Categories
Hints and Tips Logstash Regular Expressions Ruby

What’s in a word? (\w regexp shorthand class)

Well not just letters of the alphabet it seems.

Take the case of the logstash pattern WORD:

WORD \b\w+\b

but the shorthand character class \w matches [a-zA-Z0-9_] – notice the digits and underscore! So WORD is not really a WORD!

REALWORD \b[a-zA-Z]+\b

would be better … although I suppose things might be different in Unicode. But generally log files may be Unicode but frequently the data itself is still effectively ASCII.

Categories
Hints and Tips One-liners Perl SQL Server Windows

Opening UTF-16LE files in Perl

Placeholder for useful code snippet:

open my $fh, '<:raw:perlio:encoding(UTF-16LE):crlf', $filename

which will convert CR/LF combinations to LF only. Alternatively, to keep them intact:

open my $fh, '<:raw:perlio:encoding(UTF-16LE)', $filename

Useful for reading Windows registry export files, SQL server log export files etc.

Categories
Elasticsearch ELK Kibana Peoplesoft Perl SQL Server

Visualizing SQL Server Logspace Usage with Elasticsearch, Kibana and Perl

There are many ways possible ways to collect logspace usage data from SQL Server – this was a quick way using the tools I had at hand – DBCC, perl, elastic search and kibana.

All I did was capture the output of:

DBCC SQLPERF(logspace)

into a temporary table using a simple perl script using DBI. I then did a SELECT against the temporary table in the same perl script and posted the resulting data into ES using the Search::Elasticsearch CPAN module. The ES index was very simple – just 4 columns: database, logspace, logspaceused (%) and the timestamp of the capture (GETDATE()).

After that, all I had to do was visualize the data using Kibana. Here’s some sample output:

 

logspaceused graph
Logspace (%) usage over time

A great way to see the log space pressure points which can easily be tied back to specific batch processes at those times.

Categories
Hints and Tips Java Peoplecode

Ceil/Ceiling function in PeopleCode

I just used the java Math library ceil function from PeopleCode to solve the “round to nearest 0.5” problem e.g.

1
2
3
4
5
6
7
8
9
10
11
Local JavaObject &mathclass;
Local number &number_to_round, &result;
 
/* Instantiate java Math class */
&mathclass = GetJavaClass("java.lang.Math");
 
For &number_to_round = 0.1 To 2.0 Step 0.1
/* Use ceil function from java to solve problem */
&result = &mathclass.ceil(&number_to_round * 2) / 2;
MessageBox(0, "", 0, 0, "Number to Round: " | &number_to_round | " Result: " | &result);
End-For;
Categories
App Engine Languages Peoplesoft PeopleTools Perl Process Scheduler SQR

Perl and PeopleSoft

Way back in 1998 I was implementing PeopleSoft Financials 7.5 for a UK Charity. SQR and Application Engine (the COBOL version back then) were the only options available in the PeopleSoft toolset for updating the database. Other than straight SQL updates in SQLPlus of course!

Whilst SQR was an OK tool, I always felt it lacked so many capabilities. In fact, at that point it could not even read a CSV file – I had to code a user DLL in C to achieve even that. All very frustrating.

Having rescued various projects using perl scripts prior to this, I decided I would add perl as an available language to process scheduler. Taking the SQR include files for the process scheduler API as an example, I emulated the same approach with perl. It worked brilliantly and allowed me to add some sophisticated features to PeopleSoft including:

  • SQL and query output to CSV and XLS formats (remember this was prior to the PeopleSoft Internet Architecture) through the SpreadSheet::WriteExcel, and DBD::CSV CPAN modules
  • User defined SFTP/FTP/SCP file transfers to and from third-party systems
  • Bank Statement loads by encapsulating mainframe remote access software into process scheduler jobs
  • Exchange rate loading via Website “screen scraping”
  • Spreadsheet Aged Debt reporting
  • Fuzzy duplicate customer identification/matching
  • Automatic customer identification in Accounts Receivable

Here’s the start of one such perl script from 2004:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#!/usr/bin/perl 
#
# This is a perl routine to find possible matches for originator's
# sort code and bank account by looking to find possible customers.
#
# (1) Fetch the list of bank statement entries.
# (2) Try to find customer like this.
#
# Author: XXX
# Date : 29th January 2004.
#
# Amendment History
# -----------------
# 29-JAN-2004 XXX First version
#
#$debug = 1;
use lib 'h:\perl';
use Strict;
use Spreadsheet::WriteExcel::Big;
use String::Approx qw(amatch);
require 'prcsapi.pl';
use Date::Calc qw (Delta_Days);
$row = 0;
 
#
# Connect to database using parameters resolved from command line
#
$dbh = DBI->connect( "dbi:$dbtype:$dbname", "$accessid", "$accesspswd" ) or die $dbh->errstr;

The require of prcsapi.pl brings in all the necessary sub-modules needed for the process scheduler API. Updating the process scheduler status is then simply a call to the appropriate API function:

1
Update_Process_Status($prcs_run_status_processing,'Processing has started.');

More recently, I have taken a similar approach but for ruby …. more on that later.

Enjoy.

Categories
Java Languages Peoplesoft PeopleTools Tuning Weblogic

Weblogic Java VM Memory Parameters

When increasing the Java VM memory parameters for a PeopleSoft system under Weblogic (either in setEnv.cmd or manually by editing the cmdline registry entry if that’s your “thing”), be very careful not to increase the -XX:MaxPermSize memory allocation too much.

In fact I would leave it at 128MB or perhaps 256MB. Whatever you do, don’t update all three parameters in step, especially on a 32-bit Java, as you will hit Out of Memory errors on Weblogic startup way earlier than you expect. The MaxPermSize memory is allocated outside of the heap so the total memory you potentially need is more than you expect.

Categories
Big Data Elasticsearch ELK Java Javascript Kibana Languages Logstash nx-log Ruby

ELK and PeopleSoft

I have spent some time looking into Elasticsearch, Logstash and Kibana (ELK) for analysis of PeopleSoft web, application and process scheduler log files.

Whilst commercial solutions exist that can be configured to do this, they all seem somewhat over priced solutions to a relatively common and essentially simple problem – log file shipping, consolidation/aggregation and analysis. This is where ELK steps in …. bringing a mix of Java, Ruby and Javascript to the party.

IMHO, ELK runs best on flavours of Unix – Linux, FreeBSD or even Solaris. I have also found the most effective solution for servers running Windows is to ship the logs with some simple pre-processing to a number of logstash processes on Linux using NXLog running as a service under Windows. This reduces the CPU load on the Windows servers so they can get on with their primary functions. Check out NXLog Community Edition for more details.

Determining the parsing rules for the various log file formats is probably the most difficult part. Provided you are reasonably familiar with both the data and regular expression matching, you should have no problem understanding and transforming your data into a format that is easy to visualise in Kibana.

However, when you hit any significant data volumes you really need to look carefully at the system settings for each component. Elasticsearch scales very well, but performs best when given plenty of memory.

Here’s a simple example from an nxlog.conf file on Windows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Module im_file
#SavePos FALSE
#ReadFromLast FALSE
File 'I:\LOGS\PROD\APPSRV_*.LOG'
InputType multiline
Exec convert_fields("AUTO","utf-8");
Exec $filename = file_basename(file_name());
Exec $filedir = file_dirname(file_name());
Exec if $raw_event =~ /(GetCertificate|_dflt|Token authentication succeeded|PSJNI:|GetNextNumberWithGaps|RunAe|Switching to new log file|PublishSubscribe|Token=)/ { drop();};
Exec if $filedir =~ /\\(appserv|appserv\\prcs)\\([A-Z0-9\-]+)\\LOGS/ { $stack = $1; $server_name = $2; $server_ip = $3; $domain = $5;};
Exec $server_ip =~ s/_/./g;
Exec $host = $server_ip;
Exec if $raw_event =~ /([A-Za-z0-9\-_]+)@(\d+\.\d+\.\d+\.\d+)/ { $oprid = $1; $client_ip = $2;};
Exec if $raw_event =~ /^([A-Za-z_0-9]+)\.(\d+) \((\d+)\) \[(\d{2}.\d{2}.\d{2} \d{2}:\d{2}:\d{2})/ { $server_process = $1; $pid = $2; $task_no = $3; $datestamp = $4; };
Exec delete($EventReceivedTime); 
Exec delete($filedir); 
Exec delete($filename); 
Exec delete($SourceModuleType);
Exec $message = $raw_event;
Exec $message =~ s/^.*?\]//;
Exec $message =~ s/^\(\d+\)\s+//;
Exec to_json();

This is just an example that shows some reasonable nx-log directives to pre-process the PeopleSoft Application Server logs into a consistent and usable format. Some of the regular expressions are specific to my use case but they are useful to illustrate some simple techniques you may find useful.