Pattern Matcher (xm_pattern)
This module makes it possible to execute pattern matching with a pattern database file in XML format. Using xm_pattern is more efficient than having NXLog regular expression rules listed in Exec directives, because it was designed in such a way that patterns do not need to be matched linearly. Regular expression sub-capturing can be used to set additional fields in the event record and arbitrary fields can be added under the scope of a pattern match for message classification. In addition, the module does an automatic on-the-fly pattern reordering internally for further speed improvements.
|To examine the supported platforms, see the list of installer packages in the Available Modules chapter.|
There are other techniques such as the radix tree which solve the linearity problem; the drawback is that usually these require the user to learn a special syntax for specifying patterns. If the log message is already parsed and is not treated as single line of message, then it is possible to process only a subset of the patterns which partially solves the linearity problem. With other performance improvements employed within the xm_pattern module, its speed can compare to the other techniques. Yet the xm_pattern module uses regular expressions which are familiar to users and can easily be migrated from other tools.
Traditionally, pattern matching on log messages has employed a technique where the log message was one string and the pattern (regular expression or radix tree based pattern) was executed against it. To match patterns against logs which contain structured data (such as the Windows EventLog), this structured data (the fields of the log) must be converted to a single string. This is a simple but inefficient method used by many tools.
The NXLog patterns defined in the XML pattern database file can contain more than one field. This allows multi-dimensional pattern matching. Thus with NXLog’s xm_pattern module there is no need to convert all fields into a single string as it can work with multiple fields.
Patterns can be grouped together under pattern groups. Pattern groups
serve an optimization purpose. The group can have an optional
matchfield block which can check a condition. If the condition (such
sshd) is satisfied, the xm_pattern module
will descend into the group and check each pattern against the log. If
the pattern group’s condition did not match (
$SourceName was not
sshd), the module can skip all patterns in the group without having
to check each pattern individually.
When the xm_pattern module finds a matching pattern, the
$PatternName fields are set on the log message. These can be
used later in conditional processing and correlation rules of the
pm_evcorr module, for example.
The xm_pattern module does not process all patterns. It exits after the
first matching pattern is found. This means that at most one pattern can
match a log message. Multiple patterns that can match the same subset of
logs should be avoided. For example, with two regular expression
The XML Schema Definition (XSD) for the pattern database file is available in the nxlog-public/contrib repository.
The following procedures are exported by xm_pattern.
This configuration reads Syslog messages from file and parses them with parse_syslog().
The events are then further processed with a pattern file and the corresponding match_pattern() procedure to add additional fields to SSH authentication success or failure events.
The matching is done against the
$Message fields, so the Syslog parsing must be performed before the pattern matching will work.
<Extension _syslog> Module xm_syslog </Extension> <Extension pattern> Module xm_pattern PatternFile 'modules/extension/pattern/patterndb2-3.xml' </Extension> <Input in> Module im_file File 'test2.log' <Exec> parse_syslog(); match_pattern(); </Exec> </Input>
The following pattern database contains two patterns to match SSH
authentication messages. The patterns are under a group named ssh which
checks whether the
$SourceName field is
sshd and only tries to match the
patterns if the logs are indeed from sshd. The patterns both extract
$SourceIP4Address fields from the log
message when the pattern matches the log. Additionally
$TaxonomyAction are set. The second pattern shows an
Exec block example, which is evaluated when the pattern
<?xml version='1.0' encoding='UTF-8'?> <patterndb> <created>2018-01-01 01:02:03</created> <version>4</version> <group> <name>ssh</name> <id>1</id> <matchfield> <name>SourceName</name> <type>exact</type> <value>sshd</value> </matchfield> <pattern> <id>1</id> <name>ssh auth success</name> <matchfield> <name>Message</name> <type>regexp</type> <value>^Accepted (\S+) for (\S+) from (\S+) port \d+ ssh2</value> <capturedfield> <name>AuthMethod</name> <type>string</type> </capturedfield> <capturedfield> <name>AccountName</name> <type>string</type> </capturedfield> <capturedfield> <name>SourceIP4Address</name> <type>ipaddr</type> </capturedfield> </matchfield> <set> <field> <name>TaxonomyStatus</name> <value>success</value> <type>string</type> </field> <field> <name>TaxonomyAction</name> <value>authenticate</value> <type>string</type> </field> </set> </pattern> <pattern> <id>2</id> <name>ssh auth failure</name> <matchfield> <name>Message</name> <type>regexp</type> <value>^Failed (\S+) for invalid user (\S+) from (\S+) port \d+ ssh2</value> <capturedfield> <name>AuthMethod</name> <type>string</type> </capturedfield> <capturedfield> <name>AccountName</name> <type>string</type> </capturedfield> <capturedfield> <name>SourceIP4Address</name> <type>ipaddr</type> </capturedfield> </matchfield> <set> <field> <name>TaxonomyStatus</name> <value>failure</value> <type>string</type> </field> <field> <name>TaxonomyAction</name> <value>authenticate</value> <type>string</type> </field> </set> <exec> $TestField = 'test'; $TestField = $Testfield + 'value'; </exec> </pattern> </group> </patterndb>
<Extension _syslog> Module xm_syslog </Extension> <Extension pattern> Module xm_pattern PatternFile modules/extension/pattern/patterndb2-3.xml </Extension> <Input in> Module im_file File 'test2.log' <Exec> parse_syslog(); if not match_pattern() drop(); </Exec> </Input>