Pattern Matcher (xm_pattern)
This module parses unstructured data using an NXLog pattern definition. xm_pattern is designed for efficient, non-linear pattern-matching and is more efficient than processing logs against regular expressions defined in an Exec directive. In addition, the module automatically reprioritizes patterns according to your logs to further improve processing speed.
NXLog patterns are defined in XML format and support matching multiple fields. Therefore, unlike traditional pattern-matching techniques, you do not need to convert log records into a single string. It also supports pattern groups, where you can define multiple patterns in a block. If a log record matches the group criteria, the module continues to process the patterns in the group. Otherwise, it skips the entire group. See the examples below for a practical application of pattern groups.
When a log record matches a pattern, xm_pattern adds two fields: $PatternID
and $PatternName
.
You can use these fields for conditional processing and correlation further down the pipeline, for example, to process with pm_evcorr.
xm_pattern stops processing patterns after the first match. Therefore, a log record can only match one pattern. As a result, we recommend avoiding defining patterns that match the same subset of logs. For example, if you define two regular expression patterns, |
The pattern database XML Schema Definition (XSD) is available in our public git repository.
Configuration
The xm_pattern module accepts the following directives in addition to the common module directives.
Required directives
The following directives are required for the module to start.
Specify the path to the pattern database file. |
Functions
The following functions are exported by xm_pattern.
- boolean
match_pattern()
-
Execute the match_pattern() procedure. If the event is successfully matched, return TRUE, otherwise FALSE.
Procedures
The following procedures are exported by xm_pattern.
match_pattern();
-
Attempt to match the current event according to the PatternFile. Execute statements and add fields as specified.
Pattern database schema
The following table lists the XML elements used to create the pattern database. Click on each element for details.
Element | Description |
---|---|
Defines a field captured from a regular expression. |
|
Defines a captured field and value. |
|
Defines the creation date of the pattern database. |
|
Used to add comments to the schema. |
|
Used to execute commands in the NXLog language. |
|
Defines a field name and value. |
|
Defines a group of related patterns. |
|
The unique identifier of the parent element. |
|
Used to define matching criteria for a field. |
|
The name of the parent element or a field. |
|
Used to define matching criteria. |
|
The top-level element of the pattern database. |
|
Defines fields and values to be set if the event matches the pattern. |
|
Defines a field and value to match in a pattern. |
|
Used to define the data type of a field. |
|
Used to define the value of a field. |
|
Defines the version of the pattern database. |
Examples
This configuration reads syslog messages from a file and parses them with parse_syslog().
It then processes log records with the match_pattern() procedure and a corresponding pattern database file to enrich SSH authentication success and failure events.
The patterns use the $SourceName
and $Message
fields.
Therefore, the log records must be parsed into structured data before processing them against the pattern database.
<Extension syslog>
Module xm_syslog
</Extension>
<Extension pattern>
Module xm_pattern
PatternFile 'modules/extension/pattern/pattern.xml'
</Extension>
<Input system_messages>
Module im_file
File '/var/log/syslog'
<Exec>
parse_syslog();
match_pattern();
</Exec>
</Input>
The following pattern database contains two patterns to match SSH authentication messages. The patterns are under a group named ssh, which checks whether the $SourceName
field is sshd
.
If it is, the log record is processed further with the patterns defined in the group.
Both patterns extract the $AuthMethod
, $AccountName
, and $SourceIP4Address
fields from the message.
In addition, they enrich matching records with the $TaxonomyStatus
and $TaxonomyAction
fields.
The second pattern also uses an Exec block to add further fields to log records that match.
For the full syntax and semantics of PCRE2-compliant regular expressions, please see the pcre2pattern documentation.
The number of fields captured by the regular expression must equal the number of capturedfield elements. Otherwise, parsing will terminate.
|
<?xml version='1.0' encoding='UTF-8'?>
<patterndb>
<created>2018-01-01 01:02:03</created>
<version>4</version>
<!-- First pattern group in this file -->
<group>
<id>1</id>
<name>ssh</name>
<!-- Only try to match this group if $SourceName == "sshd" -->
<matchfield>
<name>SourceName</name>
<type>exact</type>
<value>sshd</value>
</matchfield>
<!-- First pattern in this pattern group -->
<pattern>
<id>1</id>
<name>ssh auth success</name>
<!-- Do regular expression match on $Message field -->
<matchfield>
<name>Message</name>
<type>regexp</type>
<value>^Accepted (\S+) for (\S+) from (\S+) port \d+ ssh2</value>
<!-- Set 3 event record fields from captured strings -->
<capturedfield>
<name>AuthMethod</name>
<type>STRING</type>
</capturedfield>
<capturedfield>
<name>AccountName</name>
<type>STRING</type>
</capturedfield>
<capturedfield>
<name>SourceIP4Address</name>
<type>IP4ADDR</type>
</capturedfield>
</matchfield>
<!-- Set additional fields if pattern matches -->
<set>
<field>
<name>TaxonomyStatus</name>
<type>STRING</type>
<value>success</value>
</field>
<field>
<name>TaxonomyAction</name>
<type>STRING</type>
<value>authenticate</value>
</field>
</set>
</pattern>
<!-- Second pattern group in this file -->
<pattern>
<id>2</id>
<name>ssh auth failure</name>
<!-- Do regular expression match on $Message field -->
<matchfield>
<name>Message</name>
<type>regexp</type>
<value>^Failed (\S+) for invalid user (\S+) from (\S+) port \d+ ssh2</value>
<!-- Set 3 event record fields from captured strings -->
<capturedfield>
<name>AuthMethod</name>
<type>STRING</type>
</capturedfield>
<capturedfield>
<name>AccountName</name>
<type>STRING</type>
</capturedfield>
<capturedfield>
<name>SourceIP4Address</name>
<type>IP4ADDR</type>
</capturedfield>
</matchfield>
<!-- Set additional fields if pattern matches -->
<set>
<field>
<name>TaxonomyStatus</name>
<type>STRING</type>
<value>failure</value>
</field>
<field>
<name>TaxonomyAction</name>
<type>STRING</type>
<value>authenticate</value>
</field>
</set>
<!-- Exec block which is evaluated when the pattern matches -->
<exec>
$TestField = 'test';
$TestField = $Testfield + 'value';
</exec>
</pattern>
</group>
</patterndb>
This example is the same as the previous one and uses the same pattern database, but it uses the match_pattern() function to discard any event that does not match any of the patterns defined in the pattern database.
<Extension syslog>
Module xm_syslog
</Extension>
<Extension pattern>
Module xm_pattern
PatternFile 'modules/extension/pattern/pattern.xml'
</Extension>
<Input system_messages>
Module im_file
File '/var/log/syslog'
<Exec>
parse_syslog();
if not match_pattern() drop();
</Exec>
</Input>