Pattern Matcher (xm_pattern)

This module parses unstructured data using an NXLog pattern definition. xm_pattern is designed for efficient, non-linear pattern-matching and is more efficient than processing logs against regular expressions defined in an Exec directive. In addition, the module automatically reprioritizes patterns according to your logs to further improve processing speed.

NXLog patterns are defined in XML format and support matching multiple fields. Therefore, unlike traditional pattern-matching techniques, you do not need to convert log records into a single string. It also supports pattern groups, where you can define multiple patterns in a block. If a log record matches the group criteria, the module continues to process the patterns in the group. Otherwise, it skips the entire group. See the examples below for a practical application of pattern groups.

When a log record matches a pattern, xm_pattern adds two fields: $PatternID and $PatternName. You can use these fields for conditional processing and correlation further down the pipeline, for example, to process with pm_evcorr.

xm_pattern stops processing patterns after the first match. Therefore, a log record can only match one pattern. As a result, we recommend avoiding defining patterns that match the same subset of logs.

For example, if you define two regular expression patterns, ^\d+ and ^\d\d, a log record will match either one or the other, but not both. There are no rules about which pattern it will match because xm_pattern reorders the priority of patterns according to the highest match count. If you require linear pattern matching, you are better off using regular expressions in an Exec directive.

The pattern database XML Schema Definition (XSD) is available in our public git repository.

Configuration

The xm_pattern module accepts the following directives in addition to the common module directives.

Required directives

The following directives are required for the module to start.

PatternFile

Specify the path to the pattern database file.

Functions

The following functions are exported by xm_pattern.

boolean match_pattern()

Execute the match_pattern() procedure. If the event is successfully matched, return TRUE, otherwise FALSE.

Procedures

The following procedures are exported by xm_pattern.

match_pattern();

Attempt to match the current event according to the PatternFile. Execute statements and add fields as specified.

Fields

The following fields are used by xm_pattern.

$PatternID (type: integer)

The ID of the pattern that matched the event.

$PatternName (type: string)

The name of the pattern that matched the event.

Pattern database schema

The following table lists the XML elements used to create the pattern database. Click on each element for details.

Element Description

capturedfield

Defines a field captured from a regular expression.

capturedvalue

Defines a captured field and value.

created

Defines the creation date of the pattern database.

description

Used to add comments to the schema.

exec

Used to execute commands in the NXLog language.

field

Defines a field name and value.

group

Defines a group of related patterns.

id

The unique identifier of the parent element.

matchfield

Used to define matching criteria for a field.

name

The name of the parent element or a field.

pattern

Used to define matching criteria.

patterndb

The top-level element of the pattern database.

set

Defines fields and values to be set if the event matches the pattern.

testcase

Defines a field and value to match in a pattern.

type

Used to define the data type of a field.

value

Used to define the value of a field.

version

Defines the version of the pattern database.

Examples

Example 1. Using the match_pattern() procedure

This configuration reads syslog messages from a file and parses them with parse_syslog(). It then processes log records with the match_pattern() procedure and a corresponding pattern database file to enrich SSH authentication success and failure events. The patterns use the $SourceName and $Message fields. Therefore, the log records must be parsed into structured data before processing them against the pattern database.

nxlog.conf
<Extension syslog>
    Module         xm_syslog
</Extension>

<Extension pattern>
    Module         xm_pattern
    PatternFile    'modules/extension/pattern/pattern.xml'
</Extension>

<Input system_messages>
    Module         im_file
    File           '/var/log/syslog'
    <Exec>
        parse_syslog();
        match_pattern();
    </Exec>
</Input>

The following pattern database contains two patterns to match SSH authentication messages. The patterns are under a group named ssh, which checks whether the $SourceName field is sshd. If it is, the log record is processed further with the patterns defined in the group. Both patterns extract the $AuthMethod, $AccountName, and $SourceIP4Address fields from the message. In addition, they enrich matching records with the $TaxonomyStatus and $TaxonomyAction fields. The second pattern also uses an Exec block to add further fields to log records that match.

For the full syntax and semantics of PCRE2-compliant regular expressions, please see the pcre2pattern documentation.

The number of fields captured by the regular expression must equal the number of capturedfield elements. Otherwise, parsing will terminate.
pattern.xml
<?xml version='1.0' encoding='UTF-8'?>
<patterndb>
  <created>2018-01-01 01:02:03</created>
  <version>4</version>
  <!-- First pattern group in this file -->
  <group>
    <id>1</id>
    <name>ssh</name>
    <!-- Only try to match this group if $SourceName == "sshd" -->
    <matchfield>
      <name>SourceName</name>
      <type>exact</type>
      <value>sshd</value>
    </matchfield>
    <!-- First pattern in this pattern group -->
    <pattern>
      <id>1</id>
      <name>ssh auth success</name>
      <!-- Do regular expression match on $Message field -->
      <matchfield>
        <name>Message</name>
        <type>regexp</type>
        <value>^Accepted (\S+) for (\S+) from (\S+) port \d+ ssh2</value>
        <!-- Set 3 event record fields from captured strings -->
        <capturedfield>
          <name>AuthMethod</name>
          <type>STRING</type>
        </capturedfield>
        <capturedfield>
          <name>AccountName</name>
          <type>STRING</type>
        </capturedfield>
        <capturedfield>
          <name>SourceIP4Address</name>
          <type>IP4ADDR</type>
        </capturedfield>
      </matchfield>
      <!-- Set additional fields if pattern matches -->
      <set>
        <field>
          <name>TaxonomyStatus</name>
          <type>STRING</type>
          <value>success</value>
        </field>
        <field>
          <name>TaxonomyAction</name>
          <type>STRING</type>
          <value>authenticate</value>
        </field>
      </set>
    </pattern>
    <!-- Second pattern group in this file -->
    <pattern>
      <id>2</id>
      <name>ssh auth failure</name>
      <!-- Do regular expression match on $Message field -->
      <matchfield>
        <name>Message</name>
        <type>regexp</type>
        <value>^Failed (\S+) for invalid user (\S+) from (\S+) port \d+ ssh2</value>
        <!-- Set 3 event record fields from captured strings -->
        <capturedfield>
          <name>AuthMethod</name>
          <type>STRING</type>
        </capturedfield>
        <capturedfield>
          <name>AccountName</name>
          <type>STRING</type>
        </capturedfield>
        <capturedfield>
          <name>SourceIP4Address</name>
          <type>IP4ADDR</type>
        </capturedfield>
      </matchfield>
      <!-- Set additional fields if pattern matches -->
      <set>
        <field>
          <name>TaxonomyStatus</name>
          <type>STRING</type>
          <value>failure</value>
        </field>
        <field>
          <name>TaxonomyAction</name>
          <type>STRING</type>
          <value>authenticate</value>
        </field>
      </set>
      <!-- Exec block which is evaluated when the pattern matches -->
      <exec>
        $TestField = 'test';
        $TestField = $Testfield + 'value';
      </exec>
    </pattern>

  </group>

</patterndb>
Example 2. Using the match_pattern() function

This example is the same as the previous one and uses the same pattern database, but it uses the match_pattern() function to discard any event that does not match any of the patterns defined in the pattern database.

nxlog.conf
<Extension syslog>
    Module         xm_syslog
</Extension>

<Extension pattern>
    Module         xm_pattern
    PatternFile    'modules/extension/pattern/pattern.xml'
</Extension>

<Input system_messages>
    Module         im_file
    File           '/var/log/syslog'
    <Exec>
        parse_syslog();
        if not match_pattern() drop();
    </Exec>
</Input>