NXLog Docs

Pattern Matcher (xm_pattern)

This module makes it possible to execute pattern matching with a pattern database file in XML format. Using xm_pattern is more efficient than having NXLog regular expression rules listed in Exec directives, because it was designed in such a way that patterns do not need to be matched linearly. Regular expression sub-capturing can be used to set additional fields in the event record and arbitrary fields can be added under the scope of a pattern match for message classification. In addition, the module does an automatic on-the-fly pattern reordering internally for further speed improvements.

To examine the supported platforms, see the list of installer packages in the Available Modules chapter.

There are other techniques such as the radix tree which solve the linearity problem; the drawback is that usually these require the user to learn a special syntax for specifying patterns. If the log message is already parsed and is not treated as single line of message, then it is possible to process only a subset of the patterns which partially solves the linearity problem. With other performance improvements employed within the xm_pattern module, its speed can compare to the other techniques. Yet the xm_pattern module uses regular expressions which are familiar to users and can easily be migrated from other tools.

Traditionally, pattern matching on log messages has employed a technique where the log message was one string and the pattern (regular expression or radix tree based pattern) was executed against it. To match patterns against logs which contain structured data (such as the Windows EventLog), this structured data (the fields of the log) must be converted to a single string. This is a simple but inefficient method used by many tools.

The NXLog patterns defined in the XML pattern database file can contain more than one field. This allows multi-dimensional pattern matching. Thus with NXLog’s xm_pattern module there is no need to convert all fields into a single string as it can work with multiple fields.

Patterns can be grouped together under pattern groups. Pattern groups serve an optimization purpose. The group can have an optional matchfield block which can check a condition. If the condition (such as $SourceName matches sshd) is satisfied, the xm_pattern module will descend into the group and check each pattern against the log. If the pattern group’s condition did not match ($SourceName was not sshd), the module can skip all patterns in the group without having to check each pattern individually.

When the xm_pattern module finds a matching pattern, the $PatternID and $PatternName fields are set on the log message. These can be used later in conditional processing and correlation rules of the pm_evcorr module, for example.

The xm_pattern module does not process all patterns. It exits after the first matching pattern is found. This means that at most one pattern can match a log message. Multiple patterns that can match the same subset of logs should be avoided. For example, with two regular expression patterns ^\d+ and ^\d\d, only one will be matched but not consistently because the internal order of patterns and pattern groups is changed dynamically by xm_pattern (patterns with the highest match count are placed and tried first). For a strictly linearly executing pattern matcher, see the Exec directive.
The XML Schema Definition (XSD) for the pattern database file is available in the nxlog-public/contrib repository.

Configuration

The xm_pattern module accepts the following directives in addition to the common module directives.

PatternFile

This mandatory directive specifies the name of the pattern database file.

Functions

The following functions are exported by xm_pattern.

boolean match_pattern()

Execute the match_pattern() procedure. If the event is successfully matched, return TRUE, otherwise FALSE.

Procedures

The following procedures are exported by xm_pattern.

match_pattern();

Attempt to match the current event according to the PatternFile. Execute statements and add fields as specified.

Fields

The following fields are used by xm_pattern.

$PatternID (type: integer)
The ID of the pattern that matched the event.
$PatternName (type: string)
The name of the pattern that matched the event.

Examples

Example 1. Using the match_pattern() procedure

This configuration reads syslog messages from file and parses them with parse_syslog(). The events are then further processed with a pattern file and the corresponding match_pattern() procedure to add additional fields to SSH authentication success or failure events. The matching is done against the $SourceName and $Message fields, so the syslog parsing must be performed before the pattern matching will work.

nxlog.conf
<Extension syslog>
    Module          xm_syslog
</Extension>

<Extension pattern>
    Module          xm_pattern
    PatternFile     'modules/extension/pattern/patterndb2-3.xml'
</Extension>

<Input in>
    Module          im_file
    File            'test2.log'
    <Exec>
        parse_syslog();
        match_pattern();
    </Exec>
</Input>

The following pattern database contains two patterns to match SSH authentication messages. The patterns are under a group named ssh which checks whether the $SourceName field is sshd and only tries to match the patterns if the logs are indeed from sshd. The patterns both extract $AuthMethod, $AccountName, and $SourceIP4Address fields from the log message when the pattern matches the log. Additionally $TaxonomyStatus and $TaxonomyAction are set. The second pattern shows an Exec block example, which is evaluated when the pattern matches.

For the full syntax and semantics of the regular expressions supported by PCRE2, please see the pcre2pattern documentation.

The number of captured fields should be exactly equal to the number of capturedfield records, otherwise the parsing will terminate.
patterndb2-3.xml
<?xml version='1.0' encoding='UTF-8'?>
<patterndb>
  <created>2018-01-01 01:02:03</created>
  <version>4</version>

  <group>
    <id>1</id>
    <name>ssh</name>
    <matchfield>
      <name>SourceName</name>
      <type>exact</type>
      <value>sshd</value>
    </matchfield>

    <pattern>
      <id>1</id>
      <name>ssh auth success</name>

      <matchfield>
        <name>Message</name>
        <type>regexp</type>
        <value>^Accepted (\S+) for (\S+) from (\S+) port \d+ ssh2</value>
        <capturedfield>
          <name>AuthMethod</name>
          <type>STRING</type>
        </capturedfield>
        <capturedfield>
          <name>AccountName</name>
          <type>STRING</type>
        </capturedfield>
        <capturedfield>
          <name>SourceIP4Address</name>
          <type>IP4ADDR</type>
        </capturedfield>
      </matchfield>

      <set>
        <field>
          <name>TaxonomyStatus</name>
          <type>STRING</type>
          <value>success</value>
        </field>
        <field>
          <name>TaxonomyAction</name>
          <type>STRING</type>
          <value>authenticate</value>
        </field>
      </set>
    </pattern>

    <pattern>
      <id>2</id>
      <name>ssh auth failure</name>

      <matchfield>
        <name>Message</name>
        <type>regexp</type>
        <value>^Failed (\S+) for invalid user (\S+) from (\S+) port \d+ ssh2</value>

        <capturedfield>
          <name>AuthMethod</name>
          <type>STRING</type>
        </capturedfield>
        <capturedfield>
          <name>AccountName</name>
          <type>STRING</type>
        </capturedfield>
        <capturedfield>
          <name>SourceIP4Address</name>
          <type>IP4ADDR</type>
        </capturedfield>
      </matchfield>

      <set>
        <field>
          <name>TaxonomyStatus</name>
          <type>STRING</type>
          <value>failure</value>
        </field>
        <field>
          <name>TaxonomyAction</name>
          <type>STRING</type>
          <value>authenticate</value>
        </field>
      </set>

      <exec>
        $TestField = 'test';
        $TestField = $Testfield + 'value';
      </exec>
    </pattern>

  </group>

</patterndb>
Example 2. Using the match_pattern() function

This example is the same as the previous one, and uses the same pattern file, but it uses the match_pattern() function to discard any event that is not matched by the pattern file.

nxlog.conf
<Extension syslog>
    Module          xm_syslog
</Extension>

<Extension pattern>
    Module          xm_pattern
    PatternFile     'modules/extension/pattern/patterndb2-3.xml'
</Extension>

<Input in>
    Module          im_file
    File            'test2.log'
    <Exec>
        parse_syslog();
        if not match_pattern() drop();
    </Exec>
</Input>