NXLog Legacy Documentation

Patterns

Patterns provide a way to extract important information (e.g. user names, IP addresses, URLs, etc.) from free-form log messages. Many sources generate such free-form log messages where this information is contained within a human-readable sentence or message, for example syslog.

Consider the following example generated by the SSH server when an authentication failure occurs:

Failed password for john from 127.0.0.1 port 1542 ssh2

To be able to create a report about authentication failures, the username (john in the above example) needs to be extracted. Regular expressions are commonly used for this purpose.

The patterns used by NXLog Manager and NXLog are special in a way that these are not only single regular expressions.

  • Patterns contain match criteria to be executed against one or more fields. A pattern matches only if all fields with match criteria match. This technique allows patterns to be used with structured logs as well.

  • The matching executed against the field(s) can be an exact match or a regular expression.

  • Patterns can extract data from strings using captured substrings and store these in separate fields.

  • Patterns can modify the log by setting additional fields. This is useful for message classification.

  • Patterns can contain test cases for validation.

  • Patterns are grouped under Pattern Groups.

Patterns are used by the NXLog agent. This makes it possible to distribute this task to the agents and receive preprocessed and ready-to-store logs instead of parsing all logs at the central server. This approach can yield a significant reduction in CPU load on the central log server.

For more information about the patterns used by the NXLog agent, please refer to the Pattern Matcher (pm_pattern) module documentation in the NXLog Reference Manual.

Pattern Groups

Pattern groups are used to group patterns together which are used to match log messages generated by the same application or log source. Some pattern groups do not apply to specific log sources. With pattern groups, it is easy to exclude or include patterns that cannot match at all because the source would never generate such log messages. For example, if there is no SSH service on a system, we should not try to match patterns in the SSHd group against the logs coming from this system.

Pattern groups also serve an optimization purpose. They can have optional match criteria. One or more fields can be specified using either EXACT or REGEXP match. The log message is first checked against this match criteria. If it matches, only then will the patterns belonging to the group be matched against the log message.

To create a pattern group, the following form needs to be filled out.

Creating a pattern group

After form submission, the pattern group can be viewed:

Viewing a pattern group

In the above example, the ssh patterns will be only checked against the log if the field SourceName matches the string sshd. The SourceName field must be extracted from the syslog message with a syslog parser before running the logs through the pattern matcher.

Creating a pattern

Patterns can be created directly by clicking on the Create pattern menu item. In this case, an empty form must be filled out.

Pattern information block

Here you should enter the basic pattern information. Make sure the Pattern Group is set.

The next Match block must be populated with field(s) value(s). For example a message field:

Pattern’s Match block pre-populated with values

This can be more generic according to our needs so that the pattern can extract the user name and the destination IP address from the message:

Pattern’s Match block

We replace those parts of the message with regular expressions constructs (\S+ in the above example) which are not static. Captured substrings (the (\S+)) are stored in the fields we select. In the above example AccountName and DesinationIPVAddress are used to store the extracted values.

If it is necessary, add more than one field to execute the matching operation against. The match type can be either an EXACT or a REGEXP match. If this is toggled to REGEXP, the NXLog Manager will offer to escape special characters:

Escaping special characters in a pattern

If the regular expression does not start with the caret (^), the regular expression engine would try to find an occurrence anywhere in the subject string. This is a costly operation. Usually, the provided regular expression is written to match the start of the string and it is easy to forget the caret from the start. For this reason, the interface will show a hint:

Consider using ^

The regular expressions are compiled and executed by the NXLog engine using the PCRE library. The regular expression must be PCRE-compatible to work.

The last block is for optional test cases:

Test case for a pattern

This built-in testing interface is extremely useful for verifying the functionality of our patterns. Without this we could only find out that the pattern doesn’t work if we loaded the pattern into the agent and ran it against a set of logs.

The field value was already filled in using the log message we used to create the pattern from. After clicking the Calcualte Fields button, the captured field values should appear correctly. If they don’t, you have made a mistake somewhere and should go fix it.

Message classification with patterns

Patterns can not only load a value from a captured substring into a field but additional fields may be specified. This feature can be used for message classification and to tag log messages with special values which can be later used in the processing chain.

Set block of the Create Pattern form

There are five special fields starting with Taxonomy. NXLog Manager comes with a dictionary for these taxonomy values and only a value from this dictionary list can be set. Event taxonomy is an important concept that allows the processing of events generically regardless of their source.

If you do not want to classify the event with the Taxonomy fields or one or more is not applicable, click Delete to remove it. The Taxonomy fields are optional but are recommended to be set. It is possible to add other additional fields and specify their value.

Searching patterns

The pattern list has a simple search input box in the upper right corner. This can search for entries in the list and will show rows that contain the specified keyword.

Pattern list

There is a more powerful search interface that allows searching in any of the patterns' properties (fields, test cases, etc). Click on the Search Pattern menu item under the PATTERN menu.

Searching for patterns

Exporting and importing patterns

NXLog Manager can export and import patterns in an XML format. This is the same format used by the NXLog agent. To export a pattern or a pattern group, check its checkbox in the list and click Export. You can import a pattern database file by clicking on the Import Pattern menu item or the Import button under the pattern list.

Using patterns

Patterns are used and executed by the NXLog engine. Unlike other log analysis solutions that utilize a single pattern matcher in the central engine, the architecture of NXLog Manager allows patterns to be used on the agents as well.

To use the patterns in an NXLog agent, add a pm_pattern processor module and select the appropriate pattern groups:

Configuring the pm_pattern module

The patterns will be pushed to the NXLog agent after clicking Update config and they will take effect after a restart. See the Agents chapter for more information about agent configuration details.

Some patterns work with a set of fields and this requires some preprocessing (e.g. syslog parsing) in some cases. Instead of writing a regular expression to match a full Syslog line which includes the header (priority, timestamp, hostname, etc), it is a lot more efficient to write the regular expression to match the Message field (instead of the raw_event field) and have a syslog parser store the header information in separate fields before the pattern matching. These patterns will be usable when the same message is collected over a different protocol.