Patterns
Patterns provide a way to extract important information (e.g. user names, IP addresses, URLs, etc.) from free-form log messages. Many sources generate such free-form log messages where this information is contained within a human-readable sentence or message, for example syslog.
Consider the following example generated by the SSH server when an authentication failure occurs:
Failed password for john from 127.0.0.1 port 1542 ssh2
To be able to create a report about authentication failures, the username (john in the above example) needs to be extracted. Regular expressions are commonly used for this purpose.
The patterns used by NXLog Manager and NXLog are special in a way that these are not only single regular expressions.
-
Patterns contain match criteria to be executed against one or more fields. A pattern matches only if all fields with match criteria match. This technique allows patterns to be used with structured logs as well.
-
The matching executed against the field(s) can be an exact match or a regular expression.
-
Patterns can extract data from strings using captured substrings and store these in separate fields.
-
Patterns can modify the log by setting additional fields. This is useful for message classification.
-
Patterns can contain test cases for validation.
-
Patterns are grouped under Pattern Groups.
Patterns are used by the NXLog agent. This makes it possible to distribute this task to the agents and receive preprocessed and ready-to-store logs instead of parsing all logs at the central server. This approach can yield a significant reduction in CPU load on the central log server.
For more information about the patterns used by the NXLog agent, please refer to the Pattern Matcher (pm_pattern) module documentation in the NXLog Reference Manual.
Pattern Groups
Pattern groups are used to group patterns together which are used to match log messages generated by the same application or log source. Some pattern groups do not apply to specific log sources. With pattern groups, it is easy to exclude or include patterns that cannot match at all because the source would never generate such log messages. For example, if there is no SSH service on a system, we should not try to match patterns in the SSHd group against the logs coming from this system.
Pattern groups also serve an optimization purpose. They can have
optional match criteria. One or more fields can be specified using
either EXACT
or REGEXP
match. The log message is first checked against
this match criteria. If it matches, only then will the patterns
belonging to the group be matched against the log message.
To create a pattern group, the following form needs to be filled out.
After form submission, the pattern group can be viewed:
In the above example, the ssh patterns will be only checked against the log if the field SourceName matches the string sshd. The SourceName field must be extracted from the syslog message with a syslog parser before running the logs through the pattern matcher.
Creating a pattern
Patterns can be created directly by clicking on the Create pattern menu item. In this case, an empty form must be filled out.
Here you should enter the basic pattern information. Make sure the Pattern Group is set.
The next Match block must be populated with field(s) value(s). For example a message field:
This can be more generic according to our needs so that the pattern can extract the user name and the destination IP address from the message:
We replace those parts of the message with regular expressions constructs (\S+ in the above example) which are not static. Captured substrings (the (\S+)) are stored in the fields we select. In the above example AccountName and DesinationIPVAddress are used to store the extracted values.
If it is necessary, add more than one field to execute the matching operation against. The match type can be either an EXACT or a REGEXP match. If this is toggled to REGEXP, the NXLog Manager will offer to escape special characters:
If the regular expression does not start with the caret (^), the regular expression engine would try to find an occurrence anywhere in the subject string. This is a costly operation. Usually, the provided regular expression is written to match the start of the string and it is easy to forget the caret from the start. For this reason, the interface will show a hint:
The regular expressions are compiled and executed by the NXLog engine using the PCRE library. The regular expression must be PCRE-compatible to work. |
The last block is for optional test cases:
This built-in testing interface is extremely useful for verifying the functionality of our patterns. Without this we could only find out that the pattern doesn’t work if we loaded the pattern into the agent and ran it against a set of logs.
The field value was already filled in using the log message we used to create the pattern from. After clicking the Calcualte Fields button, the captured field values should appear correctly. If they don’t, you have made a mistake somewhere and should go fix it.
Message classification with patterns
Patterns can not only load a value from a captured substring into a field but additional fields may be specified. This feature can be used for message classification and to tag log messages with special values which can be later used in the processing chain.
There are five special fields starting with Taxonomy. NXLog Manager comes with a dictionary for these taxonomy values and only a value from this dictionary list can be set. Event taxonomy is an important concept that allows the processing of events generically regardless of their source.
If you do not want to classify the event with the Taxonomy fields or one or more is not applicable, click Delete to remove it. The Taxonomy fields are optional but are recommended to be set. It is possible to add other additional fields and specify their value.
Searching patterns
The pattern list has a simple search input box in the upper right corner. This can search for entries in the list and will show rows that contain the specified keyword.
There is a more powerful search interface that allows searching in any of the patterns' properties (fields, test cases, etc). Click on the Search Pattern menu item under the PATTERN menu.
Exporting and importing patterns
NXLog Manager can export and import patterns in an XML format. This is the same format used by the NXLog agent. To export a pattern or a pattern group, check its checkbox in the list and click Export. You can import a pattern database file by clicking on the Import Pattern menu item or the Import button under the pattern list.
Using patterns
Patterns are used and executed by the NXLog engine. Unlike other log analysis solutions that utilize a single pattern matcher in the central engine, the architecture of NXLog Manager allows patterns to be used on the agents as well.
To use the patterns in an NXLog agent, add a pm_pattern processor module and select the appropriate pattern groups:
The patterns will be pushed to the NXLog agent after clicking Update config and they will take effect after a restart. See the Agents chapter for more information about agent configuration details.
Some patterns work with a set of fields and this requires some preprocessing (e.g. syslog parsing) in some cases. Instead of writing a regular expression to match a full Syslog line which includes the header (priority, timestamp, hostname, etc), it is a lot more efficient to write the regular expression to match the Message field (instead of the raw_event field) and have a syslog parser store the header information in separate fields before the pattern matching. These patterns will be usable when the same message is collected over a different protocol. |