Parsing multi-line logs

Multi-line log messages such as exception logs and stack traces are quite common in logs. Unfortunately these log messages are often stored in files or forwarded over the network without any encapsulation. In this case, the newline characters in the messages cannot be correctly parsed by simple line-based parsers, which treat every line as a separate event.

Multi-line events may have one or more of:

a header in the first line (with timestamp and severity field, for example),
a closing character sequence marking the end, and
a fixed line count.

Based on this information, NXLog can be configured to reconstruct the original messages, creating a single event for each multi-line log message.

xm_multiline

NXLog provides xm_multiline for multi-line log parsing; this dedicated extension module is the recommended way to parse multi-line log messages. It supports header lines, footer lines, and fixed line counts. Once configured, the xm_multiline module instance can be used as a parser via the input module’s InputType directive.

Example 1. Using the xm_multiline module

This configuration creates a single event record with the matching HeaderLine and all successive lines until an EndLine is received.

nxlog.conf

<Extension multiline_parser>
    Module      xm_multiline
    HeaderLine  "---------------"
    EndLine     "END------------"
</Extension>

<Input in>
    Module      im_file
    File        "/var/log/app-multiline.log"
    InputType   multiline_parser
</Input>

It is also possible to use regular expressions with the HeaderLine and EndLine directives.

Example 2. Using regular expressions with xm_multiline

Here, a new event record is created beginning with each line that matches the regular expression.

nxlog.conf

<Extension tomcat_parser>
    Module      xm_multiline
    HeaderLine  /^\d{4}\-\d{2}\-\d{2} \d{2}\:\d{2}\:\d{2},\d{3} \S+ \[\S+\] \- .*/
</Extension>

<Input log4j>
    Module      im_file
    File        "/var/log/tomcat6/catalina.out"
    InputType   tomcat_parser
</Input>

Because the EndLine directive is not specified in this configuration, the xm_multiline parser cannot know that a log message is finished until it receives the HeaderLine of the next message. The log message is kept in the buffers, waiting to be forwarded, until either a new log message is read or the im_file module instance’s PollInterval has expired. See the xm_multiline AutoFlush directive.

Module variables

It is also possible to parse multi-line messages by using module variables, as shown below. However, it is generally recommended to use the xm_multiline module instead, because it offers some significant advantages:

more efficient message processing,
a more readable configuration,
correctly incremented module event counters (one increment per multi-line message versus one per line), and
operation on the message source level rather than the module instance level (each file for a wildcarded im_file module instance or each TCP connection for an im_tcp/im_ssl instance).

Example 3. Parsing multi-line messages with module variables

This example saves the matching line and successive lines in the saved variable. When another matching line is read, an internal log message is generated with the contents of the saved variable.

nxlog.conf

<Input log4j>
    Module  im_file
    File    "/var/log/tomcat6/catalina.out"
    <Exec>
        if $raw_event =~ /(?x)^\d{4}\-\d{2}\-\d{2}\ \d{2}\:\d{2}\:\d{2},\d{3}\ \S+
                          \ \[\S+\]\ \-\ .*/
        {
            if defined(get_var('saved'))
            {
                $tmp = $raw_event;
                $raw_event = get_var('saved');
                set_var('saved', $tmp);
                delete($tmp);
                log_info($raw_event);
            }
            else
            {
                set_var('saved', $raw_event);
                drop();
            }
        }
        else
        {
            set_var('saved', get_var('saved') + "\n" + $raw_event);
            drop();
        }
    </Exec>
</Input>

As with the previous example, a log message is kept in the saved variable, and not forwarded, until a new log message is read.