Parsing multi-line logs
Multi-line log messages such as exception logs and stack traces are quite common in logs. Unfortunately these log messages are often stored in files or forwarded over the network without any encapsulation. In this case, the newline characters in the messages cannot be correctly parsed by simple line-based parsers, which treat every line as a separate event.
Multi-line events may have one or more of:
-
a header in the first line (with timestamp and severity field, for example),
-
a closing character sequence marking the end, and
-
a fixed line count.
Based on this information, NXLog can be configured to reconstruct the original messages, creating a single event for each multi-line log message.
xm_multiline
NXLog provides xm_multiline for multi-line log parsing; this dedicated extension module is the recommended way to parse multi-line log messages. It supports header lines, footer lines, and fixed line counts. Once configured, the xm_multiline module instance can be used as a parser via the input module’s InputType directive.
This configuration creates a single event record with the matching HeaderLine and all successive lines until an EndLine is received.
<Extension multiline_parser>
Module xm_multiline
HeaderLine "---------------"
EndLine "END------------"
</Extension>
<Input in>
Module im_file
File "/var/log/app-multiline.log"
InputType multiline_parser
</Input>
It is also possible to use regular expressions with the HeaderLine and EndLine directives.
Here, a new event record is created beginning with each line that matches the regular expression.
<Extension tomcat_parser>
Module xm_multiline
HeaderLine /^\d{4}\-\d{2}\-\d{2} \d{2}\:\d{2}\:\d{2},\d{3} \S+ \[\S+\] \- .*/
</Extension>
<Input log4j>
Module im_file
File "/var/log/tomcat6/catalina.out"
InputType tomcat_parser
</Input>
Because the EndLine directive is not specified in this configuration, the xm_multiline parser cannot know that a log message is finished until it receives the HeaderLine of the next message. The log message is kept in the buffers, waiting to be forwarded, until either a new log message is read or the im_file module instance’s PollInterval has expired. See the xm_multiline AutoFlush directive. |
Module variables
It is also possible to parse multi-line messages by using module variables, as shown below. However, it is generally recommended to use the xm_multiline module instead, because it offers some significant advantages:
-
more efficient message processing,
-
a more readable configuration,
-
correctly incremented module event counters (one increment per multi-line message versus one per line), and
-
operation on the message source level rather than the module instance level (each file for a wildcarded im_file module instance or each TCP connection for an im_tcp/im_ssl instance).
This example saves the matching line and successive lines in the saved variable. When another matching line is read, an internal log message is generated with the contents of the saved variable.
<Input log4j>
Module im_file
File "/var/log/tomcat6/catalina.out"
<Exec>
if $raw_event =~ /(?x)^\d{4}\-\d{2}\-\d{2}\ \d{2}\:\d{2}\:\d{2},\d{3}\ \S+
\ \[\S+\]\ \-\ .*/
{
if defined(get_var('saved'))
{
$tmp = $raw_event;
$raw_event = get_var('saved');
set_var('saved', $tmp);
delete($tmp);
log_info($raw_event);
}
else
{
set_var('saved', $raw_event);
drop();
}
}
else
{
set_var('saved', get_var('saved') + "\n" + $raw_event);
drop();
}
</Exec>
</Input>
As with the previous example, a log message is kept in the saved variable, and not forwarded, until a new log message is read. |