Timestamps
Timestamps are an important aspect of event logging. Knowing the date and time when an event occurred is necessary when it comes to analyzing events for troubleshooting, event correlation, and forensic audits. Without a correct timestamp, event logs lose much of their value.
When dealing with timestamps, there are many variables to consider. Different systems produce timestamps in different formats. A timestamp may be represented as a string or an integer. It may contain a timezone offset or not, and it may be precise up to the second, millisecond, or microsecond. Being aware of the kind of timestamps produced by your log sources and making sure that you are handling all these variables will ensure that events make it to your log repository with the correct timestamp.
NXLog provides functions for parsing timestamps into datetime values, as well as functions for converting datetime values into formatted strings. These functions not only help you to output timestamps in a unified format, but also to fix broken formats, such as timestamps without a year in the case of syslog BSD, and to normalize timestamps across different time zones.
Parsing timestamps
Most timestamps found in logs can be parsed with the parsedate() function, which will automatically parse any of the supported formats.
Consider the following line-based input sample. Each event record begins with a timestamp followed by a tab.
2016-10-11T22:14:15.003Z machine.example.com An account failed to log on.
This example configuration uses a regular expression to capture the
string up to the first tab. Then the
parsedate() function is used to parse the
resulting string and set the $EventTime
field to the corresponding
datetime value. This value can be converted to a
timestamp string as required in later processing, either explicitly or as
defined by the global DateFormat directive (see
Formatting timestamps).
<Input in>
Module im_file
File 'in.log'
Exec if $raw_event =~ /^([^\t])\t/ $EventTime = parsedate($1);
</Input>
The parsedate() function is especially useful if the timestamp format varies within the log records being processed. A timestamp of any supported format will be parsed. In this example, the timestamp must be at the beginning of the event and followed by a tab character to be matched by the regular expression. |
Sometimes a log source will contain a few events with invalid or unexpected formatting. If parsedate() fails to parse the input string, it will return an undefined datetime value. This allows the user to configure a fallback timestamp.
This example statement uses a vague regular expression that may in
some cases match an invalid string. If
parsedate() fails to parse the timestamp, it
will return an undefined value.
In this case, the final line below will set $EventTime
to the current time.
if $raw_event =~ /^(\S+)\s+(\S+)/
$EventTime = parsedate($1 + " " + $2);
# Make sure $EventTime is set
if not defined($EventTime) $EventTime = now();
$EventTime =
$EventReceivedTime could be
used instead to set a timestamp according to when the event was
received by NXLog.
|
For parsing more exotic formats, the strptime() function can be used.
In this input sample, the date and time are two distinct fields delimited by a tab. It also uses a non-standard single digit format instead of fixed width with double digits.
2011-5-29 0:3:2 GMT WINDOWSDC An account failed to log on.
To parse this, a regular expression can be used to capture the timestamp string. This string is then parsed with the strptime() function.
if $raw_event =~ /^(\d+-\d+-\d+\t\d+:\d+:\d+ \w+)/
$EventTime = strptime($1, '%Y-%m-%d%t%H:%M:%S %Z');
Adjusting timestamps
Sometimes a log source sends events with incorrect or incomplete timestamps. For example, some network devices may not have the correct time (especially immediately after rebooting); also, the BSD Syslog header provides neither the year nor the timezone. NXLog can be configured to apply timestamp corrections in various ways.
Reliably applying timezone offsets is difficult due to complications like daylight savings time (DST) and networking and processing delays. For this reason, it is best to use clock synchronization (such as NTP) and timezone-aware timestamps at the log source when possible. |
The simplest solution for incorrect timestamps is to replace them with the
time when the event was received by NXLog. This is a good option for
devices with untrusted clocks on the local network that send logs to
NXLog in real-time. The
$EventReceivedTime field is automatically
added to each event record by NXLog; this field can be stored
alongside the event’s own timestamp (normally $EventTime
) if all fields are
preserved when the event is stored/forwarded. Alternatively, this field can be
used as the event timestamp as shown below. This would have the effect of
influencing the timestamp used on most outputs, such as with the
to_syslog_ietf() procedure.
This configuration accepts Syslog messages via UDP with the im_udp module.
Events are parsed with the parse_syslog() procedure, which adds an EventTime field from the Syslog header timestamp.
The $EventTime
value, however, is replaced by the timestamp set by NXLog in the $EventReceivedTime field.
Any later processing that uses the $EventTime
field will operate on the updated timestamp.
For example, if the to_syslog_ietf() procedure is used, the resulting IETF Syslog header will contain the $EventReceivedTime
timestamp.
<Extension _syslog>
Module xm_syslog
</Extension>
<Input syslog>
Module im_udp
<Exec>
parse_syslog();
$EventTime = $EventReceivedTime;
</Exec>
</Input>
In some edge cases, a UTC timestamp that does not have the timezone specified is parsed as local time. This can happen if BSD Syslog timestamps are in UTC, or when reading a non-timezone-aware ID timestamp with im_odbc. In this case, it is necessary to either manually re-parse (see Parsing timestamps) or apply a corresponding reverse offset.
This statement uses the parsedate() and strftime() functions to apply a reverse offset after an incorrect local-to-UTC timezone conversion. To reduce the likelihood of an incorrect offset during the daylight saving time (DST) transition, this should be done in the Input module instance which is collecting the events (see the warning above).
$EventTime = parsedate(strftime($EventTime, '%Y-%m-%d %H:%M:%SZ'));
For the general case of adjusting timestamps, the plus
(+
) and minus (-
) operators can be used to adjust a
timestamp by a specified number of seconds.
This statement adds two hours to the $EventTime
field.
This simple method may not be suitable for correction of a timezone that uses daylight saving time (DST). In that case the required offset may change based on whether DST is in effect. |
$EventTime = $EventTime + (2 * 3600);
Formatting timestamps
After a timestamp has been parsed to a datetime value, it will usually need to be converted back to a string at some point before being sent to the output. This can be done automatically by the output configuration.
Consider an event record with an $EventTime
field (as a
datetime value) and a $Message
field. Note
that the table below shows the $EventTime
value as it is stored
internally: as microseconds since the epoch.
Field | Value |
---|---|
$EventTime |
1493425133541851 |
$Message |
EXT4-fs (dm-0): mounted filesystem with ordered data mode. |
The following output module instance uses the to_json() procedure without specifying the timestamp format.
<Output out>
Module om_file
File 'out.log'
Exec to_json();
</Output>
The output of the $EventTime
field in this case will depend on the
DateFormat directive of the xm_json
module. The default DateFormat is YYYY-MM-DDThh:mm:ss.sTZ
(local time).
{
"EventTime": "2017-01-02 15:19:22",
"Message": "EXT4-fs (dm-0): mounted filesystem with ordered data mode."
}
A different timestamp may be used in some cases, depending on
the procedure used to convert the field and the output
module. The to_syslog_bsd()
procedure, for example, will use the $EventTime value to
generate a RFC 3164 format timestamp regardless of how the
DateFormat directive is set.
|
Alternatively, the strftime() function can be used to explicitly convert a datetime value to a string with the required format.
Again, consider an event record with an $EventTime
field (as a
datetime value) and a $Message
field. In this
example, the strftime() function is used with a
format string (see the strftime(3) manual) to convert $EventTime
to
a string in the local time zone. Then the
to_json() procedure is used to set the
$raw_event
field.
<Output out>
Module om_file
File 'out.log'
<Exec>
$EventTime = strftime($EventTime, '%Y-%m-%dT%H:%M:%S%z');
to_json();
</Exec>
</Output>
{
"EventTime": "2017-04-29T02:18:53+0200",
"Message": "EXT4-fs (dm-0): mounted filesystem with ordered data mode."
}
NXLog Enterprise Edition supports a few additional format strings for formats that the stock C strftime() does not offer, including formats with fractional seconds and in UTC time. See the Reference Manual strftime() documentation for the list.
The following statement will convert $EventTime
to a timestamp
format with fractional seconds and in UTC (regardless of the current
time zone).
$EventTime = strftime($EventTime, 'YYYY-MM-DDThh:mm:ss.sUTC');
The resulting timestamp string in this case would be
2017-04-29T00:18:53.541851Z
.
Preserving datetime values across time zones
Timestamps written by NXLog can be formatted to include the timezone offset. However, there are instances when you may want to normalize timestamps to UTC, such as if the receiving end doesn’t handle timezone offset and your log sources are located in different time zones.
To normalize timestamps to UTC, you need to consider both how timestamps are parsed when reading the logs, as well as how NXLog writes timestamps in the output. Several options are available for parsing and generating timestamps in UTC:
- parsedate() core function
-
When called with the utc argument set to
TRUE
, this function treats timestamps without a timezone offset as UTC. - DateFormat global directive
-
This directive defines the date format for converting datetime values to string. A date format with the UTC suffix specifies that the timestamp should be written in UTC.
- DateFormat directive in xm_json
-
This directive defines the date format when converting data to JSON. It can be defined per xm_json module instance and is independent of the global DateFormat directive.
- UTCTimestamp directive in xm_syslog
-
When this directives is set to
TRUE
, the timestamp in the syslog header is written in UTC. - integer() function
-
When this function is called with a datetime argument, it returns an integer representing the number of microseconds that have elapsed since the epoch, also known as Unix time.
- ParseDateInUTC global directive
-
When this directive is set to
TRUE
, timestamps in the formatYYYY-MM-DD hh:mm:ss
and which do not contain a timezone offset are treated as UTC. - GenerateDateInUTC global directive
-
When this directive is set to
TRUE
, timestamps in the formatYYYY-MM-DD hh:mm:ss
are generated in UTC.
Parsing timestamps without timezone offset
By default, the parsedate() function parses timestamps
that do not contain timezone information as local time. The function also
accepts a second boolean argument utc, which when TRUE
treats
timestamps without timezone information as UTC.
This example configuration parses line-based log records and converts them to JSON. Below is an input sample of a log record containing a timestamp without timezone offset.
2021-09-22T20:14:15.003 machine.example.com An account failed to log on.
In this configuration, log records are parsed with a regular expression and the
event time is converted to datetime using the
parsedate() function. The second argument of this
function call is set to TRUE
, to specify that if the event time does not
contain timezone information, it should be treated as UTC.
The log record is converted to JSON using the to_json()
procedure of the xm_json module. At this stage, the value of $EventTime
is
converted to the format specified by the DateFormat
directive and the output is written to the $raw_event
field.
<Extension _json>
Module xm_json
DateFormat YYYY-MM-DDThh:mm:ss.sUTC
</Extension>
<Input file>
Module im_file
File '/path/to/log/file'
<Exec>
if $raw_event =~ /^(.+)\t(.+)\t(.+)/
{
$EventTime = parsedate($1,TRUE);
$Hostname = $2;
$Message = $3;
to_json();
}
</Exec>
</Input>
{
"EventReceivedTime": "2021-09-22T20:16:04.526253Z",
"SourceModuleName": "file",
"SourceModuleType": "im_file",
"EventTime": "2021-09-22T20:14:15.003000Z",
"Hostname": "machine.example.com",
"Message": "An account failed to log on."
}
JSON does not contain a datetime type, therefore timestamps are often found in JSON logs as strings. By default, the xm_json module attempts to parse strings that begin with 4 digits as datetime and will treat timestamps without a timezone offset as local time. This behavior can be controlled with the ParseDate directive.
This example configuration parses and outputs log records in JSON format. Below is an input sample of a log record containing a timestamp without timezone offset.
{
"EventTime": "2021-09-22T20:14:15.003",
"Hostname": "machine.example.com",
"Message": "An account failed to log on."
}
In this configuration, default date parsing by the xm_json module is turned
off by setting the ParseDate
directive to FALSE
. Log records are parsed with
the parse_json() procedure and the event time is
converted to datetime using the
parsedate() function. The second argument of this
function call is set to TRUE
, to specify that if the event time does not
contain timezone information, it should be treated as UTC.
Log records are converted back to JSON using the
to_json() procedure. At this stage the value of
$EventTime
is converted to the format specified by the
DateFormat directive of the xm_json module
instance and the output is written to the $raw_event
field.
<Extension _json>
Module xm_json
ParseDate FALSE
DateFormat YYYY-MM-DDThh:mm:ss.sUTC
</Extension>
<Input file>
Module im_file
File '/path/to/log/file'
<Exec>
parse_json();
$EventTime = parsedate($EventTime, TRUE);
to_json();
</Exec>
</Input>
{
"EventReceivedTime": "2021-09-22T20:16:04.526253Z",
"SourceModuleName": "file",
"SourceModuleType": "im_file",
"EventTime": "2021-09-22T20:14:15.003000Z",
"Hostname": "machine.example.com",
"Message": "An account failed to log on."
}
Generating timestamps in UTC
How a datetime value is converted to a string depends on the function, procedure, and module being used. The global DateFormat directive applies to most functions and procedures, however modules like xm_json and xm_syslog include separate handling of datetime values. See the examples below for how to output dates in UTC when generating CSV, JSON, and syslog formats.
The following log record contains a timestamp without a timezone offset. In these examples, we used this sample as input data and executed the configurations on a machine with the timezone set to UTC+2.
2021-09-22T22:14:15.003 machine.example.com An account failed to log on.
Timestamp variations that would produce the same output:
-
2021-09-22T22:14:15.003+02:00
-
2021-09-22T20:14:15.003Z
This configuration sets the DateFormat global
directive to an ISO 8601 date format with the UTC
suffix.
Log records are parsed with a regular expression and the event time is converted to datetime using the parsedate() function. Unless the event time contains timezone information, this function treats the timestamp as local time.
The log record is converted to CSV using the to_csv()
procedure, at which stage the value of $EventTime
is converted to the
format specified by the DateFormat
directive and the output is written
to the $raw_event
field.
DateFormat YYYY-MM-DDThh:mm:ss.sUTC
<Extension _csv>
Module xm_csv
Fields EventTime,Hostname,Message
</Extension>
<Input file>
Module im_file
File '/path/to/log/file'
<Exec>
if $raw_event =~ /^(.+)\t(.+)\t(.+)/
{
$EventTime = parsedate($1);
$Hostname = $2;
$Message = $3;
_csv->to_csv();
}
</Exec>
</Input>
2021-09-22T20:14:15.003000Z,"machine.example.com","An account failed to log on."
The DateFormat global directive also applies to the format of the
timestamp in the NXLog LogFile.
|
This configuration sets the DateFormat
directive of the xm_json module instance to an ISO 8601 date format with the
UTC
suffix.
Log records are parsed with a regular expression and the event time is converted to datetime using the parsedate() function. Unless the event time contains timezone information, this function treats the timestamp as local time.
The log record is converted to JSON using the to_json()
procedure, at which stage the value of $EventTime
is converted to the
format specified by the DateFormat
directive and the output is written to the
$raw_event
field.
<Extension _json>
Module xm_json
DateFormat YYYY-MM-DDThh:mm:ss.sUTC
</Extension>
<Input file>
Module im_file
File '/path/to/log/file'
<Exec>
if $raw_event =~ /^(.+)\t(.+)\t(.+)/
{
$EventTime = parsedate($1);
$Hostname = $2;
$Message = $3;
to_json();
}
</Exec>
</Input>
{
"EventReceivedTime": "2021-09-22T20:16:04.526253Z",
"SourceModuleName": "file",
"SourceModuleType": "im_file",
"EventTime": "2021-09-22T20:14:15.003000Z",
"Hostname": "machine.example.com",
"Message": "An account failed to log on."
}
This configuration sets the UTCTimestamp
directive of the xm_syslog module instance to TRUE
.
Log records are parsed with a regular expression and the event time is converted to datetime using the parsedate() function. Unless the event time contains timezone information, this function treats the timestamp as local time.
The log record is converted to syslog using the
to_syslog_bsd() procedure, at which stage the
value of $EventTime
is converted to UTC and the output is written to the
$raw_event
field.
<Extension _syslog>
Module xm_syslog
UTCTimestamp TRUE
</Extension>
<Input file>
Module im_file
File '/path/to/log/file'
<Exec>
if $raw_event =~ /^(.+)\t(.+)\t(.+)/
{
$EventTime = parsedate($1);
$Hostname = $2;
$Message = $3;
to_syslog_bsd();
}
</Exec>
</Input>
<13>Sep 22 20:14:15 machine.example.com An account failed to log on.
In this configuration, log records are parsed with a regular expression and the
event time is converted to datetime using the
parsedate() function. Unless the event time contains
timezone information, this function treats the timestamp as local time.
It then uses the integer() function to retrieve the UTC
timestamp in the Unix time format and stores it in another field $UTCTimestamp
.
The log record is converted to CSV using the to_csv()
procedure, at which stage the value of $EventTime
will be written to the
$raw_event
field as local time and in the NXLog default format.
<Extension _csv>
Module xm_csv
Fields EventTime,UTCTimestamp,Hostname,Message
</Extension>
<Input file>
Module im_file
File '/path/to/log/file'
<Exec>
if $raw_event =~ /^(.+)\t(.+)\t(.+)/
{
$EventTime = parsedate($1);
$UTCTimestamp = integer($EventTime);
$Hostname = $2;
$Message = $3;
_csv->to_csv();
}
</Exec>
</Input>
2021-09-22 22:14:15,1632341655003000,"machine.example.com","An account failed to log on."
Using the GenerateDateInUTC and ParseDateInUTC global directives
These two directives only apply to dates in the format YYYY-MM-DD hh:mm:ss
.
This is the default format used by NXLog.
ParseDateInUTC is equivalent to using
parsedate(arg,TRUE)
to parse timestamps without a timezone offset as UTC.
GenerateDateInUTC is equivalent to
specifying DateFormat YYYY-MM-DD hh:mm:ssUTC
to output timestamps in UTC.
This example configuration parses line-based log records and converts them to CSV. The following log record contains a timestamp without a timezone offset. In this example, we used this sample as input data and executed the configuration on a machine with the timezone set to UTC+2.
2021-09-22 15:14:15 machine.example.com An account failed to log on.
Event time variations that would produce the same output:
-
2021-09-22 17:14:15+02:00
-
2021-09-22 15:14:15Z
The configuration sets the ParseDateInUTC
and GenerateDateInUTC
global
directives to TRUE
.
Log records are parsed with a regular expression and the event time is converted
to datetime using the
parsedate()
function. Unless the event time contains timezone information, the timestamp
will be treated as UTC according to the ParseDateInUTC
directive.
Records are converted to CSV using the to_csv()
procedure, at which stage the value of $EventTime
is converted to UTC as
specified by the GenerateDateInUTC
directive and the output is written to
the $raw_event
field.
GenerateDateInUTC TRUE
ParseDateInUTC TRUE
<Extension _csv>
Module xm_csv
Fields EventTime,Hostname,Message
</Extension>
<Input file>
Module im_file
File '/path/to/log/file'
<Exec>
if $raw_event =~ /^(.+)\t(.+)\t(.+)/
{
$EventTime = parsedate($1);
$Hostname = $2;
$Message = $3;
_csv->to_csv();
}
</Exec>
</Input>
2021-09-22 15:14:15,"machine.example.com","An account failed to log on."
The above configuration does not apply when parsing logs with the
parse_json() procedure. See the next example for a configuration that
uses the ParseDateInUTC global directive when parsing JSON logs.
|
In this example, JSON log records are converted to CSV. The following is a log record containing a timestamp without a timezone offset. In this example, we used this sample as input data and executed the configuration on a machine with the timezone set to UTC+2.
{
"EventTime": "2021-09-22 15:14:15",
"Hostname": "machine.example.com",
"Message": "An account failed to log on."
}
Event time variations that would produce the same output:
-
2021-09-22 17:14:15+02:00
-
2021-09-22 15:14:15Z
This configuration sets the ParseDateInUTC
and GenerateDateInUTC
global
directives to TRUE
.
It also sets the ParseDate directive of the
xm_json module to FALSE
. Otherwise, timestamps without timezone
information will be parsed by the module as local time and the ParseDateInUTC
directive will have no effect.
Log records are parsed with the parse_json()
procedure and the event time is converted to datetime
using the parsedate() function. Unless the event time
contains timezone information, the timestamp will be treated as UTC according to
the ParseDateInUTC
directive.
The log record is converted to CSV using the to_csv()
procedure, at which stage the value of $EventTime
is converted to UTC as
specified by the GenerateDateInUTC
directive and the output is written to
the $raw_event
field.
GenerateDateInUTC TRUE
ParseDateInUTC TRUE
<Extension _json>
Module xm_json
ParseDate FALSE
</Extension>
<Extension _csv>
Module xm_csv
Fields EventTime,Hostname,Message
</Extension>
<Input file>
Module im_file
File '/path/to/log/file'
<Exec>
parse_json();
$EventTime = parsedate($EventTime);
_csv->to_csv();
</Exec>
</Input>
2021-09-22 15:14:15,"machine.example.com","An account failed to log on."
GenerateDateInUTC does not apply when creating JSON output with the
to_json() procedure or function. The
DateFormat directive of xm_json
needs to be used in that case. See
Generating JSON output with UTC timestamps.
|
Preserving datetime values between NXLog instances
When logs are transferred between two or more NXLog instances, fields can be preserved by using the Binary format. This means that when it comes to timestamps, you only need to parse them when the log is processed by the first instance and convert them to string in the last instance.
The following is an example of how to configure two NXLog instances to send and receive logs in Binary format.
2021-09-22T20:14:15.003Z machine.example.com An account failed to log on.
The first NXLog instance reads logs from file using the im_file input module. It parses log records using a regular expression and the event time is converted to datetime using the parsedate() function.
Log records are then forwarded to another NXLog instance using the
om_tcp output module. The OutputType
directive is set to Binary
, which will output the data
in an NXLog proprietary format that preserves fields and their values.
<Input file>
Module im_file
File '/path/to/log/file'
<Exec>
if $raw_event =~ /^(.+)\t(.+)\t(.+)/
{
$EventTime = parsedate($1);
$Hostname = $2;
$Message = $3;
}
</Exec>
</Input>
<Output out>
Module om_tcp
Host 192.168.0.123:514
OutputType Binary
</Output>
The second NXLog instance uses the im_tcp input module to
listen for TCP connections on port 1514. The InputType
directive is set to Binary
, which means it expects to
receive logs from another NXLog instance. Since the logs are already
parsed by the first NXLog instance, they do not need to be parsed again
by this instance.
Log records are converted to JSON using the to_json() procedure of the xm_json module before they are sent to their final destination, which could be a central log repository or SIEM.
<Extension _json>
Module xm_json
</Extension>
<Input tcp_listen>
Module im_tcp
ListenAddr 0.0.0.0:1514
InputType Binary
Exec to_json();
</Input>
The following is an output sample in JSON format. Notice how the
SourceModuleName
and SourceModuleType
contain values for the first instance
since these fields where created there and preserved by using
the Binary
format. The EventTime
field was also converted to the local
date and time, since the field was parsed to a datetime value by the first
instance. The MessageSourceAddress
is a field added by the im_tcp input
module.
{
"EventReceivedTime": "2021-09-22T22:16:09.984898+02:00",
"SourceModuleName": "file",
"SourceModuleType": "im_file",
"EventTime": "2021-09-22T22:14:15.003000+02:00",
"Hostname": "machine.example.com",
"Message": "An account failed to log on.",
"MessageSourceAddress": "192.168.0.122"
}