Normalize logs with NXLog Agent
Data normalization enables SIEMs to interpret logs from different sources efficiently, facilitates event correlation, and makes it easier for you to work with the data in dashboards and reports. Almost all SIEM solutions have taxonomies for different types of logs. In addition, they use a set of standard metadata fields, such as labels that describe the system that generated the event. Such data might not be part of the event record, but you must add it from an external source.
Below, we provide examples of transforming and enriching log records with NXLog Agent.
Rewrite logs
One of the most common methods to search for and replace text is using regular expressions.
This configuration reads syslog messages from a file with the im_file input module, which stores the log line in the $raw_event
field.
It then uses a regular expression to search the message for error
and, if found, enriches the log record with a $Severity
field.
Finally, it converts records to JSON format using the to_json() procedure of the xm_json module.
<Extension json>
Module xm_json
</Extension>
<Input system_messages>
Module im_file
File '/var/log/syslog'
<Exec>
if $raw_event =~ /error/i
{
$Severity = "ERROR";
}
$Message = $raw_event;
to_json();
</Exec>
</Input>
The following is a syslog message collected from a Linux host.
Oct 11 15:50:21 SERVER-1 xdg-desktop-por[15999]: Failed to get application states: GDBus.Error:org.freedesktop.portal.Error.Failed: Could not get window list
The following JSON object shows the same log record after NXLog Agent processed it.
{
"EventReceivedTime": "2023-10-11T15:50:26.252732+02:00",
"SourceModuleName": "system_messages",
"SourceModuleType": "im_file",
"Hostname": "SERVER-1",
"Severity": "ERROR",
"Message": "Oct 11 15:50:21 SERVER-1 xdg-desktop-por[15999]: Failed to get application states: GDBus.Error:org.freedesktop.portal.Error.Failed: Could not get window list"
}
NXLog Agent supports the regex /g
global modifier, which you can use to replace text in place.
This configuration reads syslog messages from a file with the im_file input module, which stores the log line in the $raw_event
field.
It then uses a regular expression to search the message for IP addresses and, if found, replaces them with ******
.
<Input system_messages>
Module im_file
File '/var/log/syslog'
<Exec>
if $raw_event =~ s/((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}/******/g
{
log_info("Sanitized IP address");
}
</Exec>
</Input>
The following is a syslog message containing an IP address.
Oct 11 09:17:41 SERVER-1 NetworkManager[547]: <info> [1698218261.0783] dhcp4 (enp0s3): option ip_address => '10.0.0.103'
The following is the same log message after NXLog Agent processed it.
Oct 11 09:17:41 SERVER-1 NetworkManager[547]: <info> [1698218261.0783] dhcp4 (enp0s3): option ip_address => '******'
Rename and delete fields
The NXLog language provides the rename_field() and delete() procedures for simple manipulation of fields.
This configuration reads syslog messages from a file and parses records into structured data using the parse_syslog() procedure of the xm_syslog module.
It renames the NXLog Agent $SourceModuleType
and $SourceModuleName
core fields to $NXLogModuleType
and $NXLogModuleName
and deletes the $SeverityValue
and $Severity
fields.
Finally, it converts records to JSON format using the to_json() procedure of the xm_json module.
<Extension json>
Module xm_json
</Extension>
<Extension syslog>
Module xm_syslog
</Extension>
<Input system_messages>
Module im_file
File '/var/log/syslog'
<Exec>
parse_syslog();
# Rename a field by passing the field names as strings
# or the fields themselves
rename_field("SourceModuleType", "NXLogModuleType");
rename_field($SourceModuleName, $NXLogModuleName);
# Delete a field by passing the field name as a string
# or the field itself
delete("SeverityValue");
delete($Severity);
to_json();
</Exec>
</Input>
The following is a syslog message collected from a Linux host.
Oct 11 23:11:26 SERVER-1 systemd[1]: Started NXLog daemon.
The following JSON object shows the same log record after NXLog Agent processed it.
{
"EventReceivedTime": "2023-10-11T23:11:26.880942+01:00",
"NXLogModuleName": "system_messages",
"NXLogModuleType": "im_file",
"SyslogFacilityValue": 1,
"SyslogFacility": "USER",
"SyslogSeverityValue": 5,
"SyslogSeverity": "NOTICE",
"Hostname": "SERVER-1",
"EventTime": "2023-10-11T23:11:26.000000+01:00",
"SourceName": "systemd",
"ProcessID": 1,
"Message": "Started NXLog daemon."
}
Map fields
For more advanced log transformation, you can use the xm_rewrite module to rename or delete fields, specify a list of fields to retain and transform the data based on custom processing.
This configuration reads syslog messages from a file. It parses records into structured data using the parse_syslog() procedure of the xm_syslog module.
It then uses the xm_rewrite module to map NXLog Agent fields to the Elastic Common Schema and converts records to JSON format using the to_json() procedure of the xm_json module.
<Extension json>
Module xm_json
</Extension>
<Extension syslog>
Module xm_syslog
</Extension>
<Extension syslog_ecs>
Module xm_rewrite
Rename EventTime, @timestamp
Rename EventReceivedTime, event.ingested
Rename Severity, event.severity
Rename SeverityValue, log.level
Rename SyslogSeverityValue, log.syslog.severity.code
Rename SyslogSeverity, log.syslog.severity.name
Rename SyslogFacilityValue, log.syslog.facility.code
Rename SyslogFacility, log.syslog.facility.name
Rename ProcessID, process.pid
Rename SourceName, service.type
Rename Message, message
Rename SourceModuleType, nxlog.module.type
Rename SourceModuleName, nxlog.module.name
<Exec>
${event.original} = $raw_event;
${ecs.version} = "8.0.1";
if $Hostname =~ /^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$/
{
${host.ip} = $Hostname;
}
else
{
${host.hostname} = $Hostname;
}
</Exec>
Delete Hostname
</Extension>
<Input system_messages>
Module im_file
File '/var/log/syslog'
<Exec>
parse_syslog();
syslog_ecs->process();
to_json();
</Exec>
</Input>
The following is a syslog message collected from a Linux host.
Oct 11 23:11:26 SERVER-1 systemd[1]: Started NXLog daemon.
The following JSON object shows the same log record after NXLog Agent processed it.
{
"event.ingested": "2023-10-11T23:11:26.485096+01:00",
"nxlog.module.name": "system_messages",
"nxlog.module.type": "im_file",
"log.syslog.facility.code": 1,
"log.syslog.facility.name": "USER",
"log.syslog.severity.code": 5,
"log.syslog.severity.name": "NOTICE",
"log.level": 2,
"event.severity": "INFO",
"@timestamp": "2023-10-11T23:11:26.000000+01:00",
"service.type": "systemd",
"process.pid": 1,
"message": "Started NXLog daemon.",
"event.original": "Oct 11 23:11:26 SERVER-1 systemd[1]: Started NXLog daemon.",
"ecs.version": "8.0.1",
"host.hostname": "SERVER-1"
}
Use environment variables
You can enrich log records with information from the operating system’s environment variables with the envvar general directive. The NXLog Agent can access environment variables that are visible to its service user.
This configuration defines three environment variables to retrieve CPU information. It uses the im_msvistalog module to collect Windows events from the System log.
It then enriches log records with the CPU information from the environment variables and converts them to JSON format using the to_json() procedure of the xm_json module.
envvar PROCESSOR_IDENTIFIER
envvar PROCESSOR_ARCHITECTURE
envvar NUMBER_OF_PROCESSORS
<Extension json>
Module xm_json
</Extension>
<Input eventlog>
Module im_msvistalog
<QueryXML>
<QueryList>
<Query Id="0">
<Select Path="System">*</Select>
</Query>
</QueryList>
</QueryXML>
<Exec>
${meta.processor} = '%PROCESSOR_IDENTIFIER%';
${meta.processor_arch} = '%PROCESSOR_ARCHITECTURE%';
${meta.processor_count} = %NUMBER_OF_PROCESSORS%;
to_json();
</Exec>
</Input>
The following JSON object shows a Windows System event after NXLog Agent processed it.
{
"EventTime": "2023-10-11T18:25:40.946416+01:00",
"Hostname": "SERVER-1.example.com",
"Keywords": "9259400833873739776",
"LevelValue": 4,
"EventType": "INFO",
"SeverityValue": 2,
"Severity": "INFO",
"EventID": 7036,
"SourceName": "Service Control Manager",
"ProviderGuid": "{555908D1-A6D7-4695-8E1E-26931D2012F4}",
"Version": 0,
"TaskValue": 0,
"OpcodeValue": 0,
"RecordNumber": 23944,
"ExecutionProcessID": 536,
"ExecutionThreadID": 1700,
"Channel": "System",
"Message": "The nxlog service entered the running state.",
"Level": "Information",
"param1": "nxlog",
"param2": "running",
"EventData.Binary": "6E0078006C006F0067002F0034000000",
"EventReceivedTime": "2022-03-09T18:25:42.962159+01:00",
"SourceModuleName": "eventlog",
"SourceModuleType": "im_msvistalog",
"meta.processor": "Intel64 Family 6 Model 165 Stepping 5, GenuineIntel",
"meta.processor_arch": "AMD64",
"meta.processor_count": 16
}
Load data from a file or script
The include and include_stdout general directives allow you to load data from a file or script into the NXLog Agent configuration. For example, with include_stdout, you can execute a script to read dynamic data and inject the script’s output into the configuration.
This configuration uses two files to inject static and dynamic values into the NXLog Agent configuration. The first file defines two static values for the operating system name and version.
define OS_NAME Linux Ubuntu
define OS_VER 20.04
The second is a bash script to retrieve CPU information from the operating system and output the values to the standard output.
#!/bin/bash
PROCESSOR=$(cat /proc/cpuinfo | grep 'name'| uniq)
PROCESSOR_COUNT=$(cat /proc/cpuinfo | grep process| wc -l)
PROCESSOR_ARCH=$(uname -m)
echo "define PROCESSOR $PROCESSOR"
echo "define PROCESSOR_COUNT $PROCESSOR_COUNT"
echo "define PROCESSOR_ARCH $PROCESSOR_ARCH"
The above files are included in the NXLog Agent configuration using the include and include_stdout directives.
The configuration reads syslog messages from a file with the im_file input module and parses records into structured data using the parse_syslog() procedure of the xm_syslog module. It then enriches log records with the operating system and CPU information and converts them to JSON format using the to_json() procedure of the xm_json module.
include /opt/nxlog/etc/env.conf
include_stdout /opt/nxlog/etc/env.sh
<Extension json>
Module xm_json
</Extension>
<Extension syslog>
Module xm_syslog
</Extension>
<Input system_messages>
Module im_file
File '/var/log/syslog'
<Exec>
parse_syslog();
${meta.os_name} = '%OS_NAME%';
${meta.os_ver} = '%OS_VER%';
${meta.processor} = '%PROCESSOR%';
${meta.processor_arch} = '%PROCESSOR_ARCH%';
${meta.processor_count} = %PROCESSOR_COUNT%;
to_json();
</Exec>
</Input>
The following is a syslog message collected from a Linux host.
Oct 11 23:11:26 SERVER-1 systemd[1]: Started NXLog daemon.
The following JSON object shows the same log record after NXLog Agent processed it.
{
"EventReceivedTime": "2023-10-11T23:11:26.172998+01:00",
"SourceModuleName": "system_messages",
"SourceModuleType": "im_file",
"SyslogFacilityValue": 1,
"SyslogFacility": "USER",
"SyslogSeverityValue": 5,
"SyslogSeverity": "NOTICE",
"SeverityValue": 2,
"Severity": "INFO",
"Hostname": "SERVER-1",
"EventTime": "2023-10-11T23:11:26.000000+01:00",
"SourceName": "systemd",
"ProcessID": 1,
"Message": "Started NXLog daemon.",
"meta.os_name": "Linux Ubuntu",
"meta.os_ver": "20.04",
"meta.processor": "model name\t: Intel(R) Core(TM) i7-10700T CPU @ 2.00GHz",
"meta.processor_arch": "x86_64",
"meta.processor_count": 16
}
- See also
-
-
The Logs table reference lists the NXLog Agent core and standard fields.
-
The NXLog language contains several other functions that you can use for log enrichment, such as the host_ip() and hostname() functions. See Functions in the NXLog Agent Reference Manual for a complete listing.
-