Convert character sets with NXLog Agent

Sometimes, you might need to convert logs between different character sets, for example, if collecting records from UTF-16-encoded files and your SIEM requires UTF-8 encoding. You can convert between character sets with NXLog Agent’s xm_charconv module, which allows you to configure the input and output encodings and provides functions to detect and convert character sets.

Below, we provide examples of using the xm_charconv module to convert your logs' character encoding.

Auto-detect input encoding

If you have multiple sources producing logs in different character sets that you want to streamline into a single encoding, you can use the AutodetectCharsets directive combined with the convert_fields() procedure.

Example 1. Auto-detect and convert character sets

This configuration uses an xm_charconv instance and specifies a list of character sets that input logs might use. It then converts all text fields in each record to UTF-8 with the convert_fields() procedure, specifying auto for the input encoding.

nxlog.conf
<Extension charconv>
    Module                xm_charconv
    AutodetectCharsets    utf-8, utf-16, utf-32, shift-jis, euc-jp
</Extension>

<Input input_file>
    Module                im_file
    File                  '/path/to/logs/*'
    Exec                  convert_fields("auto", "utf-8");
</Input>

Convert a specific character set

If you want to convert logs between specific character sets, using the InputEncoding and OutputEncoding directives to register input reader and output writer functions is the easiest.

Example 2. Convert a specific character set to UTF-8

This configuration uses an xm_charconv instance with the input encoding set to shift-jis, a character set for the Japanese language. It then specifies the InputType of the im_file instance to shift_jis, i.e., the name of the xm_charconv instance.

Since it does not explicitly set the OutputEncoding, it will output logs in the default UTF-8 character set.

nxlog.conf
<Extension shift_jis>
    Module           xm_charconv
    InputEncoding    shift-jis
</Extension>

<Input input_file>
    Module           im_file
    File             '/path/to/logs/*'
    InputType        shift_jis
</Input>