Compression (xm_zlib)

This module provides compression and decompression functionality using the gzip data format defined in RFC 1952 and the zlib format defined in RFC 1950. Decompression of input data is defined within im_file module instances, while compression of output data is specified within om_file module instances. The functionality of xm_zlib can be combined with other modules providing data conversion such as xm_crypto.

Writing compressed files with zlib and gzip data format will not result in an instant appearance of compressed data in output files, especially if the data is small and fits into the internal buffer size of 16384 bytes. It’s a feature of stream processing. Therefore, do not expect immediate data readiness in the output compressed file.

Configuration

The xm_zlib module accepts the following directives in addition to the common module directives.

Optional directives

Format

This optional directive specifies the algorithm to be used for compressing and decompressing log data. The accepted values are gzip and zlib. The default value is gzip.

CompressionLevel

This optional directive specifies the level of compression and ranges between 0 and 9. 0 means compression with the lowest level but with the highest performance, while 9 means the highest level of compression but with the lowest performance. If this directive is not specified, the default compression level is set to the default of the zlib library. This usually equals to 6.

CompBufSize

This optional directive specifies the amount of bytes to be allocated for the compression memory buffer. The minimum value is 8192 bytes. The default value is 16384.

DecompBufSize

This optional directive specifies the amount of bytes to be allocated for the decompression memory buffer. The minimum value is 16384 bytes. The default value is 32768.

DataType

This optional directive specifies the type of data being processed and is used by the compress data converter. Specifying the data type improves compression results. The accepted values are unknown, text, and binary. The default value is unknown.

MemoryLevel

This optional directive specifies the amount of available compression memory and accepts values between 1 and 9. The default value is 8.

Data conversion

The xm_zlib module implements data converters to be used with the im_file and om_file modules. These are specified in the InputType and OutputType directives of their respective module and are invoked using dot notation:

<InstanceName>.<DataConverterName>

Where <InstanceName> is the given name of the xm_zlib instance and <DataConverterName> is the name of the converter being invoked.

The following data converters are available to compress and decompress log data.

compress

This data converter is used to compress log data. It should be specified in the OutputType directive after the output writer function. The compressed result is similar to running the following command:

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - input_file | gzip -c > compressed_file
decompress

This data converter is used to decompress log data. It should be specified in the InputType directive before the input reader function. The decompressed result is similar to running the following command:

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - compressed_file | gzip -dc

Examples

The examples below describe various ways of processing logs with the xm_zlib module.

Example 1. Compression of logs

The configuration below utilizes the im_systemd module to read systemd messages and convert them to JSON using the xm_json module. The JSON-formatted messages are then written to a file and compressed using the compress data converter.

nxlog.conf
<Extension gzip>
    Module                 xm_zlib
    Format                 gzip
    CompressionLevel       9
    CompBufSize            16384
    DecompBufsize          16384
</Extension>

<Extension json>
    Module                 xm_json
</Extension>

<Input input_systemd>
    Module                 im_systemd
    Exec                   to_json();
</Input>

<Output output_file>
    Module                 om_file
    OutputType             gzip.compress
    File                   '/tmp/output'
</Output>
Example 2. Decompression of logs

The following configuration uses the decompress converter to process gzip-compressed log files in the input instance. The result is saved to a file.

nxlog.conf
<Extension gzip>
    Module                 xm_zlib
    Format                 gzip
    CompressionLevel       9
    CompBufSize            16384
    DecompBufsize          16384
</Extension>

<Input input_file>
    Module                 im_file
    File                   '/tmp/input'
    InputType              gzip.decompress
</Input>

<Output output_file>
    Module                 om_file
    File                   '/tmp/output'
</Output>

The xm_zlib module can process data via a single instance or multiple instances.

Multiple instances provide flexibility because each instance can be customized for a specific scenario; whereas using a single instance shortens the configuration.

Example 3. Processing data with multiple module instances

The configuration below specifies two instances of the xm_zlib module. The gzip instance is used to decompress gzip-compressed data at the input. Once decompressed, the messages are converted to JSON using the xm_json module. The zlib instance is then used at the output to compress the JSON data in zlib format and save it to a file.

nxlog.conf
<Extension gzip>
    Module                 xm_zlib
    Format                 gzip
    CompressionLevel       9
    CompBufSize            16384
    DecompBufSize          16384
</Extension>

<Extension zlib>
    Module                 xm_zlib
    Format                 zlib
    CompressionLevel       3
    CompBufSize            64000
    DecompBufSize          64000
</Extension>

<Extension json>
    Module                 xm_json
</Extension>

<Input input_file>
    Module                 im_file
    File                   '/tmp/input'
    InputType              gzip.decompress
    Exec                   to_json();
</Input>

<Output output_file>
    Module                 om_file
    File                   '/tmp/output'
    OutputType             zlib.compress
</Output>
Example 4. Processing data with a single module instance

The configuration below specifies a single xm_zlib module instance with default parameters. The input instance decompresses gzip-compressed log files using the decompress data converter and converts the log records to IETF Syslog format using the xm_syslog module. The output instance then writes the results to a file and compresses it using the compress data converter.

nxlog.conf
<Extension gzip>
    Module                 xm_zlib
</Extension>

<Extension syslog>
    Module                 xm_syslog
</Extension>

<Input input_file>
    Module                 im_file
    File                   '/tmp/input'
    InputType              gzip.decompress
    Exec                   to_syslog_ietf();
</Input>

<Output output_file>
    Module                 om_file
    File                   '/tmp/output'
    OutputType             gzip.compress
</Output>

Data conversion operations can be chained together to create a workflow. For example, the xm_zlib module functionality can be combined with the xm_crypto module to perform compression and encryption operations on log files.

Data converters are processed sequentially from left to right, thus the order they are specified is important. When specifying data converters, decryption should always occur before decompression in input instances, while compression should always precede encryption in output instances, as illustrated in the following table.

Table 1. Sequential order of operations for compression/encryption and decompression/decryption
Directive First Operation Second Operation Third Operation

Compression + Encryption

OutputType

Output writer function

compress

aes_encrypt

Decompression + Decryption

InputType

aes_decrypt

decompress

Input reader function

Example 5. Processing data with multiple data converters

The configuration below processes a gzip-compressed and encrypted log file containing log records in the NXLog Agent Binary format. The input instance decrypts the file using the aes_decrypt data converter of the xm_crypto module, and then decompresses it using the decompress converter of the xm_zlib module. Each log record is processed and if it contains the stdout string it is passed on to the output instance.

The processed log data is written to a file by the output instance in the NXLog Agent Binary format. The data is compressed using the compress converter and finally encrypted using the aes_encrypt converter.

nxlog.conf
<Extension gzip>
    Module              xm_zlib
    Format              gzip
    CompressionLevel    9
    CompBufSize         16384
    DecompBufsize       16384
</Extension>

<Extension cryptography>
    Module              xm_crypto
    UseSalt             TRUE
    PasswordFile        passwordfile
</Extension>

<Input input_file>
    Module              im_file
    File                '/tmp/input'
    InputType           cryptography.aes_decrypt, gzip.decompress, Binary
    Exec                if not ($raw_event =~ /stdout/) drop();
</Input>

<Output output_file>
    Module              om_file
    File                '/tmp/output'
    OutputType          Binary, gzip.compress, cryptography.aes_encrypt
</Output>
The default input reader and output writer function for the im_file and om_file modules is LineBased. When using the default function, it can be omitted from the respective InputType or OutputType directive.

For more information and examples on combined data conversion operations see NXLog Agent log compression and encryption in the NXLog Platform User Guide.