Compression (xm_zlib)
This module provides compression and decompression functionality using the gzip
data format defined in RFC 1952 and the zlib
format defined in RFC 1950.
Decompression of input data is defined within im_file module instances, while compression of output data is specified within om_file module instances.
The functionality of xm_zlib can be combined with other modules providing data conversion such as xm_crypto.
Writing compressed files with zlib and gzip data format will not result in an instant appearance of compressed data in output files, especially if the data is small and fits into the internal buffer size of 16384 bytes. It’s a feature of stream processing. Therefore, do not expect immediate data readiness in the output compressed file.
|
Configuration
The xm_zlib module accepts the following directives in addition to the common module directives.
Optional directives
This optional directive specifies the algorithm to be used for compressing and decompressing log data. The accepted values are gzip and zlib. The default value is gzip. |
|
This optional directive specifies the level of compression and ranges between 0 and 9. 0 means compression with the lowest level but with the highest performance, while 9 means the highest level of compression but with the lowest performance.
If this directive is not specified, the default compression level is set to the default of the |
|
This optional directive specifies the amount of bytes to be allocated for the compression memory buffer. The minimum value is 8192 bytes. The default value is 16384. |
|
This optional directive specifies the amount of bytes to be allocated for the decompression memory buffer. The minimum value is 16384 bytes. The default value is 32768. |
|
This optional directive specifies the type of data being processed and is used by the compress data converter. Specifying the data type improves compression results. The accepted values are unknown, text, and binary. The default value is unknown. |
|
This optional directive specifies the amount of available compression memory and accepts values between 1 and 9. The default value is 8. |
Data conversion
The xm_zlib module implements data converters to to be used with the im_file and om_file modules. These are specified in the InputType and OutputType directives of their respective module and are invoked using dot notation:
<InstanceName>.<DataConverterName>
Where <InstanceName>
is the given name of the xm_zlib instance and
<DataConverterName>
is the name of the converter being invoked.
The following data converters are available to compress and decompress log data.
- compress
-
This data converter is used to compress log data. It should be specified in the OutputType directive after the output writer function. The compressed result is similar to running the following command:
printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - input_file | gzip -c > compressed_file
- decompress
-
This data converter is used to decompress log data. It should be specified in the InputType directive before the input reader function. The decompressed result is similar to running the following command:
printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - compressed_file | gzip -dc
Examples
The examples below describe various ways of processing logs with the xm_zlib module.
The configuration below utilizes the im_systemd module to read systemd messages and converts them to JSON using the xm_json module. The JSON-formatted messages are then written to a file and compressed using the compress data converter.
<Extension gzip>
Module xm_zlib
Format gzip
CompressionLevel 9
CompBufSize 16384
DecompBufsize 16384
</Extension>
<Extension json>
Module xm_json
</Extension>
<Input input_systemd>
Module im_systemd
Exec to_json();
</Input>
<Output output_file>
Module om_file
OutputType gzip.compress
File '/tmp/output'
</Output>
The following configuration uses the decompress converter to process gzip-compressed log files in the input instance. The result is saved to a file.
<Extension gzip>
Module xm_zlib
Format gzip
CompressionLevel 9
CompBufSize 16384
DecompBufsize 16384
</Extension>
<Input input_file>
Module im_file
File '/tmp/input'
InputType gzip.decompress
</Input>
<Output output_file>
Module om_file
File '/tmp/output'
</Output>
The xm_zlib module can process data via a single instance or multiple instances.
Multiple instances provide flexibility because each instance can be customized for a specific scenario; whereas using a single instance shortens the configuration.
The configuration below specifies two instances of the xm_zlib module. The gzip instance is used to decompress gzip-compressed data at the input. Once decompressed, the messages are converted to JSON using the xm_json module. The zlib instance is then used at the output to compress the JSON data in zlib format and save it to a file.
<Extension gzip>
Module xm_zlib
Format gzip
CompressionLevel 9
CompBufSize 16384
DecompBufSize 16384
</Extension>
<Extension zlib>
Module xm_zlib
Format zlib
CompressionLevel 3
CompBufSize 64000
DecompBufSize 64000
</Extension>
<Extension json>
Module xm_json
</Extension>
<Input input_file>
Module im_file
File '/tmp/input'
InputType gzip.decompress
Exec to_json();
</Input>
<Output output_file>
Module om_file
File '/tmp/output'
OutputType zlib.compress
</Output>
The configuration below specifies a single xm_zlib module instance with default parameters. The input instance decompresses gzip-compressed log files using the decompress data converter and converts the log records to IETF Syslog format using the xm_syslog module. The output instance then writes the results to a file and compresses it using the compress data converter.
<Extension gzip>
Module xm_zlib
</Extension>
<Extension syslog>
Module xm_syslog
</Extension>
<Input input_file>
Module im_file
File '/tmp/input'
InputType gzip.decompress
Exec to_syslog_ietf();
</Input>
<Output output_file>
Module om_file
File '/tmp/output'
OutputType gzip.compress
</Output>
Data conversion operations can be chained together to create a workflow. For example, the xm_zlib module functionality can be combined with the xm_crypto module to perform compression and encryption operations on log files.
Data converters are processed sequentially from left to right, thus the order that they are specified in is important. When specifying data converters, decryption should always occur before decompression in input instances, while compression should always precede encryption in output instances, as illustrated in the following table.
Directive | First Operation | Second Operation | Third Operation | |
---|---|---|---|---|
Compression + Encryption |
Output writer function |
|||
Decompression + Decryption |
Input reader function |
The configuration below processes a gzip-compressed and encrypted log file containing log records in the NXLog Binary format. The input instance decrypts the file using the aes_decrypt data converter of the xm_crypto module, and then decompresses it using the decompress converter of the xm_zlib module. Each log record is processed and if it contains the stdout string it is passed on to the output instance.
The processed log data is written to a file by the output instance in the NXLog Binary format. The data is compressed using the compress converter and finally encrypted using the aes_encrypt converter.
<Extension gzip>
Module xm_zlib
Format gzip
CompressionLevel 9
CompBufSize 16384
DecompBufsize 16384
</Extension>
<Extension cryptography>
Module xm_crypto
UseSalt TRUE
PasswordFile passwordfile
</Extension>
<Input input_file>
Module im_file
File '/tmp/input'
InputType cryptography.aes_decrypt, gzip.decompress, Binary
Exec if not ($raw_event =~ /stdout/) drop();
</Input>
<Output output_file>
Module om_file
File '/tmp/output'
OutputType Binary, gzip.compress, cryptography.aes_encrypt
</Output>
For more information and examples on combined data conversion operations see the topic on Compression and Encryption.