Protect against duplicate logs

NXLog Agent implements mechanisms to protect against data loss. For example, several input modules save the last log event they processed in a cache file. If the cache file is corrupt, the module reads all available logs, possibly duplicating log records to avoid the risk of data loss.

Another instance where duplicate logs may occur is when using persistent log queues. Since log records are not removed from the queue before they’re delivered successfully, if a crash happens just before records are removed, the module resends them when NXLog Agent restarts.

If log duplication is undesirable, you can configure NXLog Agent to prevent duplicate log records.

Detect duplicate log records

NXLog Agent provides the duplicate_guard() procedure to detect and discard duplicate log records in the cases described above. It does this by checking whether the log record’s serial number is older than the last serial number saved by the module, and if it is, discards it.

Example 1. Detecting and deleting duplicate logs

This configuration uses disk-based log queues for all processor and output module instances. It then uses the duplicate_guard() procedure to ensure it does not send duplicate records to the SIEM.

nxlog.conf
PersistLogqueue    TRUE    (1)

<Input file>
    Module         im_file
    File           '/path/to/input.log'
</Input>

<Output siem>
    Module         om_http
    URL            http://siem.example.com:8080/
    Exec           duplicate_guard();
</Output>
1 Enables PersistLogqueue to use disk-based log queues.