Protect against duplicate data
NXLog Agent implements mechanisms to protect against data loss. For example, several input modules save the last record they processed in a cache file. If the cache file is corrupt, the module reads all available data, possibly duplicating records to avoid the risk of data loss.
Another instance where duplicate data may occur is when using persistent queues. Since data is not removed from the queue before it’s delivered successfully, if a crash happens just before data is removed, the module resends it when NXLog Agent restarts.
If data duplication is undesirable, you can configure NXLog Agent to prevent duplicate records.
Detect duplicate records
NXLog Agent provides the duplicate_guard() procedure to detect and discard duplicate data in the cases described above. It does this by checking whether the record’s serial number is older than the last serial number saved by the module, and if it is, discards it.
This configuration uses disk-based queues for all processor and output module instances. It then uses the duplicate_guard() procedure to ensure it does not send duplicate records to the SIEM.
PersistLogqueue TRUE (1)
<Input file>
Module im_file
File '/path/to/input.log'
</Input>
<Output siem>
Module om_http
URL http://siem.example.com:8080/
Exec duplicate_guard();
</Output>
| 1 | Enables PersistLogqueue to use disk-based log queues. |