NXLog Agent buffering and flow control
NXLog Agent implements several data buffering features. Two of these are particularly important and switched on by default: log queues and flow control. In addition, stream-oriented modules use read and write buffers, which you can configure with module-level directives.
See Configure buffering for examples of setting log queues and buffer sizes, managing flow control, and using other buffering functionality.
Log queues
Every processor and output module instance has an input log queue for the data the instance has not yet processed. The LogqueueSize directive defines the size of the log queue. When the preceding module processes a batch of records, it places it in this queue if it’s less than 70% of the size limit. Otherwise, flow control kicks in.
Because batches are defined by the number of records and not the data size, the log queue size is not a hard limit. If the log queue is less than 70% full and an incoming batch contains large records, the log queue size limit may be exceeded. Likewise, if the log queue size is well below the limit but is more than 70% full, no more records will be placed in it.
When NXLog Agent stops, it writes any records in the log queue to a file on disk. Once it restarts, modules process these records first, before any new incoming records. This results in NXLog Agent statistics showing more outgoing than incoming records.
Log queues are switched on by default for all processor and output module instances; adjusting log queue sizes is the preferred way to control buffering behavior.
Flow control
NXLog Agent’s flow control functionality provides automatic, zero-configuration handling of many cases where buffering would otherwise be required. Flow control takes effect when the following sequence of events occurs in a route:
-
A processor or output module instance cannot process data at the incoming rate.
-
That module instance’s log queue becomes full.
-
An input, processor, or output module instance has flow control switched on (the default).
In this case, flow control will cause the input or processor module instance to suspend processing until the succeeding instance is ready to accept more data.
You can configure flow control settings per module instance. If two succeeding module instances in a route have conflicting flow control settings, switched off flow control takes precedence.
When a route contains multiple output instances, it is possible to turn off flow control for one output instance so that it will not suspend processing for the entire route if it is blocked. In this case, the preceding input or processor module instance will continue to process data and forward it to the output instances. The blocked output instance will discard data until it recovers.
Read and write buffers
Stream-oriented input and output modules, i.e., those connecting over the network (UDP, TCP, and HTTP) or reading and writing to files, use read/write buffers. You can control the buffer size with the BufferSize common module directive.
It is important to note that a module instance can create multiple buffers. Modules that receive or send data over the network, like *m_tcp and *m_http, create a buffer for each active connection. On the other hand, *m_file module instances create a buffer for each open file. The size you set with the BufferSize directive applies to each buffer. For example, the diagram below illustrates an im_file module instance collecting telemetry data from three files. Therefore, it will create three read buffers of 65,000 bytes (the default) each, totaling 195 KB.
Use caution when defining the buffer size:
-
Avoid configurations that use large chunks of memory. Network modules can create an undefined number of connections, each with a read/write buffer of the size you specify. Similarly, a wildcard that matches many files and a large buffer is not a good combination when collecting data with im_file. In such cases, consider using the ActiveFiles directive.
-
If a module receives data larger than the buffer size, it will truncate it and record an error in the NXLog Agent log file.
In addition to the above, some modules use other buffers specific to their functionality, as shown in the following table.
| Module(s) | Directive | Default value |
|---|---|---|
im_udp |
OS default |
|
xm_zlib |
16,384 bytes |
|
xm_zlib |
32,768 bytes |
Other buffering functionality
In addition to the functionality mentioned above, several NXLog Agent modules implement specialized buffering features, such as the ones listed below.
-
You can configure the UDP modules im_udp, om_udp, and om_udpspoof to set the socket buffer size with the
SockBufSizedirective. -
Use modules like im_exec, im_perl, im_python, om_exec, om_perl, and om_python to implement custom buffering solutions.
-
Some modules, such as om_batchcompress, om_elasticsearch, and om_webhdfs, buffer data internally to forward records to the destination in batches.
-
Control buffering with the pm_blocker module by blocking or unblocking the data processing flow in a route. This module is especially useful for testing buffering.
-
To test buffering behavior, you can also simulate a blocked output with the om_blocker module.
The following diagram shows all buffers used in a simple route. The socket buffers only apply to network-based modules.