NXLog Documentation

Optimizing the configuration

NXLog’s module-based configuration gives you the flexibility to parse, process, and format log records according to your needs. However, heavy processing can impact performance, especially in high-volume logging environments. Using regular expressions, extracting data with functions like extract_json() and extract_xml(), and external scripts or services all affect the overall processing speed. Therefore, creating an efficient configuration is very important.

Load balancing

Network load balancer

Whenever possible, using a network layer load balancer is the best method to distribute connections for higher throughput. A network load balancer can distribute connections between multiple identically-configured NXLog agents.

Figure 1. Processing logs with multiple NXLog agents

Alternatively, you can take advantage of NXLog’s multi-threaded architecture to distribute log processing between multiple input module instances of the same agent.

gv processing with multiple input instances

Figure 2. Processing logs with multiple input instances

There are several commercial and open-source network load balancers available. For example, NGINX Plus and NGINX Open Source version 1.9.0 or later support TCP and UDP Load Balancing. See the installation instructions in the NGINX Admin Guide to get started.

Example 1. NGINX example configuration

This NGINX configuration distributes UDP and TCP connections to an NXLog log agent configured with multiple input instances listening on different ports. On Debian-based systems, the default location of the NGINX configuration file is /etc/nginx/nginx.conf, but this may vary depending on your distribution.

The NGINX load balancer routes UDP traffic per message and TCP traffic per connection. As a result, load balancing TCP traffic works best when log sources send a similar number of events.

nginx.conf

load_module /usr/lib/nginx/modules/ngx_stream_module.so; (1)

stream {
  upstream nxlog_udp { (2)
    server 192.168.1.81:1001;
    server 192.168.1.81:1002;
  }

  upstream nxlog_tcp { (3)
    server 192.168.1.81:1003;
    server 192.168.1.81:1004;
  }

  server {
    listen 192.168.1.81:514 udp; (4)
    proxy_pass nxlog_udp;
    proxy_responses 0;
  }

  server {
    listen 192.168.1.81:1514; (5)
    proxy_pass nxlog_tcp;
  }
}

worker_rlimit_nofile 1000000;

events {
  worker_connections 20000; (6)
}

1	The NGINX stream module must be loaded from the configuration or enabled with the `--with-stream` configuration parameter.
2	Lists the NXLog input instances listening for UDP connections.
3	Lists the NXLog input instances listening for TCP connections.
4	Specifies the IP address and port NGINX will listen on for UDP connections. Configure your sources to send logs to this IP and port.
5	Specifies the IP address and port NGINX will listen on for TCP connections. Configure your sources to send logs to this IP and port.
6	The maximum number of simultaneous connections allowed.

Refer to the NGINX documentation for more information on the available configuration directives.

Example 2. Receiving logs on multiple inputs

This NXLog configuration defines two identical instances of the im_udp input module listening for connections on different ports.

nxlog.conf

<Extension syslog>
    Module        xm_syslog
</Extension>

<Extension json>
    Module        xm_json
</Extension>

<Input udp_1>
    Module        im_udp
    ListenAddr    0.0.0.0:1001
    <Exec> (1)
        parse_syslog(); (2)
        to_json(); (3)
    </Exec>
</Input>

<Input udp_2>
    Module        im_udp
    ListenAddr    0.0.0.0:1002
    <Exec>
        parse_syslog();
        to_json();
    </Exec>
</Input>

<Output file>
    Module        om_file
    File          '/path/to/output/file'
</Output>

<Route r1>
    Path          udp_1, udp_2 => file (4)
</Route>

1	Exec block for heavy parsing.
2	Parses syslog messages into structured data using the parse_syslog() procedure of the xm_syslog module.
3	Converts the record to JSON using the to_json() procedure of the xm_json module.
4	Routes messages from all input instances to a single output.

Modules as threads

If deploying a Network Load Balancer is not an option, you can implement parallelization within the NXLog configuration. There are several options depending on your use case.

The first method is to implement a selector function in the input instance to reroute individual events to multiple identical output instances. This way, any intensive log processing is distributed between different threads.

Example 3. Routing individual messages

This configuration uses the im_tcp input module to listen for connections on port 1514. It then reroutes messages to three identical output instances to distribute the load between them.

Flow control is explicitly disabled when rerouting messages, resulting in NXLog dropping messages if the target module(s) queue is full.

nxlog.conf

<Extension syslog>
    Module       xm_syslog
</Extension>

<Extension json>
    Module       xm_json
</Extension>

<Input tcp_routing>
   Module        im_tcp
   ListenAddr    0.0.0.0:1514
   <Exec>
      if (get_var("linecounter") == undef ) set_var("linecounter", 0); (1)
      set_var("linecounter", get_var("linecounter")+1); (2)

      if get_var("linecounter") == 2 reroute("2"); (3)

      if get_var("linecounter") == 3 {
          reroute("3");
          set_var("linecounter", 0); (4)
      }
      log_info(get_var("linecounter")); (5)
   </Exec>
</Input>

<input null>
    Module       im_null
</input>

<Output file_1>
    Module       om_file
    File         '/path/to/output/file_1'
    <Exec> (6)
        parse_syslog(); (7)
        to_json(); (8)
    </Exec>
</Output>

<Output file_2>
    Module       om_file
    File         '/path/to/output/file_2'
    <Exec>
        parse_syslog();
        to_json();
    </Exec>
</Output>

<Output file_3>
    Module       om_file
    File         '/path/to/output/file_3'
    <Exec>
        parse_json();
        to_json();
    </Exec>
</Output>

<Route 1>
    Path         tcp_routing => file_1
</Route>

<Route 2>
    Path         null => file_2
</Route>

<Route 3>
    Path         null => file_3
</Route>

1 Creates a module variable using the get_var() function and set_var() procedure to initialize a counter. The message falls through to route 1.

2 Increases the counter by 1.

3 Reroutes the message to the relevant output module with the reroute() procedure.

4 Resets the counter once it reaches the maximum number of output instances.

5 The log_info() procedure is used to write the counter’s value to the log file for testing purposes only.

6 Exec block for heavy parsing.

7 Parses syslog messages into structured data using the parse_syslog() procedure of the xm_syslog module.

Converts the record to JSON using the to_json() procedure of the xm_json module.

In NXLog Enterprise Edition version 5, the log_info() procedure will truncate messages larger than 1024 bytes. In subsequent versions, messages longer than specified at LogSizeLimit will be truncated.

Another option when receiving logs over the network is to route connections to multiple identical input instances by enabling the ReusePort directive of the im_tcp or im_udp modules, which allows multiple threads to receive data on the same port. Routing works best when many simultaneous connections deliver a similar number of events; otherwise, connection distribution may be skewed and might not yield any benefits.

Let’s consider an example where four input threads can handle 7,000 EPS with parsing enabled. Three agents send a cumulative 22,000 EPS.

gv routing connections to multiple inputs

Figure 3. Routing connections to multiple inputs

One might conclude that the total throughput provided by the four threads (28,000 EPS) might be sufficient to handle the influx. However, each source’s connection is associated with a single input thread. Therefore, if source A delivers 20,000 EPS, whereas sources B and C deliver 1,000 EPS, the maximum throughput will not scale as expected. Instead, it will equal the saturation throughput of Input 1 + 2*1,000 EPS, resulting in 9,000 EPS. 13,000 EPS ((20,000+2,000)-9,000) backpressure will cause significant delivery delays.

Distributing connections between threads is handled by the operating system. In our tests, we noticed poor results with a small number of connections.

Example 4. Routing TCP connections

This configuration defines multiple identical instances of the im_tcp input module listening for connections on port 1514. The ReusePort directive allows each instance to receive data synchronously on the same port.

nxlog.conf

<Extension syslog>
    Module       xm_syslog
</Extension>

<Extension json>
    Module       xm_json
</Extension>

<Input tcp_1>
    Module        im_tcp
    ListenAddr    0.0.0.0:1514
    ReusePort     TRUE
    <Exec> (1)
        parse_syslog(); (2)
        to_json(); (3)
    </Exec>
</Input>

<Input tcp_2>
    Module        im_tcp
    ListenAddr    0.0.0.0:1514
    ReusePort     TRUE
    <Exec>
        parse_syslog();
        to_json();
    </Exec>
</Input>

<Output file>
   Module         om_file
   File           '/path/to/output/file'
</Output>

<Route 1>
    Path          tcp_1, tcp_2 => file (4)
</Route>

1	Exec block for heavy parsing.
2	Parses syslog messages into structured data using the parse_syslog() procedure of the xm_syslog module.
3	Converts the record to JSON using the to_json() procedure of the xm_json module.
4	Routes messages from all input instances to a single output.

Configuration redundancy

When your NXLog configuration contains identical blocks like the examples above, it is best to split it into separate files and use the include general directive to add configuration snippets as needed. Doing so ensures that your instances are configured identically, and you only need to apply configuration updates in a single location.

The default configuration includes all files with the .conf extension. Therefore, you need to use a different extension for partial configuration files, or they will be added out of context.

Example 5. Managing redundant configuration

This configuration uses the include general directive to load duplicate configuration snippets.

nxlog.conf

define CONFDIR /opt/nxlog/etc/nxlog.d (1)
include %CONFDIR%/*.conf (2)

<Extension syslog>
    Module       xm_syslog
</Extension>

<Extension json>
    Module       xm_json
</Extension>

<Input tcp_1>
include %CONFDIR%/tcp-input.partial (3)
</Input>

<Input tcp_2>
include %CONFDIR%/tcp-input.partial
</Input>

1	Defines the `CONFDIR` constant to store the location of the partial configuration files.
2	The default NXLog configuration includes all `.conf` files.
3	Includes the input instance configuration from an external file.

tcp-input.partial

Module        im_tcp
ListenAddr    0.0.0.0:1514
ReusePort     TRUE (1)
include %CONFDIR%/tcp-exec.part (2)

1	Enables the ReusePort directive to allow multiple instances to receive data synchronously on the same port.
2	Includes the message processing configuration from an external file.

tcp-exec.partial

<Exec>
    parse_syslog(); (1)
    to_json(); (2)
</Exec>

1	Parses syslog messages into structured data using the parse_syslog() procedure of the xm_syslog module.
2	Converts the record to JSON using the to_json() procedure of the xm_json module.

Memory usage

Sometimes you may need to increase the default BatchSize and LogqueueSize for processor and output module instances to allow input modules to read events from the source faster. However, a side-effect of increasing the batch size and log queue is an increase in memory consumption. Use the following formula to determine the required memory according to your BatchSize and LogqueueSize settings:

\[MaximumQueueMemoryConsumption =\]

\[LogqueueSizeLimit + (LastBatchSize * AverageEventSizeInLastBatch)\]

In the above equation, MaximumQueueMemoryConsumption is for only one (output or processor) module. Each of those modules has its queue, so each has its own LogqueueSizeLimit (and the content of the LastBatch can vary at each one).

Starting from NXLog Enterprise Edition version 6, LogqueueSize is set directly in bytes and not event record batches, making the formula unnecessary.

In addition, you must consider the unacknowledged network data in the output buffer, which may amount to a considerable size. In an otherwise unobstructed network, the size of the unacknowledged data is the product of the network’s throughput and latency:

\[OutputBufferSize = Eventsize ∗ EPS ∗ RoundTripLatency ∗ NumberofOutputs\]

Disk I/O

Implementing measures to reduce the risk of data loss, such as enabling the PersistLogQueue directive or using disk-based buffers with the pm_buffer module, can negatively affect performance since they substantially increase disk read and write operations. By default, NXLog uses log queues and buffers with a flow control mechanism to mitigate data loss. Therefore you only need to implement additional measures if reliability is of utmost importance. In most cases, finding a balance between log processing speed and data integrity is necessary.

Example 6. Combining memory and disk-based buffering

This configuration uses the im_tcp input module to listen for connections on port 1514. It first stores received logs in a memory-based buffer and then in a larger disk-based buffer until they are saved to a file.

<Input tcp_listen>
    Module        im_tcp
    ListenAddr    0.0.0.0:1514
</Input>

<Processor memory_buffer>
    Module        pm_buffer
    MaxSize       1024 (1)
    Type          Mem
</Processor>

<Processor disk_buffer>
    Module        pm_buffer
    MaxSize       500000 (2)
    Type          Disk
    WarnLimit     400000 (3)
</Processor>

<Output output_file>
    Module        om_file
    File          '/path/to/output/file'
</Output>

<Route 1>
    Path         tcp_listen => memory_buffer => disk_buffer => output_file
</Route>

1	Sets the compulsory MaxSize directive to 1 MB for the memory-based buffer.
2	Sets the disk-based buffer to 500 MB.
3	Specifies the optional WarnLimit directive to generate a warning when the buffer reaches 400 MB.

Troubleshooting performance issues

To help with troubleshooting performance issues, NXLog provides several ways to output server information and log processing status.

The easiest method is using SIGNALS to send instructions to the NXLog process. By sending SIGUSR1 on Linux-based operating systems and SC code 200 on Windows, you instruct NXLog to write a summary in its log file. The summary includes the status of each module instance and its queue size, which can help you to identify if an instance is a bottleneck.

SIGUSR1 output sample

2022-08-22 22:44:19 INFO [CORE|main] event queue has 2 events;
jobgroup with priority 1;non-module job, events: 0;
jobgroup with priority 10;job of module null/im_null, events: 0;job of module tcp_routing/im_tcp, events: 1; POLL: 1;job of module file_1/om_file, events: 0;job of module file_2/om_file, events: 0;
jobgroup with priority 99;non-module job, events: 0;non-module job, events: 0;non-module job, events: 0;non-module job, events: 0;non-module job, events: 0;non-module job, events: 0;non-module job, events: 0;non-module job, events: 0;
[route 1]; - tcp_routing: type INPUT, status: RUNNING queuesize: 0; - file_1: type OUTPUT, status: RUNNING queuesize: 0;
[route 2]; - null: type INPUT, status: RUNNING queuesize: 0; - file_2: type OUTPUT, status: RUNNING queuesize: 0;

See Send debug dump to the internal log in the Troubleshooting section for more information on sending signals.

Another option is to use the xm_admin module to be able to request NXLog status information via HTTP(S). xm_admin supports JSON or SOAP and provides the ServerInfo endpoint for retrieving detailed NXLog status information.

You can send HTTP requests to an xm_admin-enabled agent using a utility like curl. In addition, any third-party system or application monitoring tool will be able to integrate with NXLog.

serverInfo response

{
    "response": "serverInfoReply",
    "status": "success",
    "data": {
        "server-info": {
            "started": 1661856836559441,
            "load": 0.21999999880790710449,
            "pid": 11452,
            "mem": 6119424,
            "os": "Linux",
            "version": "5.5.7638",
            "systeminfo": "OS: Linux, Hostname: NXLog-1, Release: 5.15.0-46-generic, Version: #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022, Arch: x86_64, 1 CPU(s), 1.9Gb memory",
            "hostname": "NXLog-1",
            "servertime": 1661856845614052,
            "modules": {
                "tcp_1": {
                    "module_name": "tcp_1",
                    "evt-recvd": 0,
                    "evt-drop": 0,
                    "evt-fwd": 0,
                    "queuesize": 0,
                    "queuelimit": 0,
                    "batchsize": 50,
                    "status": 3,
                    "module-type": 1,
                    "module": "im_tcp",
                    "tcp": {
                        "current-listener-count": 1,
                        "listeners": [
                            {
                                "address": "0.0.0.0",
                                "port": 1514
                            }
                        ],
                        "current-connection-count": 0,
                        "cumulative-connection-count": 0
                    },
                    "variables": {}
                },
                "tcp_2": {
                    "module_name": "tcp_2",
                    "evt-recvd": 0,
                    "evt-drop": 0,
                    "evt-fwd": 0,
                    "queuesize": 0,
                    "queuelimit": 0,
                    "batchsize": 50,
                    "status": 3,
                    "module-type": 1,
                    "module": "im_tcp",
                    "tcp": {
                        "current-listener-count": 1,
                        "listeners": [
                            {
                                "address": "0.0.0.0",
                                "port": 1514
                            }
                        ],
                        "current-connection-count": 0,
                        "cumulative-connection-count": 0
                    },
                    "variables": {}
                },
                "output_file": {
                    "module_name": "output_file",
                    "evt-recvd": 0,
                    "evt-drop": 0,
                    "evt-fwd": 0,
                    "queuesize": 0,
                    "queuelimit": 100,
                    "batchsize": 0,
                    "status": 3,
                    "module-type": 3,
                    "module": "om_file",
                    "variables": {}
                }
            },
            "labels": {},
            "extensions": {
                "admin": {
                    "module-name": "admin",
                    "module": "xm_admin"
                }
            },
            "routes": {
                "1": {
                    "route-modules": {
                        "tcp_1": {
                            "module_name": "tcp_1",
                            "evt-recvd": 0,
                            "evt-drop": 0,
                            "evt-fwd": 0,
                            "queuesize": 0,
                            "queuelimit": 0,
                            "batchsize": 50,
                            "status": 3,
                            "module-type": 1,
                            "module": "im_tcp",
                            "tcp": {
                                "current-listener-count": 1,
                                "listeners": [
                                    {
                                        "address": "0.0.0.0",
                                        "port": 1514
                                    }
                                ],
                                "current-connection-count": 0,
                                "cumulative-connection-count": 0
                            },
                            "variables": {}
                        },
                        "tcp_2": {
                            "module_name": "tcp_2",
                            "evt-recvd": 0,
                            "evt-drop": 0,
                            "evt-fwd": 0,
                            "queuesize": 0,
                            "queuelimit": 0,
                            "batchsize": 50,
                            "status": 3,
                            "module-type": 1,
                            "module": "im_tcp",
                            "tcp": {
                                "current-listener-count": 1,
                                "listeners": [
                                    {
                                        "address": "0.0.0.0",
                                        "port": 1514
                                    }
                                ],
                                "current-connection-count": 0,
                                "cumulative-connection-count": 0
                            },
                            "variables": {}
                        },
                        "output_file": {
                            "module_name": "output_file",
                            "evt-recvd": 0,
                            "evt-drop": 0,
                            "evt-fwd": 0,
                            "queuesize": 0,
                            "queuelimit": 100,
                            "batchsize": 0,
                            "status": 3,
                            "module-type": 3,
                            "module": "om_file",
                            "variables": {}
                        }
                    },
                    "route-name": "1",
                    "evt-recvd": 0,
                    "evt-drop": 0,
                    "evt-fwd": 0,
                    "in-use": "true"
                }
            }
        }
    }
}

See the Configuration examples in the xm_admin reference to configure NXLog with this module.

Troubleshooting performance issues in a live environment may be challenging. A better option is to simulate the load in a segregated test environment, allowing you to carry out troubleshooting steps without affecting business operations. Generating test data in the Troubleshooting section provides options to simulate input data for various scenarios.