Elasticsearch (om_elasticsearch)
This module forwards logs to an Elasticsearch server. It will connect to the URL specified in the configuration in either plain HTTP or HTTPS mode. This module supports bulk data operations and dynamic indexing. Event data is sent in batches, reducing the latency caused by the HTTP responses, thus improving Elasticsearch server performance. HTTP protocol errors result in the entire batch being retried. For data errors reported by the Elasticsearch server, the server response is parsed and only the failed Event data is included in the retry. If the same batch (or partial batch) has not been accepted by the server after RetryLimit retries are exhausted, the batch will be dropped, or the module will stop (according to OnError).
This module requires the xm_json extension module to be loaded to convert the payload to JSON. See the Output log format section for information on the format of the payload. |
Output log format
om_elasticsearch forwards log records over HTTP(S) as JSON payload.
The JSON format depends on the value of the $raw_event
field.
The module checks if the value of $raw_event
is valid JSON and applies the following rules:
-
If it is valid JSON, the value is forwarded as is.
-
If it is not valid JSON, the log record is converted to JSON in the following format:
{ "raw_event": "<json_escaped_raw_event>" }
Additional metadata, including the NXLog Agent-specific fields EventReceivedTime
, SourceModuleName
, and SourceModuleType
, will not be included in the output unless these values have been written to the $raw_event
field.
The processing required to achieve this depends on the format of the input data, but generally it means you need to:
-
Parse the log record according to the data format.
-
If the input data is already in JSON format, use parse_json() to parse
$raw_event
into fields. -
If the input is unstructured plain text data, copy the value of
$raw_event
to a custom field.
-
-
Create and populate any additional custom fields.
-
Use to_json() to convert the fields to JSON format and update the value of
$raw_event
.
See the Examples section for NXLog Agent configuration examples of the above.
Date format in Elasticsearch
Date strings in the JSON output need to be in a format that is recognized by Elasticsearch for them to be saved as date fields. See the Elasticsearch documentation on the Date field type and format. NXLog Agent provides several ways to format datetime fields in the output and which one to use depends on how log records are being processed.
-
By default, the xm_json module outputs datetime fields in a format that is compatible with Elasticsearch:
YYYY-MM-DDThh:mm:ss.sTZ
. If you are using the to_json() procedure and need to output dates in a different format, specify the DateFormat module level directive. -
If you are not using the xm_json module to convert data to JSON, you can make use of the DateFormat global directive or convert datetime fields individually with the strftime function.
For more information on how to handle date fields, see Log event timestamps in the NXLog Platform User Guide.
Configuration
The om_elasticsearch module accepts the following directives in addition to the common module directives.
The following directives are required for the module to start.
Required directives
This mandatory directive specifies the URL for the module to POST the event data.
If multiple URL directives are specified, the module works in a failover configuration.
If a destination becomes unavailable, the module automatically fails over to the next one.
If the last destination becomes unavailable, the module will fail over to the first destination.
The module operates in plain HTTP or HTTPS mode depending on the URL provided. If the port number is not explicitly indicated in the URL, it defaults to port 80 for HTTP and port 443 for HTTPS.
The URL should point to the _bulk endpoint, or Elasticsearch will return 400 Bad Request.
When sending logs to a data stream, the URL needs to contain the stream’s name, e.g.
|
TLS/SSL directives
The following directives are for configuring secure data transfer via TLS/SSL.
HTTP basic authorization username. |
|||
HTTP basic authorization password.
|
|||
This boolean directive specifies whether the connection should be allowed with an expired certificate.
If set to |
|||
This boolean directive specifies that the connection should be allowed regardless of the certificate verification results.
If set to |
|||
This directive specifies a path to a directory containing certificate authority (CA) certificates. These certificates will be used to verify the certificate presented by the remote server. The certificate files must be named using the OpenSSL hashed format, i.e. the hash of the certificate followed by .0, .1 etc. To find the hash of a certificate using OpenSSL:
For example, if the certificate hash is A remote server’s self-signed certificate (which is not signed by a CA) can also be trusted by including a copy of the certificate in this directory. The default operating system root certificate store will be used if this directive is not specified.
Unix-like operating systems commonly store root certificates in |
|||
This specifies the path of the certificate authority (CA) certificate that will be used to verify the certificate presented by the remote server. A remote server’s self-signed certificate (which is not signed by a CA) can be trusted by specifying the remote server certificate itself. In case of certificates signed by an intermediate CA, the certificate specified must contain the complete certificate chain (certificate bundle). |
|||
This optional directive specifies the thumbprint of the certificate authority (CA) certificate that will be used to verify the certificate presented by the remote server. The hexadecimal fingerprint string can be copied from Windows Certificate Manager (certmgr.msc). Whitespaces are automatically removed. The certificate must be added to a Windows certificate store that is accessible by NXLog Agent. This directive is only supported on Windows and is mutually exclusive with the HTTPSCADir and HTTPSCAFile directives. |
|||
This specifies the path of the certificate file that will be presented to the remote server during the HTTPS handshake. |
|||
This specifies the path of the private key file that was used to generate the certificate specified by the HTTPSCertFile directive. This is used for the HTTPS handshake. |
|||
This optional directive specifies the thumbprint of the certificate that will be presented to the remote server during the HTTPS handshake.
The hexadecimal fingerprint string can be copied from Windows Certificate Manager (certmgr.msc).
Whitespaces are automatically removed. The certificate must be imported to the
This directive is only supported on Windows and is mutually exclusive with the HTTPSCertFile and HTTPSCertKeyFile directives.
|
|||
This directive specifies a path to a directory containing certificate revocation list (CRL) files. These CRL files will be used to check for certificates that were revoked and should no longer be accepted. The files must be named using the OpenSSL hashed format, i.e. the hash of the issuer followed by .r0, .r1 etc. To find the hash of the issuer of a CRL file using OpenSSL:
For example, if the hash is |
|||
This specifies the path of the certificate revocation list (CRL) which will be used to check for certificates that have been revoked and should no longer be accepted. Example to generate a CRL file using OpenSSL:
|
|||
This optional directive specifies a file with dh-parameters for Diffie-Hellman key exchange. These parameters can be generated with dhparam(1ssl). If no directive is specified, default parameters will be used. See OpenSSL Wiki for further details. |
|||
This directive specifies the passphrase of the private key specified by the HTTPSCertKeyFile directive. A passphrase is required when the private key is encrypted. Example to generate a private key with Triple DES encryption using OpenSSL:
This directive is not needed for passwordless private keys. |
|||
This optional boolean directive, when set to |
|||
This optional directive can be used to set the permitted SSL cipher list, overriding the default.
Use the format described in the ciphers(1ssl) man page.
For example specify
|
|||
This optional directive can be used to set the permitted cipher list for TLSv1.3. Use the same format as in the HTTPSSSLCipher directive. Refer to the OpenSSL documentation for a list of valid TLS v1.3 cipher suites. The default value is:
|
|||
This boolean directive allows you to enable data compression when sending data over the network. The compression mechanism is based on the zlib compression library. If the directive is not specified, it defaults to FALSE: compression is disabled.
|
|||
This directive can be used to set the allowed SSL/TLS protocol(s).
It takes a comma-separated list of values which can be any of the following: |
Optional directives
This optional directive specifies an additional header to be added to each HTTP request. |
|||
This boolean directive determines whether the event data is inserted into
a data stream. By default its value is |
|||
This directive allows to specify a custom
_id field
for Elasticsearch documents. If the directive is not defined, Elasticsearch
uses a GUID for the |
|||
This directive specifies the index to insert the event data into.
It must be a string type expression.
If the expression in the Index directive is not a constant string (it contains functions, field names, or operators), it will be evaluated for each event to be inserted.
The default is |
|||
This directive specifies the index type to use in the bulk index command. It must be a string type expression. If the expression in the IndexType directive is not a constant string (it contains functions, field names, or operators), it will be evaluated for each event to be inserted. By default, no index type is sent for compatibility with Elasticsearch 8.x. Index mapping types have been gradually deprecated starting with Elasticsearch 6.0.0, and support for them was completely removed in Elasticsearch 8.0.0. IndexType should only be used if required for Elasticsearch 7.x or older, or for custom types in Elasticsearch 8.x. See Removal of mapping types in the Elasticsearch Reference for more info. |
|||
This optional directive specifies the local port number of the connection. If this is not specified, a random high port number will be used, which is not always ideal in firewalled network environments.
|
|||
This optional directive is used to specify the IP address (or hostname) and port number of the HTTP proxy server to be used.
The format is
|
|||
This directive has been deprecated. Please use the Proxy directive instead. |
|||
This directive has been deprecated. Please use the Proxy directive instead. |
|||
This optional directive sets the reconnect interval in seconds. If it is set, the module attempts to reconnect in every defined second. If it is not set, the reconnect interval will start at 1 second and doubles with every attempt. If the duration of the successful connection is greater than the current reconnect interval, then the reconnect interval will be reset to 1 sec.
|
|||
This specifies how many times the module will attempt to resend data events in the event that it is rejected by the server. A negative value disables retry limit checking. If not specified, it defaults to 2. |
|||
This optional directive specifies the hostname used for Server Name Indication (SNI) in HTTPS mode. If not specified, it defaults to the hostname in the URL directive. |
|||
This optional block directive can be used to specify a group of statements to handle errors reported by the Elasticsearch server for each document/record. All response status codes that are not between 200 and 299 are treated as errors. OnError can be used to perform custom error handling. For example, records that are rejected by the Elasticsearch server can be droppped or rerouted.
|
Functions
The following functions are exported by om_elasticsearch.
- integer
get_response_code()
-
Returns the response code for the current record. This function can only be used inside OnError Exec blocks.
- integer
get_retry_count()
-
Returns the retry count for the current record. The retry count starts at
1
when processing the first JSON response received for a ES request, and is incremented by1
for every subsequent response for the same record. This function can only be used inside OnError Exec blocks.
Procedures
The following procedures are exported by om_elasticsearch.
reconnect();
-
Force a reconnection. This can be used from a Schedule block to periodically reconnect to the server.
The reconnect() procedure must be used with caution. If configured, it can attempt to reconnect after every event sent, potentially overloading the destination system.
Examples
This configuration reads log records from file and forwards them to the Elasticsearch server on localhost. No further processing is done on the log records.
<Extension json>
Module xm_json
</Extension>
<Input file>
Module im_file
File '/var/log/myapp*.log'
BatchSize 200
BatchFlushInterval 2
# Parse log here if needed
# $EventTime should be set here
</Input>
<Output elasticsearch>
Module om_elasticsearch
URL http://localhost:9200/_bulk
# Create an index daily
Index strftime($EventTime, "nxlog-%Y%m%d")
# Or use the following if $EventTime is not set
# Index strftime(now(), "nxlog-%Y%m%d")
</Output>
The following is a log record sample read by NXLog Agent.
Mar 24 15:58:53 pc1 systemd[1452]: tracker-store.service: Succeeded.
The following is the JSON-formatted log record that will be sent to Elasticsearch.
{
"raw_event": "Mar 24 15:58:53 pc1 systemd[1452]: tracker-store.service: Succeeded."
}
This configuration reads log records from a file and adds a $Hostname
metadata field.
Log records are converted to JSON using the to_json() procedure of the xm_json module before they are forwarded to Elasticsearch.
<Extension json>
Module xm_json
</Extension>
<Input file>
Module im_file
File '/var/log/myapp*.log'
Exec $Hostname = hostname();
Exec $Message = $raw_event;
</Input>
<Output elasticsearch>
Module om_elasticsearch
URL http://localhost:9200/_bulk
Exec to_json();
</Output>
The following is a log record sample read by NXLog Agent.
Mar 24 15:58:53 pc1 systemd[1452]: tracker-store.service: Succeeded.
The following is the JSON-formatted log record that will be sent to Elasticsearch.
{
"EventReceivedTime": "2021-03-24T16:52:20.457348+01:00",
"SourceModuleName": "file",
"SourceModuleType": "im_file",
"Hostname": "pc1",
"Message": "Mar 24 15:58:53 pc1 systemd[1452]: tracker-store.service: Succeeded."
}
This configuration reads syslog records from a file. It uses the parse_syslog() procedure of the xm_syslog module to parse logs into structured data. Log records are then converted to JSON using the to_json() procedure of the xm_json module before they are forwarded to Elasticsearch.
<Extension syslog>
Module xm_syslog
</Extension>
<Extension json>
Module xm_json
</Extension>
<Input file>
Module im_file
File '/var/log/myapp*.log'
Exec parse_syslog();
</Input>
<Output elasticsearch>
Module om_elasticsearch
URL http://localhost:9200/_bulk
Exec to_json();
</Output>
The following is a log record sample read by NXLog Agent.
Mar 24 15:58:53 pc1 systemd[1452]: tracker-store.service: Succeeded.
The following is the JSON-formatted log record that will be sent to Elasticsearch.
{
"EventReceivedTime": "2021-03-24T16:30:18.920342+01:00",
"SourceModuleName": "file",
"SourceModuleType": "im_file",
"SyslogFacilityValue": 1,
"SyslogFacility": "USER",
"SyslogSeverityValue": 5,
"SyslogSeverity": "NOTICE",
"SeverityValue": 2,
"Severity": "INFO",
"Hostname": "pc1",
"EventTime": "2021-03-24T15:58:53.000000+01:00",
"SourceName": "systemd",
"ProcessID": 1452,
"Message": "tracker-store.service: Succeeded."
}
This configuration reads JSON-formatted log records from a file.
It uses the parse_json() procedure of the xm_json module to parse logs into structured data and adds an $EventType
metadata field.
Log records are then converted back to JSON using the to_json() procedure before they are forwarded to Elasticsearch.
<Extension json>
Module xm_json
</Extension>
<Input file>
Module im_file
File '/var/log/myapp*.log'
Exec parse_json();
Exec $EventType = "browser-history";
</Input>
<Output elasticsearch>
Module om_elasticsearch
URL http://localhost:9200/_bulk
Exec to_json();
</Output>
The following is a log record sample read by NXLog Agent.
{
"AccessTime": "2021-03-24T16:30:43.000000+01:00",
"URL": "https://nxlog.co",
"Title": "High Performance Log Collection Solutions",
"Username": "user1"
}
The following is the JSON-formatted log record that will be sent to Elasticsearch.
{
"EventReceivedTime": "2021-03-24T17:14:23.908155+01:00",
"SourceModuleName": "file",
"SourceModuleType": "im_file",
"AccessTime": "2021-03-24T16:30:43.000000+01:00",
"URL": "https://nxlog.co",
"Title": "High Performance Log Collection Solutions",
"Username": "user1",
"EventType": "browser-history"
}
This configuration sends log records to an Elasticsearch server in a failover configuration (multiple URLs defined).
The actual destinations used in this case are http://localhost:9200/_bulk
,http://192.168.1.1:9200/_bulk
, and http://example.com:9200/_bulk
.
<Extension json>
Module xm_json
</Extension>
<Output elasticsearch>
Module om_elasticsearch
URL http://localhost:9200/_bulk
URL http://192.168.1.1:9200/_bulk
URL http://example.com:9200/_bulk
</Output>
This configuration collects all records that are failed to process by the Elastic server.
<Extension json>
Module xm_json
</Extension>
<Output elastic>
Module om_elasticsearch
URL http://localhost:9200/_bulk
Index strftime(now(), "test")
NoDefaultIndexType TRUE
DropOnError TRUE
<Exec>
json->to_json();
</Exec>
<OnRecordError>
<Exec>
$resp_code = get_response_code();
if $resp_code == 400
{
reroute("reroute_es_errors");
}
</Exec>
</OnRecordError>
</Output>
<Input null>
Module im_null
</Input>
<Output failed_es_logs>
Module om_file
File "/var/log/failed_es.logs"
</Output>
<Route reroute_es_errors>
Path null => failed_es_logs
</Route>