Elasticsearch (om_elasticsearch)
This module forwards logs to an Elasticsearch server. It will connect to the URL specified in the configuration in either plain HTTP or HTTPS mode. This module supports bulk data operations and dynamic indexing. Event data is sent in batches, reducing the latency caused by the HTTP responses, thus improving Elasticsearch server performance.
To examine the supported platforms, see the list of installer packages in the Available Modules chapter. |
This module requires the xm_json extension module to be loaded in order to convert the payload to JSON. See the Output log format section for information on the format of the payload. |
Output log format
om_elasticsearch forwards log records over HTTP(S) as JSON payload.
The JSON format depends on the value of the $raw_event
field.
The module checks if the value of $raw_event
is valid JSON and applies the following rules:
-
If it is valid JSON, the value is forwarded as is.
-
If it is not valid JSON, the log record is converted to JSON in the following format:
{ "raw_event": "<json_escaped_raw_event>" }
Additional metadata, including the NXLog-specific fields EventReceivedTime
, SourceModuleName
, and SourceModuleType
, will not be included in the output unless these values have been written to the $raw_event
field.
The processing required to achieve this depends on the format of the input data, but generally it means you need to:
-
Parse the log record according to the data format.
-
If the input data is already in JSON format, use parse_json() to parse
$raw_event
into fields. -
If the input is unstructured plain text data, copy the value of
$raw_event
to a custom field.
-
-
Create and populate any additional custom fields.
-
Use to_json() to convert the fields to JSON format and update the value of
$raw_event
.
See the Examples section for NXLog configuration examples of the above.
Date format in Elasticsearch
Date strings in the JSON output need to be in a format that is recognized by Elasticsearch in order for them to be saved as date fields. See the Elasticsearch documentation on the Date field type and format. NXLog provides several ways to format datetime fields in the output and which one to use depends on how log records are being processed.
-
By default, the xm_json module outputs datetime fields in a format that is compatible with Elasticsearch:
YYYY-MM-DDThh:mm:ss.sTZ
. If you are using the to_json() procedure and need to output dates in a different format, specify the DateFormat module level directive. -
If you are not using the xm_json module to convert data to JSON, you can make use of the DateFormat global directive or convert datetime fields individually with the strftime function.
For more information on how to handle date fields, see Timestamps in the NXLog User Guide.
Using Elasticsearch with NXLog Enterprise Edition 3.x
Some setup is required when using Elasticsearch with NXLog Enterprise Edition 3.x. Consider the following points. None of this is required with NXLog Enterprise Edition 4.1 and later.
Date format
By default, Elasticsearch will not automatically detect the date format used by NXLog Enterprise Edition 3.x.
As a result, NXLog datetime values, such as $EventTime
, will be mapped as strings rather than dates.
To fix this on Elasticsearch version 7.0 and newer, add a mapping for the date field in your mapping definition. Specify the date formats generated by NXLog Enterprise Edition 3.x as follows:
curl -X PUT "localhost:9200/nxlog?pretty" -H 'Content-Type: application/json' -d'
{
"mappings": {
"properties": {
"date": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||YYYY-MM-dd HH:mm:ss.SSSSSSZ"
}
}
}
}'
To fix this on older versions of Elasticsearch, add an Elasticsearch template for indices matching the specified pattern (nxlog*
).
Extend the dynamic_date_formats
setting to include additional date formats.
For compatibility with indices created with Elasticsearch 5.x or older, use _default_
instead of _doc
(but _default_
will not be supported by Elasticsearch 7.0.0).
$ curl -X PUT localhost:9200/_template/nxlog?pretty \
-H 'Content-Type: application/json' -d '
{
"index_patterns" : ["nxlog*"],
"mappings" : {
"_doc": {
"dynamic_date_formats": [
"strict_date_optional_time",
"YYYY-MM-dd HH:mm:ss.SSSSSSZ",
"YYYY-MM-dd HH:mm:ss"
]
}
}
}'
Index type
The IndexType directive should be set to _doc
(the default in NXLog Enterprise Edition 3.x is logs
).
However, for compatibility with indices created with Elasticsearch 5.x or older, set IndexType as required for the configured mapping types.
See the IndexType directive below for more information.
Configuration
The om_elasticsearch module accepts the following directives in addition to the common module directives. The URL directive is required.
- URL
-
This mandatory directive specifies the URL for the module to POST the event data. If multiple URL directives are specified, the module works in a failover configuration. If a destination becomes unavailable, the module automatically fails over to the next one. If the last destination becomes unavailable, the module will fail over to the first destination. The module operates in plain HTTP or HTTPS mode depending on the URL provided. If the port number is not explicitly indicated in the URL, it defaults to port 80 for HTTP and port 443 for HTTPS. The URL should point to the _bulk endpoint, or Elasticsearch will return 400 Bad Request.
- AddHeader
-
This optional directive specifies an additional header to be added to each HTTP request.
- FlushInterval
-
This directive has been deprecated. See Batch processing for details.
- FlushLimit
-
This directive has been deprecated. See Batch processing for details.
- Index
-
This directive specifies the index to insert the event data into. It must be a string type expression. If the expression in the Index directive is not a constant string (it contains functions, field names, or operators), it will be evaluated for each event to be inserted. The default is
nxlog
. Typically, an expression with strftime() is used to generate an index name based on the event’s time or the current time (for example,strftime(now(), "nxlog-%Y%m%d"
).
- IndexType
-
This directive specifies the index type to use in the bulk index command. It must be a string type expression. If the expression in the IndexType directive is not a constant string (it contains functions, field names, or operators), it will be evaluated for each event to be inserted. The default is
_doc
. This default will be removed in the next major version of NXLog. Note that index mapping types have been deprecated and will be removed in Elasticsearch 7.0.0 (see Removal of mapping types in the Elasticsearch Reference). IndexType should only be used if required for indices created with Elasticsearch 5.x or older. See NoDefaultIndexType for Elasticsearch 8.x and newer.
- ID
-
This directive allows to specify a custom _id field for Elasticsearch documents. If the directive is not defined, Elasticsearch uses a GUID for the
_id
field. Setting custom_id
fields can be useful for correlating Elasticsearch documents in the future and can help to prevent storing duplicate events in the Elasticsearch storage. The directive’s argument must be a string type expression. If the expression in the ID directive is not a constant string (it contains functions, field names, or operators), it will be evaluated for each event to be submitted. You can use a concatenation of event fields and the event timestamp to uniquely and informatively identify events in the Elasticsearch storage.
- HTTPBasicAuthUser
-
HTTP basic authorization username.
- HTTPBasicAuthPassword
-
HTTP basic authorization password.
HTTP authorization works only when both parameters are set. |
- HTTPSAllowExpired
-
This boolean directive specifies whether the connection should be allowed with an expired certificate. If set to
TRUE
, the connection will be allowed even if the remote server presents an expired certificate. The default isFALSE
: the remote server must present a certificate that is not expired.
- HTTPSAllowUntrusted
-
This boolean directive specifies that the connection should be allowed regardless of the certificate verification results. If set to
TRUE
the connection will be allowed with any unexpired certificate provided by a server. The default value isFALSE
: the remote server must present a trusted certificate.
- HTTPSCADir
-
This directive specifies a path to a directory containing certificate authority (CA) certificates. These certificates will be used to verify the certificate presented by the remote server. The certificate files must be named using the OpenSSL hashed format, i.e. the hash of the certificate followed by .0, .1 etc. To find the hash of a certificate using OpenSSL:
$ openssl x509 -hash -noout -in ca.crt
For example if the certificate hash is
e2f14e4a
, then the certificate filename should bee2f14e4a.0
. If there is another certificate with the same hash then it should be namede2f14e4a.1
and so on.A remote server’s self-signed certificate (which is not signed by a CA) can also be trusted by including a copy of the certificate in this directory.
Unix-like operating systems use the /etc/ssl/certs
path as their default for certificates. Windows uses the Windows Certificate Store as a default path for certificates.
- HTTPSCAFile
-
This specifies the path of the certificate authority (CA) certificate that will be used to verify the certificate presented by the remote server. A remote server’s self-signed certificate (which is not signed by a CA) can be trusted by specifying the remote server certificate itself. In case of certificates signed by an intermediate CA, the certificate specified must contain the complete certificate chain (certificate bundle).
- HTTPSCAThumbprint
-
This optional directive specifies the thumbprint of the certificate authority (CA) certificate that will be used to verify the certificate presented by the remote server. The hexadecimal fingerprint string can be copied from Windows Certificate Manager (certmgr.msc). Whitespaces are automatically removed. The certificate must be added to a Windows certificate store that is accessible by NXLog. This directive is only supported on Windows and is mutually exclusive with the HTTPSCADir and HTTPSCAFile directives.
- HTTPSSearchAllCertStores
-
This optional boolean directive, when set to
TRUE
, enables the loading of all available Windows certificates into NXLog, for use during remote certificate verification. Any required certificates must be added to a Windows certificate store that NXLog can access. This directive is mutually exclusive with the HTTPSCAThumbprint, HTTPSCADir and HTTPSCAFile directives.
- HTTPSCertFile
-
This specifies the path of the certificate file that will be presented to the remote server during the HTTPS handshake.
- HTTPSCertKeyFile
-
This specifies the path of the private key file that was used to generate the certificate specified by the HTTPSCertFile directive. This is used for the HTTPS handshake.
- HTTPSCertThumbprint
-
This optional directive specifies the thumbprint of the certificate that will be presented to the remote server during the HTTPS handshake. The hexadecimal fingerprint string can be copied from Windows Certificate Manager (certmgr.msc). Whitespaces are automatically removed. The certificate must be imported to the
Local Computer\Personal
certificate store in PFX format for NXLog to find it. To create a PFX file from the certificate and private key using OpenSSL:$ openssl pkcs12 -export -out server.pfx -inkey server.key -in server.pem
This directive is only supported on Windows and is mutually exclusive with the HTTPSCertFile and HTTPSCertKeyFile directives.
- HTTPSCRLDir
-
This directive specifies a path to a directory containing certificate revocation list (CRL) files. These CRL files will be used to check for certificates that were revoked and should no longer be accepted. The files must be named using the OpenSSL hashed format, i.e. the hash of the issuer followed by .r0, .r1 etc. To find the hash of the issuer of a CRL file using OpenSSL:
$ openssl crl -hash -noout -in crl.pem
For example if the hash is
e2f14e4a
, then the filename should bee2f14e4a.r0
. If there is another file with the same hash then it should be namede2f14e4a.r1
and so on.
- HTTPSCRLFile
-
This specifies the path of the certificate revocation list (CRL) which will be used to check for certificates that have been revoked and should no longer be accepted. Example to generate a CRL file using OpenSSL:
$ openssl ca -gencrl -out crl.pem
- HTTPSDHFile
-
This optional directive specifies file with dh-parameters for Diffie-Hellman key exchange. These parameters can be generated with dhparam(1ssl). If no directive is specified, default parameters will be used. See OpenSSL Wiki for further details.
- HTTPSKeyPass
-
This directive specifies the passphrase of the private key specified by the HTTPSCertKeyFile directive. A passphrase is required when the private key is encrypted. Example to generate a private key with Triple DES encryption using OpenSSL:
$ openssl genrsa -des3 -out server.key 2048
This directive is not needed for passwordless private keys.
- HTTPSSSLCipher
-
This optional directive can be used to set the permitted SSL cipher list, overriding the default. Use the format described in the ciphers(1ssl) man page. For example specify
RSA:!COMPLEMENTOFALL
to include all ciphers with RSA authentication but leave out ciphers without encryption.
If RSA or DSA ciphers with Diffie-Hellman key exchange are used, DHFile can be set for specifying custom dh-parameters. |
- HTTPSSSLCiphersuites
-
This optional directive can be used to set the permitted cipher list for TLSv1.3. Use the same format as in the HTTPSSSLCipher directive. Refer to the OpenSSL documentation for a list of valid TLS v1.3 cipher suites. The default value is:
TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256
- HTTPSSSLCompression
-
This boolean directive allows you to enable data compression when sending data over the network. The compression mechanism is based on the zlib compression library. If the directive is not specified, it defaults to FALSE: compression is disabled.
Some Linux packages (for example, Debian) use the OpenSSL library provided by the OS and may not support the zlib compression mechanism. The module will emit a warning on startup if the compression support is missing. The generic deb/rpm packages are bundled with a zlib-enabled libssl library.
- HTTPSSSLProtocol
-
This directive can be used to set the allowed SSL/TLS protocol(s). It takes a comma-separated list of values which can be any of the following:
SSLv2
,SSLv3
,TLSv1
,TLSv1.1
,TLSv1.2
andTLSv1.3
. By default, theTLSv1.2
andTLSv1.3
protocols are allowed. Note that the OpenSSL library shipped by Linux distributions may not supportSSLv2
andSSLv3
, and these will not work even if enabled with this directive.
- LocalPort
-
This optional directive specifies the local port number of the connection. If this is not specified, a random high port number will be used, which is not always ideal in firewalled network environments.
Due to the required
TIME-WAIT
delay in closing connections, attempts to bind toLocalPort
may fail. In such cases, the messageAddress already in use
will be written tonxlog.log
. If the situation persists, it could impede network performance.
- Proxy
-
This optional directive is used to specify the IP address (or hostname) and port number of the HTTP proxy server to be used. The format is
hostname:port
. If the port number is ommited, it defaults to 80.The om_elasticsearch module supports HTTP proxying only. SOCKS4/SOCKS5 proxying is not supported.
- ProxyAddress
-
This directive has been deprecated. Please use the Proxy directive instead.
- ProxyPort
-
This directive has been deprecated. Please use the Proxy directive instead.
- Reconnect
-
This optional directive sets the reconnect interval in seconds. If it is set, the module attempts to reconnect in every defined second. If it is not set, the reconnect interval will start at 1 second and doubles with every attempt. If the duration of the successful connection is greater than the current reconnect interval, then the reconnect interval will be reset to 1 sec.
The Reconnect directive must be used with caution. If it is used on multiple systems, it can send reconnect requests simultaneously to the same destination, potentially overloading the destination system. It may also cause NXLog to use unusually high system resources or cause NXLog to become unresponsive.
- NoDefaultIndexType
-
This boolean directive can be set to
TRUE
to disable sending the default IndexType to Elasticsearch 7.x/8.x. The default value isFALSE
. Note that this directive will be removed in the next major version of NXLog.
- SNI
-
This optional directive specifies the hostname used for Server Name Indication (SNI) in HTTPS mode. If not specified, it defaults to the hostname in the URL directive.
Procedures
The following procedures are exported by om_elasticsearch.
reconnect();
-
Force a reconnection. This can be used from a Schedule block to periodically reconnect to the server.
The reconnect() procedure must be used with caution. If configured, it can attempt to reconnect after every event sent, potentially overloading the destination system.
Examples
This configuration reads log records from file and forwards them to the Elasticsearch server on localhost. No further processing is done on the log records.
<Extension json>
Module xm_json
</Extension>
<Input file>
Module im_file
File '/var/log/myapp*.log'
BatchSize 200
BatchFlushInterval 2
# Parse log here if needed
# $EventTime should be set here
</Input>
<Output elasticsearch>
Module om_elasticsearch
URL http://localhost:9200/_bulk
# Create an index daily
Index strftime($EventTime, "nxlog-%Y%m%d")
# Or use the following if $EventTime is not set
# Index strftime(now(), "nxlog-%Y%m%d")
</Output>
The following is a log record sample read by NXLog.
Mar 24 15:58:53 pc1 systemd[1452]: tracker-store.service: Succeeded.
The following is the JSON-formatted log record that will be sent to Elasticsearch.
{
"raw_event": "Mar 24 15:58:53 pc1 systemd[1452]: tracker-store.service: Succeeded."
}
This configuration reads log records from file and adds a $Hostname
metadata field.
Log records are converted to JSON using the to_json() procedure of the xm_json module before they are forwarded to Elasticsearch.
<Extension json>
Module xm_json
</Extension>
<Input file>
Module im_file
File '/var/log/myapp*.log'
Exec $Hostname = hostname();
Exec $Message = $raw_event;
</Input>
<Output elasticsearch>
Module om_elasticsearch
URL http://localhost:9200/_bulk
Exec to_json();
</Output>
The following is a log record sample read by NXLog.
Mar 24 15:58:53 pc1 systemd[1452]: tracker-store.service: Succeeded.
The following is the JSON-formatted log record that will be sent to Elasticsearch.
{
"EventReceivedTime": "2021-03-24T16:52:20.457348+01:00",
"SourceModuleName": "file",
"SourceModuleType": "im_file",
"Hostname": "pc1",
"Message": "Mar 24 15:58:53 pc1 systemd[1452]: tracker-store.service: Succeeded."
}
This configuration reads syslog records from file. It uses the parse_syslog() procedure of the xm_syslog module to parse logs into structured data. Log records are then converted to JSON using the to_json() procedure of the xm_json module before they are forwarded to Elasticsearch.
<Extension syslog>
Module xm_syslog
</Extension>
<Extension json>
Module xm_json
</Extension>
<Input file>
Module im_file
File '/var/log/myapp*.log'
Exec parse_syslog();
</Input>
<Output elasticsearch>
Module om_elasticsearch
URL http://localhost:9200/_bulk
Exec to_json();
</Output>
The following is a log record sample read by NXLog.
Mar 24 15:58:53 pc1 systemd[1452]: tracker-store.service: Succeeded.
The following is the JSON-formatted log record that will be sent to Elasticsearch.
{
"EventReceivedTime": "2021-03-24T16:30:18.920342+01:00",
"SourceModuleName": "file",
"SourceModuleType": "im_file",
"SyslogFacilityValue": 1,
"SyslogFacility": "USER",
"SyslogSeverityValue": 5,
"SyslogSeverity": "NOTICE",
"SeverityValue": 2,
"Severity": "INFO",
"Hostname": "pc1",
"EventTime": "2021-03-24T15:58:53.000000+01:00",
"SourceName": "systemd",
"ProcessID": 1452,
"Message": "tracker-store.service: Succeeded."
}
This configuration reads JSON-formatted log records from file.
It uses the parse_json() procedure of the xm_json module to parse logs into structured data and adds an $EventType
metadata field.
Log records are then converted back to JSON using the to_json() procedure before they are forwarded to Elasticsearch.
<Extension json>
Module xm_json
</Extension>
<Input file>
Module im_file
File '/var/log/myapp*.log'
Exec parse_json();
Exec $EventType = "browser-history";
</Input>
<Output elasticsearch>
Module om_elasticsearch
URL http://localhost:9200/_bulk
Exec to_json();
</Output>
The following is a log record sample read by NXLog.
{
"AccessTime": "2021-03-24T16:30:43.000000+01:00",
"URL": "https://nxlog.co",
"Title": "High Performance Log Collection Solutions",
"Username": "user1"
}
The following is the JSON-formatted log record that will be sent to Elasticsearch.
{
"EventReceivedTime": "2021-03-24T17:14:23.908155+01:00",
"SourceModuleName": "file",
"SourceModuleType": "im_file",
"AccessTime": "2021-03-24T16:30:43.000000+01:00",
"URL": "https://nxlog.co",
"Title": "High Performance Log Collection Solutions",
"Username": "user1",
"EventType": "browser-history"
}
This configuration sends log records to an Elasticsearch server in a failover configuration (multiple URLs defined).
The actual destinations used in this case are http://localhost:9200/_bulk
,http://192.168.1.1:9200/_bulk
, and http://example.com:9200/_bulk
.
<Extension json>
Module xm_json
</Extension>
<Output elasticsearch>
Module om_elasticsearch
URL http://localhost:9200/_bulk
URL http://192.168.1.1:9200/_bulk
URL http://example.com:9200/_bulk
</Output>