NXLog Legacy Documentation

WebHDFS (om_webhdfs)

This module allows logs to be stored in Hadoop HDFS using the WebHDFS protocol.

Configuration

The om_webhdfs module accepts the following directives in addition to the common module directives. The File and URL directives are required.

Required directives

The following directives are required for the module to start.

File

This mandatory directive specifies the name of the destination file. It must be a string type expression. If the expression in the File directive is not a constant string (it contains functions, field names, or operators), it will be evaluated before each request is dispatched to the WebHDFS REST endpoint (and after the Exec is evaluated). Note that the filename must be quoted to be a valid string literal, unlike in other directives which take a filename argument.

URL

This mandatory directive specifies the URL of the WebHDFS REST endpoint where the module should POST the event data. The module operates in plain HTTP or HTTPS mode depending on the URL provided, and connects to the hostname specified in the URL. If the port number is not explicitly indicated in the URL, it defaults to port 80 for HTTP and port 443 for HTTPS.

HTTPS directives

The following directives configure secure data transfer via HTTPS.

HTTPSAllowExpired

Specifies if the connection should be allowed with an expired certificate. If set to TRUE, the remote host will be able to connect with an expired certificate. The default is FALSE: the certificate must not be expired. This directive is only valid if HTTPSRequireCert is set to TRUE.

HTTPSAllowHostnameValidation

Specifies if the certificate FQDN should be validated against the server hostname or not. If set to TRUE, the connection will only be allowed if the certificate FQDN corresponds to the server hostname. The default value is FALSE: the remote server hostname is not validated.

HTTPSAllowUntrusted

Specifies if the connection should be allowed without certificate verification. If set to TRUE, the connection will be allowed even if the remote host presents an unknown or self-signed certificate. The default value is FALSE: the remote host must present a trusted certificate.

HTTPSCADir

The path to a directory containing certificate authority (CA) certificates. These certificates will be used to verify the certificate presented by the remote host. The certificate files must be named using the OpenSSL hashed format, i.e. the hash of the certificate followed by .0, .1 etc. To find the hash of a certificate using OpenSSL:

$ openssl x509 -hash -noout -in ca.crt

For example, if the certificate hash is e2f14e4a, then the certificate filename should be e2f14e4a.0. If there is another certificate with the same hash then it should be named e2f14e4a.1 and so on.

A remote host’s self-signed certificate (which is not signed by a CA) can also be trusted by including a copy of the certificate in this directory.

The default operating system root certificate store will be used if this directive is not specified. Unix-like operating systems commonly store root certificates in /etc/ssl/certs. Windows operating systems use the Windows Certificate Store, while macOS uses the Keychain Access Application as the default certificate store. See NXLog TLS/SSL configuration in the User Guide for more information on using this directive.

In addition, Microsoft’s PKI repository contains root certificates for Microsoft services.

HTTPSCAFile

The path of the certificate authority (CA) certificate that will be used to verify the certificate presented by the remote host. A remote host’s self-signed certificate (which is not signed by a CA) can be trusted by specifying the remote host certificate itself. In case of certificates signed by an intermediate CA, the certificate specified must contain the complete certificate chain (certificate bundle).

HTTPSCAPattern

This optional directive, supported only on Windows, defines a pattern for locating a suitable CA (Certificate Authority) certificate and its thumbprint in the native Windows Certificate Storage. The pattern must follow PCRE2 rules and use the format "SUBJECT=, CN=, DN=, SAN=" where DN is "CN=, O=, OU=, L=, ST=, C=". During configuration, this directive resolves into the corresponding CAThumbprint value. If multiple matching certificates are found, the first encountered thumbprint is selected. We recommend ensuring that the used certificate storage is well-maintained for optimal performance. This feature is not dynamic; the agent must be restarted if the certificate changes. This directive is mutually exclusive with the HTTPSCAThumbprint directive.

Configuration examples:

CAPattern    'Test' + ' ' + 'Root'

or

CAPattern    $domain

A normal log output example would look like as follows:

matching pattern [DN=CN=Client\.example\.com;.*?SAN=DNS:Client\.example\.com] to certificate [SUBJECT=US, ClientState, ClientCity, ClientCompany, ClientUnit, Client.example.com, CN=Client.example.com; DN=CN=Client.example.com, O=ClientCompany, OU=ClientUnit, L=ClientCity, ST=ClientState, C=US; SAN=DNS:Client.example.com; DNS:www.Client.example.com; IP:127.0.0.3; ]

HTTPSCAThumbprint

This optional directive, supported only on Windows, specifies the thumbprint of the certificate authority (CA) certificate that will be used to verify the certificate presented by the remote host. The hexadecimal fingerprint string can be copied from Windows Certificate Manager (certmgr.msc). Whitespaces are automatically removed. The certificate must be added to a Windows certificate store that is accessible by NXLog. This directive is mutually exclusive with the HTTPSCADir and HTTPSCAFile directives.

HTTPSCertFile

The path of the certificate file that will be presented to the remote host during the HTTPS handshake.

HTTPSCertKeyFile

The path of the private key file that was used to generate the certificate specified by the HTTPSCertFile directive. This is used for the HTTPS handshake.

HTTPSCertPattern

This optional directive, supported only on Windows, defines a pattern for identifying a corresponding certificate and its thumbprint within the native Windows Certificate Storage. The pattern must follow PCRE2 rules and use the format "SUBJECT=, CN=, DN=, SAN=" where DN is "CN=, O=, OU=, L=, ST=, C=". The certificate must be imported in PFX format into the Local Computer\Personal certificate store for NXLog to locate it. During configuration, this directive is resolved into the corresponding CertThumbprint value. The first found thumbprint will be chosen if multiple certificates match the pattern. We recommend ensuring that the used certificate storage is well-maintained for optimal performance. This feature is not dynamic; the agent must be restarted if the certificate changes. This directive is mutually exclusive with the HTTPSCertThumbprint directive.

Configuration examples:

CertPattern    $hostname + 'Cert'

or

CertPattern    DN=CN=Client\.example\.com;.*?SAN=DNS:Client\.example\.com

A normal log output example would look like as follows:

matching pattern [DN=CN=Client\.example\.com;.*?SAN=DNS:Client\.example\.com] to certificate [SUBJECT=US, ClientState, ClientCity, ClientCompany, ClientUnit, Client.example.com, CN=Client.example.com; DN=CN=Client.example.com, O=ClientCompany, OU=ClientUnit, L=ClientCity, ST=ClientState, C=US; SAN=DNS:Client.example.com; DNS:www.Client.example.com; IP:127.0.0.3; ]

HTTPSCertThumbprint

This optional directive, supported only on Windows, specifies the thumbprint of the certificate that will be presented to the remote host during the HTTPS handshake. The hexadecimal fingerprint string can be copied from Windows Certificate Manager (certmgr.msc). Whitespaces are automatically removed. The certificate must be imported to the Local Computer\Personal certificate store in PFX format for NXLog to find it. Run the following command to create a PFX file from the certificate and private key using OpenSSL:

$ openssl pkcs12 -export -out server.pfx -inkey server.key -in server.pem

When the global directive UseCNGCertificates is set to FALSE the private key associated with the certificate must be exportable.

  • If you generate the certificate request using Windows Certificate Manager, enable the Make private key exportable option from the certificate properties.

  • If you import the certificate with the Windows Certificate Import Wizard, make sure that the Mark this key as exportable option is enabled.

  • If you migrate the certificate and associated private key from one Windows machine to another, select Yes, export the private key when exporting from the source machine.

On the contrary, when the global directive UseCNGCertificates is set to TRUE the private key associated with the certificate does not have to be exportable. In cases like TPM modules, the private key is always nonexportable.

The usage of the directive is the same in all cases:

HTTPSCertThumbprint    7c2cc5a5fb59d4f46082a510e74df17da95e2152

HTTPSCRLDir

The path to a directory containing certificate revocation list (CRL) files. These CRL files will be used to check for certificates that were revoked and should no longer be accepted. The files must be named using the OpenSSL hashed format, i.e. the hash of the issuer followed by .r0, .r1 etc. To find the hash of the issuer of a CRL file using OpenSSL:

$ openssl crl -hash -noout -in crl.pem

For example if the hash is e2f14e4a, then the filename should be e2f14e4a.r0. If there is another file with the same hash then it should be named e2f14e4a.r1 and so on.

HTTPSCRLFile

The path of the certificate revocation list (CRL) which will be used to check for certificates that have been revoked and should no longer be accepted. Example to generate a CRL file using OpenSSL:

$ openssl ca -gencrl -out crl.pem

HTTPSDHFile

This optional directive specifies a file with dh-parameters for Diffie-Hellman key exchange. These parameters can be generated with dhparam(1ssl). If no directive is specified, default parameters will be used. See OpenSSL Wiki for further details.

HTTPSKeyPass

The passphrase of the private key specified by the HTTPSCertKeyFile directive. A passphrase is required when the private key is encrypted. Example to generate a private key with Triple DES encryption using OpenSSL:

$ openssl genrsa -des3 -out server.key 2048

This directive is not needed for passwordless private keys.

HTTPSLoadCertificateChains

If set to TRUE, try to load higher-level certificates from the referenced PEM file which may contain only one certificate or the whole chain. The default value is FALSE: certificates will be instead loaded from the operating system certification storage.

This directive is only supported on Windows.

HTTPSRequireCert

Specifies if the remote HTTPS host must present a certificate. If set to TRUE and there is no certificate presented during the connection handshake, the connection will be refused. The default value is TRUE: each connection must use a certificate.

HTTPSSearchAllCertStores

This optional directive, if set to TRUE, enables the loading of all available Windows certificates into NXLog, for use during remote certificate verification. Any required certificates must be added to a Windows certificate store that NXLog can access. This directive is mutually exclusive with the HTTPSCAThumbprint, HTTPSCADir and HTTPSCAFile directives.

This directive is only supported on Windows.

HTTPSSSLCipher

This optional directive can be used to set the permitted SSL cipher list, overriding the default. Use the format described in the ciphers(1ssl) man page. For example specify RSA:!COMPLEMENTOFALL to include all ciphers with RSA authentication but leave out ciphers without encryption.

If RSA or DSA ciphers with Diffie-Hellman key exchange are used, DHFile can be set for specifying custom dh-parameters.

HTTPSSSLCiphersuites

This optional directive can be used to set the permitted cipher list for TLSv1.3. Use the same format as in the HTTPSSSLCipher directive. Refer to the OpenSSL documentation for a list of valid TLS v1.3 cipher suites. The default value is:

TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256

HTTPSSSLCompression

If set to TRUE, enables data compression when sending data over the network. The compression mechanism is based on the zlib compression library. If the directive is not specified, it defaults to FALSE (the compression is disabled).

Some Linux packages (for example, Debian) use the OpenSSL library provided by the OS and may not support the zlib compression mechanism. The module will emit a warning on startup if the compression support is missing. The generic deb/rpm packages are bundled with a zlib-enabled libssl library.

HTTPSSSLProtocol

This directive can be used to set the allowed SSL/TLS protocol(s). It takes a comma-separated list of values which can be any of the following: SSLv2, SSLv3, TLSv1, TLSv1.1, TLSv1.2 and TLSv1.3. By default, the TLSv1.2 and TLSv1.3 protocols are allowed. Note that the OpenSSL library shipped by Linux distributions may not support SSLv2 and SSLv3, and these will not work even if enabled with this directive.

Sigalgs

The signature algorithm parameter that is being sent to the Windows SSL library. Allowed values depend on the available encryption providers.

This directive is only supported on Windows.

SNI

This optional directive specifies the hostname used for Server Name Indication (SNI) in HTTPS mode. If not specified, it defaults to the hostname in the URL directive.

UseCNGCertificates

If set to TRUE, the module uses the Windows Cryptography API: Next Generation (CNG) to access the private keys associated with certificates identified by a thumbprint.

This directive is only supported on Windows.

Optional directives

FlushInterval

The module will send the data to the endpoint defined in URL after this amount of time in seconds, unless FlushLimit is reached first. This defaults to 5 seconds.

FlushLimit

When the number of events in the output buffer reaches the value specified by this directive, the module will send the data to the endpoint defined in URL. This defaults to 500 events. The FlushInterval may trigger sending the write request before this limit is reached if the log volume is low to ensure that data is sent promptly.

QueryParam

This configuration option can be used to specify additional HTTP Query Parameters such as BlockSize. This option may be used to define more than one parameter:

QueryParam blocksize 42
QueryParam destination /foo

Reconnect

This optional directive sets the reconnect interval in seconds. If it is set, the module attempts to reconnect in every defined second. If it is not set, the reconnect interval will start at 1 second and double with every attempt. If the duration of the successful connection is greater than the current reconnect interval, then the reconnect interval will be reset to 1 sec.

The Reconnect directive must be used with caution. If it is used on multiple systems, it can send reconnect requests simultaneously to the same destination, potentially overloading the destination system. It may also cause NXLog to use unusually high system resources or cause NXLog to become unresponsive.

ReconnectOnData

This optional directive defines the behavior when the connection with the remote host is lost. When set to TRUE, the module only attempts to reconnect when it has data to send. The default value is FALSE; it will always keep a connection open with the remote host.

Procedures

The following procedures are exported by om_webhdfs.

reconnect();

Force a reconnection. This can be used from a Schedule block to periodically reconnect to the server.

The reconnect() procedure must be used with caution. If configured, it can attempt to reconnect after every event sent, potentially overloading the destination system.

Examples

Example 1. Sending Logs to a WebHDFS Server

This example output module instance forwards messages to the specified URL and file using the WebHDFS protocol.

nxlog.conf
<Output hdfs>
   Module       om_webhdfs
   URL          http://hdfsserver.domain.com/
   File         "myfile"
   QueryParam   blocksize 42
   QueryParam   destination /foo
</Output>