Google Cloud Logging (im_googlelogging)

Google Cloud Logging is a managed service that stores and analyzes log data from applications hosted on Google Cloud and Amazon Web Services.

The Google Cloud Logging API enables you to retrieve logs from Cloud Logging. This module uses the REST version of the API to collect logs from monitored resources such as organizations, projects, and folders.

To examine the supported platforms, see the list of installation packages.

Configuring a Google Cloud service account

im_googlelogging requires a Google Cloud service account and a corresponding private key file in JSON format to connect to the Cloud Logging API. Follow these instructions to create a new service account and download its private key file for an existing project.

Log in to your Google Cloud account and switch to the project you want to configure.
From the navigation menu, click IAM & Admin > Service Accounts.
Click CREATE SERVICE ACCOUNT.
Enter a service account name and description and click CREATE AND CONTINUE.
Select the Owner role from the Role drop-down and click DONE.
Click on the newly created account on the Service accounts page to open its configuration page.
Click the KEYS tab, expand the ADD KEY drop-down and select Create new key.
Select JSON for the key type and click CREATE to download the private key. Save the private key file to a location accessible by NXLog Agent. This file is required for the NXLog Agent configuration.

Configuration

The im_googlelogging module accepts the following directives in addition to the common module directives. The CredentialsFile and ResourceName directives are required.

Required directives

The following directives are required for the module to start.

CredentialsFile

This mandatory directive specifies the path to the private key file of the service account required for authenticating with the Cloud Logging API. See Configuring a Google Cloud service account for more information.

ResourceName

This mandatory directive specifies the name of one or more parent Google Cloud resources from which to retrieve log entries. ResourceName can be specified multiple times. Example of accepted values:

projects/[PROJECT_ID]
organizations/[ORGANIZATION_ID]
billingAccounts/[BILLING_ACCOUNT_ID]
folders/[FOLDER_ID]

For more information check Google Cloud Logging documentation.

HTTP(S) directives

The following directives are for configuring HTTP(S) connection settings.

AddHeader

This optional directive can be specified multiple times to add custom headers to each HTTP request.

Compression

This optional directive can be used to enable HTTP compression for outgoing HTTP messages. The possible values are none, gzip, and deflate. By default, compression is disabled. Please note that some HTTP servers may not accept compressed HTTP requests. If a server doesn’t support a specific compression method, it may return 415 Unsupported Media Type errors in response to compressed requests.

HTTPBasicAuthPassword

HTTP basic authentication password. You must also set the HTTPBasicAuthUser directive to use HTTP authentication.

HTTPBasicAuthUser

HTTP basic authentication username. You must also set the HTTPBasicAuthPassword directive to use HTTP authentication.

HTTPSAllowExpired

Specifies if the connection should be allowed with an expired certificate. If set to TRUE, the remote host will be able to connect with an expired certificate. The default is FALSE: the certificate must not be expired.

HTTPSAllowUntrusted

Specifies if the connection should be allowed without certificate verification. If set to TRUE, the connection will be allowed even if the remote host presents an unknown or self-signed certificate. The default value is FALSE: the remote host must present a trusted certificate.

HTTPSCADir

The path to a directory containing certificate authority (CA) certificates. These certificates will be used to verify the certificate presented by the remote host. The certificate files must be named using the OpenSSL hashed format, i.e. the hash of the certificate followed by .0, .1 etc. To find the hash of a certificate using OpenSSL:

$ openssl x509 -hash -noout -in ca.crt

For example, if the certificate hash is e2f14e4a, then the certificate filename should be e2f14e4a.0. If there is another certificate with the same hash then it should be named e2f14e4a.1 and so on.

A remote host’s self-signed certificate (which is not signed by a CA) can also be trusted by including a copy of the certificate in this directory.

The default operating system root certificate store will be used if this directive is not specified. Unix-like operating systems commonly store root certificates in /etc/ssl/certs. Windows operating systems use the Windows Certificate Store, while macOS uses the Keychain Access Application as the default certificate store. See Certification Authority (CA) certificates in the NXLog Platform User Guide for more information on using this directive.

In addition, Microsoft’s PKI repository contains root certificates for Microsoft services.

HTTPSCAFile

The path of the certificate authority (CA) certificate that will be used to verify the certificate presented by the remote host. A remote host’s self-signed certificate (which is not signed by a CA) can be trusted by specifying the remote host certificate itself. In case of certificates signed by an intermediate CA, the certificate specified must contain the complete certificate chain (certificate bundle).

HTTPSCertFile

The path of the certificate file that will be presented to the remote host during the HTTPS handshake.

HTTPSCertKeyFile

The path of the private key file that was used to generate the certificate specified by the HTTPSCertFile directive. This is used for the HTTPS handshake.

Proxy

This optional directive is used to specify the protocol, IP address (or hostname) and port number of the HTTP or SOCKS proxy host to be used. The format is protocol://hostname:port.

Reconnect

This optional directive sets the reconnect interval in seconds. If it is set, the module attempts to reconnect in every defined second. If it is not set, the reconnect interval will start at 1 second and double with every attempt. In the latter case, when the system decides that the reconnection is successful, the reconnect interval is immediately reset to 1 sec.

The Reconnect directive must be used with caution. If it is used on multiple systems, it can send reconnect requests simultaneously to the same destination, potentially overloading the destination system. It may also cause NXLog Agent to use unusually high system resources or cause NXLog Agent to become unresponsive.

ReconnectOnData

This optional directive defines the behavior when the connection with the remote host is lost. When set to TRUE, the module only attempts to reconnect when it has data to send. The default value is FALSE; it will always keep a connection open with the remote host.

Optional directives

Filter

A query to filter log entries. See Logging query language in the Google Cloud documentation. Only log entries that match the filter will be collected. If a filter is not specified, the module will collect all log entries from the resources listed in ResourceName. Referencing a parent resource not included in ResourceName will return no results. The maximum length of the filter is 20000 characters.

OrderBy

This optional directive specifies how to sort the results. The accepted values are timestamp asc to order results by oldest first and timestamp desc to order results by newest first. The default value is timestamp asc.

PollInterval

This directive specifies how frequently the module will check for new events in seconds. If this directive is not specified, it defaults to 20 seconds.

ReadFromLast

This boolean directive instructs the module on where to start reading events from the log source when NXLog Agent starts.

When TRUE, NXLog Agent will only read events logged after NXLog Agent started, unless SavePos is TRUE and a saved position for this log source exists in the cache file.

When FALSE, NXLog Agent will read all events from the log source, unless SavePos is TRUE and a saved position for this log source exists in the cache file.

The default is TRUE.

The following matrix shows the outcome of this directive in conjunction with the SavePos directive:

ReadFromLast SavePos Saved position Outcome

TRUE

Yes

Reads events from the saved position.

TRUE

Reads events that are logged after NXLog Agent is started.

TRUE

FALSE

Yes

Reads events that are logged after NXLog Agent is started.

TRUE

FALSE

Reads events that are logged after NXLog Agent is started.

FALSE

TRUE

Yes

Reads events from the saved position.

FALSE

TRUE

Reads all events.

FALSE

Yes

Reads all events.

FALSE

Reads all events.

If the NoCache directive is TRUE, it overrides the SavePos directive. In this case, the module behaves as if SavePos is FALSE.

SavePos

This boolean directive instructs the module whether to save the position of the last read event before NXLog Agent exits. On the next startup, NXLog Agent will try to read the saved position from the cache file. Together with the ReadFromLast directive, this directive allows the agent to continue reading events from the saved position.

The default is TRUE; the position of the last read event is saved and will be read from the cache file on the next startup.

If the NoCache directive is TRUE, it overrides the SavePos directive. In this case, the module behaves as if SavePos is FALSE.

StartFrom

This optional directive specifies the time format of the first event to pull. If this directive is not set, the module reads events according to the ReadFromLast directive.

Creating and populating fields

When the im_googlelogging module reads a record from the server, it creates and populates the fields corresponding to the LogEntry structure.

Fields

The following fields are used by im_googlelogging.

$raw_event (type: string)

A list of event fields in key-value pairs.

$HttpRequest (type: hash)

An object containing HTTP request details related to the log entry.

$HttpRequest('CacheFillBytes') (type: integer)

The number of HTTP response bytes inserted into cache. Set only when a cache fill was attempted.

$HttpRequest('CacheHit') (type: boolean)

Whether or not an entity was served from cache (with or without validation).

$HttpRequest('CacheLookup') (type: boolean)

Whether or not a cache lookup was attempted.

$HttpRequest('CacheValidatedWithOriginServer') (type: boolean)

Whether or not the response was validated with the origin server before being served from cache. This field is only meaningful if cacheHit is True.

$HttpRequest('Latency') (type: string)

The request processing latency on the server, from the time the request was received until the response was sent.

$HttpRequest('Method') (type: string)

The request method. Examples: "GET", "HEAD", "PUT", "POST".

$HttpRequest('Protocol') (type: string)

Protocol used for the request. Examples: "HTTP/1.1", "HTTP/2", "websocket".

$HttpRequest('Referer') (type: string)

The referer URL of the request.

$HttpRequest('RemoteIp') (type: string)

The IP address (IPv4 or IPv6) of the client that issued the HTTP request. This field can include port information. Examples: "192.168.1.1", "10.0.0.1:80", "FE80::0202:B3FF:FE1E:8329".

$HttpRequest('RequestSize') (type: integer)

The size of the HTTP request message in bytes, including the request headers and the request body.

$HttpRequest('ResponseSize') (type: integer)

The size of the HTTP response message sent back to the client, in bytes, including the response headers and the response body.

$HttpRequest('ServerIp') (type: string)

The IP address (IPv4 or IPv6) of the origin server that the request was sent to. This field can include port information. Examples: "192.168.1.1", "10.0.0.1:80", "FE80::0202:B3FF:FE1E:8329".

$HttpRequest('Status') (type: integer)

The response code indicating the status of response. Examples: 200, 404.

$HttpRequest('Url') (type: string)

The request url attached to the log.

$HttpRequest('UserAgent') (type: string)

The user agent sent by the client.

$InsertId (type: string)

A unique identifier for the log entry

$JsonPayload (type: string)

The log entry payload, represented as a structure expressed as a JSON object. Only one of TextPayload, JsonPayload, or ProtoPayload will contain data.

$Labels (type: hash)

A list of user-defined labels stored as key-value pairs. Use the format $Labels('MyKey') to access individual labels.

$LogName (type: string)

The resource name of the log to which this log entry belongs.

$LogSplit (type: hash)

An object containng additional information for log correlation. It is a compound value made up of Uid, Index, and TotalSplits.

$LogSplit('Index') (type: integer)

The index of this LogEntry in the sequence of split log entries. Log entries are given |index| values 0, 1, …, n-1 for a sequence of n log entries.

$LogSplit('TotalSplits') (type: integer)

The total number of log entries that the original LogEntry was split into.

$LogSplit('Uid') (type: string)

A globally unique identifier for all log entries in a sequence of split log entries. All log entries with the same $LogSplit('Uid') are assumed to be part of the same sequence of split log entries.

$Operation (type: hash)

An object containing information about an operation the log entry is associated with. It is made up of Id, Producer, First, and Last.

$Operation('First') (type: boolean)

If true first log entry in the operation.

$Operation('Id') (type: string)

An arbitrary operation identifier. Log entries with the same identifier are assumed to be part of the same operation.

$Operation('Last') (type: boolean)

If true last log entry in the operation.

$Operation('Producer') (type: string)

An arbitrary producer identifier.

$ProtoPayload (type: string)

The log entry payload represents a protocol buffer. Some Google Cloud Platform services use this field for their log entry payload. Use the format $ProtoPayload('Key') to access individual fields. It can also work recursively.

NXLog Agent only supports signed 64-bit integers. Floating point numbers will be stored as strings.

Items in arrays can be accessed using their index. For example, the following JSON:

{
  "protoPayload": {
    "authorizationInfo": [
      {
        "permission": "logging.logs.delete",
        "granted": true
      },
      {
        "permission": "logging.logs.insert",
        "granted": true
      }
    ]
  }
}

will be transformed to:

"ProtoPayload.authorizationInfo.0.permission": "logging.logs.delete"
"ProtoPayload.authorizationInfo.0.granted": true
"ProtoPayload.authorizationInfo.1.permission": "logging.logs.insert"
"ProtoPayload.authorizationInfo.1.granted": true

Based on the Google Logging API reference, only the following protocol buffer types are supported:

Only one of TextPayload, JsonPayload, or ProtoPayload will contain data.

$ReceiveTimestamp (type: datetime)

The time the log entry was received by Google Logging.

$Resource (type: hash)

An object representing the monitored resource. It is a compound value made up of the resource Type and Labels.

$Resource('Labels') (type: hash)

List of key-value pairs of the labels included in the associated monitored resource descriptor. To access individual items, use the format $Resource('Labels')('MyLabel').

$Resource('Type') (type: string)

The monitored resource type.

$Severity (type: string)

The severity of the log entry.

$SourceLocation (type: hash)

An object containing information about the source code that generated the log entry. It is made up of File, Line, and Function.

$SourceLocation('File') (type: string)

Source file name. Depending on the runtime environment, this might be a simple name or a fully-qualified name.

$SourceLocation('Function') (type: string)

Human-readable name of the function or method being invoked, with optional context such as the class or package name. This information may be used in contexts such as the logs viewer, where a file and line number are less meaningful. The format can vary by language. For example: qual.if.ied.Class.method (Java), dir/package.func (Go), function (Python).

$SourceLocation('Line') (type: integer)

Line within the source file. 1-based; 0 indicates no line number available.

$SpanId (type: string)

The ID of the Cloud Trace span associated with the current operation in which the log is being written.

$TextPayload (type: string)

The log entry payload. Only one of TextPayload, JsonPayload, or ProtoPayload will contain data.

$Timestamp (type: datetime)

The time the event described by the log entry occurred.

$Trace (type: string)

The REST resource name of the trace being written to Cloud Trace in association with this log entry.

$TraceSampled (type: boolean)

The sampling decision of the trace associated with the log entry.

True means that the trace resource name in the trace field was sampled for storage in a trace backend. False means that the trace was not sampled for storage when this log entry was written, or the sampling decision was unknown at the time. A non-sampled trace value is still useful as a request correlation identifier.

Examples

Example 1. Collecting logs from Google Cloud Logging

This configuration uses the im_googlelogging input module to collect logs from two Google Cloud projects named myfirstproject and mysecondproject.

<Input google_logging>
    Module              im_googlelogging
    CredentialsFile     /path/to/credentials.json (1)
    ResourceName        projects/myfirstproject-343508 (2)
    ResourceName        projects/mysecondprojet-343509
    Filter              prod (3)
</Input>

1	Credentials file for authenticating with the Cloud Logging API. See Configuring a Google Cloud service account for more information.
2	List of monitored Google Cloud resources to poll.
3	This filter retrieves entries that have the label `prod`. NXLog Agent will append `timestamp > <date>` to the filter depending on the ReadFromLast and SavePos directives.