Google Cloud Logging (im_googlelogging)

Google Cloud Logging is a managed service that stores and analyzes log data from applications hosted on Google Cloud and Amazon Web Services.

The Google Cloud Logging API enables you to retrieve logs from Cloud Logging. This module uses the REST version of the API to collect logs from monitored resources such as organizations, projects, and folders.

Configuring a Google Cloud service account

im_googlelogging requires a Google Cloud service account and a corresponding private key file in JSON format to connect to the Cloud Logging API. Follow these instructions to create a new service account and download its private key file for an existing project.

  1. Log in to your Google Cloud account and switch to the project you want to configure.

  2. From the navigation menu, click on IAM & Admin > Service Accounts.

    IAM & Admin menu
  3. Click CREATE SERVICE ACCOUNT.

  4. Enter a service account name and description and click CREATE AND CONTINUE.

    Create service account
  5. Select the Owner role from the Role drop-down and click DONE.

    Service account role
  6. Click on the newly created account on the Service accounts page to open its configuration page.

  7. Click the KEYS tab, expand the ADD KEY drop-down and select Create new key.

    Create new service key
  8. Select JSON for the key type and click CREATE to download the private key. Save the private key file to a location accessible by NXLog Agent. This file is required for the NXLog Agent configuration.

    New service key type

Configuration

The im_googlelogging module accepts the following directives in addition to the common module directives. The CredentialsFile and ResourceName directives are required.

Required directives

The following directives are required for the module to start.

CredentialsFile

This mandatory directive specifies the path to the private key file of the service account required for authenticating with the Cloud Logging API. See Configuring a Google Cloud service account for more information.

ResourceName

This mandatory directive specifies the name of one or more parent Google Cloud resources from which to retrieve log entries. ResourceName can be specified multiple times. Example of accepted values:

  • projects/[PROJECT_ID]

  • organizations/[ORGANIZATION_ID]

  • billingAccounts/[BILLING_ACCOUNT_ID]

  • folders/[FOLDER_ID]

For more information check Google Cloud Logging documentation.

HTTP(S) directives

The following directives are for configuring HTTP(S) connection settings.

AddHeader

This optional directive can be specified multiple times to add custom headers to each HTTP request.

Compression

This optional directive can be used to enable HTTP compression for outgoing HTTP messages. The possible values are none, gzip, and deflate. By default, compression is disabled. Please note that some HTTP servers may not accept compressed HTTP requests. If a server doesn’t support a specific compression method, it may return 415 Unsupported Media Type errors in response to compressed requests.

HTTPBasicAuthUser

HTTP basic authorization username. You must also set the HTTPBasicAuthPassword directive to use HTTP authorization.

HTTPBasicAuthPassword

HTTP basic authorization password. You must also set the HTTPBasicAuthUser directive to use HTTP authorization.

HTTPSAllowExpired

This boolean directive specifies whether the connection should be allowed with an expired certificate. If set to TRUE, the remote client will be able to connect with an expired certificate. The default is FALSE: the certificate must not be expired. This directive is only valid if HTTPSRequireCert is set to TRUE.

HTTPSAllowUntrusted

This boolean directive specifies that the connection should be allowed without certificate verification. If set to TRUE, the connection will be allowed even if the remote hosts presents an unknown or self-signed certificate. The default value is FALSE: the remote host must present a trusted certificate.

HTTPSCADir

This directive specifies a path to a directory containing certificate authority (CA) certificates. These certificates will be used to verify the certificate presented by the remote host. The certificate files must be named using the OpenSSL hashed format, i.e. the hash of the certificate followed by .0, .1 etc. To find the hash of a certificate using OpenSSL:

$ openssl x509 -hash -noout -in ca.crt

For example, if the certificate hash is e2f14e4a, then the certificate filename should be e2f14e4a.0. If there is another certificate with the same hash then it should be named e2f14e4a.1 and so on.

A remote host’s self-signed certificate (which is not signed by a CA) can also be trusted by including a copy of the certificate in this directory.

The default operating system root certificate store will be used if this directive is not specified. Unix-like operating systems commonly store root certificates in /etc/ssl/certs. Windows operating systems use the Windows Certificate Store, while macOS uses the Keychain Access Application as the default certificate store.

HTTPSCAFile

This specifies the path of the certificate authority (CA) certificate that will be used to verify the certificate presented by the remote host. A remote host’s self-signed certificate (which is not signed by a CA) can be trusted by specifying the remote host certificate itself. In the case of certificates signed by an intermediate CA, the certificate specified must contain the complete certificate chain (certificate bundle).

HTTPSCertFile

This specifies the path of the certificate file that will be presented to the remote host during the HTTPS handshake.

HTTPSCertKeyFile

This specifies the path of the private key file that was used to generate the certificate specified by the HTTPSCertFile directive. This is used for the HTTPS handshake.

Proxy

This optional directive is used to specify the protocol, IP address (or hostname) and port number of the HTTP or SOCKS proxy server to be used. The format is protocol://hostname:port.

Reconnect

This optional directive sets the reconnect interval in seconds. If it is set, the module attempts to reconnect in every defined second. If it is not set, the reconnect interval will start at 1 second and doubles on every attempt. If the duration of the successful connection is greater than the current reconnect interval, then the reconnect interval will be reset to 1 sec.

Optional directives

Filter

A query to filter log entries. See Logging query language in the Google Cloud documentation. Only log entries that match the filter will be collected. If a filter is not specified, the module will collect all log entries from the resources listed in ResourceName. Referencing a parent resource not included in ResourceName will return no results. The maximum length of the filter is 20000 characters.

OrderBy

This optional directive specifies how to sort the results. The accepted values are timestamp asc to order results by oldest first and timestamp desc to order results by newest first. The default value is timestamp asc.

PollInterval

This directive specifies how frequently the module will check for new events in seconds. If this directive is not specified, it defaults to 20 seconds.

ReadFromLast

This optional boolean directive instructs the module to only read logs that arrive after NXLog Agent is started. This directive comes into effect if a saved position is not found, for example on the first start, or when the SavePos directive is FALSE. When the SavePos directive is TRUE and a previously saved position is found, the module will always resume reading from the saved position. If ReadFromLast is FALSE, the module will read all the available logs. This can result in a lot of messages and is usually not the expected behavior. If this directive is not specified, it defaults to TRUE.

The following matrix shows the outcome of this directive in conjunction with the SavePos directive:

ReadFromLast SavePos SavedPosition Outcome

TRUE

TRUE

No

Reads events that are logged after NXLog Agent is started.

TRUE

TRUE

Yes

Reads events from the saved position.

TRUE

FALSE

No

Reads events that are logged after NXLog Agent is started.

TRUE

FALSE

Yes

Reads events that are logged after NXLog Agent is started.

FALSE

TRUE

No

Reads all events.

FALSE

TRUE

Yes

Reads events from the saved position.

FALSE

FALSE

No

Reads all events.

FALSE

FALSE

Yes

Reads all events.

SavePos

If this boolean directive is set to TRUE, the timestamp of the last read event will be saved when NXLog Agent exits. The timestamp will be read from the cache file upon startup. The default is TRUE, the last timestamp will be saved if this directive is not specified. This directive affects the outcome of the ReadFromLast directive. The SavePos directive can be overridden by the global NoCache directive.

StartFrom

This optional directive specifies the time format of the first event to pull. If this directive is not set, the module reads events according to the ReadFromLast directive.

Creating and populating fields

When the im_googlelogging module reads a record from the server, it creates and populates the fields corresponding to the LogEntry structure.

Fields

The following fields are used by im_googlelogging.

$raw_event (type: string)

A list of event fields in key-value pairs.

$HttpRequest (type: hash)

An object containing HTTP request details related to the log entry.

$HttpRequest('CacheFillBytes') (type: integer)

The number of HTTP response bytes inserted into cache. Set only when a cache fill was attempted.

$HttpRequest('CacheHit') (type: boolean)

Whether or not an entity was served from cache (with or without validation).

$HttpRequest('CacheLookup') (type: boolean)

Whether or not a cache lookup was attempted.

$HttpRequest('CacheValidatedWithOriginServer') (type: boolean)

Whether or not the response was validated with the origin server before being served from cache. This field is only meaningful if cacheHit is True.

$HttpRequest('Latency') (type: string)

The request processing latency on the server, from the time the request was received until the response was sent.

$HttpRequest('Method') (type: string)

The request method. Examples: "GET", "HEAD", "PUT", "POST".

$HttpRequest('Protocol') (type: string)

Protocol used for the request. Examples: "HTTP/1.1", "HTTP/2", "websocket".

$HttpRequest('Referer') (type: string)

The referer URL of the request.

$HttpRequest('RemoteIp') (type: string)

The IP address (IPv4 or IPv6) of the client that issued the HTTP request. This field can include port information. Examples: "192.168.1.1", "10.0.0.1:80", "FE80::0202:B3FF:FE1E:8329".

$HttpRequest('RequestSize') (type: integer)

The size of the HTTP request message in bytes, including the request headers and the request body.

$HttpRequest('ResponseSize') (type: integer)

The size of the HTTP response message sent back to the client, in bytes, including the response headers and the response body.

$HttpRequest('ServerIp') (type: string)

The IP address (IPv4 or IPv6) of the origin server that the request was sent to. This field can include port information. Examples: "192.168.1.1", "10.0.0.1:80", "FE80::0202:B3FF:FE1E:8329".

$HttpRequest('Status') (type: integer)

The response code indicating the status of response. Examples: 200, 404.

$HttpRequest('Url') (type: string)

The request url attached to the log.

$HttpRequest('UserAgent') (type: string)

The user agent sent by the client.

$InsertId (type: string)

A unique identifier for the log entry

$JsonPayload (type: string)

The log entry payload, represented as a structure expressed as a JSON object. Only one of TextPayload, JsonPayload, or ProtoPayload will contain data.

$Labels (type: hash)

A list of user-defined labels stored as key-value pairs. Use the format $Labels('MyKey') to access individual labels.

$LogName (type: string)

The resource name of the log to which this log entry belongs.

$LogSplit (type: hash)

An object containng additional information for log correlation. It is a compound value made up of Uid, Index, and TotalSplits.

$LogSplit('Index') (type: integer)

The index of this LogEntry in the sequence of split log entries. Log entries are given |index| values 0, 1, …​, n-1 for a sequence of n log entries.

$LogSplit('TotalSplits') (type: integer)

The total number of log entries that the original LogEntry was split into.

$LogSplit('Uid') (type: string)

A globally unique identifier for all log entries in a sequence of split log entries. All log entries with the same $LogSplit('Uid') are assumed to be part of the same sequence of split log entries.

$Operation (type: hash)

An object containing information about an operation the log entry is associated with. It is made up of Id, Producer, First, and Last.

$Operation('First') (type: boolean)

If true first log entry in the operation.

$Operation('Id') (type: string)

An arbitrary operation identifier. Log entries with the same identifier are assumed to be part of the same operation.

$Operation('Last') (type: boolean)

If true last log entry in the operation.

$Operation('Producer') (type: string)

An arbitrary producer identifier.

$ProtoPayload (type: string)

The log entry payload represents a protocol buffer. Some Google Cloud Platform services use this field for their log entry payload. Use the format $ProtoPayload('Key') to access individual fields. It can also work recursively.

NXLog Agent only supports signed 64-bit integers. Floating point numbers will be stored as strings.

Items in arrays can be accessed using their index. For example, the following JSON:

{
  "protoPayload": {
    "authorizationInfo": [
      {
        "permission": "logging.logs.delete",
        "granted": true
      },
      {
        "permission": "logging.logs.insert",
        "granted": true
      }
    ]
  }
}

will be transformed to:

"ProtoPayload.authorizationInfo.0.permission": "logging.logs.delete"
"ProtoPayload.authorizationInfo.0.granted": true
"ProtoPayload.authorizationInfo.1.permission": "logging.logs.insert"
"ProtoPayload.authorizationInfo.1.granted": true

Based on the Google Logging API reference, only the following protocol buffer types are supported:

Only one of TextPayload, JsonPayload, or ProtoPayload will contain data.

$ReceiveTimestamp (type: datetime)

The time the log entry was received by Google Logging.

$Resource (type: hash)

An object representing the monitored resource. It is a compound value made up of the resource Type and Labels.

$Resource('Labels') (type: hash)

List of key-value pairs of the labels included in the associated monitored resource descriptor. To access individual items, use the format $Resource('Labels')('MyLabel').

$Resource('Type') (type: string)

The monitored resource type.

$Severity (type: string)

The severity of the log entry.

$SourceLocation (type: hash)

An object containing information about the source code that generated the log entry. It is made up of File, Line, and Function.

$SourceLocation('File') (type: string)

Source file name. Depending on the runtime environment, this might be a simple name or a fully-qualified name.

$SourceLocation('Function') (type: string)

Human-readable name of the function or method being invoked, with optional context such as the class or package name. This information may be used in contexts such as the logs viewer, where a file and line number are less meaningful. The format can vary by language. For example: qual.if.ied.Class.method (Java), dir/package.func (Go), function (Python).

$SourceLocation('Line') (type: integer)

Line within the source file. 1-based; 0 indicates no line number available.

$SpanId (type: string)

The ID of the Cloud Trace span associated with the current operation in which the log is being written.

$TextPayload (type: string)

The log entry payload. Only one of TextPayload, JsonPayload, or ProtoPayload will contain data.

$Timestamp (type: datetime)

The time the event described by the log entry occurred.

$Trace (type: string)

The REST resource name of the trace being written to Cloud Trace in association with this log entry.

$TraceSampled (type: boolean)

The sampling decision of the trace associated with the log entry.

True means that the trace resource name in the trace field was sampled for storage in a trace backend. False means that the trace was not sampled for storage when this log entry was written, or the sampling decision was unknown at the time. A non-sampled trace value is still useful as a request correlation identifier.

Examples

Example 1. Collecting logs from Google Cloud Logging

This configuration uses the im_googlelogging input module to collect logs from two Google Cloud projects named myfirstproject and mysecondproject.

<Input google_logging>
    Module              im_googlelogging
    CredentialsFile     /path/to/credentials.json (1)
    ResourceName        projects/myfirstproject-343508 (2)
    ResourceName        projects/mysecondprojet-343509
    Filter              prod (3)
</Input>
1 Credentials file for authenticating with the Cloud Logging API. See Configuring a Google Cloud service account for more information.
2 List of monitored Google Cloud resources to poll.
3 This filter retrieves entries that have the label prod. NXLog Agent will append timestamp > <date> to the filter depending on the ReadFromLast and SavePos directives.