Cloud Instance Metadata

Cloud providers often allow retrieval of metadata about a virtual machine directly from the instance. NXLog can be configured to enrich the log data with this information, which may include details such as instance ID and type, hostname, and currently used public IP address.

The examples below use the xm_python module and Python scripts for this purpose. Each of the scripts depends on the requests module which can be installed by running pip install requests or with the system’s package manager (for example, apt install python-requests on Debian-based systems).

Example 1. Adding instance metadata to events

In this example, NXLog reads from a generic file with im_file. In the Output block, the xm_python python_call() procedure is used to execute the get_attribute() Python function, which adds one or more metadata fields to the event record. The output is then converted to JSON format and written to a file.

This configuration is applicable for each of cloud providers listed in the following sections, with the corresponding Python code which differs according to the provider.

nxlog.conf
<Extension python>
    Module      xm_python
    PythonCode  metadata.py
</Extension>

<Extension json>
    Module      xm_json
</Extension>

<Input in>
    Module      im_file
    File        '/var/log/input'
</Input>

<Output out>
    Module      om_file
    File        '/tmp/output'
    <Exec>
        # Call Python function; this will add one or more fields to the event
        python_call('get_attribute');

        # Save contents of $raw_event field in $Message prior to JSON conversion
        $Message = $raw_event;

        # Save all fields in event record to $raw_event field in JSON format
        $raw_event = to_json();
    </Exec>
</Output>

Amazon Web Services

The AWS instance metadata service can be accessed with a GET request to 169.254.169.254. For example:

$ curl http://169.254.169.254/

See the Instance Metadata and User Data documentation for more information about retrieving metadata from the AWS EC2 service.

Example 2. Using a Python script to retrieve EC2 instance metadata

The following Python script, which can be used with the xm_python module, collects the instance ID from the EC2 metadata service and adds a field to the event record.

metadata.py
import nxlog, requests

def request_metadata(item):
    """Gets value of metadata attribute 'item', returns text string"""
    # Set metadata URL
    metaurl = 'http://169.254.169.254/latest/meta-data/{0}'.format(item)

    # Send HTTP GET request
    r = requests.get(metaurl)

    # If present, get text payload from the response
    if r.status_code != 404:
        value = r.text
    else:
        value = None

    # Return text value
    return value

def get_attribute(event):
    """Reads metadata and stores as an event field"""
    # Get nxlog module object
    module = event.module

    # Set an attribute to retrieve; in this case: AWS EC2 instance-id
    attribute = 'instance-id'

    # Request for metadata only if not already present in the module
    if 'metadata' not in module:
        module['metadata'] = request_metadata(attribute)

    # Save metadata as an event field
    event.set_field(attribute, module['metadata'])

Azure Cloud

The Azure Instance Metadata Service provides a REST endpoint available at a non-routable IP address (169.254.169.254), which can be accessed only from within the virtual machine. It is necessary to provide the header Metadata: true in order to get the response. For example, the request below retrieves the vmId:

$ curl -H "Metadata:true" \
  "http://169.254.169.254/metadata/instance/compute/vmId?api-version=2017-08-01&format=text"

See the Azure Instance Metadata service for more information about retrieving the metadata of an Azure instance.

Example 3. Using a Python script to retrieve Azure VM metadata

The following Python script, which can be used with the xm_python module, collects the metadata attributes from the Azure Instance Metadata Service API and adds a field to the event record for each.

metadata.py
import json, nxlog, requests

def request_metadata():
    """Gets all metadata values for compute instance, returns dict"""
    # Set metadata URL
    metaurl = 'http://169.254.169.254/metadata/instance/compute?api-version=2017-08-01'
    # Set header required to retrieve metadata
    metaheader = {'Metadata':'true'}

    # Send HTTP GET request
    r = requests.get(metaurl, headers=metaheader)

    # If present, get text payload from the response
    if r.status_code != 404:
        value = r.text
    else:
        value = None

    # Load JSON data into Python dictionary and return
    return json.loads(value)

def get_attribute(event):
    """Reads metadata and stores as event fields"""
    # Get nxlog module object
    module = event.module

    # Request for metadata only if not already present in the module
    if 'metadata' not in module:
        module['metadata'] = request_metadata()

    # Get metadata stored in module object
    metadata = module['metadata']

    # Save attributes and their values as event fields
    for attribute in metadata:
        event.set_field(attribute, metadata[attribute])

Google Compute Engine

The Google Cloud metadata server is available at metadata.google.internal. It is necessary to provide the header Metadata-Flavor: Google in order to get the response. For example, the request below retrieves the instance ID:

$ curl -H "Metadata-Flavor: Google" \
  "http://metadata.google.internal/computeMetadata/v1/instance/id"

See Storing and Retrieving Instance Metadata for more information about retrieving metadata from the Google Compute Engine.

Example 4. Using a Python script to retrieve GCP instance metadata

The following Python script, which can be used with the xm_python module, collects the instance ID from the GCP metadata server and adds a field to the event record.

metadata.py
import nxlog, requests

def request_metadata(item):
    """Gets value of metadata attribute 'item', returns text string"""
    # Set metadata URL
    metaurl = 'http://metadata.google.internal/computeMetadata/v1/instance/{0}'.format(item)
    # Set header require to retrieve metadata
    metaheader = {'Metadata-Flavor':'Google'}

    # Send HTTP GET request
    r = requests.get(metaurl, headers=metaheader)

    # If present, get text payload from the response
    if r.status_code != 404:
        value = r.text
    else:
        value = None

    # Return text value
    return value

def get_attribute(event):
    """Reads metadata and stores as an event field"""
    # Get nxlog module object
    module = event.module

    # Set an attribute to retrieve; in this case: GCE instance id
    attribute = 'id'

    # Request for metadata only if not already present in the module
    if 'metadata' not in module:
        module['metadata'] = request_metadata('id')

    # Save metadata as an event field
    event.set_field(attribute, module['metadata'])
Disclaimer

While we endeavor to keep the information in this topic up to date and correct, NXLog makes no representations or warranties of any kind, express or implied about the completeness, accuracy, reliability, suitability, or availability of the content represented here.

Last revision: 23 May 2019