NXLog Legacy Documentation

Salesforce

This add-on is available for purchase. For more information, please contact us.

The Salesforce add-on provides support for fetching Salesforce logs from Salesforce with NXLog. The script collects Event Log Files from a Salesforce instance by periodically running SOQL queries via the REST API. The Events can then be passed to NXLog by different means, depending how the data collection is configured.

For more information about the Event Log File API, see EventLogFile in the Salesforce SOAP API Developer Guide.

The Event Logs feature of Salesforce is a a paid add-on feature. Make sure this feature is enabled on the Salesforce instance before continuing.

General usage

The salesforce.py script can be configured both from the command line and from a configuration file. The configuration file collect.conf.json must be located in the same directory as salesforce.py, so that the script can load the configuration parameters automatically. Passing arguments from the command line overrides the corresponding parameter read from the configuration file. The following is a sample configuration file:

collect.conf.json
{
  "log_level": "DEBUG",
  "log_file": "var/collector.log",
  "user": "user@example.com",
  "password": "UxQqx847sQ",
  "token": "ZsQO0k5gAgJch3mLUtEqt0K",
  "url": "https://login.salesforce.com/services/Soap/u/39.0/",
  "checkpoint": "var/checkpoint/",
  "keep_csv": "True",
  "output": "structured",
  "header": "none",
  "mode": "across",
  "transport": "stdout",
  "target": "file",
  "limit": "5",
  "delay": "3",
  "request_delay": "3600"
}

A compact view of the command line options is shown below. Use salesforce.py -h to get help, including a short explanation of the options.

salesforce.py usage
usage: salesforce.py [-h] [--config CONFIG] [--user USER]
                     [--password PASSWORD] [--token TOKEN] [--url URL]
                     [--checkpoint CHECKPOINT] [--keep_csv {True,False}]
                     [--output {json,structured}] [--header {none,syslog}]
                     [--mode {loop,across}] [--target TARGET] [--delay DELAY]
                     [--limit LIMIT] [--request_delay REQUEST_DELAY]
                     [--transport {file,socket,pipe,stdout}]
                     [--log_level {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}]
                     [--log_file LOG_FILE]

Authentication and data retrieval

The user needs to set the authentication parameters (username, password, and token) so that the script can connect to Salesforce and retrieve the Event Logs. The url parameter supplied with the sample configuration file is correct at the time of writing but it may change in the future. The log_level and log_file parameters can be used as an aid during the initial setup, as well as to identify problems during operation.

It is not possible to find the security token of an existing profile. The solution is to reset it as described in Reset Your Security Token on Salesforce Help.

Depending on your setup, the mode parameter can be set to loop so that the script will look for new events continuously or to across so that once all the available events are retrieved the script will terminate. When in loop mode, the request_delay can be configured for the script to wait the specified number of seconds before requesting more events.

Local storage and processing

The script will temporarily store the Event Log Files in a directory structure under the directory name given by the checkpoint parameter. The events are stored in CSV format. Files with the same name but with a .state extension hold the current state, so that no events will be lost or duplicated even if the script terminates unexpectedly. The the directory structure is shown below.

Directories and files are created automatically when an event of that type is logged by Salesforce.
var/checkpoint/ApexExecution:
	2018-02-08T00:00:00.000+0000.csv
	2018-02-08T00:00:00.000+0000.state
var/checkpoint/LightningError:
	2018-02-08T00:00:00.000+0000.csv
	2018-02-08T00:00:00.000+0000.state
var/checkpoint/Login:
	2018-02-08T00:00:00.000+0000.csv
	2018-02-08T00:00:00.000+0000.state
var/checkpoint/Logout:
	2018-02-08T00:00:00.000+0000.csv
	2018-02-08T00:00:00.000+0000.state
var/checkpoint/PackageInstall:
	2018-03-01T00:00:00.000+0000.csv
	2018-03-01T00:00:00.000+0000.state
If this directory structure is removed, the script will be unable to determine the state and all available events stored in your Salesforce instance will be retrieved and passed to NXLog again. However, after testing and determining that everything is configured correctly, remember to delete the directory structure to reset the state.

Once all the available events have been downloaded and the script determines that no other events has been added, it will proceed to process them and produce the final output. The limit and delay parameters can be set to throttle the processing by limiting by number of records and delaying between blocks of records in seconds.

The script will delete the CSV files once those are processed. However, the keep_csv parameter can be set to True to preserve them.

Data format and transport

The processed events can be presented in two different formats: either as structured output or as JSON. This can be selected by setting the output parameter accordingly. Furthermore, a Syslog style header can be added before the event data by means of the header parameter. The output types are show below.

Structured output
CLIENT_IP="46.198.211.113" OS_NAME="LINUX" DEVICE_SESSION_ID="33ddcf5f751fdaf4b6a010d73014710ed2f13e33" BROWSER_NAME="CHROME" BROWSER_VERSION="64" USER_AGENT=""Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36"" CLIENT_ID="" REQUEST_ID="" SESSION_KEY="qomr/wgmbMU73iG6" DEVICE_ID="" CONNECTION_TYPE="" EVENT_TYPE="LightningError" SDK_APP_VERSION="" SDK_APP_TYPE="" UI_EVENT_SOURCE="storage" SDK_VERSION="" UI_EVENT_SEQUENCE_NUM="" LOGIN_KEY="5ujU+09kPSKatTxR" UI_EVENT_TYPE="error" PAGE_START_TIME="1519928816975" DEVICE_MODEL="" USER_TYPE="Standard" ORGANIZATION_ID="00D1r000000rH0F" OS_VERSION="" USER_ID_DERIVED="0051r000007NyeqAAC" UI_EVENT_ID="ltng:error" APP_NAME="one:one" UI_EVENT_TIMESTAMP="1519928819334" USER_ID="0051r000007Nyeq" TIMESTAMP="20180301182702.187" TIMESTAMP_DERIVED="2018-03-01T18:27:02.187Z" DEVICE_PLATFORM="SFX:BROWSER:DESKTOP"
JSON output
{"CLIENT_IP": "Salesforce.com IP", "REQUEST_ID": "4GVCi4pxSjCESP-qby__7-", "SESSION_KEY": "", "API_TYPE": "", "EVENT_TYPE": "Login", "SOURCE_IP": "46.198.211.113", "RUN_TIME": "143", "LOGIN_KEY": "", "USER_NAME": "user@example.com", "CPU_TIME": "57", "BROWSER_TYPE": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36", "URI": "/index.jsp", "ORGANIZATION_ID": "00D1r000000rH0F", "USER_ID_DERIVED": "0051r000007NyeqAAC", "DB_TOTAL_TIME": "47093446", "LOGIN_STATUS": "LOGIN_NO_ERROR", "USER_ID": "0051r000007Nyeq", "TIMESTAMP": "20180302083919.878", "TLS_PROTOCOL": "TLSv1.2", "REQUEST_STATUS": "", "CIPHER_SUITE": "ECDHE-RSA-AES256-GCM-SHA384", "TIMESTAMP_DERIVED": "2018-03-02T08:39:19.878Z", "URI_ID_DERIVED": "", "API_VERSION": "9998.0"}
Structured output with Syslog header
<14>1 2018-03-05T18:37:56.157860 eu12.salesforce.com - - - - NUMBER_FIELDS="2" CLIENT_IP="46.198.211.113" ENTITY_NAME="EventLogFile" DB_CPU_TIME="0" USER_AGENT="5238" REQUEST_ID="4GUW0E969JxN49-qbzCo8-" SESSION_KEY="mmOUNLlL4HlSzrSq" EVENT_TYPE="RestApi" RUN_TIME="8" RESPONSE_SIZE="706" METHOD="GET" CPU_TIME="4" LOGIN_KEY="szBoBvcp+3dHeuff" STATUS_CODE="200" URI="/services/data/v37.0/sobjects/EventLogFile/0AT1r000000NWSKGA4/LogFile" ORGANIZATION_ID="00D1r000000rH0F" REQUEST_STATUS="S" DB_TOTAL_TIME="3319055" ROWS_PROCESSED="1" MEDIA_TYPE="text/csv" DB_BLOCKS="15" USER_ID="0051r000007Nyeq" TIMESTAMP="20180301190010.634" URI_ID_DERIVED="0AT1r000000NWSKGA4" REQUEST_SIZE="0" USER_ID_DERIVED="0051r000007NyeqAAC" TIMESTAMP_DERIVED="2018-03-01T19:00:10.634Z"
The samples above are not from the same event.

The formatted output can then be displayed in standard output, passed to another program by a named pipe, saved to a file, or sent to another program using Unix Domain Sockets (UDS). This can be controlled by setting the transport parameter to stdout, pipe, file, or socket respectively. When the transport is pipe, file, or socket the target parameter can be used to set the name of the pipe, file, or socket.

Configuring NXLog

The versatility of the salesforce.py script, combined with NXLog, allows for several different ways to collect the Event Log Files from Salesforce.

A first scenario is that NXLog is running the script directly and consumes the data from the script. To do this, the script should be running in loop mode, so that events are fetched periodically from Salesforce.

Example 1. Loop mode

NXLog executes salesforce.py, which in turn collects events every hour, processes them, formats them as JSON with a Syslog header, and forwards them to NXLog.

collect.conf.json
{
  "log_level": "DEBUG",
  "log_file": "var/collector.log",
  "user": "user@example.com",
  "password": "UxQqx847sQ",
  "token": "ZsQO0k5gAgJch3mLUtEqt0K",
  "url": "https://login.salesforce.com/services/Soap/u/39.0/",
  "checkpoint": "var/checkpoint/",
  "keep_csv": "True",
  "output": "json",
  "header": "syslog",
  "mode": "loop",
  "transport": "stdout",
  "target": "file",
  "limit": "100",
  "delay": "3",
  "request_delay": "3600"
}
nxlog.conf
<Extension _syslog>
    Module  xm_syslog
</Extension>

<Extension _json>
    Module  xm_json
</Extension>

<Input messages>
    Module  im_exec
    Command ./salesforce.py
    <Exec>
        parse_syslog();
        parse_json($Message);
    </Exec>
</Input>

<Output out>
    Module  om_file
    File    "output.log"
</Output>

<Route messages_to_file>
    Path    messages => out
</Route>

A second scenario: set up NXLog to listen on a UDS for events and use either NXLog or an external scheduler to run salesforce.py. In this case, salesforce.py runs in across mode.

Be sure to provide ample time for the script to finish executing before the scheduler starts a new execution. Or use a shell script that prevents running multiple instances simultaneously.
Example 2. Across mode with NXLog as scheduler
collect.conf.json
{
  "log_level": "DEBUG",
  "log_file": "var/collector.log",
  "user": "user@example.com",
  "password": "UxQqx847sQ",
  "token": "ZsQO0k5gAgJch3mLUtEqt0K",
  "url": "https://login.salesforce.com/services/Soap/u/39.0/",
  "checkpoint": "var/checkpoint/",
  "keep_csv": "True",
  "output": "structured",
  "header": "none",
  "mode": "across",
  "transport": "socket",
  "target": "uds_socket",
  "limit": "100",
  "delay": "3",
  "request_delay": "3600"
}
nxlog.conf
<Extension exec>
    Module  xm_exec
    <Schedule>
        Every   1 hour
        <Exec>
            log_info("Scheduled execution at " + now());
            exec_async("./salesforce.py");
        </Exec>
    </Schedule>
</Extension>

<Input messages>
    Module  im_uds
    UDS     ./uds_socket
    UDSType stream
</Input>

<Output out>
    Module  om_file
    File    "output.log"
</Output>

<Route messages_to_file>
    Path    messages => out
</Route>

It is even possible to manually start the salesforce.py in loop mode with a large request_delay and collect via UDS (as shown above) without the xm_exec instance. Or set the transport to file and configure NXLog to read events with im_file.

Though events are captured in real time, Salesforce generates the Event Log Files during non-peak hours.