File (im_file)
This module can be used to read log messages from files. The file position can be persistently saved across restarts to avoid reading from the beginning again when NXLog Agent is restarted. External rotation tools are also supported. When the module is not able to read any more data from the file, it checks whether the opened file descriptor belongs to the same filename it opened originally. If the inodes differ, the module assumes the file was moved and reopens its input.
im_file uses a one-second interval to monitor files for new messages. This method was implemented because polling a regular file is not supported on all platforms. If there is no more data to read, the module will sleep for 1 second.
By using wildcards, the module can read multiple files simultaneously and will open new files as they appear. It will also enter newly created directories if recursion is enabled.
The module needs to scan the directory content for wildcarded file monitoring. This can present a significant load if there are many files (hundreds or thousands) in the monitored directory. For this reason, it is highly recommended to rotate files out of the monitored directory either using the built-in log rotation capabilities of NXLog Agent or with external tools. |
Configuration
The im_file module accepts the following directives in addition to the common module directives. The File directive is required.
Required directives
The following directives are required for the module to start.
This mandatory directive specifies the name of the input file to open.
It may be used more than once in a single im_file module instance.
The value must be a string type expression.
If the expression in the File directive is not a constant string (it contains functions, field names, or operators), it will be evaluated before each event is
written to the file (and after the Exec is evaluated).
For relative filenames, you should be aware that NXLog Agent changes its working directory to "/" unless the global SpoolDir is set to something else.
On Windows systems, the directory separator is the backslash ( Wildcards are supported in filenames and directories. Wildcards are not regular expressions but are patterns commonly used by Unix shells to expand filenames (also known as "globbing").
|
Optional directives
This directive specifies the maximum number of files NXLog Agent will actively monitor. If there are modifications to more files in parallel than the value of this directive, then modifications to files above this limit will only get noticed after the DirCheckInterval (all data should be collected eventually). Typically there are only a few log sources actively appending data to log files, and the rest of the files are dormant after being rotated, so the default value of 10 files should be sufficient in most cases. This directive is also only relevant in the case of a wildcarded File path. |
|||||||||||||||||||||||||||||||||||||
If set to |
|||||||||||||||||||||||||||||||||||||
This directive specifies how frequently, in seconds, the module will check the monitored directory for modifications to files and new files in case of a wildcarded File path. The default is twice the value of the PollInterval directive (if PollInterval is not set, the default is 2 seconds). Fractional seconds may be specified. We recommend increasing the default if many files cannot be rotated out and the NXLog Agent process is causing a high CPU load. |
|||||||||||||||||||||||||||||||||||||
This boolean directive specifies whether the backslash ( |
|||||||||||||||||||||||||||||||||||||
This directive can specify a file or a set of files (using wildcards) to be excluded. More than one occurrence of the Exclude directive can be specified. |
|||||||||||||||||||||||||||||||||||||
See the InputType directive in the list of common module directives. If this directive is not specified the default is LineBased (the module will use CRLF as the record terminator on Windows, or LF on Unix). This directive also supports data converters, see the description in the InputType section. |
|||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||
This optional block directive can be used to specify a group of statements to execute when a file has been fully read (on end-of-file). Only one OnEOF block can be specified per im_file module instance. The following directives are used inside this block.
|
|||||||||||||||||||||||||||||||||||||
This directive specifies how frequently the module will check for new files and new log entries, in seconds.
If this directive is not specified, it defaults to 1 second.
Fractional seconds may be specified ( |
|||||||||||||||||||||||||||||||||||||
This optional boolean directive instructs the module to only read logs that arrive after NXLog Agent is started.
This directive comes into effect if a saved position is not found, for example on the first start, or when the SavePos directive is The following matrix shows the outcome of this directive in conjunction with the SavePos directive:
|
|||||||||||||||||||||||||||||||||||||
This optional directive specifies the reading order of the elements in a directory. The accepted values are none, CtimeOldestFirst, CtimeNewestFirst (Ctime is file creating time), MtimeOldestFirst, MtimeNewestFirst (Mtime is file modification time), NameAsc and NameDesc (sort is done according to ASCII codes of name characters). If this directive is not specified then none is used as a default which means that the order of entries read from the directory is not specified. |
|||||||||||||||||||||||||||||||||||||
If set to
|
|||||||||||||||||||||||||||||||||||||
This directive specifies whether the module should treat symlinked directories and files as regular directories and files.
If the directive is |
|||||||||||||||||||||||||||||||||||||
If set to
|
|||||||||||||||||||||||||||||||||||||
If this boolean directive is set to |
Functions
The following functions are exported by im_file.
- string
file_name()
-
Return the name of the currently open file, including the path which the log was read from.
- integer
record_number()
-
Returns the number of processed records (including the current record) of the currently open file since it was opened or truncated.
Creating and populating fields
im_file populates the $raw_event
core field with the log message read from file.
Further processing of this field can be done to parse the message into structured data or convert it to a different output format, such as JSON or XML.
See Parsing and converting log records below and Parsing standard log formats in the NXLog Platform User Guide for examples
Examples
This configuration reads logs from a directory, symlinked directories, and files in the directory. The configuration then forwards the logs via TCP to a remote host.
<Input messages>
Module im_file
File "/var/log/messages/*"
FollowSymlinks TRUE
</Input>
<Output tcp>
Module om_tcp
Host 192.168.1.1:514
</Output>
This configuration reads logs from a file and parses the $raw_event
field with a regular expression.
If the regular expression matches, fields are created according to the captured groups, otherwise the log record is dropped.
Finally, the record is converted to JSON format using the to_json() procedure of the xm_json module.
<Extension json>
Module xm_json
</Extension>
<Input messages>
Module im_file
File '/path/to/log/file'
<Exec>
if $raw_event =~ /(?x)^(\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d\+\d\d:\d\d),
(.+),(.+)$/
{
$EventTime = parsedate($1);
$Severity = $2;
$Message = $3;
}
else
{
drop();
}
to_json();
</Exec>
</Input>
2021-11-05T14:03:40+01:00,INFO,The service started successfully
{
"EventReceivedTime": "2021-11-05T14:04:24.244343+01:00",
"SourceModuleName": "messages",
"SourceModuleType": "im_file",
"EventTime": "2021-11-05T14:03:40.000000+01:00",
"Severity": "INFO",
"Message": "The service started successfully"
}