File (im_file)

This module reads data from one or more files. You can use wildcards to match multiple files, allowing the module to process them concurrently and automatically open new files as they appear. When you enable recursion, the module also detects and reads files in newly created subdirectories.

The module populates the $raw_event core field with data read from the file. You can then process this field further, for example, by parsing it into structured data or transforming it into another format, such as JSON or XML. See Parsing and converting log records below and Parse common event formats in the NXLog Platform User Guide for examples.

Filepath definition

The module treats filenames as case-insensitive on Windows and case-sensitive on Unix/Linux systems. When you specify a relative filename, note that NXLog Agent sets its working directory to / unless you define a different directory with the global SpoolDir directive.

On Windows, the directory separator is the backslash (\). For compatibility, you can use the forward slash (/) as a directory separator, but only when the filename does not include wildcards. If you specify a filename with wildcards, you must use the backslash (\) as the directory separator.

Wildcards

You can use wildcards in filenames and directories. Wildcards are not regular expressions but patterns commonly used by Unix shells to expand filenames (also known as "globbing").

The supported wildcards are:

?

Matches a single character only.

*

Matches zero or more characters.

\*

Matches the asterisk (*) character.

\?

Matches the question mark (?) character.

[…​]

Matches a single character using a character class. The class description consists of individual characters and character ranges separated by a hyphen (-). If the class description starts with ^ or !, the class matches any character that is not listed. You can prefix any character with a backslash (\), which the parser ignores. This allows you to include special characters such as ]. -, ^, and !.

Escape characters

By default, the escape sequence is the backslash character (\). This character is also the directory separator on Windows. Because of this, the module doesn’t support escaping wildcard characters on Windows. See the EscapeGlobPatterns directive for more information.

The module evaluates string literals differently depending on the quotation type:

  • Single quoted strings are interpreted as-is without escaping, for example 'C:\t???\*.log' stays C:\t???\*.log.

  • Escape sequences in double-quoted strings are processed, for example "C:\\t???\*.log" becomes C:\t???\*.log.

In both cases, the evaluated string is the same and is separated into parts with different glob patterns at different levels. On this example, the parts are C:, t???, and *.log. NXLog Agent matches the parts at the relevant directory levels to find all matching files.

Configuration

The im_file module accepts the following directives in addition to the common module directives.

Required directives

The following directive is required for the module to start.

File

Set this directive to specify the input file(s). You can use this directive multiple times in a single im_file module instance, to read from multiple sources.

Specify the value as a string expression. If the expression is not a constant string (contains functions, field names, or operators), it’s evaluated whenever the module checks for new content as defined by PollInterval and DirCheckInterval.

See Filepath definition for more details on using this directive.

If a monitored directory contains hundreds or thousands of files, scanning them can incur substantial performance overhead. To minimize this load, rotate files out of the monitored directory. You can use NXLog Agent’s built-in log rotation features or an external tool. For more information, see Rotate input log files in the NXLog Platform User Guide.

Optional directives

ActiveFiles

Set this directive to specify the maximum number of concurrent files to be actively monitored. This directive is only relevant when using File directives with wildcards. If there are concurrent modifications to more files than the value of this directive, then the remaining modifications will only get noticed after the DirCheckInterval. Typically there are only a few active log sources appending data to log files, so the default value should be sufficient in most cases.

The default is 10.

CloseWhenIdle

Set this directive to TRUE to close files as soon as possible when there is no more data to read. Some applications request an exclusive lock on the log file when written or rotated, and this directive can help if an application tries to reacquire the lock.

The default is FALSE; the module doesn’t close a file as soon as there is no more data to read.

DetectContentChanges

Set this directive to TRUE to detect if a file has been overwritten. This functionality is useful when files are replaced instead of recreated. This detection is done by comparing the saved checksum and the current checksum whenever file changes are detected. If they do not match the file is read from the beginning.

The default is FALSE; the module doesn’t perform any detection comparison for this situation.

This detection method should only be used if absolutely required, since calculating checksums before each file access affects performance.

DirCheckInterval

Set this directive to specify the interval, in seconds, between each time the module will check monitored directories for modifications to files. You can use fractional seconds, for example setting DirCheckInterval to 0.5 means the module checks twice every second.

If the PollInterval directive is set the default of this directive is twice the value of the PollInterval, otherwise the default is 2 seconds.

Increase the value of this directive if many files cannot be rotated out and the NXLog Agent process is causing a high CPU load.

EscapeGlobPatterns

Set this directive to TRUE to escape the backslash character (\) with another backslash character, resulting in \\, in glob patterns or entries with wildcards. File and directory patterns on Windows do not require escaping and are processed as non-escaped even if this directive is set to TRUE.

The default is FALSE; the module will not escape the backslash character (\).

Exclude

Set this directive to specify a file, or files (using wildcards), to be ignored by the module. You can use this directive multiple times in a single im_file module instance, to ignore multiple files.

See Filepath definition for more details on using this directive.

FollowSymlinks

Set this directive to TRUE to treat symlinked directories and symlinked files as normal directories and files.

If both this directive and the Recursive directive are set to TRUE, the Recursive directive will have the same effect on symlinked directories as it has for regular directories. The Exclude directive works as usual with symlinks, except when the symlink’s name is defined in the File directive and the symlink’s target is specified in the Exclude directive.

The default is FALSE; the module ignores symlinks.

This directive works with symlinks created with the mklink command on Windows and the ln command on Unix-like operating systems.

InputType

This directive is identical to the InputType directive in the list of common module directives and also supports data converters.

The default is LineBased, meaning the module will use CRLF as the record terminator on Windows, or LF on Unix.

OnEOF

Set this block directive to specify one or more statements to execute when a file has been fully read (on end-of-file). Only one OnEOF block can be specified per im_file module instance.

The following directives are allowed inside this block directive:

Exec

Set this directive to specify the actions to execute after EOF has been detected and the grace period has passed. This directive is mandatory in the OnEOF block directive. The directive can be specified as a single directive or a block directive.

GraceTimeout

Set this directive to specify the time, in seconds, to wait before executing the actions configured in the Exec directive. The default is 1 second.

NoEscape

This directive is deprecated, use the EscapeGlobPatterns global directive instead.

Set this directive to TRUE to disable using the backslash (\) in file paths as an escape sequence, which is especially useful for file paths on Windows. The default is FALSE; backslash escaping is enabled, and the path separator on Windows must be escaped.

PollInterval

Set this directive to specify how often, in seconds, the module checks for new files and new log entries in the files. You can use fractional seconds, for example setting PollInterval to 0.5 means the module checks twice every second.

The default is 1 second.

ReadFromLast

This boolean directive instructs the module on where to start reading events from the log source when NXLog Agent starts.

When TRUE, NXLog Agent will only read events logged after NXLog Agent started, unless SavePos is TRUE and a saved position for this log source exists in the cache file.

When FALSE, NXLog Agent will read all events from the log source, unless SavePos is TRUE and a saved position for this log source exists in the cache file.

The default is TRUE.

The following matrix shows the outcome of this directive in conjunction with the SavePos directive:

ReadFromLast SavePos Saved position Outcome

TRUE

TRUE

Yes

Reads events from the saved position.

TRUE

TRUE

No

Reads events that are logged after NXLog Agent is started.

TRUE

FALSE

Yes

Reads events that are logged after NXLog Agent is started.

TRUE

FALSE

No

Reads events that are logged after NXLog Agent is started.

FALSE

TRUE

Yes

Reads events from the saved position.

FALSE

TRUE

No

Reads all events.

FALSE

FALSE

Yes

Reads all events.

FALSE

FALSE

No

Reads all events.

If the NoCache directive is TRUE, it overrides the SavePos directive. In this case, the module behaves as if SavePos is FALSE.

ReadOrder

Set this directive to specify the reading order of the elements in a directory. The accepted values are none, CtimeOldestFirst, CtimeNewestFirst (Ctime is file creating time), MtimeOldestFirst, MtimeNewestFirst (Mtime is file modification time), NameAsc and NameDesc (sort done according to ASCII codes of name characters).

The default is none.

Recursive

Set this directive to TRUE to search within sub-directories of any directory specified in the File directive(s). You can use wildcards in combination with Recursive, but it will only apply the recursive action on the stem of the path. See the examples below.

The default is FALSE; the module will not search within sub-directories.

File Directive Matches Directory Matches Filename Examples

/var/log/error.log

/var/log/*

error.log

/var/log/error.log
/var/log/apt/error.log

/var/log/*.log

/var/log/*

*.log

/var/log/error.log
/var/log/apt/history.log

/var/log/*/error.log

/var/log/*/*

error.log

/var/log/apt/error.log
/var/log/journal/error.log
/var/log/journal/tmp/error.log

/var/*/apt/error.log

/var/*/apt/*

error.log

/var/log/apt/error.log
/var/log/apt/tmp/error.log
/var/lib/apt/error.log
/var/lib/apt/log/error.log

RenameCheck

Set this directive to TRUE to monitor input files for file rotation and avoid re-reading the same content. The module considers a file rotated when it detects a new file with the same inode and size as another deleted input file.

The default is FALSE; the module considers renamed files to be new and will re-read the file content.

File systems can reuse the inode number of a deleted file. If a new log file has the same inode and size as a deleted file, it creates a false positive and the module will falsely detect it as a rotated or renamed file.

When the module is not able to read any more data from a file, it checks whether the opened file descriptor belongs to the same filename it opened originally. If the inodes differ, the module assumes the file was moved and reopens its input.

When using file rotation, it is better to use a naming scheme that does not match the wildcard specified in the File directive. This ensures rotated files are no longer monitored without relying on the RenameCheck directive.

SavePos

Set this directive to TRUE to save the position of the last processed event before NXLog Agent exits. On the next startup, the agent reads the saved position from the cache file and resumes from that point. Together with the ReadFromLast directive, this directive allows the agent to continue reading events from the saved position.

The default is TRUE; the position of the last read event is saved and will be read from the cache file on the next startup.

If the NoCache directive is TRUE, it overrides the SavePos directive. In this case, the module behaves as if SavePos is FALSE.

Functions

The following functions are exported by im_file.

type: string file_name()

Returns the full path and filename of the currently open file.

type: integer record_number()

Returns the number of records processed from the currently open file, including the current record, since the file was opened or last truncated.

Examples

Example 1. Basic im_file configuration

The following is a basic configuration that collects logs from files whose names start with file in a specific directory.

nxlog.conf
<Input messages>
    Module            im_file
    File              '/path/to/log/file*'
    FollowSymlinks    TRUE  (1)
</Input>
1 The FollowSymlinks directive is set to TRUE, so the module follows symbolic links to files.
Example 2. Parsing and converting log records

This configuration reads logs from a file and parses the records using a regular expression that looks for a timestamp, severity, and message. It converts records that match the regular expression to JSON and discards those that do not.

nxlog.conf
<Extension json>
    Module    xm_json
</Extension>

<Input messages>
    Module    im_file
    File      '/path/to/log/file'
    <Exec>
        if $raw_event =~ /(?x)^(\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d\+\d\d:\d\d),
                          (.+),(.+)$/
        {
            $EventTime = parsedate($1); (1)
            $Severity = $2;
            $Message = $3;
        }
        else
        {
            drop();
        }

        to_json(); (2)
    </Exec>
</Input>
1 The parsedate() function converts the timestamp to datetime.
2 The to_json() procedure converts the fields to JSON and writes the result to the $raw_event core field.

The following input sample matches the regular expression used by the configuration above.

Input sample
2026-01-05T14:03:40+01:00,INFO,The service started successfully
Output sample
{
  "EventReceivedTime": "2026-01-05T14:04:24.244343+01:00",
  "SourceModuleName": "messages",
  "SourceModuleType": "im_file",
  "Hostname": "SERVER-01",
  "EventTime": "2026-01-05T14:03:40.000000+01:00",
  "Severity": "INFO",
  "Message": "The service started successfully"
}