Character set conversion
It is recommended to normalize logs to UTF-8. The xm_charconv module provides character set conversion: the convert_fields() procedure for converting an entire message (all event fields) and a convert() function for converting a string.
This configuration shows an example of character set auto-detection. The input file may contain differently encoded lines, but by invoking the convert_fields() procedure, each message will have the character set encoding of its fields detected and then converted to UTF-8 as needed.
<Extension _charconv> Module xm_charconv AutodetectCharsets utf-8, euc-jp, utf-16, utf-32, iso8859-2 </Extension> <Input filein> Module im_file File "tmp/input" Exec convert_fields("auto", "utf-8"); </Input> <Output fileout> Module om_file File "tmp/output" </Output> <Route r> Path filein => fileout </Route>