Case Study: Making RFC5424 Syslog + JSON Logs Work in Splunk Case Study: Making RFC5424 Syslog + JSON Logs Work in Splunk

Case Study: Making RFC5424 Syslog + JSON Logs Work in Splunk 

Situation 

 

The customer wanted to bring in logs from Harfang EDR using heavy forwarder and TCP port, with a custom source type (harfang-syslog). The log format mixes a syslog header (RFC5424 style) with a JSON payload at the end. 

Splunk did not correctly parse these logs out of the box, and most of the fields were absent. They were extracted with the aid of the following search:  

| rex field=_raw "<(?<pri>\d+)>1\s(?<timestamp>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?[+-]\d{2}:\d{2})\s(?<hostname>[^\s]+)\s(?<appname>[^\s]+)\s(?<procid>[^\s]+)\s(?<msgid>[^\s]+)\s(?<structured_data>\S+)\s(?<json_msg>\{.*\})" 

| Spath input=json_msg 

 During the search, this extracted many of the necessary fields. 

Approaches Tested 

After some research and hands-on testing, we identified three ways to simplify Splunk field extraction for these logs:  

Option 1: Extract all syslog header fields using props/transforms; parse the JSON during search with Spath. 

  • This uses custom regex for each field in transforms.conf. The JSON “tail” remains untouched, so you use Spath to open it up at search time. 

Option 2: Strip off the syslog part at index time with SEDCMD. 

  • Here, we remove the header before indexing. Only the JSON goes into Splunk, so field extraction becomes automatic. The downside is you lose syslog metadata unless it’s inside the JSON. 

Example for props.conf: 

[harfang-syslog] 

SEDCMD-strip_syslog = s/^<\d+>1\s\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?[+-]\d{2}:\d{2}\s[^\s]+\s[^\s]+\s[^\s]+\s[^\s]+\s\S+\s//g 

Option 3: Change the log source to emit pure JSON. 

  • If you make the sender, drop the syslog header entirely, Splunk can manage JSON and nested fields with zero additional config. 

Result 

The customer picked Option 2 (strip syslog header at index time). This method worked as expected. Splunk indexed just the JSON, and all fields popped up. No extra extraction was needed, and the syslog meta was not missed for their use case. We closed the support case after confirming everything looked good

For more information, explore our  Our Splunk Implementation & Re-architecting 

 

Get in touch

Send us a Message

Looking for general information or have a specific question? Fill the form below or drop
us a line at susan@positka.com.

Enquiry Now