DEV Community

ashleygraf_
ashleygraf_

Posted on • Edited on

Workarounds for inconsistently formatted logs in Splunk

Over the past year and a half as I've learned to use Splunk to hunt down root causes, create sequence diagrams, make reports, and generate statistics, there's a few features I keep returning to, to make inconsistently formatted logs easier to work with.

If your logs are any combination of fully parsed JSON that you can use directly, JSON blobs, or straight-up just text with predictable patterns, this may come in handy.

They are the below

  • rex (regular expressions)
  • rename (renaming fields)
  • coalesce (combining fields)
  • joins (like SQL, on fields with matching names [hence why you need rename])

Regex

Yes that's right, the tool everyone loves to hate. Regular expressions. I use regular expressions in two key ways
1) to remove any unique identifiers that would match a pattern, and replace it with text representing that pattern. This is particularly useful for creating statistics, if you don't have URI as a field. This enables you to make it a field.

For this purpose, I've found it generally unnecessary to learn complex regex. Don't overcomplicate it. You will start off needing to know 5 symbols.

words \S
numbers \D
more than 1 character until you meet the next symbol or anchor +
number of characters [n]
all characters until the next symbol or anchor .*

and then whatever flag you need for sed mode. I find g for global suffices for most searches.

rex field=<field> mode=sed "s/<regex>/<replacement>/<flags>"
Enter fullscreen mode Exit fullscreen mode

2) to find text that matches a pattern, and collect it for further use.

rex field=<field> "anchor(?<new_field_name>\S)anchor"
Enter fullscreen mode Exit fullscreen mode

Don't name the new field the same as an existing field if you are planning to use it in a coalesce.

Do name the new field the same as an existing field if you are planning to use it in a join.

If you need to combine data and join two searches, coalesce first, then join.

Rename

Once you are done retrieving your new field out of the ordered text, you can rename it so it looks nicer for your report or chart, or so that you can use it for joins. If you rename before you try to coalesce, you cannot coalesce them, as they have the same name.

rename <field_name> as <new_field_name>
Enter fullscreen mode Exit fullscreen mode

Coalesce

This is useful to combine the values you have just created out of regex-ed values with the data you already have from the fully parsed logs. If your field used to have a name, but now it has a different one due to reasons, this comes in handy. You can compare with historical data.

eval <field_name>=coalesce(<new_field>,<existing_field)
Enter fullscreen mode Exit fullscreen mode

Joins

This is useful to combine the results of two searches. You might use it to connect a field to match it to another field that is missing on some logs but not on others with a field that it does have in common.

There are two joins in Splunk - left (or outer) and inner, and this is how you do them.

index="index" "phrase" | ...... | table ..... | join type=outer field [search index="index" "phrase" | ....... | rename ..... | table ..... ]
Enter fullscreen mode Exit fullscreen mode
index="index" "phrase" | ...... | table ..... | join type=inner field [search index="index" "phrase" | ....... | rename ..... | table ..... ]
Enter fullscreen mode Exit fullscreen mode

Top comments (0)