loading...
Cover image for How to classify log status on DataDog

How to classify log status on DataDog

trickstival profile image Patrick Stival ・3 min read

DataDog is a great tool for Data Analysis and Log Management. I started using it recently, and the tools they provide really impressed me.

First things first, you will need to collect your logs before monitoring them. There are many ways to do so, through preexisting integrations or by collecting them from files.

By default, every log message is classified as info, but it can be changed to any Severity Level described by syslog.

The possible log status are:

  • Emergency
  • Alert
  • Critical
  • Error
  • Warning
  • Notice
  • Informational
  • Debug

Once you have collected your logs, it's time to classify them.
To do so, you have to create a pipeline.

Pipeliiines

Pipelines are workflows that hold one or more processors inside. Processors are predefined data manipulators that can be executed in line. For this post we are going to use 3 Processors:

  • Grok Parser
  • String Builder
  • Status Remapper

Creating a Pipeline

  1. Go to Logs > Configuration on the sidebar menu
    Configuration Sidebar Image

  2. Click on "New Pipeline" at the upper corner of the page
    New Pipeline Button Image

  3. Name your Pipeline
    Creating pipeline

  4. Once you created the pipeline, you should be able to add processors:
    Create processor image

Parsing the logs

The next step is to parse the logs. For that purpose you can use the Grok Parser and extract information from your text. You can find more information about parsing rules by clicking here.

Choose the Grok Parser as the processor.
Then, provide some log samples (you can get those on the Data Dog Logs Page) and write your own Parsing Rules.

A Parsing Rule is defined by the following structure:

<Name of the Rule> Pattern %{matcher1:attribute}...

You can also use filters combined with matchers

<Name of the Rule> Pattern %{filter1} Pattern2 %{matcher1:attribute}

Read more about matchers and filters.

Check out the Grok Parser configuration:
Grok Parser Configuration

In most cases, just one rule should be enough. In the example there is just one rule called ImmaRule and we collect the attribute timestamp as a date and username as a word, by using matchers.

Once you extracted all the data you needed, a JSON representation of the attributes and values extracted through parsing you specified with the data samples will show up on the bottom of the screen:
JSON Representation of the extracted data.
Name your processor and click save.

Classifying the logs

The Grok Parser will keep forward only with the logs that matched the pattern you specified earlier. Since it's a pipeline, you can safely classify all logs filtered as errors, warnings, or any other status.

In this case I want to classify them as errors. For this purpose, we're adding a String Builder processor to the pipeline beneath the Grok Parser.
This way we can store the log level in a new attribute and officialize it as the status attribute later by using the Log Status Remapper processor.

So I'll add the String Builder processor and add the attribute level to it, with the value error:
String builder processor

And finally set the new level attribute as the status attribute, by adding the Log Status Remapper processor after the String Builder on the pipeline:
Status remapper example

And finally, our pipeline is done!
Finished Pipeline

Now you should get your logs on the Log Explorer flagged with the level you chose previously
Log explorer

Thanks for reading.
I hope this tutorial to be somehow useful for you.

Discussion

pic
Editor guide
 

Great, thank you Patrick that's exactly what I needed!

If that could help someone else I use the following pipeline to process logs coming from kube-dns pod (following docs.datadoghq.com/logs/processing...) :

  • one Grok Parser with two following rules (put them is step 2 "Define parsing rules") :
kube_dns %{regex("\\w"):level}%{date("MMdd HH:mm:ss.SSSSSS"):timestamp}\s+%{number:logger.thread_id} %{notSpace:logger.name}:%{number:logger.lineno}\] %{data:msg}

kube_dns_no_msg %{regex("\\w"):level}%{date("MMdd HH:mm:ss.SSSSSS"):timestamp}\s+%{number:logger.thread_id} %{notSpace:logger.name}:%{number:logger.lineno}\]
  • one Status Remapper with the attribute "level"

And here you go :
kube-dns logs

If anyone need more details on this (pushing kube-dns logs to Datadog then parse them correctly), feel free to reach me.

Have a great day!