loading...
Cover image for Metrics with Prometheus StatsD Exporter and Grafana

Metrics with Prometheus StatsD Exporter and Grafana

kirklewis profile image Kirk Lewis Updated on ・9 min read

This post explains how to convert StatsD metrics using the Prometheus StatsD-exporter, and later visualising them with Grafana.

I explain from the perspective of an application which is instrumented with StatsD but is monitored by a Prometheus server. As such, I will use the terminal as a pretend application emitting metrics ; Prometheus will monitor the application and collect any metrics ; and finally Grafana will be used to draw a graph by using Prometheus as a data source.

Overview of Technologies Used

This section briefly covers why each technology is being used but to learn more, you can read about each using the links in the Recommended Reading section.

Metrics

During a system's runtime, measurements taken of any of its properties are known as metrics. These metrics are typically in the form of data points associated with some information to give it context. Metrics are things like the number memory bytes used, number of website visitors, the time a task took to complete, etc.

StatsD metrics allow such information to be recorded and represented using a text format specification. Below are two StatsD metric lines each representing how long an API request has taken.

# metric format: <metric name>:<value>|ms
test_api_call.orders-api.timer./v1/orders:84|ms
test_api_call.orders-api.timer./v1/orders:95|ms

In the above example, two API calls to GET /api/orders are measured at 84 and 95 milliseconds each. The |ms is used with StatsD timer metrics. StatsD also supports other metric types like Counters, Gauges, Histograms and more.

Why dot-delimited?

To record information about the metric, in this case the name of the API and its endpoint, the <metric name> part is dot-delimited so those details can be captured.

Later you will see how the Statsd-exporter will help us capture the dot-delimited parts we are concerned with.

Prometheus for Monitoring

Prometheus can be used to monitor the performance of an instrumented system such as, a microservice, a host server, etc. The system will need to expose its metrics in order for Prometheus to collect them - this is also known scraping. The scraped metrics can then be visualised or queried using the Prometheus expression-browser. Prometheus also has an AlertManager which can send alerts about certain metrics being monitored - it has support for Slack, PagerDuty, email and more.

The following diagram shows Prometheus scraping an application's metrics which are exposed on the/metrics path. Grafana can then use these metrics for visualisations.

Statsd-exporter

The statsd-exporter is used to convert Statsd metrics into Prometheus metrics. This is because the StatsD text-format represents metrics differently to Prometheus' exposition format which is represented like this:

<time series name>{<label name>=<label value>,...}

The following diagram shows how a StatsD metric, is converted to Prometheus metrics by using an exporter's mapping rules.

It is very common for a system to record its metrics in a format different to Prometheus, therefore there are many Exporters written which allow such metrics to be converted to Prometheus' time series notation

Where did count, sum and quantile come from?

The StatsD-exporter automatically converts timer metrics into a Prometheus summary. The summary contains extra information like the count, sum and each quantile for the current observation. By default the quantiles are 0.5, 0.9 and 0.99.

In the image above, 10 API requests took 80 milliseconds each to respond. As such a summary is generated with these details:

  • _count of 10.
  • _sum of 0.799999999999999 for that observation.
  • and each _quantile where 0.5 (50th percentile) is the median.

Now there is an understanding of what each technology does in relation to each other, we can get started implementing our test application.


Configure Prometheus and Statsd-exporter

In this section the configuration files for both Prometheus and the statsd-exporter are created.

Create the Directory and Files

Create a directory named .prometheus to contain the files prometheus.yml and test-mapping.yml as follows:

# I have created mine in $HOME/.prometheus/
mkdir ~/.prometheus && cd $_
touch ./{prometheus,test-mapping}.yml

Run tree or find . to verify the files.

Prometheus Configuration

This gives us some basic configuration for the Prometheus server such as static service discovery, and how often to scrape each discovered service.

Add the following to the prometheus.yml file and save it.

global:
  scrape_interval:      15s
  evaluation_interval:  15s

scrape_configs:
  # optional: this makes the metrics available to us about Promethus itself.
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  # tells Prometheus to scrape metrics an address over port 9123
  - job_name: 'test_metrics'
    static_configs:
      - targets: ['host.docker.internal:9123'] # see statsd-exporter further down
        labels: {'host': 'test-app-001'} # optional: just a way to identify the system exposing metrics

Prometheus also supports dynamic service discovery which is useful if you deploy applications on AWS ECS or other scalable cloud solutions.

Statsd-Exporter Configuration

This is the file used by the Prometheus statsd-exporter so it knows how to convert StatsD metrics into Prometheus metrics.

Add the following to the test-mapping.yml file and save it.

mappings:
    # usage: test_api_call.product-api.timer./v1/product
  - match: "test_api_call.*.timer.*"
    name: "test_api_call"
    labels:
        api_name: "$1"
        api_endpoint: "$2"

Startup the Statsd Exporter

Before visualising any metrics with Grafana, we can check whether our Statsd metrics are being converted correctly.

To start, run the the official Prometheus statsd-exporter Docker image, and then send it metrics over port 8125.

Running

Make sure you are inside the .prometheus directory.

docker run --name=prom-statsd-exporter \
    -p 9123:9102 \
    -p 8125:8125/udp \
    -v $PWD/test-mapping.yml:/tmp/test-mapping.yml \
    prom/statsd-exporter \
        --statsd.mapping-config=/tmp/test-mapping.yml \
        --statsd.listen-udp=:8125 \
        --web.listen-address=:9102

What this does

  • listens for metrics sent to an address on port 8125 - e.g. echo "hello:1|c" | nc -w 0 -u 127.0.0.1 8125
  • converts Statsd metrics to Prometheus ones by using the test-mapping.yml file created earlier.
  • expose the metrics for Prometheus to scrape - in this case over host port 9123 - container port 9102.

After running the image you should see output like this:

Startup Prometheus

Now the Prometheus server can be started so it can scrape the metrics exposed on port 9123.

Recall that the prom-statsd-exporter container, is exposing metrics on container port 9102 which were remapped to host port 9123.

To start, run the official Prometheus Docker image.

Running

Make sure you are inside the .prometheus directory.

docker run --name=prometheus \
    -p 9090:9090 \
    -v $PWD/prometheus.yml:/prometheus.yml \
    prom/prometheus \
        --config.file=/prometheus.yml \
        --log.level=debug \
        --web.listen-address=:9090 \
        --web.page-title='Prometheus - Test Metrics Demo'

What this does

  • runs Prometheus server and tells it to scrape metrics from the targets defined in prometheus.yml
  • allow us to visualise the metrics using the Prometheus browser and even query them using PromQL
  • allows Grafana to use Prometheus as a data source over address port 9090

After running the image you should see output like this:

The above informs us that the prometheus.yml file was loaded, and Prometheus knows about the two targets to scrape.

Verify the Targets are Up and Running

Go to localhost:9090/targets. Each one endpoint for each target should be up.

We can start sending metrics and prometheus will scrape them.

Sending some Metrics

Now try sending a couple metric over localhost IP address and port 127.0.0.1:8125 using the lines below.

echo "test_api_call.orders-api.timer./v1/orders:89|ms" | nc -w 0 -u 127.0.0.1 8125
echo "test_api_call.orders-api.timer./v1/orders:68|ms" | nc -w 0 -u 127.0.0.1 8125

Now goto http://localhost:9123 to view the metrics. If you have updated your hosts file to resolve host.docker.internal to localhost then you can visit http://host.docker.internal:9123 instead.

Search for test_api_call and you should see a Prometheus summary similar to the following.


Visualise the API Call Metrics in Grafana

This final section looks at starting up Grafana on port 3000, adding a Prometheus data source, and creating a Graph to visualise the API call metrics.

Start by running the official Grafana Docker image.

docker run -d --name=grafana -p 3000:3000 grafana/grafana

The default username and password for Grafana are both admin and admin respectively.

Adding a Prometheus Data Source to Grafana

To add a Prometheus data source and add a Graph you can follow the instructions at the Prometheus website - instead of me repeating the same instructions here.

Plotting the average request duration
Within the the section "Creating a Prometheus Graph" where it mentions 'Entering a Prometheus expression' into the Query field, use the expression rate(test_api_call_sum[1m]) / rate(test_api_call_count[1m]) as seen below. This expression calculates the average request duration for the last minute using the Prometheus summary for test_api_call

Optionally, you can also update the legend field to {{api_name}} - {{api_endpoint}} as seen below also - doing so will display the legend like this: orders-api - /v1/orders.

Emit Metrics Continuously

Now to save us having to send metrics manually - just to update the graph more frequently, we can write a simple command-line statement which will send a metric every second. This will be done in two shells, in order to simulate API calls to our fictional orders-api and invoice-api.

Below the shuf command is used to generate a random range of milliseconds between 50 and 150.

Shell 1: orders-api

while true; do echo -n "test_api_call.orders-api.timer./v1/orders:$(shuf -i 50-150 -n 1)|ms" | \
    nc -w 1 -u 127.0.0.1 8125; done

Shell 2: invoice-api

while true; do echo -n "test_api_call.invoice-api.timer./v1/invoice:$(shuf -i 50-150 -n 1)|ms" | \
    nc -w 1 -u 127.0.0.1 8125; done

use ctrl+c to stop either loop

The Visualisation

After a few seconds you should see the graph drawing. And some time later - in my case 5 minutes your graph should look similar to the image below. Make sure the Dashboard's refresh rate is at least 5 seconds.

How do I emit metrics from a real application?

There are StatsD client libraries written for Go, Node, Python, Perl and more. See link below:
https://github.com/statsd/statsd/wiki#client-implementations
Some of the libraries also default the host localhost and port to 8125 so you can get up and running quickly.

Node StatsD Client Example

This example uses the NPM module node-statsd.

const StatsD = require('node-statsd');
const statsdClient = new StatsD(); // defaults to localhost:8125

statsdClient.timing('example.response_time', 80); // record 80ms response time
statsdClient.increment('example.site_visit');     // record a new site visitor count +1

Can I just emit metrics to Prometheus format instead of StatsD?

Yes, it is possible instrument an application to record metrics in Prometheus format. This approach eliminates the need to use a Prometheus exporter sidecar and the extra work of creating a mapping file.

To instrument your application with Prometheus, use one of the official or unofficial client libraries.


Conclusion

Getting Statsd metrics out of an application (terminal for this demo) and into Grafana, only required the following:

  • two configuration files.
  • running statsd-exporter, Prometheus and Grafana.
  • emitting metrics over a host address and port with UDP using the Statsd line protocol.
  • setting Prometheus as a data source in Grafana and configuring a graph.

That was it! The majority of this post is mostly explaining some essentials which I think are very important to understand. Although the terminal is used as a pretend application 'to emit metrics', the principles are the same for a real application. So whether you write your application in Go, Node, Python, etc, you can just use a StatsD client library to emit metrics from it.

Recommended Reading

Thank you for reading!

Cover image

Posted on by:

kirklewis profile

Kirk Lewis

@kirklewis

Separator of Concerns. I sometimes write about code.

Discussion

markdown guide
 

0.5 is 50th percentile / median, not 5th percentile.