DEV Community

Cover image for Tracking System Metrics with collectd
Lorraine for Next Tech

Posted on • Edited on

Tracking System Metrics with collectd

Collecting system metrics allows system administrators to monitor available resources, detect bottlenecks, and make informed decisions about servers and projects.

At Next Tech, we use system metrics we gather for many things, such as:

  • Detecting an issue, like a server that doesn't have enough resources.
  • Conversely, for identifying areas we could be saving money by reducing a server's resources.
  • Ensuring that all services (like a web server or database) are running.

In this tutorial we will go over how to install and configure collectd, which is an open source daemon that collects system performance statistics and provides ways to store and publish them. Then, using the collectd's write_http plugin, we will send our metrics data to a Flask application using HTTP POST requests.

Let's get started!

Step 1: Load a Python environment

The easiest way to jump into a Python sandbox is using the Next Sandbox, which gives you access to an online computing environment in a couple of seconds. You can click here to launch one, or here to read about how you can install Python on your computer.

Step 2: Installing collectd

The first step is to install collectd. You can do this by entering the following commands in your terminal:

apt-get update
apt-get install collectd

Once collectd is installed successfully, move on to the next step — configuring the daemon!

Step 3: Configuring collectd

We need to configure collectd so that it knows what data to collect and how to send the values collected.

The collectd configuration file can be found at etc/collectd/collectd.conf. If you know Vim, you can modify the configuration file directly by running:

vim etc/collectd/collectd.conf

Otherwise, we will link the config file to one that is easily editable in the Sandbox. To do so, run the following three commands in your terminal to link the original configuration file to our new one:

cp /etc/collectd/collectd.conf /root/sandbox/collectd.conf
rm /etc/collectd/collectd.conf
ln -s /root/sandbox/collectd.conf /etc/collectd/collectd.conf

Now, open the file and take a look at the default collectd configuration. There are four sections in this file: Global, Logging, LoadPlugin Section, and Plugin Configuration.

Global

The first part of the file displays the Global Settings. The lines beginning with a hash (#) are commented out — we will remove some of these hashes so that these settings are as follows:

Hostname "localhost"
FQDNLookup true
BaseDir "/var/lib/collectd"
PluginDir "/usr/lib/collectd"
#TypesDB "/usr/share/collectd/types.db" "/etc/collectd/my_types.db"

AutoLoadPlugin false

CollectInternalStats false

Interval 10

Logging

The next section of the configuration file displays plugins used for logging messages generated by the daemon when it is initialized and when loading or configuring other plugins.

For each plugin in this section (and the next), there is a LoadPlugin line in the configuration, followed by the plugin's options. Almost all of these lines are commented out in order to keep the default configuration lean.

Only one log plugin should be enabled. We will be using using LogFile.

Remove the hash before the LoadPlugin logfile line and edit the plugin configuration to write the output to this file by changing the File parameter. This section should look like this:

LoadPlugin logfile

<Plugin logfile>
    LogLevel "info"
    File "/root/sandbox/collectd.log"
    PrintSeverity true
</Plugin>

Make sure the default logging plugin syslog is either removed or commented out.

LoadPlugin section

The next section displays a list of features. By default the following plugins are enabled:

Plugin Description
battery collects the battery's charge, the drawn current and the battery's voltage.
cpu collects the amount of time spent by the CPU in various states, e.g. executing user code, executing system code, waiting for IO-operations and being idle.
df collects file system usage information, i.e. how much space on a mounted partition is used and how much is available.
disk collects performance statistics of hard-disks and, where supported, partitions.
entropy collects the available entropy on a system.
interface collects information about the traffic (octets per second), packets per second and errors of interfaces (number of errors during one second).
irq collects the number of times each interrupt has been handled by the operating system.
load collects the system load (the number of runnable tasks in the run-queue). These numbers give a rough overview over the utilization of a machine.
memory collects physical memory utilization - Used, Buffered, Cached, Free.
processes collects the number of processes, grouped by their state (e. g. running, sleeping, zombies, etc.). It can also gather detailed statistics about selected processes.
swap collects the amount of memory currently written onto hard disk or whatever the system calls “swap”.
users counts the number of users currently logged into the system.

Our collectd daemon will automatically collect data using these plugins. A full list of the available plugins and a short description of each can be found here.

We also want to enable the write_http plugin so that collectd will know where to send the data it collects:

Plugin Description
write_http sends values collected by collectd to a web-server using HTTP POST requests.

Find this plugin in the list and remove the hash to enable it.

Plugin configuration

The final section shows the configuration for all the listed plugins. You can see all the plugin options in the collectd.conf(5) manual.

Find the plugin configuration for write_http. We will modify this to look like the following:

<Plugin "write_http">
    <Node "example">
        URL "http://127.0.0.1:5000";
        Format JSON
    </Node>
</Plugin>

Note that we specified the format of our output to be in JSON.

We will not be modifying any of the other plugin configurations but feel free to do so on your own.

Step 4: Verifying configuration

It is important to restart collectd whenever the configuration file is changed. Run the following command in your terminal to do so:

systemctl restart collectd

You can also run this command to check the status of collectd:

systemctl status collectd

If all is working, you should see Active: active (running) in the output.

You should also be able to see that collectd was initialized and plugins were loaded in your collectd.log file.

Finally, we can verify whether there are issues in the configuration file by running the following:

collectd -t ; echo $?

This command tests the configuration, then exits. It should return the output 0.

Step 5: Creating a Flask app

We've gotten collectd to track our metrics properly...now let's create a web application using Flask so collectd can send this data via HTTP POST requests!

Flask is a powerful microframework for creating web applications with Python. It comes with an inbuilt development server and is a perfect framework to build RESTful web services. The route decorator which helps to bind a function to a URL can take the HTTP methods as arguments that pave a way to build APIs in an ideal manner.

First, let's install Flask by running the following:

pip3 install Flask

Now, create a new file called flask_app.py.

There are three steps to write this program:

  1. Create a WSGI application instance, as every application in Flask needs one to handle requests.
  2. Define a route method which associates a URL and the function which handles it.
  3. Activate the application's server.

Copy the following code into the flask_app.py file in your directory:

from flask import Flask, request

app = Flask(__name__)

@app.route('/', methods=['GET', 'POST'])
def get_data():
    print(request.data)
    return 'This is working!'

This snippet executes the first two steps — we created a WSGI application instance using Flask's Flask class, and then we defined a route which maps the path '/' and the function get_data to process the request using a Flask's decorator function Flask.route().

Within the Flask.route() decorator, we specified the request methods as GET and POST. Then, our get_data method prints the incoming request data using request.data.

Continue on to the final step in our lesson to activate the application's server to see our data!

Step 6: Running a Flask app

To run your Flask application, you need to first tell the terminal what application to work with by exporting the FLASK_APP environment variable:

export FLASK_APP=flask_app.py

Then, execute the following to enable the development environment, including the interactive debugger and reloader:

export FLASK_ENV=development

Finally, we can run the application with the following:

flask run

After the app start running, in your terminal you should see your collectd data coming in as a JSON output!

Summary

In this tutorial we covered:

  1. Installing collectd
  2. Configuring multiple collectd plugins
  3. Building a basic Flask application
  4. Receiving data from collectd inside of Flask

We hope you enjoyed this tutorial and learned something new! If you have any comments or questions, don't hesitate to drop us a note below.

This tutorial was extracted from a lesson on Next Tech. If you're interested in exploring the other courses we have, come take a look!

Top comments (0)