DEV Community

Cover image for Alertmanager configuration with PyYAML
Julian
Julian

Posted on

Alertmanager configuration with PyYAML

What my nightmares were like trying to use multiple configuration files with Alertmanager 💀👀

I recently started working on monitoring some of the internal tools and services at We Provide. I decided to tinker with the "Prometheus, Grafana and Alertmanager" stack as a first attempt.

Obviously, there we some hiccups... (literally and figuratively) 💩

I over engineered...

Originally I started thinking about a tool I could write, something like am-stitcher (ssooo original, not like amtool, at all...), but... I realized that I was over-engineering something that was basically a matter of merging multiple YAML files into one.

Google, here I come, gimme something useful or copy-pastable please! ❤️
I didn't really find anything sufficient for this use-case in terms of literally merging.

Some of the tools I came across did what I wanted them to do, but didn't give me control over in what order to merge them. Some did, but they required me to specify every file individually instead of being able to recursively search a directory.

PyYAML with pyyaml-include 💌 💕
I found an article about !include in YAML, and as naive as I am, I thought it was something that might work in YAML natively... But, of course, it didn't!

Support for !include is added by the pyyaml-include extension, which adds this "directive" to PyYAML.

So... I tried installing PyYAML and the pyyaml-include extension using PIP, hoping that the parsing with PyYAML would work with only the CLI, and it'd automatically use the pyyaml-include extension. But I couldn't find a way to use PyYAML only in the CLI...

I should really stop being so naive... Don't you think so too? 😬😅

What I ended up writing was a small Python script that imports PyYAML and uses pyyaml-include to parse !include "directives".

Don't be fooled, this was a lot of copy-paste in combination with trial and error! 🔥

import sys
import os
import yaml
from yamlinclude import YamlIncludeConstructor

YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.FullLoader, base_dir=os.getcwd())

data = ''

for line in sys.stdin:
    data += line

data = yaml.load(data, Loader=yaml.FullLoader)
yaml.dump(data, sys.stdout, default_flow_style=False)
Enter fullscreen mode Exit fullscreen mode

I decided to use the standard input (sys.stdin) and output (sys.stdout) streams to read and write. I didn't want to bother with using arguments when I didn't even know the Python syntax 😅

python parse-yaml.py < alertmanager.yml > merged.yml
Enter fullscreen mode Exit fullscreen mode

Shall we add a tiny bit of automation?! 🙌

At We Provide we mostly use Bitbucket, for open-source we use GitHub (obviously...), but for most of the stuff we use Bitbucket.

It consists out of three steps, "Parse & Merge", "Validate" and "Deploy". I think those are kinda self-explanatory, don't you think? 😊

pipelines:
  default:
    - step:
        name: Parse & Merge
        image: python:3.5.1
        script:
          - pip install pyyaml pyyaml-include
          - python parse-yaml.py < alertmanager.yml > merged.yml
        artifacts:
          - merged.yml
    - step:
        name: Validate
        image: serializator/amtool
        script:
          - amtool check-config merged.yml
    - step:
        name: Deploy
        image: eeacms/rsync
        script:
          - rsync merged.yml $USER@$HOST:$ALERTMANAGER_YAML
          - curl -L -X POST $HOST$RELOAD_PATH
Enter fullscreen mode Exit fullscreen mode

As you can see, one of the images is serializator/amtool. Originally I tried using the amtool-docker image from Cisco Metacloud (Anthony Rogliano). But the fact that it was built on top of golang:1.9-alpine3.6 caused issues with the usage of /bin/bash instead of /bin/sh in the Bitbucket pipeline.

^ If you know a solution for this, or if I overlooked something in the Bitbucket pipeline that I could've configured, please let me know! ❤️

What I ended up doing was forking the repository and build the image on top of golang:1.13-stretch instead of golang:1.9-alpine3.6.

^ I thought about the increase in size by not using the alpine tag, but it's negligible in my opinion, looking at the use-case.

I'm in love with the "/-/reload" endpoint 🙈🙊
I thought that I'd had to use SSH and use Docker Compose to restart the Alertmanager service, but it luckily didn't end up being that ugly.

Somewhere in 2016 the /-/reload endpoint was introduced by ZhenyangZhao, and I'm glad he made that pull request! I don't know how I'd live with myself knowing that one of my deployments didn't have cURL in it...

So, without much hesitation, I quickly wrote one line more to send a POST request to the /-/reload endpoint, configured some stuff using environment variables, and pushed it to master!

I didn't know that this kinda happiness existed! 😊🔥
I finally laid down on bed, closed my eyes and began dreaming about the many files, folders and !include's I was going to write with a happy smile on my face! 😆

I hope you got something useful out of this, and if not, I hope you at least enjoyed reading it!

I thank you, @yamadashy for this emoji cheatsheet ❤️

Top comments (0)