DEV Community

Aleksi Waldén for Polar Squad

Posted on • Edited on

Prometheus Observability Platform: Prometheus

Prometheus is an open-source application used for monitoring and alerting. It uses a time series database (TSDB) to store the data fed into it. It is designed to scrape metrics from the targets’ HTTP endpoints using a pull method. It is also capable of being a push target for metrics when using a push gateway. It employs its proprietary query language, PromQL, to query the data stored in its TSDB.

The Prometheus server component can evaluate the data it holds and create alerts based on PromQL queries. These alerts are then sent to a component called Alertmanager where you can set up routing for alerts. The routes can be, for example, to Slack or email.

The default data retention period is 15 days, but it can be set as low as 2 hours and has no upper limit. The local storage that Prometheus uses cannot be clustered or replicated, which is why it is not advised to use Prometheus itself for long-term storage of metrics. Also, the local TSDB gets corrupted easily, so having the option to just drop it without losing metrics is desirable. This is why we want to use the remote write capability to forward the metrics into a more robust long-term storage solution.

In Kubernetes, we have kube-prometheus-stack. This is a Helm chart that contains the Prometheus components for Kubernetes. Prometheus uses a node exporter to scrape metrics from the nodes and a custom resource definition (CRD) called ServiceMonitor to scrape metrics from pods behind a service. There is also a CRD called PodMonitor if you don’t have a service in front of the pods.

Prometheus has a concept of exporters. These are a collection of libraries and servers which are capable of exporting metrics from third-party systems into Prometheus metrics. The most used one is node exporter which is used to collect hardware and OS metrics exposed by Linux kernels.

In Prometheus, we can use the remote_write block to forward data into another Prometheus metrics-capable source. If we want to chain remote writing from Prometheus to another Prometheus, then we need to enable a feature flag for the remote write receiver. Remote writing supports multiple types of authentication methods when the long-term storage requires authentication, such as OAuth2.

With Prometheus, we want to have a Prometheus server as close as possible to the physical servers, so we get the least networking latency between the target and the Prometheus server. In the case of data centres, we can set up a Prometheus server in each data centre zone and have it pull metrics from targets in that data centre zone using telemetry agents such as openTelemetry or act as a remote_write target for workloads that push metrics. This will lead to having multiple Prometheus servers in multiple data centre regions and zones, so your applications need to be aware of where they are located and which Prometheus instance they are supposed to be connected to.

Demo

To test out Prometheus we can use minikube to run a local Kubernetes cluster and then the kube-prometheus-stack Helm chart to install Prometheus into the minikube cluster.

Prerequisites:

First, we need to start our minikube cluster. I defined the Kubernetes version as v1.26.3 here, this is the version which these examples have been tested with.

minikube start --kubernetes-version=v1.26.3
Enter fullscreen mode Exit fullscreen mode

You can validate that you have a connection to your minikube cluster by running:

kubectl cluster-info
Enter fullscreen mode Exit fullscreen mode

Then we want to add the kube-prometheus-stack Helm chart and install it into the Prometheus namespace.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
Enter fullscreen mode Exit fullscreen mode
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack --create-namespace --namespace prometheus --set grafana.enabled=false --set alertmanager.enabled=false
Enter fullscreen mode Exit fullscreen mode

Notice that we are disabling Grafana and Alertmanager for now as we will be installing them manually in the coming parts.

To test that Prometheus is operational, we can port forward the Prometheus service locally:

kubectl port-forward -n prometheus services/kube-prometheus-stack-prometheus 9090:9090
Enter fullscreen mode Exit fullscreen mode

We can now navigate to http://localhost:9090 to access the Prometheus web UI. Notice that some of the targets are giving errors.

prometheus-ui

We have now set up a simple kube-prometheus-stack onto our cluster and are ready to tackle the next steps.

Next part: Prometheus Observability Platform: Long-term storage

Top comments (0)