Enhancing Cyclops: Adding Cache Metrics to Prometheus – My First OSS Contribution

#cyclops #go #kubernete #prometheus

Hola Mis Amigos,

Recently, I learned about Cyclops - A Developer friendly Kubernetes tool.

What is Cyclops

Cyclops is a tool for managing Kubernetes (K8s) clusters and offers a GUI solution for developers to deploy and manage these clusters, making the management of containerized applications super easy! It also allows developers and DevOps teams to create custom Helm charts for deploying any application with specific requirements.

Prerequisites for using Cyclops

Cyclops is an easy-to-use tool but requires some additional knowledge about the technology it uses and the problem it solves.

Kubernetes - for container orchestration of your application.
Docker - for containerization of your application.
Minikube - for managing the local Kubernetes cluster.

Good First Issues on Cyclops GitHub

As a newcomer to Cyclops, I wanted to contribute and make a meaningful impact. While browsing their repository, I came across a "good-first-issues" that seemed like a perfect starting point. I followed the "CONTRIBUTING.md" to set up Cyclops on my localhost. After searching through the issues I found a good issue to work upon (this)

The Task

The task was to extend the current implementation of Prometheus monitor to include the cache metrics from the templateCache utilization.

I began by understanding the requirements and the codebase; I tried to find the starting point of the application and build an understanding of its workflow.

The ristretto.Cache was located in the internal > template > cache > inmemory.go file. The first step was to make the field Metrics to true while initializing the cache. This made sure the cache metrics were being generated.

The next step was to include the cache metrics in the existing monitor:

m := Monitor{
    ModulesDeployed: prometheus.NewGauge(prometheus.GaugeOpts{
        Name:      "modules_deployed",
        Help:      "No of modules Inc or Dec",
        Namespace: "cyclops",
    }),
}

I identified a few key metrics provided by ristretto and included them in the monitor as Gauge metrics. A Gauge represents a single numerical value that can go up and down arbitrarily, making it a good choice for the cache metrics.

The monitor now contained:

m := Monitor{
    ModulesDeployed: prometheus.NewGauge(prometheus.GaugeOpts{
        Name:      "modules_deployed",
        Help:      "No of modules Inc or Dec",
        Namespace: "cyclops",
    }),
    CacheHits: prometheus.NewGauge(prometheus.GaugeOpts{
        Name:      "cache_hits",
        Help:      "No of cache hits",
        Namespace: "cyclops",
    }),
    CacheMisses: prometheus.NewGauge(prometheus.GaugeOpts{
        Name:      "cache_misses",
        Help:      "No of cache misses",
        Namespace: "cyclops",
    }),
    CacheKeysAdded: prometheus.NewGauge(prometheus.GaugeOpts{
        Name:      "cache_keys_added",
        Help:      "No of cache keys added",
        Namespace: "cyclops",
    }),
    CacheCostAdded: prometheus.NewGauge(prometheus.GaugeOpts{
        Name:      "cache_cost_added",
        Help:      "No of cache cost added",
        Namespace: "cyclops",
    }),
    CacheKeysEvicted: prometheus.NewGauge(prometheus.GaugeOpts{
        Name:      "cache_keys_evicted",
        Help:      "No of cache keys evicted",
        Namespace: "cyclops",
    }),
    CacheCostEvicted: prometheus.NewGauge(prometheus.GaugeOpts{
        Name:      "cache_cost_evicted",
        Help:      "No of cache cost evicted",
        Namespace: "cyclops",
    }),
}

Then, made a metrics list array and looped it to register the metrics to the prometheus registry for storing metrics during controller runtime.

metricsList := []prometheus.Collector{
    m.ModulesDeployed,
    m.CacheHits,
    m.CacheMisses,
    m.CacheKeysAdded,
    m.CacheCostAdded,
    m.CacheKeysEvicted,
    m.CacheCostEvicted,
}

for _, metric := range metricsList {
    if err := metrics.Registry.Register(metric); err != nil {
        logger.Error(err, "unable to connect prometheus")
        return Monitor{}, err
    }
}

Next, I created an update method on Prometheus monitor which will update the metric of the cache in the Prometheus monitor, allowing us to get the latest metrics in the /metrics route.

func (m *Monitor) UpdateCacheMetrics(cache *ristretto.Cache) {
    cacheMetrics := cache.Metrics

    m.CacheHits.Set(float64(cacheMetrics.Hits()))
    m.CacheMisses.Set(float64(cacheMetrics.Misses()))
    m.CacheKeysAdded.Set(float64(cacheMetrics.KeysAdded()))
    m.CacheCostAdded.Set(float64(cacheMetrics.CostAdded()))
    m.CacheKeysEvicted.Set(float64(cacheMetrics.KeysEvicted()))
    m.CacheCostEvicted.Set(float64(cacheMetrics.CostEvicted()))
}

Now as per the requirements, the only thing remaining was to write a cron job which will update the cache metric in the Prometheus monitor periodically for a fixed time delta. If you don't know about a cron job - here's an amazing article to understand this interesting topic.

func StartCacheMetricsUpdater(m *Monitor, cache *ristretto.Cache, interval time.Duration, logger logr.Logger) {
    go func() {
        ticker := time.NewTicker(interval)
        defer ticker.Stop()

        logger.Info("starting cache metrics updater")

        for range ticker.C {
            m.UpdateCacheMetrics(cache)
        }
    }()
}

The StartCacheMetricsUpdater function takes a pointer to the Prometheus monitor, a pointer to the cache, a time interval, and a logger. Inside the function, a goroutine is started which runs at every tick of the interval, calling the UpdateCacheMetrics method to update the cache metrics in the Prometheus monitor.

Finally, in the main.go file, I called the StartCacheMetricsUpdater function to kickstart the updating of cache metrics to the Prometheus monitor.

Before raising a PR for this feature, I thoroughly tested the code to ensure that it doesn't break any existing functionality. This might be the most important step while contributing.

Conclusion

Contributing to the Cyclops project by adding cache metrics to Prometheus was a rewarding experience. It allowed me to deepen my understanding of both Cyclops and Prometheus while making an improvement to the tool. I learned the importance of thoroughly understanding the existing codebase, strictly following the contribution guidelines, and testing changes rigorously.

And I'll sign-off now and will see you ...