DEV Community

Cover image for Observability with Elasticsearch Kibana and APM
Caio Campos Borges Rosa for Woovi

Posted on

Observability with Elasticsearch Kibana and APM

Why at Woovi we invest in Observability?

At Woovi we focus on speed and innovation, in highly dynamic distributed systems, troubleshooting can be a challenging aspect, causing valuable time to be wasted on debugging. However, the concept of observability addresses this pain point by providing insights into data and processes, ultimately minimizing debugging costs and enabling more accurate future planning. Observability empowers users to gain visibility into the inner workings of their systems, helping them identify issues promptly and efficiently while freeing up time for the creation of new and innovative solutions.

Elastic Search

Elasticsearch is a versatile and scalable search and analytics engine designed to handle large volumes of data. It provides near-real-time search capabilities and supports complex querying, making it an ideal solution for a wide range of use cases. Built on top of the Apache Lucene library, Elasticsearch uses a distributed architecture to ensure high availability, fault tolerance, and scalability. It stores data in the form of JSON documents, allowing for efficient indexing and retrieval. It also offers features such as automatic sharding, replication, and distributed document storage, making it suitable for both small-scale applications and enterprise-level deployments.

APM

In Elasticsearch, the APM (Application Performance Monitoring) module provides insights into the performance of your applications and services that interact with Elasticsearch. APM in Elasticsearch works by instrumenting your application code and capturing detailed information about transactions, spans, and errors.

To use APM in Elasticsearch, you need to integrate the APM agent into your application. The agent is available in various programming languages and frameworks. Once integrated, the agent automatically collects performance data from your application and sends it to the Elasticsearch cluster.

Here we use event driven architecture, so our resources are divided in server and workers. One example of setup in a server for apm would be indexing the body of the request.

app.use(async (ctx, next) => {
  ctx.req.body = ctx.request.body;

  apm.setLabel('requestBody', ctx.request.rawBody);

  await next();
});
Enter fullscreen mode Exit fullscreen mode

This is the most simple form of indexing processes with apm, you set a label and a value for something you like to monitor, that way you can have a well structure stream of data to search. In our case we label requests bodies for all our api endpoints that way we can have a data drive approach to debugging live. Out of the box we can have for a endpoint requests, latency, throughput and error statistics.

Request Statistics

The APM module in Elasticsearch also supports distributed tracing, which allows you to follow a request's journey across multiple services and systems, providing insights into the end-to-end performance and metadata of your application.

Request trace

Kibana

Kibana is a powerful data visualization and exploration tool that works in conjunction with Elasticsearch. It provides a user-friendly web interface for searching, analyzing, and visualizing data stored in Elasticsearch, making it easier to understand and derive insights from large datasets. Alongside with APM is a powerful tool to visualization and live monitoring.

A great approach for application data visualization is to understand the team needs and the data profile before developing labels. With this approach we can ensure not only that we have all data we need but don't lose a hand on storage, as our indexes take space and infra costs.

One of the problems kibana and apm solves here at Woovi is on the worker side of our application, how to debug events in multiple queues with multiple data profiles as fast as possible and with confidence on the data? We introduced a label called job.Originator. That label is set in each job created by our event emitters so that way we can see where the event was created in the first place and with hat inquire the data for that event in specific moments of our data lifetime.

import apm from 'elastic-apm-node';

export const createJobBull = async <T extends Record<string, any> = any>(
  name: string,
  data?: T,
  options?: JobOptions,
  originator?: string,
  queue: Queue = queues.DEFAULT,
): Promise<Job<T>> => {

  const span = apm.startSpan('createJobBull');

  if (span) {
    span.setLabel('jobName', name);
    if (data) {
      span.setLabel('jobData', JSON.stringify(data));
    }

    if (originator) {
      span.setLabel('JobOriginator', originator);
    }

    if (options) {
      span.setLabel('jobOptions', JSON.stringify(options));
    }

    span.end();
  }

  return queue.add(name, options);
};

Enter fullscreen mode Exit fullscreen mode

Now we can search live and using data provided from the labels we set to query specific cases based on origin, data, resources and time.

Query by job Originator

If you want to work with us, we are hiring!

Top comments (0)