DEV Community

Cover image for Distributed Logs and Tracing with Spring, Apache Camel, OpenTelemetry and Grafana: Example
Anton Goncharov
Anton Goncharov

Posted on

Distributed Logs and Tracing with Spring, Apache Camel, OpenTelemetry and Grafana: Example

I've built an Apache Camel & OpenTelemetry demo project. It showcases how to achieve distributed tracing across Camel routes and other components using OpenTelemetry, which is the OpenCensus and OpenTracing projects merged into a new single specification under CNCF. The project uses Spring Boot as a base service framework, Loki to store logs, Tempo to store traces, Grafana as a visualization software.

GitHub logo anton-goncharov / camel-opentelemetry-grafana-demo

An Apache Camel & OpenTelemetry demo project. Uses Spring Boot as a base service framework, Loki as a log storage, Tempo as a trace storage, Grafana as a visualization software.

Introduction

Apache Camel has added OpenTelemetry support in the 3.5 release. The official documentation doesn't focus on how to set everything up for logs and traces visibility, so the goal of this example is to fill this gap.

OpenTelemetry is an open-source project resulted from merging of OpenCensus and OpenTracing. Its purpose is to enable an application's observability by providing a set of standardized data collection tools for capturing and exporting metrics, logs and traces. It's one of the most active CNCF projects these days.

Using Camel in tandem with OpenTelemetry instrumentation allows us to have distributed tracing across different routes in one or more services, and to link application logs with these traces, which makes it a great alternative to ELK and similar solutions.

This logs & traces view is what we're aiming to achieve in the end:

image / grafana / dashboard

Important note: this example uses Apache Camel 3.10 (the last available version as of June 2021), which depends on OpenTelemetry SDK for Java version 0.15.0. OpenTelemetry specification and its tools develop rapidly, now beyond 1.x release. In the Camel tracker I see the plans to upgrade dependency but today make sure 0.15.0 is used to run this example smoothly.

How it works

The project is a Spring Boot web application named 'hello-service' serving 2 endpoints. /hello responds with "Hello, ${name}" and /dispatch passes the request to a downstream service.

The docker-compose config spins up 5 container instances:

  • hello-service (1) (container "dispatch") accepts requests and dispatches them down to the hello-service (2)
  • hello-service (2) (container "hello") handles dispatched requests
  • loki is a logs storage
  • tempo is a traces storage
  • grafana is a web dashboard with loki/tempo data visualizations

Interacting hello-service (1) and hello-service (2) are here to demonstrate how OpenTelemetry traces requests across different containers.

The architecture overview:

image / architecture

Both 'hello' services send logs to Loki via loki-docker-driver and traces to Tempo via the OpenTelemetry exporter. Grafana uses Loki and Tempo as data sources. Collected logs are linked to the trace data by using traceId assigned by the OpenTelemetry instrumentation.

Service

OpenTelemetry instrumentation dynamically captures telemetry from a number of popular Java frameworks. To plug it in, add opentelemetry-javaagent-all.jar as a javaagent of the JAR application. This preparation is done in the Dockerfile.

For the Camel integration it's important that we use opentelemetry-javaagent-all.jar version 0.15.0 as camel-opentelemetry depends on this OTel Java SDK release.

ADD /agent/opentelemetry-javaagent-all.jar /etc/agent/opentelemetry-javaagent-all.jar

ENTRYPOINT ["java", "-javaagent:/etc/agent/opentelemetry-javaagent-all.jar" , "-jar", "observable-service.jar"]
Enter fullscreen mode Exit fullscreen mode

For Camel 3.5+ it's required to add the camel-opentelemetry-starter dependency. Then, in a RouteBuilder class, set up the tracer as the following:

OpenTelemetryTracer ott = new OpenTelemetryTracer();
ott.init(this.getContext());
Enter fullscreen mode Exit fullscreen mode

The docker-compose configuration has a few items to highlight.

  • OTEL_EXPORTER_OTLP_ENDPOINT is an endpoint where to send the opentelemetry data. In our case it's the Tempo's receiver.
  • We don't send metrics in this demo, so it's disabled to hide all related warnings.
  • The downstream endpoint point to the address of the second service to show how distributed tracing works.
  • The logging sections states that the container's stdout should be pipelined to the Loki storage by using the loki-docker-driver.
dispatch-service:
  image: hello-service:1.0.0
  container_name: dispatch
  networks:
    - "tempo-net"
  ports:
    - "8080:8080"
  environment:
    - OTEL_EXPORTER_OTLP_ENDPOINT=http://tempo:55680
    - OTEL_EXPORTER_OTLP_INSECURE=true
    - OTEL_METRICS_EXPORTER=none
    - OTEL_RESOURCE_ATTRIBUTES=service.name=dispatcher
    # sends requests to the 'hello-service' container
    - SRV_DOWNSTREAM_ENDPOINT=http://hello:8080/hello
  logging:
    driver: loki
    options:
      loki-url: "http://localhost:3100/loki/api/v1/push"
Enter fullscreen mode Exit fullscreen mode

The second service instance has the similar setup except for the fact that it has no downstream service.

In application.properties, the logging.pattern formats log output in a way to include traceId and spanId. It's necessary for the system to be able to index logs and link them to traces.

Loki

Loki is a scalable log aggregation system to use with Grafana.

The Loki config is pretty much default. It exposes port 3100 so all the other docker containers can send logs there.

loki:
  hostname: "loki"
  image: grafana/loki:2.2.0
  ports:
    - "3100:3100"
  networks:
    - "tempo-net"
  command: "-config.file=/etc/loki/local-config.yaml"
Enter fullscreen mode Exit fullscreen mode

Tempo

Tempo is a scalable trace storage. It has receivers for Jaeger, Zipkin and OpenTelemetry.

The tempo-config.yaml is mounted from the project. It disables auth and specifies that Tempo instance should expose an OpenTelemetry receiver. The default OTel port is 55680.

tempo:
    hostname: "tempo"
    image: grafana/tempo:latest
    networks:
      - "tempo-net"
    volumes:
      - ./tempo-config.yaml:/etc/tempo-config.yaml
    command: "-config.file=/etc/tempo-config.yaml"
Enter fullscreen mode Exit fullscreen mode

Grafana

Grafana is a visualization and monitoring software that can connect to a wide variety of data sources and query them.

How to launch

I used Maven 3.8.1 + JDK 11 to create the demo, so it should work with all version from these onwards.

Steps to launch and try:

1. Prepare Docker

Install Loki driver to Docker (to stream logs from the standard output of a container)

docker plugin install grafana/loki-docker-driver:latest -- 
alias loki --grant-all-permissions
Enter fullscreen mode Exit fullscreen mode

Create a network for the demo services:

docker network create tempo-net
Enter fullscreen mode Exit fullscreen mode

2. Pack the service with Maven

mvn package
Enter fullscreen mode Exit fullscreen mode

3. Build the container image

docker build . -t hello-service:1.0.0
Enter fullscreen mode Exit fullscreen mode

4. Run the docker-compose script to bring up the demo

docker-compose run
Enter fullscreen mode Exit fullscreen mode

5. Open a web browser and navigate to the Grafana dashboard: http://localhost:3000. Go to the Configuration->Data Sources menu item.

6. Add new Tempo data source (a storage for traces)

image / tempo

7. Add new Loki data source (a storage for logs)

image / loki

Configure "Derived Field" in Loki Data Source to relate logs with traces

image / loki / derived field

8. Send a sample request

curl http://localhost:8080/dispatch?name=Ada
Enter fullscreen mode Exit fullscreen mode

9. After all set and executed as above, browse to the Grafana UI at http://localhost:3000. Open the "Explore" view and switch to the Loki view using to dropdown on the top.

Searching for {container_name="dispatch"} will output the recent logs for the hello-service (1) running in the dispatch container.

Note that each log entry now has traceId, a global identifier to track request across interacting services; and spanId that identifies local unit of work (eg. an individual route or a handler within a service).

Click on an entry that has a traceId. It would unfold a detailed info on the log entry. Next to the "traceId" field there's a button that would carry us to the corresponding tracing.

image / grafana / log entry

Voilà, we've got to the distributed tracing for this request. Each span has the corresponding log entries grouped under the "Logs" box on this panel.

image / grafana / trace

The combined split view in Grafana allows browsing the logs and traces side by side.

image / grafana / dashboard


More references on the topic:

Top comments (3)

Collapse
 
almeidaalex profile image
Alex Almeida

Hi Anton,

Really nice article, well detailed.

I'm trying to create a example here in RoR, I've downloaded the otel example from Grafana's github. But when I tried to send my spans to otel collector, I'm getting error, however, I guess it happens because I'm sending to wrong port.

In this article you have set OTEL_EXPORTER_OTLP_ENDPOINT=tempo:55680

So you don't need to send the spans to otel collector, you can send directly to Tempo instance, is it right?

Thx!

Collapse
 
moose profile image
moose

Really good article. I'm going to check on using this with Gradle. I could use something like this.

Collapse
 
antongoncharov profile image
Anton Goncharov

Thanks!