Let's say you have an app or multiple apps - all of them send logs - how would you know what is happening inside them?
There are usually two kinds of methods:
- Logging: Saves logs from multiple applications and provides insights and searches. This is the old way and is always helpful.
- Tracing: focuses on providing insights into the performance of applications; you can build accurate metrics on them for monitoring and alerting.
There are tools for logging, some for tracing, and some tools do both!
Here are open-source tools used for logs and tracing you have to know:
1. Quickwit (Logs and Tracing) ๐
Quickwit is an open-source, distributed search engine designed for log management and analytics at a large scale.
Quickwit is a direct alternative for Elasticsearch with improved performance, especially in cloud-native and large-scale distributed environments, and focuses on optimizing storage and search efficiency.
Usually, you would use a tool like OpenTelemetry, Fluentbit, Odigos (auto-instrumented tracing tool) to collect logs and traces, send them to Quickwit, and then visualize it with Jaeger (Traces) or Grafana (Logs and Traces).
Fun fact: Both Elasticsearch and Kibana dropped their community license for a more restrictive license (From Apache 2 to Elastic License and got a massive backlash from the community).
Quickwit is AGPL 3. It's much more open for the FOSS (Free open-source community.)
ย
โญ๏ธ Support the open-source community, Star Quickwit โญ๏ธ
2. Grafana (Logs and Traces)
Grafana is an open-source alternative to the ELK stack. For logs and traces, you have to set up two query engines, Loki and Tempo, both maintained by Grafana.
Once you index all your logs or traces in Loki and Tempo, you will require a visualization tool to search your data: here comes Grafana!
Grafana allows you to query, visualize, alert, and understand your metrics no matter where they are stored. Create, explore, and share dashboards with your team and foster a data-driven culture.
ย
3. Odigos (Tracing)
Odigos is a unique technology that generates traces for any application in k8s without code changes: all their traces can then be forwarded to a database such as Quickwit or Elasticsearch (they have many more integrations).
In case you don't know, OpenTelemetry is a protocol to receive logs and traces. Odigos is using the standard so you can send your traces in any database that supports OpenTelemetry!
ย
5. Jaeger (Tracing)
Unlike Prometheus, Jaeger focuses on tracing.
Jaeger supports the propagation of context information across a distributed system, ensuring that trace data is correctly associated across the network of services.
It is not designed to handle high volumes of data, and you have to use it with a powerful storage engines like Quickwit or Elasticsearch. In such a setup, Jaeger can scale it up with your services, making it suitable for both small - and large-scale systems.
ย
6. SigNoz (Logs and Tracing)
Signoz offers logs and trace management functionalities.
You can visualize traces and logs in a single pane of glass.
You can find the root cause of the problem by going to the exact traces that are causing the problem and seeing detailed flame graphs of individual request traces.
ย
7. Keep (Alerts)
Keep connects to all your current observability tools, databases, and communication channels and aggregate everything into one platform, providing top-level alerting when something goes bad ๐
ย
8. Uptrace (Logs and Tracing)
Uptrace is an OpenTelemetry-based observability platform that ingests logs and traces. You can monitor your application and search your logs.
Thanks to its OpenTelemetry support and its many integrations, itโs easy to collect and send your data in it ๐. Note that you will need to set up Postgresql and Clickhouse databases.
ย
9. HyperDX (Logs and Tracing)
HyperDX is an Open Source Observability Platform that allows you to search logs and analyze your traces. You can debug complex errors and user issues all in one platform, without needing to jump between multiple tools.
ย
10. Prometheus (Metrics)
Interestingly enough, this library is named after the movie Prometheus (I am kidding, of course), but that was my first assumption (It's a good movie anyway)
While it might look like Prometheus and Elasticsearch are similar, they are actually very different.
Prometheus focuses only on metrics on your infrastructure (think CPU, Memory usage, disk usageโฆ) but is not well suited for high-cardinality metrics. Quickwit is more focused on Logs and Traces; Elasticsearch can do logs, traces, and metrics!
They tend to work hand in hand.
Prometheus provides a raw UI, which is fine, but it pairs best with Grafana dashboards.
Interestingly enough, Prometheus provides its query language called PromQL (Prometheus Query Language)
ย
Let's connect on X? :)
I'm here
Do you use some other excellent tools for logging and tracing?
Let me know about them in the comments :)
Top comments (32)
Thanks for mentioning SigNoz @nevodavid
Just to add, SigNoz also supports metrics. Our key value prop is to have metrics, logs and traces in a single application.
Herer's our github repo link - in case anyone is going through this - github.com/signoz/signoz
Awesome! Nice to know.
Thanks for the overview! Magic happens when you integrate Quickwit + Grafana and Jaeger!
True dat!
Awesome!
So happy I managed to help!
Thank you for mentioning Keep, @nevodavid!
Thank you for building Keep!
Ooh! Great tools in here!!
Thank you so much!!
Nice article !!
great list
โค๏ธ
Super cool tools - thanks for sharing!
Thank you for reading ๐
Great list!
Thank you so much!
What about #4? Also what are your thoughts on how integration/e2e testing and detailed analysis of code coverage can play a part in this? It won't help with observing a production system (so maybe it's not related at all to your article), but it can tell you how your code is interacting with other systems, and ensure nothing changes about those assumptions. I'm wondering how the tools you're mentioning can play a part in analyzing a run of a test suite, by viewing the the captured behavior afterwards.
Great List!
As someone who works on a legacy project, it's so important to know what's going on. I believe adding observability into legacy applications is one of the first steps to improve the application so you can refactor at a later time.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.