OpenTelemetry is a Cloud Native Computing Foundation(CNCF) incubating project aimed at standardizing the way we instrument applications for generating telemetry data(logs, metrics, and traces). OpenTelemetry aims to provide a vendor-agnostic observability framework that provides a set of tools, APIs, and SDKs to instrument applications.
Modern-day software systems are built with many pre-built components taken from the open source ecosystem like web frameworks, databases, HTTP clients, etc. Instrumenting such a software system in-house can pose a huge challenge. And using vendor-specific instrumentation agents can create a dependency. OpenTelemetry provides an open source instrumentation layer that can instrument various technologies like open source libraries, programming languages, databases, etc.
With OpenTelemetry instrumentation, you can collect four telemetry signals:
Traces help track user requests across services.
Metrics are measurements at specific timestamps to monitor performance like server response times, memory utilization, etc.
Logs are text records - structured or unstructured- containing information about activities and operations within an operating system, application, server, etc.
Baggage helps to pass information with context propagation between two process boundaries.
Before deep diving into OpenTelemetry logs, let's briefly overview OpenTelemetry.
OpenTelemetry provides instrumentation libraries for your application. The development of these libraries is guided by the OpenTelemetry specification. The OpenTelemetry specification describes the cross-language requirements and design expectations for all OpenTelemetry implementations in various programming languages.
OpenTelemetry libraries can be used to generate logs, metrics, and traces. You can then collect these signals with OpenTelemetry Collector or send them to an observability backend of your choice. In this article, we will focus on OpenTelemetry logs. OpenTelemetry can be used to collect log data and process it. Let's explore OpenTelemetry logs further.
Among the three telemetry signals, logs have the biggest legacy. A log usually captures an event and stores it as a text record. Developers use log data to debug applications. There are many types of log data:
Application logs contain information about events that have occurred within a software application.
System logs contain information about events that occur within the operating system itself.
Devices in the networking infrastructure provide various logs based on their network activity.
Web server logs
Popular web servers like Apache and NGINX produce log files that can be used to identify performance bottlenecks.
Most of these logs are either computer generated or generated by using some well-known logging libraries. Most programming languages have built-in logging capabilities or well-known logging libraries. For OpenTelemetry logs to be successful, OpenTelemetry aims to provide integrations for existing libraries while providing improvements and integrations with other observability signals.
Unlike traces and metrics, OpenTelemetry logs take a different approach. In order to be successful, OpenTelemetry needs to support the existing legacy of logs and logging libraries. And this is the main design philosophy of OpenTelemetry logs. But it is not limited to this. With time, OpenTelemetry aims to integrate logs better with other signals.
For a robust observability framework, all telemetry signals should be easily correlated. But most application owners have disparate tools to collect each telemetry signal and no way to correlate the signals. Current logging solutions don’t support integrating logs with other observability signals.
Existing logging solutions also don’t have any standardized way to propagate and record the request execution context. Without request execution context, collected logs are disassociated sets from different components of a software system. But having contextual log data can help draw quicker insights. OpenTelemetry aims to collect log data with request execution context and to correlate it with other observability signals
OpenTelemetry provides various receivers and processors for collecting first-party and third-party logs directly via OpenTelemetry Collector or via existing agents such as FluentBit so that minimal changes are required to move to OpenTelemetry for logs.
These applications are built in-house and use existing logging libraries. The logs from these applications can be pushed to OpenTelemetry with little to no changes in the application code. OpenTelemetry provides a
trace_parser with which you can add context IDs to your logs to correlate them with other signals.
In OpenTelemetry, there are two important context IDs for context propagation.
A trace is a complete breakdown of a transaction as it travels through different components of a distributed system. Each trace gets a trace ID that helps to correlate all events connected with a single trace.
A trace consists of multiple spans. Each span represents a single unit of logical work in the trace data. Spans have span ids that are used to represent the parent-child relationship.
Correlating your logs with traces can help drive deeper insights. If you don’t have request context like traceId and spanId in your logs, you might want to add them for easier correlation with metrics and traces.
There are two ways to collect application logs:
Via File or Stdout Logs
Here, the logs of the application are directly collected by the OpenTelemetry receiver using collectors like **filelog receiver.Then operators and processors are used for parsing them into the OpenTelemetry log data model.
For advanced parsing and collecting capabilities, you can also use something like FluentBit or Logstash. The agents can push the logs to the OpenTelemetry collector using protocols like FluentForward/TCP/UDP, etc.
Directly to OpenTelemetry Collector
In this approach, you can modify your logging library that is used by the application to use the logging SDK provided by OpenTelemetry and directly forward the logs from the application to OpenTelemetry. This approach removes any need for agents/intermediary medium but loses the simplicity of having the log file locally.
Logs emitted by third-party applications running on the system are known as third-party application logs. The logs are typically written to stdout, files, or other specialized mediums. For example, Windows event logs for applications.
These logs can be collected using the OpenTelemetry file receiver and then processed.
OpenTelemetry provides operators to process logs. An operator is the most basic unit of log processing. Each operator fulfills a single responsibility, such as adding an attribute to a log field or parsing JSON from a field. Operators are then chained together in a pipeline to achieve the desired result.
For example, a user may parse log lines using
regex_parser and then use
trace_parser to parse the
spanId from the logs.
OpenTelemetry also provides processors for processing logs. Processors are used at various stages of a pipeline. Generally, a processor pre-processes data before it is exported (e.g. modify attributes or sample) or helps ensure that data makes it through a pipeline successfully (e.g. batch/retry).
Processors are also helpful when you have multiple receivers for logs and you want to parse/transforms logs collected from all the receivers. Some well-known log processors are:
- Batch Processor
- Memory Limit Processor
- Attributes Processor
- Resource Processor
OpenTelemetry provides instrumentation for generating logs. You then need a backend for storing, querying, and analyzing your logs. SigNoz, a full-stack open source APM is built to support OpenTelemetry natively. It uses a columnar database - ClickHouse for storing logs effectively. Big companies like Uber and Cloudflare have shifted to ClickHouse for log analytics.
The logs tab in SigNoz has advanced features like a log query builder, search across multiple fields, structured table view, JSON view, etc.
You can also view logs in real-time with live tail logging.
With advanced Log Query Builder, you can filter out logs quickly with a mix and match of fields.
SigNoz can be installed on macOS or Linux computers in just three steps by using a simple install script.
The install script automatically installs Docker Engine on Linux. However, on macOS, you must manually install Docker Engine before running the install script.
git clone -b main https://github.com/SigNoz/signoz.git cd signoz/deploy/ ./install.sh
You can visit our documentation for instructions on how to install SigNoz using Docker Swarm and Helm Charts.
You can also check out the documentation for logs here.
System logs are the logs generated by the operating system. With SigNoz you can collect your syslogs logs and perform different queries on top of it. OpenTelemetry collector provides a syslogreceiever to receive System logs.
In this example, we will configure
rsyslog to forward our system logs to tcp endpoint of OpenTelemetry Collector and use
syslog receiver in the collector to receive and parse the logs. Below are the steps to collect syslogs. Note that you would need to install SigNoz to follow these steps.
docker-compose.yaml file present inside:
deploy/docker/clickhouse-setup to expose a port, in this case
54527 so that we can forward syslogs to this port.
... otel-collector: image: signoz/signoz-otel-collector:0.55.0-rc.3 command: ["--config=/etc/otel-collector-config.yaml"] volumes: - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml ports: - "54527:54527" ...
Add the syslog reciever to
otel-collector-config.yaml which is present inside
receivers: syslog: tcp: listen_address: "0.0.0.0:54527" protocol: rfc3164 location: UTC operators: - type: move from: attributes.message to: body ...
Here we are collecting the logs and moving message from attributes to body using operators that are available.
You can read more about operators here.
For more configurations that are available for syslog receiver please check here.
Next we will modify our pipeline inside
otel-collector-config.yaml to include the receiver we have created above.
service: .... logs: receivers: [otlp, syslog] processors: [batch] exporters: [clickhouselogsexporter]
Now we can restart the otel collector container so that new changes are applied and we can forward our logs to port
rsyslog.conf file present inside
/etc/ by running
sudo vim /etc/rsyslog.conf and adding the this line at the end
*.* action(type="omfwd" target="0.0.0.0" port="54527" protocol="tcp")
For production use cases it is recommended to using something like:
*.* action(type="omfwd" target="0.0.0.0" port="54527" protocol="tcp" action.resumeRetryCount="10" queue.type="linkedList" queue.size="10000")
So that you have retires and queue in place to de-couple the sending from the other logging action.
The value of
target might vary depending on where SigNoz is deployed, since it is deployed on the same host I am using
0.0.0.0 for more help you can visit here
- Now restart your rsyslog service by running
sudo systemctl restart rsyslog.service
- You can check the status of service by running
sudo systemctl status rsyslog.service
- If there are no errors your logs will be visible on SigNoz UI.
The goal of OpenTelemetry is to make log data have a richer context, making it more valuable to application owners. With OpenTelemetry you can correlate logs with traces and correlate logs emitted by different components of a distributed system.
Standardizing log correlation with traces and metrics, adding support for distributed context propagation for logs, unification of source attribution of logs, traces and metrics will increase the individual and combined value of observability information for legacy and modern systems.
Source: OpenTelemetry website
To get started with OpenTelemetry logs, install SigNoz and start experimenting by sending some log data to SigNoz. SigNoz also provides traces and metrics. So you can have all three telemetry signals under a single pane of glass.