DEV Community

Cover image for Getting started with OpenTelemetry for Python
Ashok Nagaraj
Ashok Nagaraj

Posted on

Getting started with OpenTelemetry for Python

Observability is the ability to measure the internal states of a system by examining its outputs. A system is considered “observable” if the current state can be estimated by only using information from outputs, namely sensor data.
In the context of microservices, Observability allows teams to:

  • Monitor modern systems more effectively
  • Find and connect effects in a complex chain and trace them back to their cause
  • Enable visibility for system administrators, IT operations analysts and developers into the entire architecture

3 pillars of observability

  1. Metrics - A metric is a numeric value measured over an interval of time and includes specific attributes such as timestamp, name, KPIs and value. Unlike logs, metrics are structured by default, which makes it easier to query and optimize for storage, giving you the ability to retain them for longer periods.

  2. Logs - A log is a text record of an event that happened at a particular time and includes a timestamp that tells when it occurred and a payload that provides context. Logs come in three formats: plain text, structured and binary

  3. Traces - A trace represents the end-to-end journey of a request through a distributed system. As a request moves through the host system, every operation performed on it — called a “span” — is encoded with important data relating to the microservice performing that operation.
    By viewing traces, each of which includes one or more spans, you can track its course through a distributed system and identify the cause of a bottleneck or breakdown.

Documentation source


Instrumentation with Python

Let us start with a simple flask server.

$ pip install flask
Enter fullscreen mode Exit fullscreen mode
import datetime
import flask

######################
## initialization
######################
app = flask.Flask(__name__)
start = datetime.datetime.now()

######################
## routes
######################
@app.route('/', methods=['GET'])
def root():
  return flask.jsonify({'message': 'flask app root/'})

@app.route('/healthz', methods=['GET'])
def healthz():
  now = datetime.datetime.now()
  return flask.jsonify({'message': f'up and running since {(now - start)}'})

if __name__ == '__main__':
  app.run(debug=True, host='0.0.0.0', port=5000)
Enter fullscreen mode Exit fullscreen mode

Let us add OpenTelemetry(otel) libraries

$ pip install opentelemetry-api opentelemetry-sdk
Enter fullscreen mode Exit fullscreen mode

Now start instrumenting, let us add tracing and a metric for counting number of times /healthz is called

import datetime

import flask
from opentelemetry import trace
from opentelemetry import metrics

######################
## initialization
######################
app = flask.Flask(__name__)
start = datetime.datetime.now()

tracer = trace.get_tracer(__name__)
meter = metrics.get_meter(__name__)

hltz_counter = meter.create_counter('healthz_count', description='Number of /healthz requests')

######################
## routes
######################
@app.route('/', methods=['GET'])
def root():
  return flask.jsonify({'message': 'flask app root/'})

@app.route('/healthz', methods=['GET'])
def healthz():
  now = datetime.datetime.now()
  hltz_counter.add(1)
  return flask.jsonify({'message': f'up and running since {(now - start)}'})

if __name__ == '__main__':
  app.run(debug=True, host='0.0.0.0', port=5000)
Enter fullscreen mode Exit fullscreen mode

Run the instrumented code

$ opentelemetry-instrument --traces_exporter console --metrics_exporter console flask run
 * Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
...
Enter fullscreen mode Exit fullscreen mode

Pass some traffic

$ curl localhost:5000
{"message":"flask app root/"}

$ curl localhost:5000/healthz
{"message":"up and running since 0:00:53.605913"}
Enter fullscreen mode Exit fullscreen mode

Observe the terminal and check for healthz_count

127.0.0.1 - - [13/Oct/2022 09:16:54] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [13/Oct/2022 09:16:58] "GET /healthz HTTP/1.1" 200 -
{
    "name": "/healthz",
    "context": {
        "trace_id": "0x7d30b2042efe9a4661cc427352119754",
        "span_id": "0x479211d157c16733",
        "trace_state": "[]"
    },
    "kind": "SpanKind.SERVER",
    "parent_id": null,
    "start_time": "2022-10-13T03:50:31.090144Z",
    "end_time": "2022-10-13T03:50:31.090545Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.server_name": "127.0.0.1",
        "http.scheme": "http",
        "net.host.port": 5000,
        "http.host": "localhost:5000",
        "http.target": "/healthz",
        "net.peer.ip": "127.0.0.1",
        "http.user_agent": "curl/7.79.1",
        "net.peer.port": 50286,
        "http.flavor": "1.1",
        "http.route": "/healthz",
        "http.status_code": 200
    },
    "events": [],
    "links": [],
    "resource": {
        "attributes": {
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.13.0",
            "telemetry.auto.version": "0.34b0",
            "service.name": "unknown_service"
        },
        "schema_url": ""
    }
}
{"resource_metrics": [{"resource": {"attributes": {"telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.13.0", "telemetry.auto.version": "0.34b0", "service.name": "unknown_service"}, "schema_url": ""}, "scope_metrics": [{"scope": {"name": "app", "version": "", "schema_url": ""}, "metrics": [{"name": "healthz_count", "description": "Number of /healthz requests", "unit": "", "data": {"data_points": [{"attributes": {}, "start_time_unix_nano": 1665632818794016000, "time_unix_nano": 1665632825058633000, "value": 2}], "aggregation_temporality": 2, "is_monotonic": true}}], "schema_url": ""}], "schema_url": ""}]}
Enter fullscreen mode Exit fullscreen mode

We have successfully generated traces and metrics (sometimes it takes a couple of seconds for them to show-up)

Oldest comments (0)