DEV Community

CodingBlocks

Open Telemetry – Instrumentation and Metrics

Reviews

Huge thanks to

  • Bill B101 for the review in iTunes
  • Donnie Clayton for the Spotify review

Metrics and Instrumentation in Open Telemetry

  • Metric – measurement of a service measured at runtime
  • Metric event – a metric captured at a specific moment – contains both the measurement and the time at which it was captured, as well as any accompanying metadata
    • Indicators of availability and performance
    • These can provide insight into user experience and impacts on the business
    • Can be used for alerting for for triggering actions such as scaling infrastructure

How Metrics Work in Open Telemetry

  • Meter Provider – typically a singleton type of implementation – one per application that shares the application’s lifecycle
    • The first step in using metering in open telemetry
    • Basically a factory for creating meters
    • In some languages, the application meter is initialized for you
  • Meter – this creates different types of metric instruments that capture service measurements at runtime
  • Metric Exporter – sends metric data to consumers ** Consumers – standard out, Open Telemetry Collector, or open sourced vendor collectors ** Here’s a list of a number of available collectors 
    https://opentelemetry.io/ecosystem/registry/?language=collector
  • Metric Instruments – these are the “things” that capture measurements and are identified by a name*, kind*, unit and description
    • The name, unit and description are chosen by the developer, OR can be one of several semantic conventions 
      https://opentelemetry.io/docs/specs/semconv/general/metrics/
    • Kind is one of the following
      • Counter – a value that increases over time, can never go down – similar to the odometer in your car
      • Asynchronous counter – same as a counter but the major difference is there’s one per export which indicates you’d need to aggregate at the consumer
      • UpDownCounter – a counter that can both increase and decrease – an example would be the number of items in a queue
      • Asynchronous UpDownCounter – same as an UpDownCounter except is collected once per export
      • Gauge – Measures the current value the time it is read – can fluctuate up and down like the fuel gauge in your car or your speedometer
      • Histogram – a client side aggregation of values, useful for things like request latencies. Useful for statistical types of measurements – how many requests too less than 500ms
  • Aggregations – a large number of measurements are combined into exact or estimated statistics that occurred during a time window
    • They mention the OTLP – Open Telemetry Protocol – it transports aggregated metrics
    • The Open Telemetry API also provides default aggregations for each instrument type – these are overrideable using views
    • Where request tracing’s purpose is to provide context of a given request, metrics are intended to provide aggregated statistical information
    • Some examples of metrics and their use cases
      • Total number of bytes read by a service, per protocol
      • Total number of bytes read and bytes per request
      • Duration of a system call
      • Request sizes for trending purposes
      • CPU or memory usage during a process
      • Average balance values of an account
      • Current number of active requests
  • Views
    • Allows the developer the ability to customize the output provided by the SDK
    • Customize which metrics are to be processed or ignored
    • Customize aggregation and what attributes you want to be available on those metrics
  • Language Support
    • Stable: C++, C#, Go, Java, JavaScript, PHP, Python
    • Experimental: Erlang/Elixir, Swift
    • Alpha: Rust
    • Maybe Never: Ruby

Automatic Instrumentation

  • Available in: .NET, Java, JavaScript, PHP, Python
  • With minimum levels of configuration, Open Telemetry can start gathering and exporting metrics for your application
    • A service name is a required configuration, but there are several other options you can set
      • Data source specific config, exporter config, propagator config, resource config

Manual Instrumentation

  • Obviously this means you’ll have more control over what you want to gather metrics on, what to export, etc

Resources

Tip of the Week

Episode source