OTel collector is a component used to receive, process and export telemetry data (signals) from sources to observability backends like elasticsearch, cassandra, Datadog, NewRelic ...
- Avoids resource contentions due to scalable nature and variety of deployment modes
- Customizable at multiple levels, no need for constant reboot/reload of the pipeline
- Tolerant to network partitions to the most part
Use a collector when you
- need a common ingestion point for variety of signals like metrics and traces
- need to collect signals from multiple sources like application, infra, cluster, framework, databases
- apply transformation to the signal data before storing it in the backend
- enrich signals with additional meta-data
- filter-out signals based on various predefined criteria
- send signal data to multiple observability backends
- build a loosely coupled, scalable pipeline for signal data flow
- Create a configuration file, say
config.yaml; details here
- Run the collector
$ docker pull otel/opentelemetry-collector:latest $ docker run [-d] -v $(pwd)/config.yaml:/etc/otelcol/config.yaml [port-config] otel/opentelemetry-collector:latest
# port configuration - "1888:1888" # pprof extension - "8888:8888" # Prometheus metrics exposed by the collector - "8889:8889" # Prometheus exporter metrics - "13133:13133" # health_check extension - "4317:4317" # OTLP gRPC receiver - "4318:4318" # OTLP http receiver - "55679:55679" # zpages extension
Has 3 (or 4) components all of which needs to be enabled in
Note: Atleast one
service > pipeline is mandatory
- Receivers - describe modes of how collector gets the data IN and can be PUSH or PULL based (eg: host metrics, application metrics, zipkin traces).
- Processors - run on the data being transported and optionally massage, transform and filter-out data (eg: filter, batch, samplers)
- Exporters - specify how data is sent out to one/more configured backends, can be PUSH or PULL based (eg: file, jaeger, prometheus). They generally involve details of authentication in production environments.
- Extensions(optional) - provide additional capabilities to the collector, but not requiring direct access to signal data (eg: health_check, pprof)
More info: configuration wiki
- Agent: A collector instance running on the same node as the application (binary, side-car or daemonset)
- Gateway: One or more instances collectively running centrally as a standalone service. It can often offer advanced capabilities like simple load-balancing, tail based sampling, independent scaling .. generally acting as a receiver for the agents.
Gist to working demo
Note: metrics exporter does not seem to work based on official documentation
OTel Deployment patterns
Source of all the below information CNCF presentation
Basic - instrument and send to a collector
Used when application is instrumented with OTel SDK and signals are sent to a predefined collector
Basic - fanout
Used when signals are (processed and) sent to multiple destinations. Useful in situations where multiple views/perspectives of same data is to be generated (eg: one from Jaeger one from Datadog)
Collector works as an intermediate proxy and massages the data before passing on to destination; used when common processors are to be applied on the incoming signals
Workloads send signals to a OTel collector sidecar which is sent over to a collector residing in a central namespace (which processes and sends it to destination). Advantages of this pattern is decoupled central collector, easily customizable side-car and implicit load balancing.
Collector is deployed as a daemonset; while it eases management, multi-tenancy and scaling requirements are hard to customize.
A central load-balancing collector is used to aggregate all signals from a given source to a given backend collector (like how session affinity is handled). The idea behind the implementation is that any given collector should provide full picture of the source application independently.
A common Otel collector is deployed on a central cluster which acts as the final stop before writing to destinations. It is useful in regulatory scenarios where common point of control need be established
Multiple destinations generally are involved and Otel collector processes and sends to multiple destinations based on filtering tags
An otel collector per signal type (eg: one for metrics, one for traces ..). Useful to establish saperate observability pipelines per signal. Note: A PUSH based collector can be scaled easily while PULL based (prometheus) is not straight-forward given the idempotency semantics.
Top comments (0)