DEV Community

Cover image for o11y: OpenTelemetry, Prometheus, Loki e Tempo no EKS [Lab Session]

o11y: OpenTelemetry, Prometheus, Loki e Tempo no EKS [Lab Session]

Image description

Loki overview

Image description
Grafana Loki documentation

Tempo architecture

Image description
Tempo documentation


Lab architecture

Image description


  • Deploy do EKS:

tofu init
tofu plan --var-file variables.tfvars
tofu apply --var-file variables.tfvars

  • IAM Role para Loki e Tempo:

Para que o Loki e Tempo consigam gravar os dados coletados em um bucket S3, é necessário criar uma role contendo a policy com as permissões necessárias. Essa role será acionada via AssumeRole através de uma annotation que definimos nos arquivos values.yaml do loki e do tempo no deploy via helm chart.

Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowLokiandTempoBucketonK8s",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:AbortMultipartUpload",
                "s3:ListBucket",
                "s3:DeleteObject",
                "s3:GetObjectVersion",
                "s3:ListMultipartUploadParts"
            ],
            "Resource": [
                "arn:aws:s3:::<loki-bucket-name>/*",
                "arn:aws:s3:::<loki-bucket-name>",
                "arn:aws:s3:::<tempo-bucket-name>/*",
                "arn:aws:s3:::<tempo-bucket-name>"
            ]
        }
    ]
}
Enter fullscreen mode Exit fullscreen mode

Coletando OIDC provider para utilizar na role

aws iam list-open-id-connect-providers

Image description

Image description

Role

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::<ACCOUNT>:oidc-provider/oidc.eks.<REGION>.amazonaws.com/id/<ID>"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.<REGION>.amazonaws.com/id/<ID>:sub": [
                        "system:serviceaccount:o11y:loki-sa",
                        "system:serviceaccount:o11y:tempo-sa"
                    ]
                }
            }
        }
    ]
}
Enter fullscreen mode Exit fullscreen mode

Image description

Image description

  • Bucket S3:

Image description

  • Conectar no EKS:

aws eks update-kubeconfig --region us-east-2 --name pegasus

Image description

Agora no repositório da stack de o11y, podemos iniciar os deploys com Helm.

  • Criar namespace:

kubectl create namespace o11y

Image description

  • Deploy promtail:

cd promtail
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm upgrade --install promtail grafana/promtail --values values.yaml -n o11y

Image description

  • Deploy Loki:

cd loki
helm upgrade --install loki grafana/loki-distributed --values values.yaml -n o11y

Image description

  • Deploy Tempo:

cd tempo
helm upgrade --install tempo grafana/tempo-distributed --values values.yaml -n o11y

Image description

  • Deploy Prometheus:

cd prometheus-grafana
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm upgrade --install kube-prometheus-stack prometheus-community/kube-prometheus-stack --values values.yaml -n o11y
kubectl apply -f istio-ingress-prometheus.yaml
kubectl apply -f istio-ingress-grafana.yaml

Neste caso, criamos também o ingress para o Prometheus e o Grafana.

Image description

Temos agora tudo rodando no cluster:

kubectl get nodes
kubectl get pods -n o11y

Image description

  • Registros no DNS (Route 53):

No DNS externo, criamos alguns registros direcionando para o NLB publico, que utilizamos como entrada. Para o Prometheus, Grafana e os apps que vamos utilizar para gerar dados.

Image description

  • Deploy API de exemplo:

cd fastapi-observability
kubectl apply -f mono_manifest.yaml

Image description

Image description

Essa API está utilizando a biblioteca do OpenTelemetry para gerar os spans das operações que compõe os traces.

  • Prometheus targets:

Podemos ver na interface do prometheus que ele já está coletando as métricas que a API de exemplo expõe no /metrics. Lembrando que temos a annotation no manifesto de deployment da API:

annotations:
  prometheus.io/scrape: "true"
  prometheus.io/path: "/metrics"
  prometheus.io/port: "8000"
Enter fullscreen mode Exit fullscreen mode

Image description

  • Grafana:

Acessando o grafana, fazemos o import do dashboard que iremos utilizar. user: admin, password: prom-operator. O json do dashboard está em: fastapi-observability/etc/fastapi-dashboard.json

Image description

Dashboard importado

Image description

Data sources

Esses data sources são criados no momento do deploy da stack prometheus, os valores são informados no values.yaml do helm chart, em 'additionalDataSources'.

Image description

  • Gerando dados com K6:

Utilizando o K6, podemos gerar requests na API de exemplo para que os dados sejam gerados e métricas capturadas para visualizarmos.

request.js

import { check } from 'k6';
import http from 'k6/http';

export const options = {
  scenarios: {
    constant_request_rate: {
      executor: 'constant-arrival-rate',
      rate: 100,
      timeUnit: '1s', // 100 iterations per second, i.e. 100 RPS
      duration: '120s',
      preAllocatedVUs: 100, // how large the initial pool of VUs would be
      maxVUs: 100, // if the preAllocatedVUs are not enough, we can initialize more
    },
  },
};

export function test(params) {
  const res = http.get('https://app-a.pauloponciano.digital');
  check(res, {
    'is status 200': (r) => r.status === 200,
  });
}

export default function () {
  test();
}
Enter fullscreen mode Exit fullscreen mode

k6 run request.js --vus=200 --duration=60m

Image description

  • Visualização:

Ainda no grafana, no dashboard FastAPI Observability vamos visualizar as métricas, traces e logs da API de exemplo.

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description


References:

Grafana Labs - FastAPI Observability

https://github.com/blueswen/fastapi-observability

OpenTelemetry demo app with Grafana, Loki, Prometheus, Tempo (Grafana Office Hours #06) - YouTube

DevOps Engineer Blueswen Liu 劉義瑋 joins us to walk us through some OpenTelemetry demo apps he created, instrumented with Grafana, Loki, Prometheus, and Tempo....

favicon youtube.com

Happy building!

Top comments (0)