DEV Community

Discussion on: Care to share some painfully funny debugging stories?

Collapse
 
habereder profile image
Raphael Habereder • Edited

When we switched to kubernetes a few weeks back, we had quite a few services to migrate from our docker compose setup.

So we happily migrated with the utmost speed for our review, but one microservice didn't behave. Traefik, our ingress just didn't see it. The ingress was picked up fine, but pings and curls went nowhere. Our deployment was stuck in some kind of void.

I kid you not, I debugged this for a whole week straight. I did everything, from updating software versions to tearing down and setting up the whole environment again multiple times (thankfully this is completely automated by now).

I finally gave up and migrated a new service, just to have something to show for in our review. I wrote my three k8s files and, as I expected, it worked smoothly. Which bugged me even more!

So I tackled the broken service again, and did a stupid vimdiff to compare it with a working service. Then it struck me, in colorful diff text. A label was wrong....

Since not everyone is familiar with Kubernetes, here are the 3 most important files of a kubernetes deployment:

  • deployment.yaml
  • service.yaml
  • ingress.yaml

For kubernetes to know that a service belongs to a deployment, you have to set labels as glue.

Example:

kind: Deployment
spec:
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
<snip>
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: monitoring
spec:
  selector:
    app: prometheus
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
spec:
  <snip>
    services:
    - name: prometheus

Guess what happens if the service up there does not have app: prometheus but ms: prometheus.
Kubernetes has no idea it belongs to the deployment with

matchLabels:
      app: prometheus

and routes requests to dev/null. Guess what the reason was? A colleague copied templates of those three files from some blog, not checking if the labels were correct. And the dumbass that I am was expecting that to be correct and looked everywhere else..

So that was my nightmare for a week.