Michael Levan

Posted on Dec 13, 2022

Logging In Kubernetes

#gratitude

Applications and systems logging output to some magical location that we end up finding on the operating system or on a system path has been the savior of every Sysadmin and Developer since the beginning of computers.

In Kubernetes, it’s no different.

Without proper logging and overall log consumption, you’ll never be able to troubleshoot what’s happening in your Kubernetes environment.

In this blog post, you’ll learn about what logging is, different logging methodologies in Kubernetes, and how you can implement them in production.

What Is Logging

Have you ever seen an error pop up on your screen when you were trying to install something or use a specific tool/platform?

Have you ever had to go on a computer and look at the output for a particular application?

Those are logs.

Logs contain information about the system, the operating system, the application, and many other aspects as long as wherever you’re viewing the logs support logging. What’s meant by that is every application has the ability to send out logs, but sometimes they don’t. It’s all up to the developer to implement the functionality to save/output logs for the app.

Logging has been a lifesaver for almost everyone in the technology space. Whether you’re a Sysadmin, a cloud engineer, a developer, or anything in between, it’s almost guaranteed that you’ve looked at logs at some point to solve a specific problem.

In short, logs are the events that occur on a computer or an app.

Logs In Kubernetes

In Kubernetes, you’re going to care about two different types of logs:

Cluster logs
Kubernetes resource/object logs

Cluster logs are in regard to how to cluster is performing. If the control plane is healthy if the worker nodes are healthy, and depending on how they’re installed, how the Kubernetes components are performing (Etcd, Scheduler, API Server, etc.). For example, in standard Kubeadm deployments, Etcd would run as a Pod, so it would be considered a “Kubernetes resource/object log”. However, if Etcd is configured across clusters running on its own server, it may not be bootstrapped as a Pod, and therefore would be considered a “cluster log”.

Kubernetes resource/object logs are any Kubernetes resource that’s running. You can get logs and events for various Kubernetes resources, but the bulk of what engineers usually look at are Pod and container logs as this is the primary location to give you information on the running application.

The quickest way to check logs for Pods is with the kubectl logs pod_name command. However, kubectl logs only work to check logs for Pods and no other Kubernetes resource.

Different Types Of Logging Methods In Kubernetes

When you’re thinking about your logging strategy for a Kubernetes environment, there are a few different ways that you can output logs:

Application Forwarding
Sidecar
Node Agent Forward

Let’s break these down.

Application Forwarding is done inside of the application code. For example, let’s say you’re writing a fronted app. Inside of the app, you can specify logic that says “send the logs from here to this system”. Although that may sound straightforward, it’s arguably the worst way to manage logs. Why? Because you’re putting a dependency on the code and the logging system. What if the logging system changes? Are you going to implement this for each and every app and piece of each app? This method should be avoided at all times unless you have some time of ridiculously compelling reason to you it. For example, a compelling reason to do this is if you have a legacy application that already has the log functionality embedded into it (although, you should be planning how to change this functionality for the future).

The sidecar method would be if you implement a log aggregator into a Kubernetes Pod. You have end up having two containers in the same Pod. One of the containers is the app itself and the other container is the logging system you’re using. This is an “alright” method, and should definitely be used instead of Application Forwarding, but there’s a better method - Node Agent Forwarding.

The Node Agent Forwarding method is a Pod that runs on each Kubernetes Worker Node. The job of the Pod is to read the log files of the containerized app and send them to whichever logging tool/platform you’re using. This is the best method for a few reason. 1) You aren’t implementing a sidecar which removes extra functionality for a Pod which should technically be doing one job. 2) Segregates the workload processes vs having dependencies and in turn you have one Kubernetes Resource (the Pod) that’s doing one job (sending logs).

Kubernetes Audit Logs

The thing about logs is that logs can be considered pretty much any output. Events, app logs, terminal output, exit events, and a lot of other pieces. Because of that, we can’t cover everything without this blog post turning into a whitepaper, so let’s stick with the most important.

To kick off the hands-on portion of this blog post, you’ll start with Kubernetes audit logs.

Kubernetes audit logs give you the ability to capture and view any output that the Kubernetes Resource creates. For example, when you create a Kubernetes Deployment, a lot is going on.

The container image is being pulled from a registry.
The Pods are being scheduled on worker nodes.
The Pods are either starting or failing and if they’re failing, there’s a reason as to why.
Scaling on the Pods is done.

Because of that detail, you may want to specify to collect certain logs or maybe you want all of them.

For example, below is the Policy Kubernetes Resource being used to create an audit policy which literally collects everything that it possibly can from every single Kubernetes Resource.

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata

First, let’s see how this is done in an on-prem Kubernetes cluster. For example, bootstrapped with Kubeadm as it’s quite different compared to cloud services. Then, you’ll see how auditing is configure for cloud services.

Kubeadm

First, create a new YAML for the policy that you want to create.

sudo vim /etc/kubernetes/kubeadmpolicy.yaml

Next, specify a policy. For example, you can use the policy below which collects audit logs for Pods and anyone that’s authenticating to the API.

apiVersion: audit.k8s.io/v1
kind: Policy
omitStages:
  - "RequestReceived"
rules:
  - level: RequestResponse
    resources:
    - group: ""
      resources: ["pods"]

  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]

  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
    - "/api*"
    - "/version"

Once saved, you need to update the API server. When bootstrapping a Kubernetes cluster with Kubeadm, all Manifests for the Pods that make up the control plane are under /etc/kubernetes/manifests.

Open up the kube-apiserver.yaml Manifest.

sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml

Save the following lines under the kube-apiserver configuration which will specify where you want the logs to go and the policy that you created above.

- --audit-log-path=/var/log/audit.log
- --audit-policy-file=/etc/kubernetes/kubeadmpolicy.yaml

Next, specify the audit policy and the log as a mount point.

*- mountPath: /etc/kubernetes/kubeadmpolicy.yaml*
  *name: audit*
  *readOnly: true*
*- mountPath: /var/log/audit.log*
  *name: audit-log*
  *readOnly: false*

The last step is to specify the volumes host path for the mount paths that you created previously.


*- hostPath:
    path: /etc/kubernetes/kubeadmpolicy.yaml
    type: File
  name: audit
- hostPath:
    path: /var/log/audit.log
    type: FileOrCreate
  name: audit-log*

Once complete, restart the Kubelet.

sudo systemctl restart kubelet

You should now be able to see the logs at the following location.

tail -f /var/log/audit.log

Azure Kubernetes Service

By default, auditing is enabled on AKS and an Audit Policy is in place to collect everything from all Kubernetes resources.

However, the functionality to view the audit logs isn’t on by default. To turn it on, do the following.

First, go to your AKS cluster and under Monitoring, click on Diagnostic settings. Then, click the blue + Add diagnostic setting button.

Under the Diagnostic settings, click on the Kubernetes Audit category. At this point, you’ll have an option on where you want to save the logs. For the purposes of this section, you can choose the Send to Log Analytics workspace option.

Now that the auditing is turned on, you can open up the analytics workspace and run a query to retrieve the events.

For example, the below query will query everything for Kubernetes audit logs.

AzureDiagnostics
| where Category == "kube-audit"
| project log_s

You can see an example of the output below.

AWS Elastic Kubernetes Service

For AWS, audit logging is not on by default. You can turn it on when you’re creating a Kubernetes cluster and then the audit logs will go to CloudWatch by default.

Kubernetes Events

In this section, you’re going to learn how to retrieve Kubernetes Events, which are, in a way, logs in themselces.

An event is a Kubernetes resource/object. that occurs when a change happens with another Kubernetes resource/object whether it’s Pods, Services, Nodes, etc…

You can run the kubectl get events command to see what’s happening inside of a cluster.

kubectl get events
LAST SEEN   TYPE     REASON                    OBJECT              MESSAGE
5m1s        Normal   NodeHasSufficientMemory   node/stdkubeadmwn   Node stdkubeadmwn status is now: NodeHasSufficientMemory
5m1s        Normal   NodeHasNoDiskPressure     node/stdkubeadmwn   Node stdkubeadmwn status is now: NodeHasNoDiskPressure
5m1s        Normal   NodeHasSufficientPID      node/stdkubeadmwn   Node stdkubeadmwn status is now: NodeHasSufficientPID
5m1s        Normal   NodeReady                 node/stdkubeadmwn   Node stdkubeadmwn status is now: NodeReady
5m11s       Normal   NodeNotReady              node/stdkubeadmwn   Node stdkubeadmwn status is now: NodeNotReady

However, there isn’t a whole lot going on because the kubectl get events command was ran on a new cluster. Let’s throw some traffic at it to generate some events.

Run the following YAML which will deploy an Nginx Kubernetes Deployment.

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginxdeployment
  replicas: 2
  template:
    metadata:
      labels:
        app: nginxdeployment
    spec:
      containers:
      - name: nginxdeployment
        image: nginx:latest
        ports:
        - containerPort: 80
EOF

If you run the kubectl get events command again, you’ll see a few more events as the Pods for the Nginx Deployment were created and the Nginx container image was pulled down for the Docker registry.

mike@stdkubeadmcp:~$ kubectl get events
LAST SEEN   TYPE     REASON                    OBJECT                                   MESSAGE
1s          Normal   Scheduled                 pod/nginx-deployment-574db6c95f-fr8qb    Successfully assigned default/nginx-deployment-574db6c95f-fr8qb to stdkubeadmwn
1s          Normal   Pulling                   pod/nginx-deployment-574db6c95f-fr8qb    Pulling image "nginx:latest"
1s          Normal   Scheduled                 pod/nginx-deployment-574db6c95f-llzzd    Successfully assigned default/nginx-deployment-574db6c95f-llzzd to stdkubeadmwn
1s          Normal   Pulling                   pod/nginx-deployment-574db6c95f-llzzd    Pulling image "nginx:latest"
1s          Normal   SuccessfulCreate          replicaset/nginx-deployment-574db6c95f   Created pod: nginx-deployment-574db6c95f-fr8qb
1s          Normal   SuccessfulCreate          replicaset/nginx-deployment-574db6c95f   Created pod: nginx-deployment-574db6c95f-llzzd
1s          Normal   ScalingReplicaSet         deployment/nginx-deployment              Scaled up replica set nginx-deployment-574db6c95f to 2
6m16s       Normal   NodeHasSufficientMemory   node/stdkubeadmwn                        Node stdkubeadmwn status is now: NodeHasSufficientMemory
6m16s       Normal   NodeHasNoDiskPressure     node/stdkubeadmwn                        Node stdkubeadmwn status is now: NodeHasNoDiskPressure
6m16s       Normal   NodeHasSufficientPID      node/stdkubeadmwn                        Node stdkubeadmwn status is now: NodeHasSufficientPID
6m16s       Normal   NodeReady                 node/stdkubeadmwn                        Node stdkubeadmwn status is now: NodeReady
6m26s       Normal   NodeNotReady              node/stdkubeadmwn                        Node stdkubeadmwn status is now: NodeNotReady

As you can imagine, running kubectl get events will end up retrieving a ton of data depending on how new the cluster is.

To get a more specific event list, you can target a specific Kubernetes Resource.

In the example below, you can see that the describe command is used to see the events (and many other components) of the Kubernetes Deployment.

kubectl describe deployment nginx-deployment

If you scroll down on the output of the describe command, you’ll see an output similar to the output below which shows the events for the resource.

Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  38s   deployment-controller  Scaled up replica set nginx-deployment-574db6c95f to 2

Conclusion

Logging in Kubernetes is much like any other method of logging - collect events, view them, and utilize them to troubleshoot or understand what’s happening in a system and in an application. The thing to remember here is to ensure that you send the logs to a proper location that can be utilized by the entire team.

Top comments (2)

Comment hidden by post author - thread only accessible via permalink

Evan Witmer • Dec 15 '22

In your conclusion,@thenjdevopsguy , you mention finding a location that can be utilized by the entire team. Would Testkube's web based GUI be a good spot for this?
It seems with the ability to run the tests natively in Kubernetes then feed that all into a web based dashboard seems like it would be a good place for all teams that need access to that info to view it. I know there are a few other log testing tools out there that do similar things, but as someone that works with the Testkube team, I know this is a direct problem the tool is trying to solve. Any feedback is appreciated!

Michael Levan • Dec 16 '22

Perhaps it can! I can't say yes or no as I haven't tried the product, so I wouldn't be able to recommend it. I just added you on LinkedIn. Let's see if we can work together to figure it out.

Some comments have been hidden by the post's author - find out more