Applications and systems logging output to some magical location that we end up finding on the operating system or on a system path has been the savior of every Sysadmin and Developer since the beginning of computers.
In Kubernetes, it’s no different.
Without proper logging and overall log consumption, you’ll never be able to troubleshoot what’s happening in your Kubernetes environment.
In this blog post, you’ll learn about what logging is, different logging methodologies in Kubernetes, and how you can implement them in production.
What Is Logging
Have you ever seen an error pop up on your screen when you were trying to install something or use a specific tool/platform?
Have you ever had to go on a computer and look at the output for a particular application?
Those are logs.
Logs contain information about the system, the operating system, the application, and many other aspects as long as wherever you’re viewing the logs support logging. What’s meant by that is every application has the ability to send out logs, but sometimes they don’t. It’s all up to the developer to implement the functionality to save/output logs for the app.
Logging has been a lifesaver for almost everyone in the technology space. Whether you’re a Sysadmin, a cloud engineer, a developer, or anything in between, it’s almost guaranteed that you’ve looked at logs at some point to solve a specific problem.
In short, logs are the events that occur on a computer or an app.
Logs In Kubernetes
In Kubernetes, you’re going to care about two different types of logs:
- Cluster logs
- Kubernetes resource/object logs
Cluster logs are in regard to how to cluster is performing. If the control plane is healthy if the worker nodes are healthy, and depending on how they’re installed, how the Kubernetes components are performing (Etcd, Scheduler, API Server, etc.). For example, in standard Kubeadm deployments, Etcd would run as a Pod, so it would be considered a “Kubernetes resource/object log”. However, if Etcd is configured across clusters running on its own server, it may not be bootstrapped as a Pod, and therefore would be considered a “cluster log”.
Kubernetes resource/object logs are any Kubernetes resource that’s running. You can get logs and events for various Kubernetes resources, but the bulk of what engineers usually look at are Pod and container logs as this is the primary location to give you information on the running application.
The quickest way to check logs for Pods is with the
kubectl logs pod_name command. However,
kubectl logs only work to check logs for Pods and no other Kubernetes resource.
Different Types Of Logging Methods In Kubernetes
When you’re thinking about your logging strategy for a Kubernetes environment, there are a few different ways that you can output logs:
- Application Forwarding
- Node Agent Forward
Let’s break these down.
Application Forwarding is done inside of the application code. For example, let’s say you’re writing a fronted app. Inside of the app, you can specify logic that says “send the logs from here to this system”. Although that may sound straightforward, it’s arguably the worst way to manage logs. Why? Because you’re putting a dependency on the code and the logging system. What if the logging system changes? Are you going to implement this for each and every app and piece of each app? This method should be avoided at all times unless you have some time of ridiculously compelling reason to you it. For example, a compelling reason to do this is if you have a legacy application that already has the log functionality embedded into it (although, you should be planning how to change this functionality for the future).
The sidecar method would be if you implement a log aggregator into a Kubernetes Pod. You have end up having two containers in the same Pod. One of the containers is the app itself and the other container is the logging system you’re using. This is an “alright” method, and should definitely be used instead of Application Forwarding, but there’s a better method - Node Agent Forwarding.
The Node Agent Forwarding method is a Pod that runs on each Kubernetes Worker Node. The job of the Pod is to read the log files of the containerized app and send them to whichever logging tool/platform you’re using. This is the best method for a few reason. 1) You aren’t implementing a sidecar which removes extra functionality for a Pod which should technically be doing one job. 2) Segregates the workload processes vs having dependencies and in turn you have one Kubernetes Resource (the Pod) that’s doing one job (sending logs).
Kubernetes Audit Logs
The thing about logs is that logs can be considered pretty much any output. Events, app logs, terminal output, exit events, and a lot of other pieces. Because of that, we can’t cover everything without this blog post turning into a whitepaper, so let’s stick with the most important.
To kick off the hands-on portion of this blog post, you’ll start with Kubernetes audit logs.
Kubernetes audit logs give you the ability to capture and view any output that the Kubernetes Resource creates. For example, when you create a Kubernetes Deployment, a lot is going on.
- The container image is being pulled from a registry.
- The Pods are being scheduled on worker nodes.
- The Pods are either starting or failing and if they’re failing, there’s a reason as to why.
- Scaling on the Pods is done.
Because of that detail, you may want to specify to collect certain logs or maybe you want all of them.
For example, below is the
Policy Kubernetes Resource being used to create an audit policy which literally collects everything that it possibly can from every single Kubernetes Resource.
apiVersion: audit.k8s.io/v1 kind: Policy rules: - level: Metadata
First, let’s see how this is done in an on-prem Kubernetes cluster. For example, bootstrapped with Kubeadm as it’s quite different compared to cloud services. Then, you’ll see how auditing is configure for cloud services.
First, create a new YAML for the policy that you want to create.
sudo vim /etc/kubernetes/kubeadmpolicy.yaml
Next, specify a policy. For example, you can use the policy below which collects audit logs for Pods and anyone that’s authenticating to the API.
apiVersion: audit.k8s.io/v1 kind: Policy omitStages: - "RequestReceived" rules: - level: RequestResponse resources: - group: "" resources: ["pods"] - level: Metadata resources: - group: "" resources: ["pods/log", "pods/status"] - level: None userGroups: ["system:authenticated"] nonResourceURLs: - "/api*" - "/version"
Once saved, you need to update the API server. When bootstrapping a Kubernetes cluster with Kubeadm, all Manifests for the Pods that make up the control plane are under
Open up the
sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml
Save the following lines under the
kube-apiserver configuration which will specify where you want the logs to go and the policy that you created above.
- --audit-log-path=/var/log/audit.log - --audit-policy-file=/etc/kubernetes/kubeadmpolicy.yaml
Next, specify the audit policy and the log as a mount point.
*- mountPath: /etc/kubernetes/kubeadmpolicy.yaml* *name: audit* *readOnly: true* *- mountPath: /var/log/audit.log* *name: audit-log* *readOnly: false*
The last step is to specify the volumes host path for the mount paths that you created previously.
*- hostPath: path: /etc/kubernetes/kubeadmpolicy.yaml type: File name: audit - hostPath: path: /var/log/audit.log type: FileOrCreate name: audit-log*
Once complete, restart the Kubelet.
sudo systemctl restart kubelet
You should now be able to see the logs at the following location.
tail -f /var/log/audit.log
Azure Kubernetes Service
By default, auditing is enabled on AKS and an Audit Policy is in place to collect everything from all Kubernetes resources.
However, the functionality to view the audit logs isn’t on by default. To turn it on, do the following.
First, go to your AKS cluster and under Monitoring, click on Diagnostic settings. Then, click the blue + Add diagnostic setting button.
Under the Diagnostic settings, click on the Kubernetes Audit category. At this point, you’ll have an option on where you want to save the logs. For the purposes of this section, you can choose the Send to Log Analytics workspace option.
Now that the auditing is turned on, you can open up the analytics workspace and run a query to retrieve the events.
For example, the below query will query everything for Kubernetes audit logs.
AzureDiagnostics | where Category == "kube-audit" | project log_s
You can see an example of the output below.
AWS Elastic Kubernetes Service
For AWS, audit logging is not on by default. You can turn it on when you’re creating a Kubernetes cluster and then the audit logs will go to CloudWatch by default.
In this section, you’re going to learn how to retrieve Kubernetes Events, which are, in a way, logs in themselces.
An event is a Kubernetes resource/object. that occurs when a change happens with another Kubernetes resource/object whether it’s Pods, Services, Nodes, etc…
You can run the
kubectl get events command to see what’s happening inside of a cluster.
kubectl get events LAST SEEN TYPE REASON OBJECT MESSAGE 5m1s Normal NodeHasSufficientMemory node/stdkubeadmwn Node stdkubeadmwn status is now: NodeHasSufficientMemory 5m1s Normal NodeHasNoDiskPressure node/stdkubeadmwn Node stdkubeadmwn status is now: NodeHasNoDiskPressure 5m1s Normal NodeHasSufficientPID node/stdkubeadmwn Node stdkubeadmwn status is now: NodeHasSufficientPID 5m1s Normal NodeReady node/stdkubeadmwn Node stdkubeadmwn status is now: NodeReady 5m11s Normal NodeNotReady node/stdkubeadmwn Node stdkubeadmwn status is now: NodeNotReady
However, there isn’t a whole lot going on because the
kubectl get events command was ran on a new cluster. Let’s throw some traffic at it to generate some events.
Run the following YAML which will deploy an Nginx Kubernetes Deployment.
kubectl apply -f - <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: selector: matchLabels: app: nginxdeployment replicas: 2 template: metadata: labels: app: nginxdeployment spec: containers: - name: nginxdeployment image: nginx:latest ports: - containerPort: 80 EOF
If you run the
kubectl get events command again, you’ll see a few more events as the Pods for the Nginx Deployment were created and the Nginx container image was pulled down for the Docker registry.
mike@stdkubeadmcp:~$ kubectl get events LAST SEEN TYPE REASON OBJECT MESSAGE 1s Normal Scheduled pod/nginx-deployment-574db6c95f-fr8qb Successfully assigned default/nginx-deployment-574db6c95f-fr8qb to stdkubeadmwn 1s Normal Pulling pod/nginx-deployment-574db6c95f-fr8qb Pulling image "nginx:latest" 1s Normal Scheduled pod/nginx-deployment-574db6c95f-llzzd Successfully assigned default/nginx-deployment-574db6c95f-llzzd to stdkubeadmwn 1s Normal Pulling pod/nginx-deployment-574db6c95f-llzzd Pulling image "nginx:latest" 1s Normal SuccessfulCreate replicaset/nginx-deployment-574db6c95f Created pod: nginx-deployment-574db6c95f-fr8qb 1s Normal SuccessfulCreate replicaset/nginx-deployment-574db6c95f Created pod: nginx-deployment-574db6c95f-llzzd 1s Normal ScalingReplicaSet deployment/nginx-deployment Scaled up replica set nginx-deployment-574db6c95f to 2 6m16s Normal NodeHasSufficientMemory node/stdkubeadmwn Node stdkubeadmwn status is now: NodeHasSufficientMemory 6m16s Normal NodeHasNoDiskPressure node/stdkubeadmwn Node stdkubeadmwn status is now: NodeHasNoDiskPressure 6m16s Normal NodeHasSufficientPID node/stdkubeadmwn Node stdkubeadmwn status is now: NodeHasSufficientPID 6m16s Normal NodeReady node/stdkubeadmwn Node stdkubeadmwn status is now: NodeReady 6m26s Normal NodeNotReady node/stdkubeadmwn Node stdkubeadmwn status is now: NodeNotReady
As you can imagine, running
kubectl get events will end up retrieving a ton of data depending on how new the cluster is.
To get a more specific event list, you can target a specific Kubernetes Resource.
In the example below, you can see that the
describe command is used to see the events (and many other components) of the Kubernetes Deployment.
kubectl describe deployment nginx-deployment
If you scroll down on the output of the
describe command, you’ll see an output similar to the output below which shows the events for the resource.
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ScalingReplicaSet 38s deployment-controller Scaled up replica set nginx-deployment-574db6c95f to 2
Logging in Kubernetes is much like any other method of logging - collect events, view them, and utilize them to troubleshoot or understand what’s happening in a system and in an application. The thing to remember here is to ensure that you send the logs to a proper location that can be utilized by the entire team.
Top comments (2)
In your conclusion,@thenjdevopsguy , you mention finding a location that can be utilized by the entire team. Would Testkube's web based GUI be a good spot for this?
It seems with the ability to run the tests natively in Kubernetes then feed that all into a web based dashboard seems like it would be a good place for all teams that need access to that info to view it. I know there are a few other log testing tools out there that do similar things, but as someone that works with the Testkube team, I know this is a direct problem the tool is trying to solve. Any feedback is appreciated!
Perhaps it can! I can't say yes or no as I haven't tried the product, so I wouldn't be able to recommend it. I just added you on LinkedIn. Let's see if we can work together to figure it out.
Some comments have been hidden by the post's author - find out more