Securing all aspects of Kubernetes is important, yet one of the largest entry points for attackers is Pods that aren’t configured properly. That’s why Kubernetes Pods have so many different options from a security perspective.
Whether you’re thinking about policy enforcement, what users are allowed to access Pods, what Service Accounts can run Pods, or what traffic can flow through to Pods, no Kubernetes environment is ready until certain steps are taken.
In this blog post, you’ll learn about those exact steps to ensure that your environment is running to mitigate as many risks as possible.
💡 We won’t talk about RBAC, code, or cluster security in this blog post. It’ll be all about the Pods themselves.
Why Securing Pods Is Important
There are two entry points into a Kubernetes environment for a bad actor:
- Pods
- Cluster
💡 This will focus on the Pod piece, but if you look at my “The 4C’s Of Kubernetes Security” blog post, you’ll learn about the cluster piece.
Pods are one of the easiest ways to break into a system because there are three levels, the container itself within the Pod, how the Pod was deployed, and the Pod itself.
The container is where one piece of the application is running. It’s also a great entry point via a base image that was used. Base images are used within the method of building a container image (like a Dockerfile or Cloud Native Buildpack) and unfortunately, a lot of them have unresolved security issues. To test this out, feel free to run a scan against a pre-build container image and see for yourself. If a container image has a vulnerability, it could be used to interact with your environment.
How a Pod is deployed is absolutely crucial. As an example, you have two methods of authenticating and authorizing a Pod deployment - the default Service Account and a new Service Account. If you don’t specify a Service Account within your Kubernetes Manifest, that means the Pod(s) is deployed with the default. The default is created by default (hence the name) by the cluster and it’s used if you do not specify a Service Account. Therefore, if the default Service Account gets compromised, so does every single one of your deployments.
Third is the Pod itself. A Pod is an entry point. You can either use it to run scripts via containers or authenticate to other parts of the cluster. This is why authentication, authorization, and proper Service Accounts are so important. Attackers can easily run a sidecar container within a Pod that could take down your environment.
In the next few sections, you’ll see a few methods that you can use to properly deploy a Pod in a secure fashion.
SecurityContext
The SecurityContext are security implementations that can be added at both the Pod level and the Container level (you’ll typically see both).
The breakdown of each implementation that you can add is as follows:
- runAsNonRoot: Run as a user that’s not root/admin. This is the best way from a security perspective.
- runAsUser (or Group): Specify the user/group you want to run the Pod/Container as.
- fsgroup: Changes the group of all files in a volume when they are mounted to a Pod.
- allowPrivilegeEscalation: Configures a process to be able to gain more privileges than its parent. This is a huge attack point.
- privileged: Runs a container with privileged permissions, which is the same permissions as the host (think admin).
- readOnlyFootFilesystem: Mounts a container filesystem as read-only (no write capabilities).
You can also set SELinux, AppArmour, and Seccomp capabilities. You can learn more about those here.
Essentially, the SecurityContext aside from network policies (which you’ll learn about later) is the absolute best way to secure Pods.
Demo
Let’s jump into some hands-on.
Below is a Deployment object/kind that has all of the security features we would like via the SecurityContext.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginxdeployment
replicas: 2
template:
metadata:
namespace: webapp
labels:
app: nginxdeployment
spec:
containers:
- name: nginxdeployment
image: nginx:latest
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
privileged: false
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
ports:
- containerPort: 80
However, sometimes it may not work out as you want from a security perspective based on what base image you’re using.
For example, chances are you’ll see an error like the one below.
kubectl get pods --watch
NAME READY STATUS RESTARTS AGE
nginx-deployment-7fdff64ddd-ntpx9 0/1 Error 2 (27s ago) 29s
nginx-deployment-7fdff64ddd-rpwwl 0/1 Error 2 (26s ago) 29s
Digging in a bit deeper, you’ll notice that the error comes from the ReadOnly
settings.
kubectl logs nginx-deployment-7fdff64ddd-ntpx9
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: can not modify /etc/nginx/conf.d/default.conf (read-only file system?)
/docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2024/06/02 15:00:04 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (30: Read-only file system)
nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (30: Read-only file system)
Although the ReadOnly
settings are what we want from a security perspective, sometimes we need Write
access. Therefore, engineers must understand this and simply mitigate as much as possible.
If you remove the readOnlyRootFilesytem
section, you should now see the Pods are running.
kubectl get pods --watch
NAME READY STATUS RESTARTS AGE
nginx-deployment-b68647d85-ddwkn 1/1 Running 0 4s
nginx-deployment-b68647d85-q2lzx 1/1 Running 0 4s
Pod Security Standards
PSS, or Pod Security Standards, are a set of standards that you should follow when deploying Pods. The hope with PSS is that it covers the typical spectrum of security.
There are three standards:
- Privileged
- Baseline
- Restricted
Privileged are unrestricted policies. This is considered the “free-for-all” policy.
Baseline is a middle ground between privileged and restricted. It prevents known escalations for Pods.
Restricted is heavy-duty enforcement. It follows all of the security best practices for Pods.
Source: https://kubernetes.io/docs/concepts/security/pod-security-standards/
You can dive deeper into this in the following link here.
Policy Enforcement
By default, Pods have free reign to do whatever they want, and more importantly, so do engineers. For example, an engineer can deploy a Kubernetes Manifest to production with a container image that uses the latest
tag, which means they could accidentally deploy an alpha
or beta
build. There’s nothing stopping anyone from this doing, which could be detrimental to your environment.
Because of that, there must be a way to have blockers and enforcers that not only disallow security issues but overall bad practices that could ultimately lead to misconfigurations and therefore become security issues.
Policy Enforcement allows you to figure policies which are blocked by the Admission Controller.
The two popular Policy Enforcers right now are:
- Open Policy Agent (OPA) with Gatekeeper enabled. Gatekeeper is the middle-ground between OPA and Kubernetes because OPA doesn’t know “how to speak” Kubernetes and vice-versa. Think of Gatekeeper like a Shim.
- Kyverno is Kubernetes native, so it doesn’t require a Shim. Kyverno now works outside of Kubernetes. when it was originally created, it was only for Kubernetes.
Demo
Let’s take a look at how Policy Enforcement works with OPA.
First, add the Gatekeeper repo.
helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
Install Gatekeeper.
helm install gatekeeper/gatekeeper --name-template=gatekeeper --namespace gatekeeper-system --create-namespace
Once Gatekeeper is installed, you can start configuring policies.
The first step is creating a Config. The Config tells Gatekeeper what it’s allowed to manage policies on. In this case, you’re telling Gatekeeper in can create and manage policies for Pods.
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
name: config
namespace: "gatekeeper-system"
spec:
sync:
syncOnly:
- group: ""
version: "v1"
kind: "Pod"
Next is the policy itself. The ContraintTemplate below creates a policy to block privileged containers via the SecurityContext.
💡 The policy is written in Rego, which is the configuration language for OPA.
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: blockprivcontainers
annotations:
description: Block Pods from using privileged containers.
spec:
crd:
spec:
names:
kind: blockprivcontainers # this must be the same name as the name on metadata.name (line 4)
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8spspprivileged
violation[{"msg": msg, "details": {}}] {
c := input_containers[_]
c.securityContext.privileged
msg := sprintf("Privileged container is not allowed: %v, securityContext: %v", [c.name, c.securityContext])
}
input_containers[c] {
c := input.review.object.spec.containers[_]
}
input_containers[c] {
c := input.review.object.spec.initContainers[_]
}
Once the ConstraintTemplate (policy) is written, you can apply it via the kind/object below, which is what you created in the previous steps.
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: blockprivcontainers
metadata:
name: blockprivcontainers
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
parameters:
annotation: "priv-containers"
To test out if this will work, run the following Deployment, which specifies that the container is running as privileged
via the securityContext.
It should fail.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginxdeployment
replicas: 2
template:
metadata:
labels:
app: nginxdeployment
spec:
containers:
- name: nginxdeployment
image: nginx:1.23.1
ports:
- containerPort: 80
securityContext:
privileged: true
Delete the previous Deployment and run the following deployment which should pass because privileged
is set to false
.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginxdeployment
replicas: 2
template:
metadata:
labels:
app: nginxdeployment
spec:
containers:
- name: nginxdeployment
image: nginx:1.23.1
ports:
- containerPort: 80
securityContext:
privileged: false
Network Policies
In the SecurityContext section, you learned how to secure Pods at the Pod level itself. Who can run the Pods, how the Pods run, and what access Pods have outside of the Pod level.
In the Policy Enforcement section, you learned how to set rules for Pods.
NetworkPolicies are similar to both the SecurityContext and the Policy Enforcement piece in terms of what Pods can and can’t do, except Network Policies manage this at the network level. The idea with Network Policies is that you manage all traffic from both the Ingress and Egress layers.
By default, the internal Kubernetes Network is flat, which means all Pods can talk to each other regardless of whether or not they’re in the same Namespace. therefore, it’s drastically crucial that you configure Network Policies.
In the sub-section below you’ll find a demo of how Network Policies work, but if you want to see other examples, here’s a link that can provide more information and more demos.
Demo
Run the following Pods:
kubectl run busybox1 --image=busybox --labels app=busybox1 -- sleep 3600
kubectl run busybox2 --image=busybox --labels app=busybox2 -- sleep 3600
Obtain the IP address of the Pods.
kubectl get pods -o wide
Run a ping against busybox1
.
kubectl exec -ti busybox2 -- ping -c3 ip_of_busybox_one
You should see that the ping works just fine and there’s 0 packet loss.
Next, let’s configure a Network Policy that denies all ingress traffic to busybox1
.
kubectl apply -f - <<EOF
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: web-deny-all
spec:
podSelector:
matchLabels:
app: busybox1
ingress: []
EOF
Run the ping again.
kubectl exec -ti busybox2 -- ping -c3 ip_of_busybox_one
You should now see that there’s 100% packet loss.
All Traffic Is Via An API
Remember, all Kubernetes traffic is run through a Kubernetes API, and that Kubernetes API resides on the API Server. Because of that, all the requests that come in for a particular workload must pass through the Admission Controller.
Admission Controllers are used to either mutate or validate the API request when the request comes in via an engineer or another entity. If the API requests aren’t allowed, it gets blocked. For example, if a policy says that a Pod cannot use the latest
container image, that means it won’t make it past the Admission Controller. It’s all about validation.
Policy enforcers like OPA or Kyverno work because they configure the policies not to allow the request to pass if it doesn’t meet the specific policy guidelines.
Essentially, Admission Controllers either allow a request or deny a request due to a policy that’s in place.
Top comments (1)
Fantastic article on securing Kubernetes pods for production workloads! The detailed explanations and practical tips are incredibly useful for anyone looking to enhance their Kubernetes security. It's a must-read for DevOps teams aiming to ensure robust and secure deployments. Great job!
Some comments have been hidden by the post's author - find out more