Peter Benjamin (they/them)

Posted on Jul 19, 2018 • Edited on Jul 25, 2018

Kubernetes Security Best-Practices

#docker #kubernetes #security

Overview
Secure Baseline
Authentication
Authorization
Admission Controls
Impersonation
Pod Security Policies
Network Policies
Additional Security Measures
Additional References
Conclusion

Overview

So, you are all-in on container technology. You may be spinning up a container for your application and another container for the database locally on your machine, but then what? You need systems to build your containers. You need systems to deploy your containers. You need systems to run your containers. How would you manage containers at this scale?

Enter, Kubernetes!

Kubernetes is the new Application Server. Kubernetes provides an API interface for dev teams, ops teams, and even security teams to interact with applications and the platform.

Kubernetes is a fast-moving open-source project with constant progress being made. So, my goal in this article is to cover some common security mistakes I have observed and offer some general best-practices around securing Kubernetes clusters and workloads.

Legend:

Icon	Meaning
❌	Not Recommended
🗒️	Rationale
✅	Recommendation
⚠️	Warning

Secure Baseline

✅ Ensure that your underlying hosts are hardened and secure. I recommend CIS benchmarks as a starting point.

✅ Ensure that Docker itself is configured per security best-practices. Check out my previous article: Docker Security Best-Practices

✅ Ensure that you're starting off with Kubernetes with a secure baseline.

Center for Internet Security (CIS) maintains documentation available for free.
The fine folks at Aqua Security also open-sourced an automated checker based on CIS recommendations. Check it out: kube-bench

# recommended - on master node
$ kubectl run                         \
    --rm                              \
    -it                               \
    kube-bench-master                 \
    --image=aquasec/kube-bench:latest \
    --restart=Never                   \
    --overrides="{ \"apiVersion\": \"v1\", \"spec\": { \"hostPID\": true, \"nodeSelector\": { \"kubernetes.io/role\": \"master\" }, \"tolerations\": [ { \"key\": \"node-role.kubernetes.io/master\", \"operator\": \"Exists\", \"effect\": \"NoSchedule\" } ] } }"               \
    -- master                         \
    --version 1.8

# recommended - on worker nodes
$ kubectl run                         \
    --rm                              \
    -it                               \
    kube-bench-node                   \
    --image=aquasec/kube-bench:latest \
    --restart=Never                   \
    --overrides="{ \"apiVersion\": \"v1\", \"spec\": { \"hostPID\": true } }" \
    -- node                           \
    --version 1.8

Authentication

Most interactions with Kubernetes are done by talking to the control plane, specifically the kube-apiserver component of the control plane. Requests pass through 3 steps in the kube-apiserver before the request is served or rejected: Authentication, Authorization, and Admission Control. Once requests pass these 3 steps, kube-apiserver communicates with Kubelets over the network. Therefore, Kubelets also have to check authentications and authorizations as well.

The behavior of kube-apiserver and kubelet can be controlled or modified by launching them with certain command-line flags. The full list of supported command-line flags are documented here:

Let's examine some common Kubernetes authentication security mistakes:

❌ By default, anonymous authentication is enabled

🗒️ Kubernetes allows anonymous authentication out-of-the-box. There are a combination of kube-apiserver settings that render anonymous authentication safe, like --authorization-mode=RBAC because you would need to explicitly grant RBAC privileges to system:anonymous user and system:unauthenticated group.
Note: Granting RBAC privileges to * user or * group do not include anonymous users.

✅ Disable anonymous authentication by passing the --anonymous-auth=false flag

# not recommended - on master node
$ kube-apiserver       \
    <... other flags>  \
    --anonymous-auth=true

# recommended - on master node
$ kube-apiserver       \
    <... other flags>  \
    --anonymous-auth=false

# not recommended - on worker nodes
$ kubelet             \
    <... other flags> \
    --anonymous-auth=true

# recommended - on worker nodes
$ kubelet             \
    <... other flags> \
    --anonymous-auth=false

❌ Running kube-apiserver with --insecure-port=<PORT>

🗒️ In older versions of Kubernetes, you could run kube-apiserver with an API port that does not have any protections around it

✅ Disable insecure port by passing the --insecure-port=0 flag. In recent versions, this has been disabled by default with the intention of completely deprecating it

# not recommended
$ kube-apiserver         \
    <... other flags>    \
    --insecure-port=6443

# recommended
$ kube-apiserver      \
    <... other flags> \
    --insecure-port=0

✅ Prefer OpenID Connect or X509 Client Certificate-based authentication strategies over the others when authenticating users

🗒️ Kubernetes supports different authentication strategies:

X509 client certs: decent authentication strategy, but you'd have to address renewing and redistributing client certs on a regular basis
Static Tokens: avoid them due to their non-ephemeral nature
Bootstrap Tokens: same as static tokens above
Basic Authentication: avoid them due to credentials being transmitted over the network in cleartext
Service Account Tokens: should not be used for end-users trying to interact with Kubernetes clusters, but they are the preferred authentication strategy for applications & workloads running on Kubernetes
OpenID Connect (OIDC) Tokens: best authentication strategy for end users as OIDC integrates with your identity provider (e.g. AD, AWS IAM, GCP IAM ...etc)

# recommended - OIDC
$ kube-apiserver                                                        \
    <... other flags>                                                   \
    --oidc-issuer-url="https://domain/.well-known/openid-configuration" \
    --oidc-client-id="example"

# recommended - X509 cert
$ kube-apiserver                               \
    <... other flags>                          \
    --client-ca-file=/path/to/ca.crt           \
    --tls-cert-file=/path/to/server.crt        \
    --tls-private-key-file=/path/to/server.key

Authorization

❌ By default, authorization mode is to always authorize all requests to kube-apiserver

✅ Enable Role-Based Access Controls

🗒️ Kubernetes supports different authorization strategies, but Role-Based Access Control (RBAC) was introduced in v1.8, which allows administrators to define what users and service accounts can or cannot do across Kubernetes clusters as a whole or within specific Kubernetes namespaces

# not recommended
$ kube-apiserver      \
    <... other flags> \
    --authorization-mode=AlwaysAllow

# recommended
$ kube-apiserver      \
    <... other flags> \
    --authorization-mode=RBAC

❌ By default, the default service account is automatically mounted into the file system of all containers in Kubernetes

🗒️ The default service account is a valid service account token that can be used to query the Kubernetes API as an authenticated workload. With RBAC enabled, this is still a problem, but not as big as a problem as without RBAC enabled. Without RBAC, this token can be used to do virtually anything in the kubernetes cluster

✅ Disable auto-mounting of the default service account token

# recommended
$ kubectl patch serviceaccount default -p "automountServiceAccountToken: false"

Admission Controls

Admission controllers are pieces of code that intercept requests to the Kubernetes API in the 3rd and last step of the authentication/authorization process. The full list of admission control options are documented here:

Admission Controllers

✅ Configure admission control to deny privilege escalation via launching interactive shells on or attaching to privileged containers

🗒️ There are some valid use-cases where you need to run privileged containers to interact with the kernel of the underlying host, but this means users can potentially kubectl attach or kubectl exec into those privileged containers.

# recommended
kube-apiserver       \
    <...other flags> \
    --admission-control=...,DenyEscalatingExec

✅ Configure admission control to enable Pod Security Policies

🗒️ Pod Security Policies are security rules that pods have to abide by in order to be accepted and scheduled on your cluster.

⚠️ Make sure you have PodSecurityPolicy objects (i.e. yaml files) ready to be applied once you turn this on, otherwise no pod will be scheduled. See Pod Security Policy recommendation in this article for examples.

# recommended
$ kube-apiserver      \
    <... other flags> \
    --addmission-control=...,PodSecurityPolicy

# example PodSecurityPolicy
$ kubectl create -f- <<EOF 
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: example
spec:
  privileged: false  # Don't allow privileged pods!
EOF

# let's try to request a privileged pod to be scheduled on our cluster
$ kubectl create -f- <<EOF
apiVersion: v1
kind: Pod
metadata:
  name:      pause
spec:
  containers:
    - name:  pause
      image: k8s.gcr.io/pause
EOF

# throws error as expected
Error from server (Forbidden): error when creating "STDIN": pods "privileged" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]

✅ Configure admission control to always pull images

🗒️ Kubernetes will cache container images that have been pulled from any registry (private or public). Any cached container images can be re-used by any pod on the cluster. Therefore, pods could gain unauthorized access to potentially sensitive information. Forcing Kubernetes to always pull images will require that the credentials needed to pull down images from private registries are provided by each requesting resource

# recommended
$ kube-apiserver      \
    <... other flags> \
    --admission-control=...,AlwaysPullImages

Impersonation

Kubernetes has a feature that allows any user to impersonate any other user on the Kubernetes cluster. This feature is nice for debugging, but can have unintended security implications if not controlled properly.

To demonstrate the implications:

$ kubectl drain test-node
Error from server (Forbidden): User "foo" cannot get nodes at the cluster scope. (get nodes test-node)

$ kubectl drain test-node --as=admin --as-group=system:admins
node "test-node" cordoned
node "test-node" drained

You can read more about it here:

User Impersonation

✅ Limit who can impersonate and what they can do as impersonated users

# not recommended
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: impersonator
rules:
# allows users to impersonate any "users", "groups", and "serviceaccounts"
- apiGroups: [""]
  resources: ["users", "groups", "serviceaccounts"]
  verbs: ["impersonate"]


# recommended
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: limited-impersonator
rules:
# Can impersonate the group "developers"
- apiGroups: [""]
  resources: ["groups"]
  verbs: ["impersonate"]
  resourceNames: ["developers"]

Pod Security Policies

Pod Security Policies define a set of conditions that a pod must abide by in order to be accepted into the system.

✅ Disallow privileged containers

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
spec:
  privileged: false

✅ Disallow sharing of the host process ID namespace
⚠️ if hostPID is set to true and a container is granted the SYS_PTRACE capability, it is possible to escalate privileges outside the container

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
spec:
  hostPID: false

✅ Disallow sharing of the host IPC namespace (i.e. memory)

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
spec:
  hostIPC: false

✅ Disallow sharing of the host network stack (i.e. access to loopback, localhost, snooping on network traffic on local node)

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
spec:
  hostNetwork: false

✅ Whitelist allowable volume types

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
spec:
  # It's recommended to allow the core volume types.
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    # Assume that persistentVolumes set up by the cluster admin are safe to use.
    - 'persistentVolumeClaim'

✅ Require containers to run as non-root user

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
spec:
  runAsUser:
    # Require the container to run without root privileges.
    rule: 'MustRunAsNonRoot'
  supplementalGroups:
    rule: 'MustRunAs'
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535

✅ Set the defaultAllowPrivilegeEscalation to false

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
spec:
  defautlAllowPrivilegeEscalation: false

✅ Apply Security Enhanced Linux (seLinux), seccomp, or apparmor profiles

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
  annotations:
    # applying default seccomp and apparmor profiles
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default'
    apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
    seccomp.security.alpha.kubernetes.io/defaultProfileName:  'docker/default'
    apparmor.security.beta.kubernetes.io/defaultProfileName:  'runtime/default'

Full Pod Security Policy Example

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default'
    apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
    seccomp.security.alpha.kubernetes.io/defaultProfileName:  'docker/default'
    apparmor.security.beta.kubernetes.io/defaultProfileName:  'runtime/default'
spec:
  privileged: false
  # Required to prevent escalations to root.
  defautlAllowPrivilegeEscalation: false
  # This is redundant with non-root + disallow privilege escalation,
  # but we can provide it for defense in depth.
  requiredDropCapabilities:
    - ALL
  # Allow core volume types.
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    # Assume that persistentVolumes set up by the cluster admin are safe to use.
    - 'persistentVolumeClaim'
  hostNetwork: false
  hostIPC: false
  hostPID: false
  runAsUser:
    # Require the container to run without root privileges.
    rule: 'MustRunAsNonRoot'
  seLinux:
    # This policy assumes the nodes are using AppArmor rather than SELinux.
    rule: 'RunAsAny'
  supplementalGroups:
    rule: 'MustRunAs'
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
  fsGroup:
    rule: 'MustRunAs'
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
  readOnlyRootFilesystem: false

Network Policies

Kubernetes allows users to deploy their choice of network add-ons.

✅ Choose a network add-on that allows you to leverage Network Policies, like Calico or Canal, which gives you networking via Flannel and network policies via Calico.

# Canal Example 
$ kube-apiserver                 \
    <... other flags>            \
    --cluster-cidr=10.244.0.0/16 \
    --allocate-node-cidrs=true

# install RBAC
$ kubectl apply -f \
    https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/canal/rbac.yaml

# install Calico
$ kubectl apply -f \
    https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/canal/canal.yaml

Additional Security Measures

All recommendations up to this point are just to get you up and running.
In this section, I am going to cover some additional security measures!

✅ Kubernetes Secrets are useful for some limited use-cases. I wouldn't rely on it as a secrets management solution. Instead, consider Hashicorp Vault for that.

✅ Continuously scan for security vulnerabilities in your containers with Anchore or Clair.

✅ Keep your infrastructure up-to-date on security patches, or run Kubernetes on OSes that keep themselves up-to-date (e.g. CoreOS)

✅ Only deploy authorized container images that you've analyzed, scanned, and signed (i.e. Software Supply Chain Security). Grafeas, TUF, and Notary can help here.

✅ Limit direct access to the Kubernetes nodes.

✅ Avoid noisy neighbor problems. Define resource quotas.

✅ Monitor and log everything with Prometheus and Grafana. Sysdig Falco will detect and alert on anomalous container behavior, like shell execution in a container, container privilege escalation, spawning of unexpected child processes, mounting sensitive paths, system binaries making network connections ...etc

Additional References

Conclusion

Containers and Orchestrators are not inherently more or less secure than traditional virtualization technologies. If anything, I personally think containers and orchestrators have the potential to revolutionize the security industry and truly enable Security at the speed of DevOps.

If you feel like I missed something, got some details wrong, or just want to say hi, please feel free to leave a comment below or reach out to me on GitHub 🐙, Twitter 🐦, or LinkedIn 🔗

Latest comments (6)

Techyrajput • Jun 29 '20

It is a great article😊😊😊

Daniel Albuschat • May 23 '19

Hey Peter, this is a great article (I have not yet completely read it, though).
Do you have a practical advice for authentication that does not have strong prerequisistes like an OpenID provider?

Peter Benjamin (they/them) • May 24 '19

Hi Daniel 👋 -

Thank you for the feedback.

With regards to authentication strategies, if integrating with OpenID providers is not an option, then I would say the second most favorable approach is certificate-based authentication.

Although, you can generate long-lived certificates, that is considered bad practice, because a compromised certificate would leave your clusters potentially exposed or compromised for the duration/lifetime of the certificate.

The better practice would be to limit the lifetime of these client certificates to as short of a period as you can feasibly make them. But, now, the operational challenge will be to make the certificate renewal process easy/self-serviceable/automated for your Kubernetes users.

Hope this answers your question.

Daniel Albuschat • May 24 '19

Peter, thanks for your reply!
I recently thought that using service-account with tokens for auth are a good idea, until I learned that the tokens are ephemeral. I considered this, because I learned the hard way that exposing a cert in the "default" setup opens up your cluster for good, and renewing the root CA is a laaarge PITA. However, if the advice is to use short-lifetime certificate, I wonder how that is better than using service-account tokens, with regular renewal?

Peter Benjamin (they/them) • May 24 '19 • Edited

If we're referring to users authenticating to Kubernetes clusters, then service accounts are not very well suited for that use-case. Sure, you can make them work. The official docs demonstrate how you can generate service accounts for "long running jobs outside the cluster" 1. Having said that, service accounts are more suited and intended for authenticating workloads running on Kubernetes.

As for how client-side certificates compare to service accounts, service account tokens have some limitations that are not present with client-side certificates:

Service accounts are namespaced. That is to say, they're limited to the namespace the service account was created in.
Service account tokens are stored as Kubernetes Secrets. Any user who can query Kubernetes Secrets can authenticate as that service account.
The Kubernetes docs distinguishes between User accounts and Service accounts 2.

Kilmore • Mar 2 '19

Good article. I had an issue with the node selector for the master in your example. Below is what worked for me.

Thanks!

kubectl run \
--rm \
-it \
kube-bench-master \
--image=aquasec/kube-bench:latest \
--restart=Never \
--overrides="{ \"apiVersion\": \"v1\", \"spec\": { \"hostPID\": true, \"nodeSelector\": { \"node-role.kubernetes.io/master\": \"\" }, \"tolerations\": [ { \"key\": \"node-role.kubernetes.io/master\", \"operator\": \"Exists\", \"effect\": \"NoSchedule\" } ] } }" \
-- master \
--version 1.8

DEV Community