Poonam Pawar for Kubernetes Community Days Chennai

Posted on May 18, 2023

Kubernetes: Storage & Security

#kubernetes #security #devops #kcdchennai

Introduction✍️

It's time to share Secrets!🤫

Obviously not mine.😜

In this blog let's talk about the storage, storage classes, security policies, network policies, security layers, and everything that comes under the store room and the security room of the Kubernetes. These topics play a very crucial role in the making of k8s clusters. Let's see how it does!

Storage🛢️

Volumes🎚️

In Kubernetes, the pods are not permanent. Once a pod is created, it will be destroyed when the requirement finishes and one comes up in place of it. Just like docker containers, the data processed inside the pod also get deleted when the pod destroys. This will occur data loss which is a big problem.
To resolve this problem, volumes come up
We attach a volume to the pod. So now, whenever a pod processed some data inside it, it gives also gets stored in the volume. If now pod is going to be deleted we have still our data alive.
The data generated by the pod is now stored in the volume!

Let's create a simple single node k8s cluster volume-specific file:

apiVersion: v1
kind: Pod
metadata:
  name: randon-number-generator
spec:
  container:
  - image: alpine
    name: alpine
    command: ["/bin/sh","-c"]
    args: ["shuf -i 0-100 -n 1 >> /opt/number.out;"]
    volumeMounts:
    - mountPath: /opt
      name: data-volume
  volumes:
  - name: data-volume
    hostPath:
      path: /data
      type: Directory

In the above yaml file, we are generating a random number in a given range so that we can save that number in our volumes. volumeMounts are used to get the same data within the pod. hostPath is simply giving the path directory where the data will be stored.

This is how even after pod deletion we can still have our data in /data dir.

Note: This is only for single-node k8s clusters.

When you will work on a large-scale project, there will not be a single node cluster. Hundreds of Pods running at a time and giving the volume path as /data to all the pod's data is not recommended in the multi-node cluster for obvious reasons.
This is because the PODs would use the /data directory on all the nodes.
So we need proper storage solutions. Kubernetes supports different standard storage solutions like NFS, glusterFS and many more.

You will get a basic idea of how to define them in the further topics.

Persistent Volumes📻

Whenever we create volumes, every time we need to configure them in a pod-definition file. So every configuration information is required to configure storage within the file.
Now imagine we are running hundreds of pods and each time when a user wants to deploy the pods, they would have to configure storage every time for each pod in their environment.
So doing this every time is not a best practice for us.
Here, Persistent Volumes come up. It is a cluster-wide room of storage volumes configured by an administrator to make use of the users deploying applications on the cluster.
Now we can use the storage using persisting volume claims

Let's create a definition yaml file named pv-definition.yaml for this:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv1
spec:
  accessModes:
    - ReadWriteOnce
  capacity: 
    storage: 1Gi
  awsElasticBlockStore:
    volumeID: <volume-id>
    fsType: ext4

There can be different kinds of accessMode like,

ReadOnlyMany
ReadWriteOnce
ReadWriteMany

awsElasicBlockStore is one of the supported storage solutions we talked about above for the multi-node cluster. This will provide specific volume-id and type instead of just /data dir path to differentiate better.

Now run,

kubectl create -f pv-definition.yaml

To check the created persistent volumes, you know what command to use now:

kubectl get persistentvolume

Persistent Volume Claims📑

After creating the persistentvolume, it's the time to create a persistent volume claim to make the storage available to a node.
Persistent volumes and Persistent volume claims are two separate objects in k8s.
An administrator creates a persistent volume and a user creates a persistent volume claim to use the storage.
Kubernetes binds the persistent volume of the claims as per the requests and required properties set on the volumes.
Kubernetes checks the sufficient capacity while binding volumes to the claims.
It also uses labels & selectors to bind on a specific persistent in case of multiple possible matches to the right volumes.

Let's create a claim definition file named pvc-definition.yaml now,

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: myclaim
spec:
  accessModes:
    - ReadWriteOnce
  resources: 
    requests:
      storage: 500Mi

When this file is created, Kubernetes looks for the volume file which was created above.

kubectl create -f pvc-definition.yaml

The accessModes match, then it checks for the storage capacity and here to store 500Mi in 1Gi is the perfect match for them since there is no other available.

So the claim is bound to the volume.

To check the claims file use the get command:

kubectl get persistentvolumeclaim

Storage Classes🗑️

We have created the persistent volumes and persistent volume claims but before creating this volume you must have created a disk on google cloud.
You have to manually provision the disk whenever your application needs storage on gc. And then manually create a persistent volume file using the same name defined during the creation of disk specifications.
This whole process is called Static Provisioning Volumes.

To automate this process fully, we have Storage Class.

You just have to define a provisioner like Google Storage and then everything will be seen by the provisioner. It automatically provisions the storage and attaches that to the pod when a claim is made.
This is called Dynamic Provisioning Volumes.
Now you don't have to create persistent volume manually.
There are many storage provisioners such as AWSEBS, AzrueFile, AzureDisk, CephFS and many more

Take a look at a storage class definition file as sc-definition.yaml:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: google-storage
provisioner: kubernetes.io/gce-pd

After defining this file, you have to give the storage class name in pvc file as the same as you have defined in the sv file as google-storage in this case.

You can create different kinds of classes in storage using different types of disks like

Silver🔘 Storage Class with the standard disk,
Gold🟠 Storage Class with SSD drives and
Platinum⚪ Storage Class with SSD drives and replication.

StatefulSets🏗️

It creates pods based on the templates.
It can scale up and scale down as per the requirements. And can perform rolling updates and rollbacks.
Like, you want to deploy the master-node first then the worker-node-1 up completely then starts running and after that worker-node-2 will be up and run itself into the k8s cluster. StatefulSet can help you to achieve this.
In this, when a pod goes down and a new pod comes up, it will be the same name that you have specifically defined for that pod.
It maintains an identity for each of its pods that helps to maintain the order of the deployment of your pods.
If you want to use StatefulSets just for the sake of identity purposes for the pods and not for the sequential deploying order then you can manually remove the commands of the order maintenance, just you have to make some changes in the YAML file.

Note: If you have already learned this topic in my previous blog then feel free to skip this topic😇. Otherwise, continue learning!✊

Though you don't need to use this StatefulSet. As it totally depends on your need for the application. If you have servers that require an order to run or you need a stable naming convention for your pods. Then this is the right choice to use.

You can create a StatefulSet yaml file just like the deployment file with some changes like the main one is kind as StatefulSet. Take a look:

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
  labels:
        app: msql
spec:
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql
  replicas: 3
  selector:
      matchLabels:
          app: mysql
  serviceName: mysql-h

than run:

kubectl create -f statefulset-definition.yml

To scale up or down use the scale command with the numbers you wanted to scale:

kubectl scale statefulset mysql --replicas=5

This is how you can work with StatefulSets.

Security🕵️

RBAC (Role-Based Access Control)🛂

-As the name suggests itself the working, it is used to define roles to users or a group and bind them who granted those permissions.

Simply, granting the permissions for who will do what.

Let's quickly see how to create a role-definition file:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: developer
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["pods"]
  verbs: ["get", "list", "create", "delete"]
- apiGroups: [""] # "" indicates the core API group
  resources: ["ConfigMap"]
  verbs: ["create"]

In the above file, we are creating the role for a developer who has resources as pods in which they can modify the pods and can allow to configure them.

Now run create command to create the role:

kubectl create -f developer-role.yaml

Now, link the user to that role. For this create another object file called RoleBinding:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: devuser-developer-binding
subjects:
- kind: User # "" indicates the core API group
  name: dev-user
  apiGroups: rbac.authorization.k8s.io
roleRef: 
  kind: Role
  name: developer
  apiGroup: rbac.authorization.k8s.io

Now run create command to bind this:

kubectl ceate -f devuser-developer-binding.yaml

To view the created roles, run the get command:

kubectl get roles

To see the bindings you have created run:

kubectl get rolebindings

To view the role in detail run describe command:

kubectl describe role developer

To check your accessibility for a particular object, run the:

kubectl auth can-i create deployments

By this, you can check access that you want to know like delete nodes

Pod Security Policies (PSPs)📦

It is a security policy defined for the users or a group to access the pods in the Kubernetes cluster.
Yes! You guessed it right. It is one of the RBAC to grant permissions and bind them with whom granted with.
It defines a set of security policies that are applied to the pods based on their labels and annotations.
When a pod is created, Kubernetes checks for the specific labels and annotations which we have defined above in the file to proceed it from creating to running state.

Note: Removed feature
PodSecurityPolicy was deprecated in Kubernetes v1.21, and removed from Kubernetes in v1.25.

Instead of using PodSecurityPolicy, you can enforce similar restrictions on Pods using either or both:

Pod Security Admission
a 3rd party admission plugin, that you deploy and configure yourself

These new Pod Security Standards define three different policies to broadly cover the security spectrum.

These policies are cumulative and range from highly-permissive to highly-restrictive.

Learn more here!

Secrets🧑‍💻

Finally, it's time to decrypt the secrets in Kubernetes.

Never tell your secrets to anyone. Ok!

But what is meant by secrets in k8s?

Let's say, you have a simple web application that is connected to a database that displays a successful message on the screen while getting connected.
First, you have coded your user name and passwords into the source code. Then you understand that it's not a good idea to provide credentials like this.
Then you create an object file called ConfigMap and put those values inside the yaml file. ConfigMap stores data in a plain text format. And that's again not a good idea.

So here, Secrets come in. It is used to store sensitive information which you don't want to share it with anybody.

It is similar to configMaps but it stores the information in encoded format instead of plain text.

So first encode your data. To do so, run:

$ echo -n 'mysql' | base64
bXlzcWw=

vice-versa for decoding the text:

$ echo -n 'bXlzcWw=' | base64 --decode
mysql

To check secrets, run:

kubectl get secrets

To get detailed information on secrets, run:

kubectl describe secrets

Now, create a secret-definition.yaml file:

apiVersion: v1
kind: Secret
metadat:
  name: myapp-secret
data:
  DB_Host: bXlzcWw=
  DB_User: cm9vdA==
  DB_Password: cGFzd3Jk

Now configure this secret with a pod-definition.yaml file:

apiVersion: v1
kind: Pod
metadata:
  name: simple-webapp-color
  labels:
    name: simple-webapp-color
spec:
  containers:
  - name: simple-webapp-color
    image: simple-webapp-color
    ports:
      - containerPort: 8080
    envFrom:
      - secretRef: 
          name: myapp-secret

Now run:

kubectl create -f pod-definition.yaml

This is how you can create your own secret file and configure it in the pod-definition file by adding the envFrom section.

And list as many variables as you want as per your need. The name should be matched with the one you have created in the secret-definition file. In this case, it is myapp-secret

Network Policies📋

In Kubernetes, every routing network traffic instruction is set by the network policies.
A set of tools that defines the communication between the pods in a cluster.
It is a powerful tool used for the security of network traffic in k8s clusters.
Allowance of traffics from one specific pod to another one.
Restricting traffic to a specific set of ports and protocols.
Implementation is done by Network APIs.
It can be applied to namespaces or individual pods.

A simple network policy yaml file:


    apiVersion: networking.k8s.io/v1
    kind: NetworkPolicy
    metadata:
      name: demo-network-policy
    spec:
      podSelector:
        matchLabels:
          app: my-app
      policyTypes:
      - Ingress
      ingress:
      - from:
        - podSelector:
            matchLabels:
              app: allowed-app
        ports:
        - protocol: TCP
          port: 80

For the deployment of the Network Policy YAML file, use the kubectl apply command:

kubectl apply -f demo-network-policy.yaml

This is how you can define your own network policies.

TLS (Transport Layer Security)🌐

Every communication over the internet must be secure. Otherwise, there is a high risk of hackers hacking the data which we are transferring through it.
Likewise in Kubernetes clusters, there is a set of master nodes and worker nodes who communicates thoroughly.
The communications between all the pods, nodes and API servers must be secured and encrypted.
Administrators trying to communicate with the master node via kubelet or through APIs directly, everything must be fully secure.
The communication between the servers and the clients must be secure and encrypted.
To fulfill all these requirements we need security certificates. We all know about TLS certificates and how it works.
It works with an asymmetric key format where every end has its own key-value pair lock-key to decrypt the data.

Now let's understand how TLS can be defined in Kubernetes.

In Kubernetes, it has two sides, one is the server side and another is the client side. Both sides must have security certificates signed by CA (Certificate Authority) to verify their identity. Let's look at both of them.

Server Certificates of the Servers🔏

KUBE-API Server - This helps the k8s to expose the HTTP service that other components and external users use to manage the cluster. So it is an important component that must be secured enough.
It has a certificate and a key-pair named apiserver.cert
and apiserver.key
ETCD Server - It stores all the information about the clusters so generate a certificate and key-value pair for this also. The naming conventions are
etcdserver.cert and etcdserver.key
KUBELET Server - It belongs in the worker node which also exposes the HTTP API endpoints to interact with others. Their cert and key-value are

kubelet.cert and kubelet.key

Client Certificate of the Clients🔐

The Administrator - The clients who access the services are the admins. It also requires a certificate and a key-value pair for access. Naming them as
admin.cert and admin.key
The Schedulers - It is another client who communicates with the kube-api server to schedule the objects as per the requirements. So it also needs verification to talk to the server.
Naming them as scheduler.cert and scheduler.key
KUBE-CONTROLLER MANAGER - It also communicates with the kube-api server and for the authentication it requires the same security checks.
controller-manager.cert and controller-manage.key
KUBE-PROXY - Another client-side component has the
certificates and key
kube-proxy.cert and kube-proxy.key

Now it's time to generate a certificate for our cluster. There are many tools to do so like EASYRSA, OpenSSL, CFSSL and many more. We will be using OpenSSL in this.

Create a private key using the openssl command:

$ openssl genrsa -out apiserver.key 2048

It will generate a key called apiserver.key

Now use the request command to generate a certificate signing request for the previous one

$ openssl req -new -key apiserver.key -out apiserver.csr -subj "/CN=kube-apiserver"
  $ openssl x509 -req -in apiserver.csr -CA /etc/kubernetes/pki/ca.crt -CAkey /etc/kubernetes/pki/ca.key -CAcreateserial -out apiserver.crt -days 365

Now, create a secret to store the private key and certificates

$ kubectl create secret tls apiserver-certs --key=apiserver.key --cert=apiserver.crt -n kube-system

Modify the Kubernetes API server configuration to use the TLS certificates:

Now edit the file of api server config to use tls cert:

  $ vi /etc/kubernetes/manifests/kube-apiserver.yaml

  spec:
    containers:
    - name: kube-apiserver
      volumeMounts:
      - mountPath: /etc/kubernetes/pki/apiserver
        name: apiserver-certs
        readOnly: true
    volumes:
    - name: apiserver-certs
      secret:
        secretName: apiserver-certs

Restart the k8s api server to apply the new config

$ systemctl restart kubelet

This is how you can generate a certificate and configure TLS for the Kubernetes API server.

Thank you!🖤

DEV Community