Because Kubernetes has transformed the way organisations do application deployments and data is an integral part of the applications and can't be left behind...
In a previous article, we discussed the setup for Fully Private EKS on Fargate Cluster fulfilling security requirements of certain regulated industries. This post is a follow up where we will further add the persistent storage and host StatefulSets on this Fully Private Fargate Cluster, while adhering to compliance requirements.
Originally published on my blog: https://k8s-dev.github.io
EFS support for EKS on Fargate is a recent feature, see the release blog post.
Statefulset is a Kubernetes contruct which helps in managing stateful applications, while maintaining the guarantees around the sequence of deployment and uniqueness of the underlying pods created by it. Statefulset and its sticky identity is particularly useful while hosting databases on kubernetes, such as MySQL with read-replicas.
Statefulset is very similar to Deployment, but the main differentiation arises from the fact that Statefulset maintain sticky identity to pods, they are created in order and have persistence across rescheduling cycles. Some features of StatefulSets are:
- Stable(persistent across pod re-scheduling), unique network identifiers
- Stable(persistent across pod re-scheduling), persistent storage
- Ordered, graceful deployment and scaling
- Ordered, automated rolling updates
Elastic File System (EFS) is a scalable and fully managed shared file-system implementation on AWS, which has integration with EKS Fargate for providing persistent storage. EFS is highly elastic and scalable, it automatically grows and shrinks on demand, while encrypting data at rest and in transit. In this solution we will make use of EFS VPC Endpoint which is ideal for security sensitive workloads running on AWS Fargate.
- A Fully Private EKS on Fargate Cluster setup (following the article)
Kubernetes supports persistent storage via “Container Storage Interface (CSI)” standard. Application pods running on Fargate uses EFS CSI driver to access EFS file system using standard Kubernetes APIs. EFS support at rest data encryption and while using the EFS CSI driver, all data in transit is encrypted by default which is a compliance requirement in most regulated industries.
Whenever a pod running on Fargate is terminated and relaunched, the CSI driver reconnects the EFS file system, even when pod is relaunched in a different AWS Availability Zone, this makes EFS as an appealing solution to provide persistent storage. EFS CSI driver comes pre-installed with Fargate stack and support for EFS is provided out of the box, updates are managed by AWS transparently.
The EFS integration with EKS on Fargate leverages three Kubernetes constructs - StorageClass, Persistent Volume (PV) and Persistent Volume Claim (PVC). This allows segregation of duties while operating the cluster where storage admin (or cluster admin) would configure Storage Class, EFS and would create PVs out of this. The developer team would then utilise these available PVs to create PVCs as and when required to deploy applications.
While Kubernetes Storage Class allows volume creation dynamically and statically - in this experiment we will use EFS CSI driver to statically create volume which is the only supported implementation yet with EFS CSI driver.
Before proceeding - ensure to have a EKS on Fargate cluster up and running.
The EFS VPC endpoint provides secure connectivity to the Amazon EFS API without requiring an internet gateway, NAT instance, or virtual private network (VPN) connection. Follow the guide at : https://docs.aws.amazon.com/efs/latest/ug/efs-vpc-endpoints.html to setup a VPC Endpoint for EFS in the same region where EKS on Fargate Cluster resides.
First, we need to create EFS file system in the same AWS Region where EKS on Fargate Cluster is residing. Follow EFS getting start guide, and configure EFS in the same private subnets which are used while creating EKS on Fargate Cluster. We shall have EFS encryption enabled for this experimentation.
Because EFS operates as NFS mount system, we need to add rule to allow NFS traffic in EKS on Fargate security group.
Also make sure to add EKS on Fargate security group in EFS configuration.
Finally, note down the FileSystem ID after successful creation which would be needed further down when we will create a persistent storage from it.
Fargate allows hands free management for Kubernetes Clusters. To direct EKS to schedule pods on Fargate - we need to create a Fargate profile which is a combination of namespace and labels. In this experiments we are creating a profile with namespace 'efs-statefulset' and all of the objects including pods, service, persistent volume and persistent volume claim will get created in this namespace.
As mentioned in previous article, make sure to run these commands from Bastion host created in public subnet in the EKS on Fargate VPC.
aws eks create-fargate-profile --fargate-profile-name fargate-statefulset --region us-east-1 --cluster-name private-fargate-cluster --pod-execution-role-arn arn:aws:iam::1234567890:role/private-fargate-pod-execution-role --subnets subnet-01b3ae56696b33747 subnet-0e639397d1f12500a subnet-039f4170f8a820afc --selectors namespace=efs-statefulset
StorageClass in equivalent of a storage profile in Kubernetes which provides a way for admins to specify type of storage, quality-of-service, backup policy or any other arbitrary policy.
Persistent Volume is a Kubernetes resource which allows to store data in persistent way across life-cycle of a pod. PV could be created dynamically or statically using StorageClass. PVC is a request for storage by a Pod. By using PVC object - Kubernetes abstracts away the implementation details of PV.
Let's make use of the following .yaml to rollout StorageClass, PV and PVC along with a sample application based on amazon-linux:2 image. All it does is to keep redirecting date and time into a file hosted on persistent volume in EFS.
Ensure to have pushed amazon-linux:2 container image to ECR repo prior to deploying the statefulset. Steps to create, tag and push images to ECR are mentioned in previous post.
apiVersion: v1 kind: PersistentVolume metadata: name: efs-pv namespace: efs-statefulset spec: capacity: storage: 5Gi volumeMode: Filesystem accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: efs-sc csi: driver: efs.csi.aws.com volumeHandle: <EFS filesystem ID> --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: efs-claim namespace: efs-statefulset spec: accessModes: - ReadWriteMany storageClassName: efs-sc resources: requests: storage: 5Gi --- apiVersion: apps/v1 kind: StatefulSet metadata: name: efs-app-sts namespace: efs-statefulset spec: selector: matchLabels: app: test-efs serviceName: efs-app replicas: 3 template: metadata: namespace: efs-statefulset labels: app: test-efs spec: terminationGracePeriodSeconds: 10 containers: - name: linux image: 01234567890.dkr.ecr.us-east-1.amazonaws.com/amazon-linux2 command: ["/bin/sh"] args: ["-c", "while true; do echo $(date -u) >> /efs-data/out.txt; sleep 5; done"] volumeMounts: - name: efs-storage mountPath: /efs-data volumes: - name: efs-storage persistentVolumeClaim: claimName: efs-claim
Deploy the yaml using the command:
kubectl apply -f <file.yaml>
Once deployed, we can see the status of PV and PVC by using the following command:
$ kubectl get pv -n efs-statefulset NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE efs-pv 5Gi RWX Retain Bound efs-statefulset/efs-claim efs-sc 17s $ kubectl get pvc -n efs-statefulset NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE efs-claim Bound efs-pv 5Gi RWX efs-sc 48s
This shows that PV is allocated and PVC is in bound state. Also the statefulsets are rolled out and are in running state.
$ kubectl get statefulsets -n efs-statefulset -o wide NAME READY AGE CONTAINERS IMAGES efs-app-sts 3/3 27m linux 1234567890.dkr.ecr.us-east-1.amazonaws.com/amazon-linux2
To appreciate the way statefulset maintains the sticky identity - we will perform a few experiments.
Scale exercise: Let's down scale the current statefulset to 2, this would delete a running statefulset and we will see that the efs-app-sts-2 would get terminated first as it was the last one to get deployed.
$ kubectl scale sts efs-app-sts --replicas=2 -n efs-statefulset statefulset.apps/efs-app-sts scaled $ kubectl get pods -n efs-statefulset -w NAME READY STATUS RESTARTS AGE efs-app-sts-0 1/1 Running 0 29m efs-app-sts-1 1/1 Running 0 28m efs-app-sts-2 0/1 Terminating 0 5s
To see the ordered pod termination let's do another experiment. In one terminal execute the pod delete command while in other terminal put a watch on statefulset. We can easily observe that pod go down in order, and wait to fully terminate one before starting to delete another one. And the same order gets repeated while re-creating the pods.
$ kubectl delete pods --selector app=test-efs -n efs-statefulset pod "efs-app-sts-0" deleted pod "efs-app-sts-1" deleted
In second terminal, execute the following command:
$ kubectl get pods -n efs-statefulset -w NAME READY STATUS RESTARTS AGE efs-app-sts-0 1/1 Running 0 33m efs-app-sts-1 1/1 Running 0 32m efs-app-sts-0 1/1 Terminating 0 34m efs-app-sts-1 1/1 Terminating 0 33m efs-app-sts-0 0/1 Terminating 0 34m efs-app-sts-1 0/1 Terminating 0 33m efs-app-sts-1 0/1 Terminating 0 33m efs-app-sts-1 0/1 Terminating 0 33m efs-app-sts-0 0/1 Terminating 0 34m efs-app-sts-0 0/1 Terminating 0 34m efs-app-sts-0 0/1 Pending 0 0s efs-app-sts-0 0/1 Pending 0 1s efs-app-sts-0 0/1 Pending 0 63s efs-app-sts-0 0/1 ContainerCreating 0 63s efs-app-sts-0 1/1 Running 0 73s efs-app-sts-1 0/1 Pending 0 1s efs-app-sts-1 0/1 Pending 0 2s efs-app-sts-1 0/1 Pending 0 57s efs-app-sts-1 0/1 ContainerCreating 0 57s efs-app-sts-1 1/1 Running 0 68s
Et voilà! After deleting the statefulset, the replication controller triggered a new pod which used the same order to recreate and attach to EFS volume.
For more information about StatefulSet, see the Kubernetes Documentation.
- In this article - we created a private EFS Endpoint and used it to host persistent data using StatefulSet in Kubernetes.
- This deployment solves some of the compliance challenges faced by BFSI and other regulated sectors, given the private deployment and encryption support.
Please let know if you had challenges replicating this in your own AWS environment. Happy Learning!