Hello again! In this part of the series, we'll finally get our hands dirty using Skaffold to build, push and deploy applications on Kubernetes. We'll deploy a Postgres database on our local Minikube cluster. Along the way, we'll learn Kubernetes concepts such as ConfigMaps, Secrets, Persistent Volumes and Persistent Volume Claims, StatefulSets, and Services.
At this point of the series, it's worth reminding why we're using Skaffold in the first place. The main reason is that we want to get rid of Docker Compose commonly used for kicking up the local development environment. At the end of the day, we want to deploy our app to Kubernetes using Kubernetes manifests instead of Docker Compose. We want to use the same Kubernetes manifests both for local development as well as for deploying our app to production. Here's where Skaffold enters the game. It takes care of, for example, hot reloading Docker images when developing locally: whenever your code changes, Skaffold re-builds images and re-deploys them to local Kubernetes. This is how we can build truly Kubernetes-native applications. Of course, Skaffold can do much more than hot reloading for local development.
In Part 1, we installed all dependencies required for this tutorial. Most notably, we'll need a Skaffold installation and a Kubernetes cluster. I assume you're using Minikube, but you could also use other local clusters such as Docker Desktop or even a remote cluster.
You can find all code accompanying this series in this GitHub repository.
Configuring Skaffold
First we'll need to configure Skaffold by creating a skaffold.yaml
file in our repository. I suggest you take a quick glance at the link to get an overview of Skaffold's configuration.
Like any resource definition in Kubernetes, skaffold.yaml
has apiVersion
, kind
, and metadata
fields. We therefore add the following to skaffold.yaml
:
# skaffold.yaml
apiVersion: skaffold/v2beta4
kind: Config
metadata:
name: learning-local-kubernetes-development-with-skaffold
The meat of Skaffold is in build
and deploy
fields. For now, we assume there are no artifacts to build and add the following below the definition above:
build:
artifacts: []
In deploy
, we tell Skaffold where to find the Kubernetes manifests and how to process them. We'll tell Skaffold to deploy with kubectl
, look for manifests in a folder called k8s/
and to use the minikube
context for deployment with the following configuration:
# skaffold.yaml
deploy:
kubectl:
manifests:
- k8s/*.yaml
kubeContext: minikube
Both manifests
and kubeContext
are set to above values by default, but I think it's always better to be explicit with such things. Instead of deploying bare Kubernetes manifests with kubectl
, you could also tell Skaffold to process your manifests with kustomize
or use helm
charts.
Instead of writing skaffold.yaml
yourself, you can also use skaffold init
command to auto-generate build
and deploy
config.
Postgres deployment
ConfigMap
and Secrets
We'll put our Kubernetes manifest for the Postgres deployment in k8s/postgres.yaml
. We'll first include non-confidential Postgres configuration data in a ConfigMap. By storing configuration in ConfigMap, we can easily re-use the configuration in other services using our Postgres cluster.
For Postgres, we'll need to define the Postgres user name and the database to use. We add these as configuration variables POSTGRES_USER
and POSTGRES_DB
in k8s/postgres.yaml
:
# k8s/postgres.yaml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-configuration
data:
POSTGRES_DB: "django-db"
POSTGRES_USER: "postgres-user"
We'll see later how to use this configuration.
Postgres also wants us to configure the password that can be used to access the database. Such confidential data should be stored as Kubernetes Secrets. Kubernetes wants us to base64-encode our secrets, so we'll have to do that first. Assuming that we choose "super-secret"
as our password, here's how to base64-encode:
$ echo -n "super-secret" | base64
c3VwZXItc2VjcmV0
# Or with Python:
$ python -c "import base64; print(base64.b64encode('super-secret'));"
c3VwZXItc2VjcmV0
For the purposes of this tutorial, I'll simply add the base64-encoded secret in postgres.yaml
:
# k8s/postgres.yaml
---
apiVersion: v1
kind: Secret
metadata:
name: postgres-credentials
type: Opaque
data:
# This should **not** be in version control
password: c3VwZXItc2VjcmV0
In any real-world use, you probably wouldn't add secrets like this to version control. You would put the secrets in their own secrets.yaml
file and keep it out of version control or read secrets at deployment time from services such as Vault.
At this point, we can see if our Skaffold deployment works by running skaffold dev
. Skaffold should discover postgres.yaml
and deploy our ConfigMap and Secret. If you then open Minikube dashboard with
$ minikube dashboard
you should find the ConfigMap and Secret in the created resources. If you modify your deployment manifests, Skaffold should automatically take care of updating the deployment so from now on, you can keep skaffold dev
running in the background.
Persistent volumes and volume claims
Next we'll setup volumes for our Postgres installation. This part is not required if you only use Postgres for local development with small data volumes. If you're deploying Postgres for production in Kubernetes, you need to think carefully where and how to persist data.
Persistent volume (PV) documentation describe it as "a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes." Storage classes can be used to provision storage on-demand from, say, Amazon Elastic Block Store or Google Cloud Persistent Disk. In this tutorial, we'll provision local storage upfront without storage classes.
In this tutorial, we'll provision storage from the host with hostPath persistent volume. As the docs explain, "In a production cluster, you would instead provision a network resource like a Google Compute Engine persistent disk, an NFS share, or an Amazon Elastic Block Store volume."
Here's how to use hostPath to provision storage from the single-node cluster's host:
# k8s/postgres.yaml
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-pv
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /mnt1/postgres-data
We specify that a persistent volume called postgres-pv
of 1 GB capacity should be provisioned at /mnt1/postgres-data
on the cluster's node. Access mode is defined as ReadWriteOnce
, meaning that the volume can be mounted by a single node for reading and writing.
Once we get Postgres running, we can ssh into your Minikube node with minikube ssh
and execute ls /mnt1/postgres-data
to browse the data stored by Postgres.
Now that we have provisioned storage, we need to create a persistent volume claim to request access to it. Volume claims are similar to how pods request resources from nodes. Instead of requesting CPU or memory, volume claims request specific storage size ("5 GB") and specific access mode. Let's create a volume claim to request 500 MB from postgres-pv
volume:
# k8s/postgres.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
labels:
type: local
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
volumeName: postgres-pv
Now we can use this claim named postgres-pvc
to mount the claimed storage to our Postgres pod.
Stateful set
We're ready to declare how to deploy Postgres itself. Just for fun, we'll deploy Postgres using StatefulSet. StatefulSet manages not only the deployment and scaling of pods but also "provides guarantees about the ordering and uniqueness of these pods". In particular, StatefulSet provides unique network identifiers, persistent storage, graceful deployment and scaling, and automated rolling updates. For example, if you were deploying Apache Cassandra, an open-source distributed database, on Kubernetes, StatefulSet would guarantee your pods would have stable identifiers such as cassandra-1
, cassandra-2
, etc. you could use to communicate between pods even after rescheduling. See here for an example how to deploy Cassandra on Kubernetes.
In our case of Postgres, we are dealing with a centralized, non-distributed database, so we'll only have one pod in our StatefulSet. However, StatefulSet is still a very useful and powerful concept for any stateful application.
We'll declare StatefulSet as follows:
# k8s/postgres.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres-statefulset
spec:
serviceName: "postgres"
replicas: 1
selector:
matchLabels:
app: postgres # has to match .spec.template.metadata.labels
template: ...
As expected, we use kind: StatefulSet
. In spec
, we declare serviceName
to be postgres
. You can read here how serviceName
controls the DNS for pods in the StatefulSet. Do not confuse this with how you would use Service to expose Postgres to other applications in the cluster or to the outside world. In the next section, we'll see how to use Service to expose the StatefulSet as a network service.
After serviceName
, we set replicas
equal to one as we're dealing with a non-replicated, centralized database. The .spec.selector
field defines which pods belong to the StatefulSet.
The spec for pods in the StatefulSet is defined in .spec.template
as follows:
# k8s/postgres.yaml
...
spec:
...
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:12
envFrom:
- configMapRef:
name: postgres-configuration
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
ports:
- containerPort: 5432
name: postgresdb
volumeMounts:
- name: postgres-volume-mount
mountPath: /var/lib/postgresql/data
readinessProbe:
exec:
command:
- bash
- "-c"
- "psql -U$POSTGRES_USER -d$POSTGRES_DB -c 'SELECT 1'"
initialDelaySeconds: 15
timeoutSeconds: 2
livenessProbe:
exec:
command:
- bash
- "-c"
- "psql -U$POSTGRES_USER -d$POSTGRES_DB -c 'SELECT 1'"
initialDelaySeconds: 15
timeoutSeconds: 2
volumes:
- name: postgres-volume-mount
persistentVolumeClaim:
claimName: postgres-pvc
We first set pod labels to postgres
to match the selector in the StatefulSet. In .spec.template.spec
, we define our pods to have a single container running postgres:12
image from DockerHub. The pod uses environment variables from the postgres-configuration
ConfigMap and postgres-credentials
Secret, both of which we defined above. We then set the pod to expose port 5432. In volumeMounts
, we mount a volume named postgres-volume-mount
to /var/lib/postgresql/data
, the folder used by Postgres for storing data. This volume is defined in .spec.template.spec.volumes
, where we use the postgres-pvc
volume claim defined above.
Finally, we also set readinessProbe
as well as livenessProbe
to let Kubernetes monitor the pod's health. You can read more about probes here.
Service
Having declared our StatefulSet, we'll finally need to define a Service to expose the StatefulSet as a network application to other applications. Compared to above, this is simple enough:
apiVersion: v1
kind: Service
metadata:
name: postgres-service
spec:
ports:
- port: 5432
name: postgres
type: NodePort
selector:
app: postgres
In our service, we expose port 5432 from the pod as NodePort
. You can read more about different options here. Choosing NodePort
will let us contact the service from outside the cluster. In production, you of course very carefully want to consider if you want your database to be exposed outside of the cluster.
Did it work?
If you had skaffold dev --port-forward
running throughout this tutorial, you should now have Postgres running in your Minikube cluster. You can verify this by, for example, running minikube dashboard
to browse the resources in Minikube and hopefully see everything glowing green. If anything went wrong, please let me know in the comments!
Conclusion
That concludes the tutorial on how to get Postgres running on Kubernetes. In the next and final part, we'll finally deploy the Django application we built in Part 2, backed by Postgres. See you then!
Top comments (5)
I have the weirdest issue .. I'm trying to make a deployment for postgresql (not for django but for a side-project idea I have) based on this tutorial, and it works fine, but if I change the POSTGRES_DB or the POSTGRES_USER at all the pod is never ready and it keeps restarting. Any clue what it could be?
Weird! I guess you have tried tearing down the whole stack and only then changing the configuration? If it uses old volumes or containers, it could mean the new database and user are never created. Maybe you could also try looking into container logs to understand the error?
I'm using docker desktop for windows to run kubernetes .. the only thing that seems to work is to use the option to completely reset all the settings to the initial defaults and start over again! And yes I've tried looking at container logs, I forget what they said but it didn't really solve it for me.
I admit I only really understand the basics of kubernetes, I can't even find where docker desktop puts the volumes (it uses WSL aka "windows subsystem for linux" and think it somehow handles them internally) but I suspect if I could find them and delete them after deleting the pod, then changing the user or pass would work.. but what do I know!
Have you published a 4th part where you deploy the Django app? I don't see it linked from here.
Sorry not yet, but you can find the Kubernetes yaml for the app in the repository.