I recently began running RocketChat on Kubernetes--a key component of this deployment was a MongoDB replica set. A best practice for running MongoDB reliably is, both, replication and regular backups, and Kubernetes provides accessible interfaces for both approaches.
In my case, I wanted regular and one-off backup capability, and the Kubernetes
Jobs resource provided me a quick way to do this. I wanted my job pod to write out these dumpfiles to a persistent data store, so I first setup a
PersistentVolume and accompanying
Claim to that volume:
kind: PersistentVolume apiVersion: v1 metadata: name: mongo-dump-pv-volume labels: type: local app: mongo-dump spec: storageClassName: manual capacity: storage: 50Gi accessModes: - ReadWriteMany hostPath: path: "/mnt/kube-data/mongo-dumps" type: DirectoryOrCreate --- kind: PersistentVolumeClaim apiVersion: v1 metadata: name: mongo-dump-pv-claim labels: app: postgres spec: storageClassName: manual accessModes: - ReadWriteMany resources: requests: storage: 50Gi
If you use a cloud provider like AWS or Azure, a PersistentVolume can provision block storage, so if your provider does make durability guarantees then your data volume is that much more persistent. The above example just uses
hostPath volumes, so will persist for the lifecycle of the host that path resides on.
Your Job, itself, will have a podspec, much like other Kubernetes resources like
Deployments where you make requests for resources, and it'll look like this:
apiVersion: batch/v1 kind: Job metadata: name: mongodb-backup labels: app: mongo-dump spec: backoffLimit: 5 activeDeadlineSeconds: 100 template: spec: containers: - name: mongodump image: mongo command: ["mongodump","--host","mongo-service:27017","--db","your_db"] volumeMounts: - mountPath: dump name: mongo-dumps volumes: - name: mongo-dumps persistentVolumeClaim: claimName: mongo-dump-pv-claim restartPolicy: OnFailure
You'll see we're attaching the volume as we might normally, and then run the
mongodump command, which will write out to the mount path,
If you have authentication enabled on your Mongo service, or use a SaaS like MongoDB Atlas, you can use
Secrets like you might normally to pass through credentials being stored securely, not in the Job spec itself:
apiVersion: batch/v1 kind: Job metadata: name: mongodb-backup labels: app: mongo-dump spec: backoffLimit: 5 activeDeadlineSeconds: 100 template: spec: containers: - name: mongodump image: mongo env: - name: MONGO_CONN_STRING valueFrom: secretKeyRef: name: mongo-auth key: connstring - name: MONGO_DB value: "my_db" command: ["mongodump","--host","$MONGO_CONN_STRING","--db","$MONGO_DB"] volumeMounts: - mountPath: dump name: mongo-dumps volumes: - name: mongo-dumps persistentVolumeClaim: claimName: mongo-dump-pv-claim restartPolicy: Never
After you apply this Job spec, you can monitor your progress:
kubectl get pods -l app=mongo-dump
then monitor the logs for that pod name:
kubectl logs $POD_NAME
If you see your job failing, you can use the
restartPolicy to define behavior in this area; for example, in the declaration above, it will not restart, but you can, for example, use
OnFailure to attempt a retry, and use other options available to to define retires, backoff timing, etc.
CronJobs are another type of Job supported in Kubernetes presently (since 1.7), and in the link above, you'll see that the spec is similar, but includes your typical Cron syntax for defining when you'd like the job run, and related behavior.