Docker is an ubiquitous tool for us while building Offen, a fair and open source web analytics software. It is foundational for our development setup, but we also use it for deploying our own Offen instance to production.
One thing that we found missing was a simple and lightweight tool for taking and managing remote backups of Docker volumes. This is why we wrote our own tool called offen/docker-volume-backup which. In this post I'd like to introduce you to the tool and how to use it for automatically taking backups of the Docker volumes in your own setup.
Volumes are Docker's way of managing persistent data. As Docker containers themselves are ephemeral, volumes can be mounted into the container's filesystem, enabling you to persist data beyond the lifecycle of a container. Volumes are commonly used for storing database data or similar.
For example, this is how you would use a Docker volume to persist data for an Offen container:
docker volume create offen_data docker run -d \ -v offen_data:/var/opt/offen \ offen/offen:latest
In the running container, data stored in
/var/opt/offen will be persisted in the
offen_data volume and can be reused in other containers.
offen/docker-volume-backup is designed to run sidecared next to an application container and periodically take backups of volumes to any S3 compatible storage (i.e. AWS S3 itself or storages like MinIO or Ceph). It can run on any schedule you wish and it can also take care of rotating away old backups after a configured retention period.
If needed, it can temporarily stop and restart your running containers to ensure backup integrity.
Using alpine as the base image and using the MinIO client instead of AWS CLI for uploading files to the remote storage keeps the image small and lightweight.
The easiest way of managing such a setup is using docker-compose. A compose file that backs up its volumes would look something like this:
version: '3' services: offen: image: offen/offen:latest volumes: - db:/var/opt/offen labels: - docker-volume-backup.stop-during-backup=true backup: image: offen/docker-volume-backup:v1.0.2 # Ideally, those values should go into an `env` file or Docker secrets # as they contain credentials. It's easier to spell them out here # in the context of this tutorial though. environment: # A backup is taken each day at 2AM BACKUP_CRON_EXPRESSION: "0 2 * * *" # Backups are stored with a timestamp appended BACKUP_FILENAME: "offen-db-%Y-%m-%dT%H-%M-%S.tar.gz" # Backups older than 7 days will be pruned. # If this value is not given, backup will be kept forever. BACKUP_RETENTION_DAYS: "7" # Credentials for your storage backend AWS_ACCESS_KEY_ID: "<YOUR_ACCESS_KEY>" AWS_SECRET_ACCESS_KEY: "<YOUR_SECRET_KEY>" AWS_S3_BUCKET_NAME: "my-backups" # If given, backups are encrypted using GPG GPG_PASSPHRASE: "<SOME_KEY>" volumes: # This allows the tool to stop and restart all # containers labeled as docker-volume-backup.stop-during-backup - /var/run/docker.sock:/var/run/docker.sock:ro # All volumes mounted to /backup/<some-name> will be backed up - db:/backup/offen-db:ro volumes: db:
Of course, you can also use the image using plain Docker commands:
docker volume create offen_data docker run -d \ -v offen_data:/var/opt/offen \ -l docker-volume-backup.stop-during-backup=true offen/offen:latest docker run -d \ -v offen_data:/backup/offen-db:ro \ -v /var/run/docker.sock:/var/run/docker.sock:ro \ --env-file backup.env \ offen/docker-volume-backup:v1.0.2
Instead of running the backups on a regular schedule, you can also execute the command in a running container yourself:
docker exec <container_ref> backup
To recover from a backup, download and untar the backup file and copy its contents back into the docker volume using a one-off container created for just that purpose:
docker run -d \ --name backup_restore \ -v offen_data:/backup_restore alpine docker cp <location_of_your_unpacked_backup> backup_restore:/backup_restore docker stop backup_restore && docker rm backup_restore
The volume is now ready to use in other containers. Alternatively, you can use a one-off volume created beforehand.
Knowing you have remote backups around in case of unexpected infrastructure glitches helps moving forward with confidence and not too much worry. I hope this article demonstrated that adding them to your Docker setup is only a matter of configuring an additional container, and helps you get going with your backups so you can move forward with your product.
Written by Frederik Ring