DEV Community

Cover image for How to setup a DR for your K8s cluster with Velero?
Maxime Guilbert
Maxime Guilbert

Posted on

How to setup a DR for your K8s cluster with Velero?

All the definitions are at the bottom.

Depending the criticity of your project and/or your SLA (Service Level Agreements), you may need a Disaster Recovery (DR) plan to keep your services up and running.

So here is Velero, a simple tool to make backups of some elements (or all) in your cluster, and do a recover from one of them.

With Velero, you can do massive backups for a complete cluster or be more granular and do backups by namespaces for example.

NOTE : It can also be used to do some cluster migrations.


Installation

First, you need to install velero in a dedicated pod.

Download the release that you want with a curl and setup it to make it executable.

# Download the Velero release
curl -L -o /tmp/velero.tar.gz https://github.com/vmware-tanzu/velero/releases/download/v1.5.1/velero-v1.5.1-linux-amd64.tar.gz 

# Unzip it
tar -C /tmp -xvf /tmp/velero.tar.gz

# Move to the local/bin folder
mv /tmp/velero-v1.5.1-linux-amd64/velero /usr/local/bin/velero
chmod +x /usr/local/bin/velero

# Test the velero command, and it should work!
velero --help
Enter fullscreen mode Exit fullscreen mode

Set your backup storage

Plugins List

Depending what you want/need and where your cluster is deployed, you will use a different way to configure your backup storage.

When I write this post, 13 plugins are available and only the next 5 are supported by Velero :

Plugin installation

Now, we will continue with AWS.

1/ First, we need to create the S3 bucket and create credentials for Velero.

For this demo, our bucket will be called "mg-demo-velero" and will be in "ca-central-1"

2/ Then, create a file with the AWS Credentials

cat > /tmp/credentials-velero <<EOF
[default]
aws_access_key_id=$AWS_ACCESS_ID
aws_secret_access_key=$AWS_ACCESS_KEY
EOF
Enter fullscreen mode Exit fullscreen mode

3/ Install the AWS plugin using the credentials file and the other informations about your bucket

velero install \
    --provider aws \
    --plugins velero/velero-plugin-for-aws:v1.1.0 \
    --bucket mg-demo-velero \ # Name of your bucket
    --backup-location-config region=ca-central-1 \ # Region where your bucket is created
    --snapshot-location-config region=ca-central-1 \ # Region where your bucket is created
    --secret-file /tmp/credentials-velero # Path to your credential file
Enter fullscreen mode Exit fullscreen mode

4/ Check the list of backups

velero backup get
Enter fullscreen mode Exit fullscreen mode

At this moment, if it's your first experience with Velero, you will see that you don't have any backups. (And it's normal)

But if you already use this bucket to store backup, you must see something like this

NAME               STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
test-backup        Completed   0        0          2021-07-21 14:45:51 -0700 PDT   29d       default            <none>
...
Enter fullscreen mode Exit fullscreen mode

Life cycle commands

Now that we have a complete setup of Velero ready, we will see all the commands to use it to its full capability.

Simple Backup

Create Backup

velero backup create [backup name] [options]

Options

parameter default description example
--ttl [DURATION] 30d Retention duration of the backup --ttl 24h0m0s
To retain the backup only 24h
--include-cluster-resources=[boolean] false Check to know if you want to include cluster-scoped resources --include-cluster-resources=true
--include-namespaces [namespaces] List of namespaces to include in the backup separated by a comma --include-namespaces test,default
--exclude-namespaces [namespaces] List of namespaces to exclude in the backup separated by a comma --exclude-namespaces test,default
--include-resources [resource names] List of resources to include in the backup separated by a comma --include-resources storageclasses
--exclude-resources [resource names] List of resources to exclude in the backup separated by a comma --exclude-resources storageclasses
--ordered-resources '[resources]' List of exact resources to include in the backup. Resource names are separated by a semi-colon and resources are separated by a comma. --ordered-resources 'pods=ns1/pod1,ns1/pod2;persistentvolumes=pv4,pv8'
--selector [labels] List of labels separated by a comma that resources needs to be include in the backup. --selector app=elasticsearch-master,env=test

Examples

velero backup create backup1 --include-cluster-resources=true --ordered-resources 'pods=ns1/pod1,ns1/pod2;persistentvolumes=pv4,pv8' --include-namespaces=ns1
velero backup create backup2 --ordered-resources 'statefulsets=ns1/sts1,ns1/sts0' --include-namespaces=ns1
Enter fullscreen mode Exit fullscreen mode

Delete backup

velero backup delete [backup name]

Examples

velero backup delete backup1
Enter fullscreen mode Exit fullscreen mode

List backups

velero backup get

Examples

velero backup get

NAME               STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
test-backup        Completed   0        0          2021-07-21 14:45:51 -0700 PDT   29d       default            <none>
Enter fullscreen mode Exit fullscreen mode

Get logs

velero backup logs [backup name]

Examples

velero backup logs backup1
Enter fullscreen mode Exit fullscreen mode

Schedule backup

As a great tool to create backup, you can create a schedule to automate it! Depending your project and your needs, it can be an hourly one or a daily one!

Create

velero schedule create [schedule name] --schedule="[schedule]" [options]

Schedule

You can use a CRON or the annotation @every. The two following example will create a backup every 6 hours.

velero schedule create test1 --schedule="0 */6 * * *"

velero schedule create test2 --schedule="@every 6h"
Enter fullscreen mode Exit fullscreen mode

Options

parameter default description example
--ttl [DURATION] 30d Retention duration of the backup --ttl 24h0m0s
To retain the backup only 24h
--include-namespaces [namespaces] List of namespaces to include in the restore separated by a comma --include-namespaces test,default
--exclude-namespaces [namespaces] List of namespaces to exclude in the restore separated by a comma --exclude-namespaces test,default
--include-resources [resource names] List of resources to include in the restore separated by a comma --include-resources storageclasses
--exclude-resources [resource names] List of resources to exclude in the restore separated by a comma --exclude-resources storageclasses
--selector [labels] List of labels separated by a comma that resources needs to be include in the restore. --selector app=elasticsearch-master,env=test

Delete

velero delete schedule [schedule name]

Examples

velero delete schedule test1
Enter fullscreen mode Exit fullscreen mode

List

velero get schedules

Recovery

From Backup

To restore from a backup.

velero restore create [Name of the restore] --from-backup [Name of the backup] [options]

Options

parameter default description example
--include-namespaces [namespaces] List of namespaces to include in the restore separated by a comma --include-namespaces test,default
--exclude-namespaces [namespaces] List of namespaces to exclude in the restore separated by a comma --exclude-namespaces test,default
--include-resources [resource names] List of resources to include in the restore separated by a comma --include-resources storageclasses
--exclude-resources [resource names] List of resources to exclude in the restore separated by a comma --exclude-resources storageclasses
--selector [labels] List of labels separated by a comma that resources needs to be include in the restore. --selector app=elasticsearch-master,env=test

Examples

velero restore create restore1 --from-backup backup1

# Create a restore with a default name ("backup1-<timestamp>") from backup "backup1"
velero restore create --from-backup backup1
Enter fullscreen mode Exit fullscreen mode

From Schedule

To restore from the last backup of a schedule.

velero restore create [Name of the restore] --from-schedule [Name of the backup] [options]

Options

parameter default description example
--include-namespaces [namespaces] List of namespaces to include in the restore separated by a comma --include-namespaces test,default
--exclude-namespaces [namespaces] List of namespaces to exclude in the restore separated by a comma --exclude-namespaces test,default
--include-resources [resource names] List of resources to include in the restore separated by a comma --include-resources storageclasses
--exclude-resources [resource names] List of resources to exclude in the restore separated by a comma --exclude-resources storageclasses
--selector [labels] List of labels separated by a comma that resources needs to be include in the restore. --selector app=elasticsearch-master,env=test
--allow-partially-failed Allow to do a restore from a partially failed backup triggered by a schedule --allow-partially-failed

Examples

# As from a backup, if you don't specify a restore name, one will be generated
velero restore create --from-schedule schedule-1
Enter fullscreen mode Exit fullscreen mode

List

To list all the restore which have been done.

velero restore get

Describe

Allow you to get more informations from specific restores.

velero restore describe [Restore name 1] [Restore name 2] ...

Example

velero restore describe restore1 restore2
Enter fullscreen mode Exit fullscreen mode

Logs

To get the logs of a specific restore. Useful for troubleshooting.

velero restore logs [Restore name 1]

Example

velero restore logs restore1
Enter fullscreen mode Exit fullscreen mode

Exclude specific resources from backup

To exclude a specific resource from all your backups, you can add the label velero.io/exclude-from-backup=true.

kubectl label -n [namespace] [resource]/[name] velero.io/exclude-from-backup=true


Links

Velero

Tutorial

  • Full tutorial with AWS and Azure (by That DevOps Guy):
  • VMWare webinar about Velero :

Definitions

DR - Disaster Recovery

Definition based on the VMWare one.

Disaster recovery is an organization’s method of :

  • regaining access and functionality to its IT infrastructure
  • keep a backup of your data

after events like a natural disaster, cyber attack...

SLA - Service Level Agreements

Definition of Atlassian

An SLA (service level agreement) is an agreement between provider and client about measurable metrics like uptime, responsiveness, and responsibilities.


In my opinion, Velero is a good and simple tool which will help us a lot! To be able to do so quickly backups and restores is really amazing!

I hope it will help you!

Discussion (0)