DEV Community

Cover image for Introduction to backups in AWS
Ben Fellows for AWS Community Builders

Posted on

Introduction to backups in AWS

Yesterday I did a live discussion on introduction to backups in AWS. Video link is available at the end of this article.

In this video I explained how you can plan your backups and what tools you can use in AWS to configure backups.

I'm going to summarise the main points here for everyone if you can't be bothered watching the recording of the Livestream.

Planning

First you can start with a company wide SLA for the acceptable downtime for any given application or database and the acceptable time until recovery as well as point in time.

These are the areas you need to plan for an enterprise cloud backup solution:

  • Normal required hours of service (e.g. 7am-7pm)
  • Acceptable uptime as a percentage (e.g. 95%, 99%, 99.99%)
  • Recovery Point Objective (or RPO) as the point in time you want to recover to.
  • Recovery Time Objective (or RTO) as the maximum time until you recover.

Once you have these negotiated and agreed upon with your business you can design a scalable solution.

Because each enterprise and business unit may have different objectives I found it more simple to provide 2 categories of application:

  • Mission Critical
  • Normal Business Hours

Mission Critical applications are your website, phone systems, email, production databases etc.
Normal Business Hours applications might be HRIS, Payroll, Finance.

Again this depends on the type of organisation you are and if you are in several different time zones most of your applications might be mission critical.

Schedule Planning

You need to understand what needs to happen when you have different types of data to backup. For example you might have EC2 servers with files stored on EBS volumes, SQL Server Enterprise on EC2 and EBS volumes. And you may have an open source app or website running on an RDS Instance.

You can have a schedule for each task that is going to be performed. This will depend on the solution you have in place. You want to co-ordinate timing of backups with each other and what the impact is going to be on running systems.

This spreadsheet will give you a good sample schedule for backups and restore process.

https://bit.ly/awsexp-backupspreadsheet

Tools

Let’s look at the combination of an Amazon DLM solution and a SQL Server S3 based solution.

Amazon DLM is for EBS volumes and AMIs, Whereas the database solution is for backing up to S3 and having a lifecycle that sits within S3.

Amazon DLM automates the creation of snapshots and manages the retention based on policies you set. For example you can setup a policy that snapshots all your EBS volumes for your website and set a schedule that runs daily. In addition to this you can have multiple schedules per policy so you could have a policy that has daily, weekly and monthly snapshots.

DLM can automatically copy tags of existing EBS volumes and also create additional tags. I have provided a link to an AWS CloudFormation template you can use for a simple daily backup policy at the end of this article.

For a custom solution you can use S3 as a target location for files you might want to backup into S3. A common use case is backing up SQL Server Backup files to S3. S3 can be configured with Life Cycle rules for deletion or archiving of files.

Auditing

One of the things I didn’t cover in the video was auditing. It is essential that you audit your backups are running. This can be done via some simple code that checks for S3 objects by date and also check for EBS snapshots.

For example, I use some Python code that handles the auditing of backups and ensures the correct files exist. This can be run as a continuous task daily, weekly or monthly. I would recommend running this as often as possible. In addition to this, AWS Lambda can be used with a scheduled Event. (More on that in another post).

Conclusion

Plan your backups, streamline your process and audit your backups. Test your restore process regularly to make sure your backup works as intended.

Video: https://www.twitch.tv/videos/1452480056
Cloudformation for DLM Policy: https://bit.ly/teemdlmcfn

Top comments (0)