Discussion on: How does your organization handle data backups?

View post

We use RDS for all our databases which gets you data as recent as 5 minutes ago, and you can restore from a snapshot in some reasonable amount of time (restoring 20gbs is quick, 5 TB a smidge longer). The big caveat here is that if you accidentally delete an RDS instance (happened once in a dev environment), all the automated backups disappear along with it 🤬.

We haven't gone as far as doing cross-region replication of an RDS instance. We did it once as a test to make sure our terraform scripts could spin up an environment in another region, so we know we can do it if needed. We haven't yet needed it 🤞

For static sites hosted on S3 + Cloudfront - we use cross-region replication with S3, using cloudfront for failover. This is a particularly inexpensive and easy solution for a resilient static site - history has shown S3 craps out about once a year.

EC2 instances are baked or configured with Chef/Bash, and send logs to Splunk, so we don't care about any data on the host.

I find it useful to think of backups/recovery in terms of:

how fast you want restore the data
how recent that data can be
how much you want to pay