The idea of doing fire drills is great. We lost our production database a few years ago (I had made a backup to do some testing with a few months earlier) and there weren't any recent backups. Both disks on the server failed at the same time. We lost about six months worth of data. The only way we were able to recover any of the data was due to the backup that I had made. The year after that we had another failure (not sure the result of that one). Luckily, I had my databases stored on a different server at the time. I think this post contains some well thought out tips and I will definitely share it with our IT team.
I'm the CTO at DoSomething.org, the largest tech company exclusively for young people and social change. I love building software, engineering culture, and diverse, happy teams.
Thanks, Tim! Another great practice we started last summer (also based on a suggestion from CJ Rayhill) was to keep a spreadsheet of every critical system and specify how it gets backed up, whether the backups are automatic, and how old the most recent backup is (especially if it's not yet automated).
Our DevOps lead reviews this weekly, and updates a cell on the spreadsheet with the timestamp of the latest review. This has helped expose and track our practices across very different systems.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
The idea of doing fire drills is great. We lost our production database a few years ago (I had made a backup to do some testing with a few months earlier) and there weren't any recent backups. Both disks on the server failed at the same time. We lost about six months worth of data. The only way we were able to recover any of the data was due to the backup that I had made. The year after that we had another failure (not sure the result of that one). Luckily, I had my databases stored on a different server at the time. I think this post contains some well thought out tips and I will definitely share it with our IT team.
Thanks, Tim! Another great practice we started last summer (also based on a suggestion from CJ Rayhill) was to keep a spreadsheet of every critical system and specify how it gets backed up, whether the backups are automatic, and how old the most recent backup is (especially if it's not yet automated).
Our DevOps lead reviews this weekly, and updates a cell on the spreadsheet with the timestamp of the latest review. This has helped expose and track our practices across very different systems.