So, at the company i work i became a focal point for some tasks cuz ive been doing 'em well, and think i got some reliability in automate things, help to develop new architectures and deployment strategies. We are a strong and small team: 4 developers: 2 back ends, 2 front ends, and everybody on the management.
This week im responsable for the product update of one of our biggest customers. All the team is supporting me, but im doing most of the hard work all alone. The task? an update deployment of a huge product which is already running. We must integrate the new version of our product without turn down or break the current one.
So we discussed the strategies and then wrote some scripts along the week. We wrote an sql migrator and a rsync uploader, both really complex due the product size and the current old architecture running on the server.
Then today i started the process by backuping the old current product version. I started a mysqldump on the remote sql server (i know, dump is not backup) and a rsync to archive the running product on our local server. This wasnt on the script, but i guess it should be. And know what? It wouldnt hurt anybody, right?
Well, the backup processes crashed the server. I just dont know how. 600+ rsync processes started to run on the server. I made an accidental ddos.
The team was very supportive, they helped to solve the problem. And we still dont know what really happened. The host managers said the server has crashed because of low memory. To be honest, it was like a fork bomb on the server, but knowbody knows how or why. 20 minutes of outage till we solve the issue and restart all services.
The team still trust on me and ill adopt a different strategy tomorrow. I think ill laught at this some day, but by now im just veeeery upset and shamed of myself.
This post is a part of a proccess to deal with it, but also to ask: how do you guys deal with those messes?