DEV Community

Discussion on: Developer Fears: Breaking Production

Collapse
 
szymach profile image
Piotr Szymaszek

Had a project with test and production instances running on the same server, deployments done through Ansible playbooks. One playbook was for copying production database and uploaded files to the test instance. That playbook was written by an ex co-worker and it worked well for 3+ years. I have never touched it since there was no need.

One day while copying production to test something broke real bad, there were database tables missing, random errors, had no idea what happened. Turns out Ansible just added an option to the MySQL module which adds use <databasase-name> to the dumped file in one of the minor versions. Of course the copying script ran through root database user, so instead of overwriting the test database with the production one, it was overwriting the production database instead, because both databases were on the same server. It happened while the application was running. I also ran the script 3 times before we finally figured out what the hell happened, fortunately I was able to set things straight eventually, but it was bad.

Lesson 1: verify the scripts you use for deployment. If something can possible break anything, sooner or later it will.
Lesson 2: do not be lazy, do not use root privileges unless absolutely necessary. Had that copying script been using specific database users instead of root, there would be a connection error and no problems.
Lesson 3: keep in touch with changes in the tools you are using, especially Ansible, since it has a lot of essential modules (like myslq_db) not giving backward compatibility guarantee.