Ok, so here is the situation: you work in a team, and each developer works with a local environment. Or you have multiple environments. Or... We don't care. The only thing is, you have to handle different versions of databases, and you don't know how to do this safely and efficiently.
Different versions of a database may differ in terms of data or in terms of structure. In this article, we will focus on the management of different versions of a database with different structures by using relational database vocabulary with
MySQL examples. However, this solution can work for any kind of database, like NoSQL.
Here we assume you work in a team, and you have a tool to share the code between people (ex: GIT, SVN).
Let's create a script in the project where you put all of the database evolution. That can be a SQL script, a code in any programming language that you connect to the database, or another solution. This is not important, the only thing is you may execute database requests.
When a developer wants to change the database structure, he will write a block of code (one or multiple requests) at the end of this file, and share the file with other people. Is important to keep this file up to date and to regularly run it on all environment.
At this point, if you execute the file several times, you will certainly encounter some errors. This is problematic cause you need to run this file each time it is updated. So you have to add condition on each block, and store into the database which block is already executed. To do so, yo can create a table in your database to store each evolution with a unique ID. Finally, you will run the block only if the ID is not already stored on this table.
That's all. The following example provides an illustration of the process.
This example is based on a development project that uses MySQL database to store some application data. There are two developers that develop in local environments on their laptop (so 2 development environments), and there is one production environment. For each environment there is one database. Databases already exist, but we want to be sure that we have the same version (the same structure) of the database on each environment.
Generally in a project, the database evolves with the code of the project. So if you update the database you have to update the code, and vice versa.
First, let's create a table to store evolution. We have to execute this request on all databases:
CREATE TABLE db_evol( id_evol VARCHAR(10) PRIMARY KEY NOT NULL );
Now, let's create a file
db_evol.sql in the project, with our first evolution. Here we use MySQL procedure, but we could use others.
delimiter // -- evol1 create procedure evol() begin set @id_evol := "evol1"; SET @count:= (select COUNT(*) FROM db_evol WHERE id_evol= @id_evol); if @count = 0 then -- This 2 lines correspond to the DB evolution ALTER TABLE table1 ADD new_col VARCHAR(255); UPDATE table1 set new_col = 'default value'; INSERT INTO db_evol(id_evol) VALUES(@id_evol); SELECT CONCAT(@id_evol, " executed"); else SELECT CONCAT(@id_evol, " skipped"); end if; end// call evol()// drop procedure evol//
Next step is the second developer who wants to do update the database. First, I check if the file
db_evol.sql is up to date. If there are changes, I will run the file. After that, I will write a second evolution at the end of the file and test the script. If it's ok, I will update the file of the project.
After several evolutions, all developers agree to update the production environment. So they will update the code and run the script
db_evol.sql on the production.
This article provides an overview on how to handle database versions on multiple environments. As we have seen, there is not only one solution: you will face different cases. In order to comeback to the initial state in case of errors, we recommend you to add transaction or error handling in your script.