This happens all the time when dealing with new clients:
Me: How do you update the database?
Client: Check the 'sql' directory to see if there's new SQL in there. Run that.
Me: Manually? Are you serious?
Client: Yes. We're used to it.
Me: Um, OK. So how do you back out the change if there's a problem?
Client: There's hardly ever a problem, but we fix it by hand.
And things go downhill from there. I frequently meet clients with insane database migration strategies. The "dump SQL in a directory" for people to apply is an annoyingly common strategy. There's no clear way to roll it back and, you can't declare dependencies. If you're using a database like MySQL or Oracle, if it contains DDL changes, those aren't transaction safe, so they really should be in their own migration, but they're not. I even had one client where they emailed developers to let them know which SQL to apply.
A few clients have an in-house database migration strategy involving numbered migrations. It often looks like this:
... 213-up-add-index-on-address-state.sql 213-down-add-index-on-address-state.sql 214-up-add-customer-notes-table.sql 214-down-add-customer-notes-table.sql 215-up-add-sales-tax-sproc.sql 215-down-add-sales-tax-sproc.sql
That, at least, can allow devs to back out changes (but tricky if your DDL isn't transaction-safe), but it's amazing when you get to migration 215 and you have eight developers, four of whom need to make a database change and they're arguing over who gets number 216. Yes, I've seen this happen more than once.
With a naïve numbering strategy, you can't declare dependencies, you get numbering conflicts, you really can't "tag" a particular migration for deployment, and so on.
Or there are the migration strategies which allow migrations to be written in your favorite programming language. Those are often nice, but can't always leverage the strength of the database, often write very poor SQL, and make it hard for other teams not using the language to write migrations.
There's a better way.
The sqitch (pronounced "skitch", not "skwitch") Web site describes sqitch as:
Sensible database-native change management for framework-free development and dependable deployment.
That is, uh, quite the mouthful. But it's accurate. It has great documentation with tutorials for Postgres, SQLite, MySQL, Firebird, Exasol, Oracle, Snowflake, and Vertica. Out of the box, sqitch offers sane, easy-to-use database change management. Since it's both free and open source, it's also easy to hook into and customize, if needed.
App/Sqitch version v1.1.1
Sqitch is a database change management application. It currently supports PostgreSQL 8.4+, SQLite 3.7.11+, MySQL 5.0+, Oracle 10g+ Firebird 2.0+, Vertica 6.0+, Exasol 6.0+ and Snowflake.
What makes it different from your typical migration approaches? A few things:
Sqitch is not tied to any framework, ORM, or platform. Rather, it is a standalone change management system with no opinions about your database engine, application framework, or development environment.
Database changes may declare dependencies on other changes -- even on changes from other Sqitch projects. This ensures proper order of execution, even when you've committed changes to your VCS out-of-order.
Sqitch manages changes and dependencies via a plan…
I won’t cover setting up sqitch. It's pretty easy and the tutorials handle it well. Instead, I'll explain how I get teams up-and-running quickly with it.
However, there is one small change I recommend. By default, in the current directory, sqitch will add
verify directories. Your SQL will go into those directories. I prefer to minimize the number of top-level directories in my projects, so after I have things set up, I usually run a command similar to this (the following assumes PostgreSQL):
sqitch engine alter pg --top-dir sql
That tells the
sqitch program that when you're using the PostgreSQL engine, create
sql/verify directories. Thus, you only have one top-level
sql directory for managing your sqitch files.
And for the sake of what follows, we'll assume that we have a
acme_test target that we run our tests against and that we've set that target to be the default (that will make more sense when you've read the docs).
Also, please note that my usage pattern is not quite the same as what's taught in the tutorials. Instead, it's designed to be an easy workflow for any developer to understand.
After you have sqitch set up and have your initial schema added to sqitch, you can add a change. The basic pattern is "create a branch", "add sql changes", "modify code as needed", "commit" and merge back.
To be more specific, let's say that you want to add a
title column to the
customers table (note that the name passed to the
sqitch add command is arbitrary, but I recommend you pick a naming convention and stick to it).
- Create a branch in your source control
sqitch add customers/title
sql/revert/customers/titleto add your sql
sqitch deployto deploy those changes
- Edit your code if needed
- Run your tests
- Commit your changes
- Merge back to your main branch
(Note: for the first and last steps, if you're using
git, see my easy git workflow)
When doing this, it's also a good idea to revert your the sqitch change(s) you've added and redeploy. This makes it easier to spot the case where your
revert file doesn't properly revert the
sqitch rebase --onto @HEAD^ -y
Alternatively, if you're not comfortable with the
alias bounce='sqitch revert --to @HEAD^ -y && sqitch deploy'
Every change goes through the same pattern. It's almost like working with git, with a steady queue of changes adding up. This is a simpler pattern than what is explained in the sqitch docs, but you don't have to explain reworking or rebasing changes. There are some presentations you might want to watch if you'd like to learn more.
Sqitch was also highlighted on FLOSS Weekly:
First and foremost, you get to write your database changes in SQL, not in some "DSL" that you're provided with. You can leverage the full power of your database. Often, I find SQL generators produce poor SQL, or simply won't produce the SQL that I need. When was the last time your DSL let you create optimizer hints?
However, if you do need more than just SQL, it's easy enough to write sqitch middleware to intercept the call to the database with your own wrapper:
sqitch engine add pg --client /path/to/my/middleware
I've done this for a client who wanted to write database changes using Percona's excellent (and free) pt-online-schema-change tool.
Also, what happens if you get a conflict with
git and the
sqitch.plan file? It's easy to have a bad rebase and fix it incorrectly. Internally,
sqitch uses checksums to determine the changes you've applied and their order, so a bad rebase won't allow you to accidentally apply the wrong changes.
I've barely scratched the surface of what you can do with
sqitch. It's amazing how many other database change management systems get this wrong. When I switch teams over to
sqitch, most of their database development pain just goes away. If you're having trouble with database migrations, try
sqitch. You won't regret it.
Cover Photo by Tobias Fischer