Aaron Harris for Kite

Posted on Oct 23, 2019 • Originally published at kite.com

Django Database Migrations: A Comprehensive Overview

#python #django #webdev #tutorial

Django Database Migrations: A Comprehensive Overview
by Damian Hites

The Django web framework is designed to work with an SQL-based relational database backend, most commonly PostgreSQL or MySQL. If you’ve never worked directly with a relational database before, managing how your data is stored/accessed and keeping it consistent with your application code is an important skill to master.

You’ll need a contract between your database schema (how your data is laid out in your database) and your application code, so that when your application tries to access data, the data is where your application expects it to be. Django provides an abstraction for managing this contract in its ORM (Object-Relational Mapping).

Over your application’s lifetime, it’s very likely that your data needs will change. When this happens, your database schema will probably need to change as well. Effectively, your contract (in Django’s case, your Models) will need to change to reflect the new agreement, and before you can run the application, the database will need to be migrated to the new schema.

Django’s ORM comes with a system for managing these migrations to simplify the process of keeping your application code and your database schema in sync.

Django’s database migration solution

Django’s migration tool simplifies the manual nature of the migration process described above while taking care of tracking your migrations and the state of your database. Let’s take a look at the three-step migration process with Django’s migration tool.

1. Change the contract: Django’s ORM

In Django, the contract between your database schema and your application code is defined using the Django ORM. You define a data model using Django ORM’s models and your application code interfaces with that data model.

When you need to add data to the database or change the way the data is structured, you simply create a new model or modify an existing model in some way. Then you can make the required changes to your application code and update your unit tests, which should verify your new contract (if given enough testing coverage).

2. Plan for change: generate migrations

Django maintains the contract largely through its migration tool. Once you make changes to your models, Django has a simple command that will detect those changes and generate migration files for you.

3. Execute: apply migrations

Finally, Django has another simple command that will apply any unapplied migrations to the database. Run this command any time you are deploying your code to the production environment. Ideally, you’ll have deploy scripts that would run the migration command right before pushing your new code live.

Tracking changes with Django

Django takes care of tracking migrations for you. Each generated migration file has a unique name that serves as an identifier. When a migration is applied, Django maintains a database table for tracking applied migrations to make sure that only unapplied migrations are run.

The migration files that Django generates should be included in the same commit with their corresponding application code so that it’s never out-of-sync with your database schema.

Rolling back with Django

Django has the ability to rollback to a previous migration. The auto-generated operations feature built-in support for reversing an operation. In the case of a custom operation, it’s on you to make sure the operation can be reversed to ensure that this functionality is always available.

A simple Django database migrations example

Now that we have a basic understanding of how migrations are handled in Django, let’s look at a simple example of migrating an application from one state to the next. Let’s assume we have a Django project for our blog and we want to make some changes.

First, we want to allow for our posts to be edited before publishing to the blog. Second, we want to allow people to give feedback on each post, but we want to give them a curated list of options for that feedback. In anticipation of those options changing, we want to define them in our database rather than in the application code.

The initial Django application

For the purposes of demonstration, we’ll setup a very basic Django project called Foo:

django-admin startproject foo

Within that project, we’ll set up our blogging application. From inside the project’s base directory: ./manage.py startapp blog

INSTALLED_APPS = [
    ...
    'blog',
]

In blog/models.py we can define our initial data model:

class Post(models.Model):
    slug = models.SlugField(max_length=50, unique=True)
    title = models.CharField(max_length=50)
    body = models.TextField()

In our simple application, the only model we have represents a blog post. It has a slug for uniquely identifying the post, a title, and the body of the post.

Now that we have our initial data model defined, we can generate the migrations that will set up our database: ./manage.py makemigrations

Notice that the output of this command indicates that a new migration file was created at

blog/migrations/0001_initial.py containing a command to CreateModel name=‘Post’.

If we open the migration file, it will look something like this:

# Generated by Django 2.2 on 2019-04-21 18:04

from django.db import migrations, models

class Migration(migrations.Migration):
    initial = True

    dependencies = [
    ]

    operations = [
        migrations.CreateModel(
            name='Post',
            fields=[
                ('id', models.AutoField(
                    auto_created=True, 
                    primary_key=True, 
                    serialize=False, 
                    verbose_name='ID'
                )),
                ('slug', models.SlugField(unique=True)),
                ('title', models.CharField(max_length=50)),
                ('body', models.TextField()),
            ],
        ),
    ]

Most of the migration’s contents are pretty easy to make sense of. This initial migration was auto-generated, has no dependencies, and has a single operation: create the Post Model.

Now let’s set up an initial SQLite database with our data model:

./manage.py migrate

The default Django configuration uses SQLite3, so the above command generates a file called db.sqlite3 in your project’s root directory. Using the SQLite3 command line interface, you can inspect the contents of the database and of certain tables.

To enter the SQLite3 command line tool run:

sqlite3 db.sqlite3

Once in the tool, list all tables generated by your initial migration:

sqlite> .tables

Django comes with a number of initial models that will result in database tables, but the 2 that we care about right now are blog_post, the table corresponding to our Post Model, and django_migrations, the table Django uses to track migrations.

Still in the SQLite3 command line tool, you can print the contents of the django_migrations table:

sqlite> select * from django_migrations;

This will show all migrations that have run for your application. If you look through the list, you’ll find a record indicating that the 0001_initial migration was run for the blog application. This is how Django knows that your migration has been applied.

Changing the Django data model

Now that the initial application is setup, let’s make changes to the data model. First, we’ll add a field called published_on to our Post Model. This field will be nullable. When we want to publish something, we can simply indicate when it was published.

Our new Post Model will now be:

from django.db import models

class Post(models.Model):
    slug = models.SlugField(max_length=50, unique=True)
    title = models.CharField(max_length=50)
    body = models.TextField()
    published_on = models.DateTimeField(null=True, blank=True)

Next, we want to add support for accepting feedback on our posts. We want 2 models here: one for tracking the options we display to people, and one for tracking the actual responses

from django.conf import settings
from django.db import models

class FeedbackOption(models.Model):
    slug = models.SlugField(max_length=50, unique=True)
    option = models.CharField(max_length=50)

class PostFeedback(models.Model):
    user = models.ForeignKey(
        settings.AUTH_USER_MODEL, related_name='feedback',
        on_delete=models.CASCADE
    )
    post = models.ForeignKey(
        'Post', related_name='feedback', on_delete=models.CASCADE
    )
    option = models.ForeignKey(
        'FeedbackOption', related_name='feedback', on_delete=models.CASCADE
    )

Generate the Django database migration

With our model changes done, let’s generate our new migrations:

./manage.py makemigrations

Notice that this time, the output indicates a new migration file, blog/migrations/0002_auto_<YYYYMMDD>_<...>.py, with the following changes:

Create model FeedbackOption
Add field published_on to Post
Create model PostFeedback

These are the three changes that we introduced to our data model.

Now, if we go ahead and open the generated file, it will look something like this:

# Generated by Django 2.2 on 2019-04-21 19:31

from django.conf import settings
from django.db import migrations, models
import django.db.models.deletion

class Migration(migrations.Migration):

    dependencies = [
        migrations.swappable_dependency(settings.AUTH_USER_MODEL),
        ('blog', '0001_initial'),
    ]

    operations = [
        migrations.CreateModel(
            name='FeedbackOption',
            fields=[
                ('id', models.AutoField(
                    auto_created=True,
                    primary_key=True,
                    serialize=False, verbose_name='ID'
                )),
                ('slug', models.SlugField(unique=True)),
                ('option', models.CharField(max_length=50)),
            ],
        ),
        migrations.AddField(
            model_name='post',
            name='published_on',
            field=models.DateTimeField(blank=True, null=True),
        ),
        migrations.CreateModel(
            name='PostFeedback',
            fields=[
                ('id', models.AutoField(
                    auto_created=True,
                    primary_key=True,
                    serialize=False,
                    verbose_name='ID'
                )),
                ('option', models.ForeignKey(
                    on_delete=django.db.models.deletion.CASCADE,
                    related_name='feedback',
                    to='blog.FeedbackOption'
                )),
                ('post', models.ForeignKey(
                    on_delete=django.db.models.deletion.CASCADE,
                    related_name='feedback',
                    to='blog.Post'
                )),
                ('user', models.ForeignKey(
                    on_delete=django.db.models.deletion.CASCADE,
                    related_name='feedback',
                    to=settings.AUTH_USER_MODEL
                )),
            ],
        ),
    ]

Similar to our first migration file, each operation maps to changes that we made to the data model. The main differences to note are the dependencies. Django has detected that our change relies on the first migration in the blog application and, since we depend on the auth user model, that is marked as a dependency as well.

Applying the Django database migration

Now that we have our migrations generated, we can apply the migrations:

./manage.py migrate

The output tells us that the latest generated migration is applied. If we inspect our modified SQLite database, we’ll see that our new migration file should be in the django_migrations table, the new tables should be present, and our new field on the Post Model should be reflected in the blog_post table.

Now, if we were to deploy our changes to production, the application code and database would be updated, and we would be running the new version of our application.

Bonus: data migrations

In this particular example, the blog_feedbackoption table (generated by our migration) will be empty when we push our code change. If our interface has been updated to surface these options, there is a chance that we forget to populate these when we push. Even if we don’t forget, we have the same problem as before: new objects are created in the database while the new application code is deploying, so there is very little time for the interface to show a blank list of options.

To help in scenarios where the required data is somewhat tied to the application code or to changes in the data model, Django provides utility for making data migrations. These are migration operations that simply change the data in the database rather than the table structure.

Let’s say we want to have the following feedback options: Interesting, Mildly Interesting, Not Interesting and Boring. We could put our data migration in the same migration file that we generated previously, but let’s create another migration file specifically for this data migration...

... check out the code on Kite's blog! Continue with "Bonus: Data Migrations"

Damian Hites is the CTO of Sylo, which is looking to improve Social Media Marketing by offering 3rd party trusted measurement. He has 10+ years of experience writing software and leading teams.

DEV Community