Django Doctor audits code and auto fixes Django anti-patterns. We checked 666 Django projects for migrations that cannot be rolled back and were shocked that 22% of the Django codebases had the following problems:
- Data migration is missing reverse handler: migration has handler for updating the data, but not for undoing that change. Read more
- Data migration importing "hot" model: migration imports a Model from the codebase instead of doing
apps.get_model(...). Read more
These prevent disaster recovery. Picture the scene: it’s 11pm and production is down. Tired and stressed and out of options you decide to roll back to the last good release. All you need is to reverse the database migrations and then deploy and...wait what?
$ python manage.py migrate hounds 0002 Unapplying hounds.0003_auto... Traceback (most recent call last): django.db.migrations.exceptions.IrreversibleError: Operation <RunPython > in hounds.0003_auto is not reversible
This blocks rolling back to the last good release because the database state would be out of phase with the last good code: if your tried to rollback to the old code then the database state won't be compatible with the old codebase.
Django data migrations have two aspects: forwards and backwards, but by default the skeleton data migration generated by Django only shows the forwards handler, and hence one in five Django codebases have at least one migration that does not support undoing the migration:
def forwards(apps, schema_editor): ... class Migration(migrations.Migration): dependencies = [("cases", "0009_auto_20200320_1210")] operations = [migrations.RunPython(forwards)]
Unfortunately it only takes one missing
reverse to make all migrations fail when attempting to undo the migrations. Human error during code review can account for the fact so many Django codebases are unable to recover from disaster in production: if 100% of mistakes were spotted 100% of the time by 100% of code reviewers then we would not need code review in the first place because such Übermensch would not make the mistake in the first place.
Django data migrations require forwards, but backwards is technically optional. While the above is valid, below is infinitely better:
def forwards(apps, schema_editor): ... def backwards(apps, schema_editor): ... class Migration(migrations.Migration): dependencies = [("cases", "0009_auto_20200320_1210")] operations = [migrations.RunPython(forwards, backwards)]
So backwards is technically optional in the same way that wearing a parachute while skydiving is optional: given enough time you will hit the ground. Indeed, if you want the option of rolling back your production website then no, backwards is not optional. Omitting backwards means you only have the option of rolling forwards, which is a bit more daunting compared with safely rolling back to a release we know works, especially when performing disaster recovery under a stressed state. Indeed, if mistakes can be made during a calm code review then imagine the scope for mistakes when under pressure.
The fix is a super simple too: just specifying
migrations.RunPython.noop can be enough. As the name implies, Django will do nothing for the reverse, simply skip over the data migration, without throwing
class Migration(migrations.Migration): dependencies = [("cases", "0009_auto_20200320_1210")] operations = [migrations.RunPython(forwards, migrations.RunPython.noop)]
So in practice follow this advice:
🦊 You can enable Django Doctor on your GitHub repos to protect against this problem 100% of the time.
Can you see the problems with this Django migration?
from django.db import migrations from territory import models def forwards(apps, schema_editor): for item in in models.ChickenCoopLocations.objects.all(): item.has_chickens = does_have_chickens(item.pk) item.save() class Migration(migrations.Migration): dependencies = [("cases", "0005_auto_does_have_chickens.py")] operations = [migrations.RunPython(forwards)] def does_have_chickens(pk): ...
It's directly importing from models.py. Given enough time and enough code changes this will eventually break the migrations as the code diverges from the historic state of the database when the data migration was generated.
The fields in Django's
models.py must agree with the schema in the database. When Django performs database read or write operations it uses the shape of the model in
models.py to determine what fields to
models.py includes fields that are not yet in the database schema then the database will throw an error.
This is easily missed if the code is reviewed while in a rush, because when
0006_populate_has_chickens is ran this bug will not happen. Indeed, at that point in time
models.py does agree with the schema, but what happens when in one weeks time your team member adds a new field to
ChickenCoopLocations? From that point on whenever
0006_populate_has_chickens is ran,
models.py will have a have a field that the database schema does not yet have, so migrations will fail. This will happen when setting up a new database when all the migrations run from scratch such as in the CI, when a new developer joins the team, or when you replaced your bricked laptop.
0006_populate_has_chickens.py it's better to use
apps.get_model, which asks Django to construct a simplified time-traveling model whose fields will reflect the fields in the database even if
models.py is vastly out of step with the schema:
# 0006_populate_has_chickens.py from django.db import migrations def forwards(apps, schema_editor): ChickenCoopLocations = apps.get_model("territory", "ChickenCoopLocations") for item in in ChickenCoopLocations.objects.all(): item.has_chickens = does_have_chickens(item.pk) item.save() class Migration(migrations.Migration): dependencies = [("cases", "0005_auto_foo.py")] operations = [migrations.RunPython(forwards)] def does_have_chickens(pk): ...
So directly importing models in migrations is flaky and in a few migrations time will probably fail because during migrations the code in models.py is out of step with the database schema: the
models.py can have a field defined that does not yet exist in the database because the required migration has not yet ran.
Or try out Django refactoring challenges.