Md. Hussainul Islam Sajib

Posted on Jan 28, 2021 • Edited on Jan 29, 2021

Django Fixtures: seeding databases

#django #fixtures #seeding #database

When we create an application or project in Django, we would need to test the features of the application. The best way to test is to use realistic data that would mean something to the developer. That way we would be able to better understand how the application is truly works. So, we would need data in our database; but if we are building an application from scratch the data may not be readily available and the database would be completely empty. Even if we have older databases, we might need to transfer our data to the newer one. That is when 'Fixtures' come into play.

Fixtures are collections of data that can be read by Django and loaded into it's database. Fixtures can also be used or created to store existing data. So, in essence, fixtures are a way for Django to export and import data into the database. Although there are packages that can help with this, like django-seed, I wanted to do it the manual way to actually understand how it works and to seed with data that is more relevant to my project.

Supported Format and Data structures

Django currently supports three formats:

JSON
XML
YAML

Django expects the fixtures to follow a specific pattern. Any pattern other than that would result in error. For JSON the pattern would be:

# This is from official documentation

[
  {
    "model": "myapp.person",
    "pk": 1,
    "fields": {
      "first_name": "John",
      "last_name": "Lennon"
    }
  },
  {
    "model": "myapp.person",
    "pk": 2,
    "fields": {
      "first_name": "Paul",
      "last_name": "McCartney"
    }
  }
]

Let's have a look at what this means. First of all, with the angle brackets indicate that this is an array. Then the curly braces and the key-value pairs indicate the things inside the array are JSON objects. Then each of the objects has exactly three keys: model, pk, and fields. The model indicates the where the model is located scoped by the app name: <app_name>.<model_name>. The pk indicates what the value of the primary key would be. I used UUID v4 as my primary key. So, I used this site to manually generate the UUIDs for me. There are many other sites that lets you generate bunch of UUIDs in one shot. Note that if you use UUID, it's value should be with in quotations. Finally, the fields property contains all the names of the fields and their respective values. That's it!!

The same information can be put into a YAML and that would look like following:

- model: myapp.person
  pk: 1
  fields:
    first_name: John
    last_name: Lennon
- model: myapp.person
  pk: 2
  fields:
    first_name: Paul
    last_name: McCartney

The XML fixture is little bit different compared to JSON and YML, apart from it being XML, it need to have some additional meta data. For example: version number with 'django-objects' and type of value in field. The same data that we have already seen would look like following in XML:

<?xml version="1.0" encoding="UTF-8"?>
<django-objects version="1.0">
    <object pk="1" model="myapp.person">
        <field type="CharField" name="first_name">John</field>
        <field type="CharField" name="last_name">Lennon</field>
    </object>
    <object pk="2" model="myap.person">
        <field type="CharField" name="first_name">Paul</field>
        <field type="CharField" name="last_name">McCartney</field>
    </object>
</django-objects>

Location of the fixtures

There are three ways that Django can find fixtures in a project. Those are:

App Scoped: By default, Django searches for fixtures directory inside an application. This is where it would look first. For this to work, we would need to create a new directory inside the app and call it 'fixtures'. Then we can store our fixtures specific to that app in that directory.
Project Scoped: We can store all our fixtures in project-level as well. This would require us to create a directory called 'fixtures' in project root level. Then we need to add the FIXTURE_DIRS settings in our settings.py file to point to the locations where the fixtures are stored. Django will search the locations indicated with this setting in addition to the app-scoped directories. This would look like this:

FIXTURE_DIRS = [
    'fixtures',
]

Note that the setting FIXTURE_DIRS expects a list of locations.

Command Prompt: We can also tell Django to search a particular location or file by adding an option to the command we would run to load the data. For example:

python manage.py loaddata location/to/your/data.json

Commands

Thus far we have been talking mostly about loading data into database but we can also generate the fixtures automatically, if we already have some data in the database (the 'export' feature that we talked about. So, there are basically two commands to deal with the fixtures and they are:

Loading data

loaddata is used to load data into the database. There are few ways that we can do this:

# specify the file location, name, and extension
python manage.py loaddata location/to/the/file/data.json

# specify the file name and extension
python manage.py loaddata data.json

#specify just the file name
python manage.py loaddata data

I believe this would need little bit of explanation. In the first command we specified a pathname, filename, and extensioin. Django would look for this file in all the 'fixture' directories that we have defined before. So, if we have a 'fixtures' directory at the root of our project, with this command Django will look for the directory structure (location/to/the/file/) and the filename inside that structure with appropriate extension. In the second command, there's no pathname; so, Django will look for the filename and extension in any of the locations specified. In the third command, Django will look for any file with filename inside the specified locations; the extensions wouldn't matter in this case. Just to note here, if we specify the extension of the fixture, Django will call the specific serializer to deserialize the data first. If we don't specify the extension, then Django will look for the file first and call the serializer based on the extension of the file found.

There are options that we can add to our loaddata command:

--database <db_name>: specifies the database where the data will be loaded. By default, this will use the default database specified in the settings.py file.
--app <app_name>: specifies the app where to look for the fixtures
--format <format_name>: specifies the serialization format (json/xml/yaml)
--exclude <file_name>: specifies any file that should be excluded from loading

Dumping data

dumpdata is used to generate fixtures. The command would look something like this:

python manage.py dumpdata <app_label>.<model_name> <app_label>.<model_name> --format <format_name>

I guess the command if pretty self-explanatory. We need to call the dumpdata command with the model name scoped to the app's label that has the model. We would also want to specify the format in which we want the data. There are some other options that we can add this command:

--all or -a: dumps all data using Django's default Model Manager.
--indent <indent_size>: specifies the size of the indent in integer format
--exclude <file_name>: specifies any app or model that should be excluded from the dump
--database <db_name>: specifies the name of the database
--pks [list of primary keys]: specifies the primary keys that would be dumped; applicable for one model only.
--output <file_name>: specifies the file name where the data will be dumped.

seed data for the whole project

If we need to seed data from multiple fixtures, executing the command again and again is not efficient. We can deal with this running a command that loads all the fixtures, using the wild card characters.

The command would look like following where it is loading all the json files inside the 'fixtures' directory:

python manage.py loaddata fixtures/*.json

This solution is taken from this stackoverflow comment

For further reading, refer to Django documentation.

Cover photo by Lukas Blazek on Unsplash

Top comments (1)

napestershine • Jan 8 '22

Awesome.

DEV Community

Django Fixtures: seeding databases

Supported Format and Data structures

Location of the fixtures

Commands

Loading data

Dumping data

seed data for the whole project

Top comments (1)

Read next

New Benchmark Tests AI Speech Models Across 143 Languages, Reveals Surprising Multilingual Performance Results

AI Models Tested on Chinese Dynasty Timeline Knowledge: New Benchmark Shows GPT-4 Leads at 75% Accuracy

New AI System Detects Writing Style Across 12 Languages with 15% Better Accuracy

AI Models Help Find Emergency Posts During Disasters Using Less Computing Power