DEV Community

loading...

Adding validation support for Json in Django Models

saadullahaleem profile image Saadullah Aleem ・1 min read

JSON image

It’s pretty cool that Django has support for JSON based features of PostgreSQL. The JSONField() can be assigned to model attributes to store JSON based data. However, enforcing pre-save validation on this field can be tricky. Doing so in your serializers and that too through using Python’s dicts can be a hassle and can get ugly pretty quickly.

Fortunately, there’s a great tool available to validate JSON data. It’s called json-schema and it has implementations in multiple languages, including Python. It can check for data types, make sure that strings match enums, and allow additional properties which might not require any validation. You can also nest multiple data types like having an array of objects with each object having a set number of fields with different data types of their own. Here’s an example of a schema against which we can validate our data:

{
  "title" : "work experience",
  "type" : "object",
  "additionalProperties": false,
  "properties":{
    "data": {"type": "array",
      "items": {
          "properties" : {
              "job_title": {"type": "string"},
              "speciality": {"type": "string"},
              "company": {"type": "string"},
              "address": {"type": "string"},
              "date_from": {"type": "string", "format": "date"},
              "date_to": {"type": "string", "format": "date"}
          }
      }
    }
  }
}

JSON schema for data holding an employee’s work experience
The idea here is to link a schema to your field against which you could validate your JSON before saving it into the database. We’ll extend the JSONField() provided by Django and add pre-save validation functionality to it.

from jsonschema import validate, exceptions as jsonschema_exceptions

from django.core import exceptions
from django.contrib.postgres.fields import JSONField


class JSONSchemaField(JSONField):

    def __init__(self, *args, **kwargs):
        self.schema = kwargs.pop('schema', None)
        super().__init__(*args, **kwargs)

    @property
    def _schema_data(self):
        model_file = inspect.getfile(self.model)
        dirname = os.path.dirname(model_file)
        # schema file related to model.py path
        p = os.path.join(dirname, self.schema)
        with open(p, 'r') as file:
            return json.loads(file.read())

    def _validate_schema(self, value):

        # Disable validation when migrations are faked
        if self.model.__module__ == '__fake__':
            return True
        try:
            status = validate(value, self._schema_data)
        except jsonschema_exceptions.ValidationError as e:
            raise exceptions.ValidationError(e.message, code='invalid')
        return status

    def validate(self, value, model_instance):
        super().validate(value, model_instance)
        self._validate_schema(value)

    def pre_save(self, model_instance, add):
        value = super().pre_save(model_instance, add)
        if value and not self.null:
            self._validate_schema(value)
        return value

We’re using an implementation of jsonschema here which validates our data against a schema such as the one defined above. In the constructor, we expect a path to a json file which we then load into memory as a property in the _schema_data method. The pre_save method makes sure that validation is performed before the model instance is saved.

Using this field is pretty straightforward. You’ll need to define your schema and save it before providing the relative path to it as an argument:

class Employee(models.Model):
    bio_short = models.CharField(max_length=256)
    bio_long = models.TextField()
    work_experience = JSONSchemaField(
      schema='schemas/jsonschema.example.json', default=dict, blank=True)

Now if we try saving an employee instance with json that doesn’t correspond to the schema defined above, a ValidationError will be thrown.

This is a great way to automate json validation instead of doing it in your serializers or elsewhere. We can go further and have a configurable schema property that is different for each model instance. For that, we’d have to pass the schema to the field while saving the instance and manually call our validation method.

Got questions or suggestions? Comment on this post and I’ll try my best to answer them.

Discussion (1)

pic
Editor guide
Collapse
habibullahkhanbarakzai profile image
HabibUllahKhanBarakzai

Impressive work, my lead. I have seen, its implementation in HAH.