Django Doctor audits code and auto fixes Django anti-patterns. We checked 666 Django projects for problems hindering maintainability and found that 48% of the Django projects could simplify their models.py
- 22% used
TextFieldwould be better read more
- 7% used deprecated
- 40% had string fields that allowed
blankor vice versa read more
There were some intersections - so some projects fell into more than one camp. Note there are valid usecases for
blank differing, and we go through that in depth later.
How would you simply Models? Try our Django models.py refactor challenge.
When a user needs to enter a string that may be very long then it's quick and easy to use
CharField(max_length=5001), but that has some problems:
- 5001 is big, but is it big enough? What if the a user wants more? Yes
max_lengthcan be increased, but bug fixes are a pain, as is the database migration to facilitate the change.
- Years from now a developer dusts off your code and reads the number 5001. Do they infer there's something special about 5001? Maybe. Devs in the future will be very busy with complicated future things so let's not add ambiguity when maintaining your "old" code.
TextField is better here as there's really no need for a length check, so users will not be presented with a validation error. A nice rule of thumb cane be if the field does not need minimum length check then it probably does not need a maximum length check. For that reason when I see this happening I suggest using
So why use
TextField is so great? Historically efficient database space usage was a key consideration. But now storage is practically free. Plus, for Postgres at least, using
TextField has the same performance as
CharField, so database storage performance is not a key consideration.
There are valid cases for
CharField with a huge length though: just like an ISBN is always 10 or 13 characters, there are some very long codes. Storing QR codes?
CharField. Working with geometry and geo spacial?
CharField. Django GIS has a 2048 long
VARCHAR that Django represents as a
For four years the documentation for
NullBooleanField was two sentences and one of was "yeah…don't use it". As of 3.1 the axe has fallen and the hint of future deprecation has been replaced with an actual deprecation warning. Instead of using
For that reason when I see
NullBooleanField I suggest using
On the face of it, the existence of
NullBooleanField seems odd. Why have an entire class that can be achieved with a
null keyword argument? We don't see
NullDateField. Indeed, for those Django expects us to do
DateField(null=True). So what's was so special about
NullBooleanField and why is it now deprecated?
NullBooleanField renders a
NullBooleanSelect widget which is a
<select> containing options "Unknown" (None), "Yes" (True) and "No" (False). The implication is
NullBooleanField was intended for when explicitly stating no answer is known yet. Indeed, in many contexts it would be useful to clarify "is it False because the user has set it False, or because the user has not yet answered?". To facilitate that, the database column must allow
None at Python level).
Unfortunately time has shown a great deal of room for confusion: StackOverflow has many questions that are answered with "use
NullBooleanField instead of
BooleanField" and vice versa. If one of the reasons for separating
NullBooleanField was to give clarity then instead the opposite occurred for many.
Until Django 2.1 in 2018, null was not permitted in
BooleanField because (obviously)
None is not in a
bool value. Why would we expect
None to be used in a field that says it's for boolean values only? On the other hand
None is not a
str either but
CharField(null=True) was supported and
None is not an
IntegerField(null=True) was also acceptable.
So in the deprecation of
NullBooleanField there is an argument for consistency with how the other fields handle null. If we're aiming for consistency the choice is to either add
NullDateField and so on or to rename
BooleanField and call it a day, even though
NullBooleanField was a more accurate name.
With this deprecation three classes are impacted:
These three have slightly different handling of "empty" values, so for some the swap from
BooleanField will need some careful testing:
from django.forms.fields import NullBooleanField field = NullBooleanField() assert field.clean("True") is True assert field.clean("") is None assert field.clean(False) is False from django.forms.fields import BooleanField field = BooleanField(required=False) assert field.clean(True) is True assert field.clean("") is False assert field.clean(False) is False from django.db.models import fields field = fields.BooleanField(null=True, blank=True) assert field.clean(True, "test") is True assert field.clean("", "test") is None assert field.clean(False, "test") is False
Expect the unexpected if
blank are different values:
null controls if the the database level validation allows no value for the field, while blank controls if the application level validation allows no value for the field.
blank=True then the field model validation allows an empty value such as
"" to be inputted by users. If
blank=False then the validation will prevent empty values being inputted.
On the other hands,
null informs the database if the database column for the field can be left empty, resulting in the database setting either
NOT NULL on the column. If the database encounters an empty
NOT NULL column then it will raise an
blank is used during during field validation.
ModelSerializer each trigger field level validation. For a concrete example,
ModelForm calls the model instance's
full_clean method during form validation, and full_clean then calls
clean_fields, which in turn may raise a
For that reason when I see this happening I suggest the following:
So normally do we want
blank to the same value? When would we want to have
blank=True or even
This facilitates using sensible default values for string fields: the field may have a default value like
name = CharField(null=False, blank=True, default=""). This is useful if the field is optional, but we also want to prevent the database column from having inconsistent data types. Sometimes being None, sometimes being
"", and other times being a non-empty string causes extra complexity in code and in ORM: if we wanted to find all users with no name:
Foo.objects.filter(name="") | Foo.objects.filter(name__isnull=True)
Compare that with the case for when the value in the database column will always be a string:
This scenario is more to keep the database happy. If using the
django.db.backends.oracle database engine then this may be needed because Oracle forces empty strings to
NULL, even if an empty string was submitted in the form, so
name = CharField(null=True, blank=False) would be needed.
Zero downtime deployment strategies may required
NULL on the database column, even though business requirements dictate the user must enter a value in the form. During blue/green deployments both the new codebase and the old codebase run against the same database at the same. If the new codebase adds a new fields and there is no sensible default value for it then
null=True is needed to avoid the database throwing an
IntegrityError while the instance of your website running the old codebase interacts with the database.
While the database column can accept
null, form validation can prevent the end users inputting no value, so data type consistency is assured? No - this required the form validation to actually run. If a developer is creating or updating via the shell then the validation will not run unless the developer calls
instance.clean_fields(). This strategy is simplified if a sane default value can be used instead of setting
Or try out Django refactor challenges.