Hello pals,
While working with Django, we all write code that does the job, but some code may be performing excessive computations or operations that we are unaware of. These operations may be ineffective and/or counterproductive in practice.
Here, I am going to mention some anti-patterns in Django models.
A. Using
len(queryset)
instead ofqueryset.count()
The queryset in Django are lazily evaluated which means that records in database aren't read from database until we interact with the data.
len(queryset)
performs the count of database records by Python interpreter in application level. For doing so, all the records should be fetched from the database at first, which is computationally heavy operation.
Whereas,queryset.count()
calculates the count at the database level and just returns the count.
For a model Post
:
from django.db import models
class Post(models.Model):
author = models.CharField(max_length=100)
title = models.CharField(max_length=200)
content = models.TextField()
If we use len(queryset)
, it handles the calculation like SELECT * FROM post
which returns a list of records (queryset) and then python interpreter calculates the length of queryset which is similar to list data structure. Imagine the waste in downloading many records only to check the length and throw them away at the end! But, if we need the records after reading the length, then len(queryset)
can be valid.
If we use queryset.count()
, it handles the calculation like SELECT COUNT(*) FROM post
at database level. It makes the code execution quicker and improves database performance.
B. Using
queryset.count()
instead ofqueryset.exists()
- While we kept praising the use of
queryset.count()
to check the length of a queryset, using it may be performance heavy if we want to check the existence of the queryset. For the same modelPost
, when we want to check if there are any post written by authorArjun
, we may do something like:
posts_by_arjun: Queryset = Post.objects.filter(author__iexact='Arjun')
if posts_by_arjun.count() > 0:
print('Arjun writes posts here.')
else:
print('Arjun doesnt write posts here.)
The posts_by_arjun.count()
performs an SQL operation that scans every row in a database table. But, if we are just interested in If Arjun writes posts here or not ?
then, more efficient code will be:
posts_by_arjun: Queryset = Post.objects.filter(author__iexact='Arjun')
if posts_by_arjun.exists():
print('Arjun writes posts here.')
else:
print('Arjun doesnt write posts here.)
posts_by_arjun.exists()
returns a bool expression that finds out if at least one result exists or not. It simply reads a single record in the most optimized way (removing ordering, clearing any user-defined select_related()
or distinct()
methods.)
Also, checking existence / truthiness of queryset like this is inefficient.
posts_by_arjun: Queryset = Post.objects.filter(author__iexact='Arjun')
if posts_by_arjun:
print('Arjun writes posts here.')
else:
print('Arjun doesnt write posts here.)
This does the fine job in checking if there are any posts by Arjun or not but is computationally heavy for larger no of records. Hence, use of queryset.exists()
is encouraged for checking existence / truthiness of querysets.
C. Using
signals
excessively
- Django signals are great for triggering jobs based on events. But it has some valid cases, and they shouldn't be used excessively. Think of any alternative for signals within your codebase, brainstorm on its substitution and try to place signals logic in your models itself, if possible.
- They are not executed asynchronously. There is no background thread or worker to execute them. If you want some background worker to do your job for you, try using
celery
. - As signals are spread over separate files if you're working on a larger project, they can be harder to trace for someone who is a fresh joiner to the company and that's not great. Although,
django-debug-toolbar
does some help in tracing the triggered signals.
Let's create a scenario where we want to keep the record of Post
writings in a separate model PostWritings
.
class PostWritings(models.Model):
author = models.CharField(max_length=100, unique=True)
posts_written = models.PositiveIntegerField(default=0)
If we want to automatically update the PostWritings
record for a use based on records created on Post
model, there are ways to achieve the task with / without signals.
A. With Signals
from django.db.models.signals import post_save
from django.db.models import F
from django.dispatch import receiver
from .models import Post
@receiver(sender=Post, post_save)
def post_writing_handler(sender, instance, created, **kwargs):
if created:
writing, created = PostWritings.objects.get_or_create(author=instance.author)
writing.update(posts_written=F('posts_written') + 1)
B. Without Signals
We need to override the save()
method for Post
model.
from django.db import models
from django.db.models import F
class Post(models.Model):
author = models.CharField(max_length=100)
title = models.CharField(max_length=200)
content = models.TextField()
def save(self, *args, **kwargs:
# Overridden method.
author = self.author
if self.id:
writing, created = PostWritings.objects.get_or_create(author=author)
writing.update(posts_written=F('posts_written') + 1)
super(Post, self).save(*args, **kwargs)
As the same job can be accomplished without signals, the code can be easily traced and prevent unnecessary event triggers.
If someone feels about not having readability on save()
method here, breaking up code is always great. Let's do that.
from django.db import models
from django.db.models import F
class Post(models.Model):
author = models.CharField(max_length=100)
title = models.CharField(max_length=200)
content = models.TextField()
def _update_post_writing(self, created=False, author=None):
if author is not None and created:
writing, created = PostWritings.objects.get_or_create(author=author)
writing.update(posts_written=F('posts_written') + 1)
def save(self, *args, **kwargs:
# Overridden method.
author = self.author
created = self.id is None
super(Post, self).save(*args, **kwargs)
self._update_post_writing(created, author)
Looks like we've learned how to mitigate some Django Model Anti Patterns. For now, thank you everyone for having me here. I'll be continuing with more stuffs about Django very soon.
You can also find me on GitHub. Till then keep coding :)
Top comments (2)
Really interesting article!
I had a question: In the example shared describing the replacement of signals with overriding of the
save()
method:How do we implement a similar approach to decrement the number of articles written by an author, in the case where a post is deleted by an author?
Thanks
Edit: Fixed markdown syntax
For that we can use the
delete()
model method, and create the_delete_post_writing
and decrement the count.