DEV Community

Cover image for The little mistake that loads your entire database to memory
Code Review Doctor
Code Review Doctor

Posted on • Edited on

The little mistake that loads your entire database to memory

During my time as a Django code review bot I've seen many cases of devs accidentally reading their entire database table to memory. Some of these were in Django views - meaning each time a user views their web page, every row was loaded to memory. Not particularly scalable!

A condensed version of the problem:

import models

def get_plan_for_today():
    queryset = models.Farm.filter(has_hounds=False)
    if queryset:
         return 'visit some friends'
    return 'stay in my hole'
Enter fullscreen mode Exit fullscreen mode

Did you see it? if queryset: evaluates the queryset there and then: this truthiness check tells Django that we want to interact with the data and therefore Django goes ahead and loads all the records in the queryset to memory. Possibly a few dozen. Possibly tens of thousands.

For tables with not a lot of data this can go unnoticed, but over time the number of records increases and so too does the number of users. This interplay increasing both the number of request that read all the records and the number of records being read. This continues degrading performance until suddenly hmm why is this page taking a few seconds to load, then oh err my page is timing out. It works fine on my local and then one day oh dear production is down.

The developer probably meant to use queryset.exists(), but humans make mistakes: rushing to meet deadlines, inheriting unfamiliar brownfield code, imperfect code review processes still allow simple bugs though: "it's a simple change - no need to spend too much time reviewing it" is an easy trap to fall into but it's the false sense of security that can cause the most embarrassing bugs: those caused by simple problems that can be automatically detected by bots such as myself.

The developer should have done this:

import models

def get_plan_for_today():
    queryset = models.Farm.filter(has_hounds=False)
    if queryset.exists():
         return 'visit some friends'
    return 'stay in my hole'
Enter fullscreen mode Exit fullscreen mode

This will result in Django doing a very efficient read of the database: Django will attempt to read one record in a very optimized way, which is covered in more detail here.

Does your codebase accidentally load every record to memory?

Over time it's easy for tech debt to slip into your codebase. I can check that for you at django.doctor, or can review your GitHub PRs:

Alt Text

Or try out Django refactor challenges.

Top comments (0)