DEV Community

Ivan Slavko Matić
Ivan Slavko Matić

Posted on • Updated on

Django 4.1: Caching With Redis

Introduction

Django is one of those frameworks that can do wonders if you invest enough time in them and pay attention to small details. It’s a framework that is rich in features and very customizable. I am happy that you (the reader) share the enthusiasm to get the most out of Django. While reading up on database access improvements, caching in Django caught my eye. So I’ve decided to research how to hook up Redis with Django as an alternative to Memcached.

Setup & Environment

Considering this topic is related to improving database accessibility, I saw it fitting to just continue working on the same project we used there. You can see more about the project setup on the following link:
Improving Database Accessibility

Why use caching?

As mentioned in the referenced article above, caching is a method for speeding up data access for the end user. It eliminates the required processing time that it takes to prepare data and send it to the user. Data is ready ahead of time. We’ve encountered caching whilst using widely known pages like Instagram, Facebook and Google — such fast and dynamic websites, all use caching. And therefore, it’s not a bad idea to utilize caching in our Django projects.

More speed → Faster website → Greater user satisfaction. Caching is a good investment. As your dynamic website grows, so will your need for speed.

Now, it all depends on our needs. For instance, we might not be building a completely dynamic website, but nevertheless, we might have dynamic elements that could benefit from caching. As we know, Django offers to cache per-site, per-view and template fragment caching (and to an extent various interfaces and low-level caching), which work in our favour — allowing us to cache large to small segments of our project.

Some keynotes that should be mentioned. Caching takes either storage/main memory space and may involve miscellaneous background processes. It can vary in cost, depending on where we decide to put your data. It can be stored in RAM, remote database, or local storage (HDDs, SSDs) — all being valid options, with RAM being the fastest and local storage slowest. So how do we choose the right one?

If we are bottlenecked by our hardware, then we use whichever hardware resource that is underutilized and unused, for our caching.

That should narrow down the choice for us. For instance, RAM is usually more precious than storage drives. If we have plenty of everything, then just go with the fastest option. Ok, let’s explore some cool options and see what we have in store for us.

Django’s Options

Caching configuration in Django is usually straightforward and simple. Cool thing is, we can actually use pure-python libraries that are compatible with Django and third-party software. That widens our possibilities. But for this article, we will be exploring Redis only.

Redis

Like Memcached, Redis uses RAM for data storage. It is described as an in-memory database or main memory database system.

Difference between Redis & Memcached

At first glance, it’s hard to notice the difference between Redis and Memcached. They both offer sub-millisecond response times, allow data distribution across multiple nodes via in-memory databases, and offer high scalability. But surprisingly, they differ a lot. Here are some notable differences:

  1. Command-line: Memcached is direct and connects to the server via telnet to execute commands, Redis on the other hand, offers dedicated ‘redis-cli’, which is a dedicated command-line interface.
  2. Disk dumping: Memcached relies on third-party tools to handle disk dumping, while Redis has highly configurable mechanisms like RDB (Redis database file), which also may involve some background processes.
  3. Data Structures: Memcached stores data as key-value pairs of String, while Redis supports other data structures like list, set, hash and others.
  4. Architecture: Redis shows better performance when dealing with smaller datasets while using a single core. Memcached on the other hand is utilizing multi-threaded architecture with multiple cores and shows better performance when dealing with larger datasets. More details on reference [2].

Personal opinion: I avoid declaring winners (even if they are obvious) when comparing approaches, methods, technologies and others. I think every developer learns really quick that clients' requests/demands come in all shapes and sizes. And therefore, I like to keep my options open, everything can be a tool and having alternative options can come in handy if we want to cover more situations. We analyse the situation, project our future needs and choose the software/package accordingly.

Installation

Officially, Redis is supported on Linux and macOS. Windows users can use Redis via VM or WSL (Linux sub-system). Now, considering that I have a Windows version that doesn’t support WSL2, I will be using a Hyper V Virtual Machine solution and setup a Debian distribution (Debian iso version — v11.5) for my Redis. First, we need to get a Debian distribution from their official website (if using an intel/amd processor — get an amd64 version).

Hyper V

Before we get our VM going, we will need to configure our NAT. NAT (network address translation) will handle our network connection to our VM. And we will also need it if we intend to connect our Django project with the Redis port.

Configure NAT:

  1. Open W. Powershell as Administrator
  2. Create NAT virtual network
New-VMSwitch -SwitchName "nat4debian" -SwitchType Internal
Enter fullscreen mode Exit fullscreen mode
  1. Get ifIndex of your newly created NAT
Get-NetAdapter
Enter fullscreen mode Exit fullscreen mode
  1. Configure the gateway
New-NetIPAddress -IPAddress 192.168.0.1 -PrefixLength 24 -InterfaceIndex 88
Enter fullscreen mode Exit fullscreen mode
  1. [Optional Step — Info] Open View Network Connections
nat4debian → Status → Details → IPv4 Address (192.168.0.1) and our IPv4 Subnet Mask (255.255.240.0)
Enter fullscreen mode Exit fullscreen mode

If you need explanations about any of the steps above, you can find them in reference [5]. It’s also possible you might have a default switch (vEthernet) that is already configured — probably comes with Hyper V installation.

Hyper V Manager:

  1. Actions → New → VM → Name: debian11 → Gen1 → Startup Memory: 4096MB (4GB, uncheck dynamic memory) → Connection: nat4debian → vhdx: default value → Install from Image file (.iso): debian-11.5.iso → Finish
  2. Run VM and perform Debian installation

GNU/Linux & Redis Installation:

  1. Keep the installation light — go with ‘Install’ (non-GUI option). Follow the installation steps up to ‘Network Configuration’. Once there, enter your IPv4 Address (192.168.3.232). If you fail the Debian archive mirror country step → you will probably need to double-check the network configuration step.
  2. Software selection: only standard system utilities. Everything else should be off. Remember, we are keeping our installation light and cost-efficient. Any unnecessary background (coming from the Desktop GNOME for instance) is not desirable. If you feel the need for a GUI, then install GNOME.
  3. Once the installation is over, bring up your terminal and run the Redis installation sequence of commands:
# Enter root (sudoer), enter your root password
su - 
# You can get it via curl (add it to apt) then just install via apt
# I will install it with snapcraft
apt update
apt install snapd
# Snap will autoupdate then install redis
snap install redis
# Test if Redis is working correctly
redis-cli ping
# Response is "PONG"
Enter fullscreen mode Exit fullscreen mode

Now, in my case of Redis snap installation — path to redis-cli got mismanaged. If that is also your case, you can do the ‘manual run’ of redis-cli (without the path in your .profile), like this:

cd /snap/redis/current/usr/bin
./redis-cli ping
# You can type 
redis-cli
# To get address and port of the server, if not started then run
./redis-server
Enter fullscreen mode Exit fullscreen mode

Django Implementation

Now our Redis server is up and running, what we need to do next is configure a Django binding to our server. Django supports Redis natively with redis library and also recommends installing the hiredis package. ‘hiredis’ package improves communication between Django and Redis by handling multi-bulk replies with the Reader class which would serve as our parser. To get the most out of our redis package we could also use the optional django-redis package, which offers pluggable clients, serializers, raw access and many other features.

Let's install those packages:

pip install redis
pip install hiredis
pip install django-redis
Enter fullscreen mode Exit fullscreen mode

Once installed, it’s time to handle our settings.py configuration:

CACHES = {
    'default': {
        'BACKEND': 'django_redis.cache.RedisCache',
        'LOCATION': 'redis://127.0.0.1:6379',
        'OPTIONS': {
            'CLIENT_CLASS': 'django_redis.client.DefaultClient',
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Let’s create a simple function-based view to test our connection:

def test_redis(request):
    get_redis_connection("default").flushall()
    return HttpResponse()
Enter fullscreen mode Exit fullscreen mode

If you run this view, you will get an error like this:

redis.exceptions.ConnectionError: Error 10061 connecting to 127.0.0.1:6379. No connection could be made because the target machine actively refused it.
Enter fullscreen mode Exit fullscreen mode

And that is because we need to search our Redis via our NAT network gateway. So we retain our Redis port but access it via a gateway. In Debian, you need to create an IPv4 address from the provided gateway (192.168.0.1) and enter that address into your settings.py:

CACHES = {
    'default': {
        'BACKEND': 'django_redis.cache.RedisCache',
        'LOCATION': 'redis://192.168.3.232:6379',
        'OPTIONS': {
            'CLIENT_CLASS': 'django_redis.client.DefaultClient',
            'PASSWORD': 'debian'
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Now we have access to our Redis server. Last thing we need to do is disable Protected mode from Redis:

./redis-cli
config set protected-mode no
Enter fullscreen mode Exit fullscreen mode

You can also find a config file ‘/etc/redis/’ and set the value ‘no’. We run our test view, and it should return ‘true’, we have Redis server access.

Important: It is fully recommended to use proper authentication when using the Redis service. For testing purposes, I will skip the authentication setup steps. If using the redis.py package, you will most likely have to embed your username and password to access your redis server (as matter of fact, it’s a must). Also, django-redis package cautions us that Redis ACL (Access Control List) requires authentication on attempted access.

Simple Benchmarking

Our setup is done, let's do some simple benchmarking to see some Redis performance benefits in action, but before that, just a quick note from Redis's official page on memory footprint:

1 Million small Keys -> String Value pairs use ~ 85MB of memory.
1 Million Keys -> Hash value, representing an object with 5 fields, use ~ 160 MB of memory.

The environment where we situated our Redis server has the capacity of 4GB RAM, but we haven’t limited/checked how much RAM Redis is actually using. I usually don’t like to let anything ‘run wild’, especially my RAM, so let’s set the maximum memory Redis can use with the following commands:

./redis-cli
config get maxmemory
config set maxmemory 512M
# Response: OK
Enter fullscreen mode Exit fullscreen mode

My recommendation: always start small then increase if needed — RAM is an expensive resource! Following the memory footprint above 512M is a good starting point. The allocated resource can always be easily adjusted.

Let’s fix up a render function view that fetches all objects (and their related objects). Our ‘Stores’ model will come in handy now:

@cache_page(180)
def test_redis(request):
    # get_redis_connection("default").flushall()
    try:
        stores = Stores.objects.select_related('manager').all()
    except (Stores.DoesNotExist, ObjectDoesNotExist) as e:
        print(e)
        stores = None
        pass

    return render(request, "demo_temp.html", {'data': stores})
Enter fullscreen mode Exit fullscreen mode

Now in this view, the cache on the first call is just being set, so we will notice the difference in performance on the second view call when the data is already cached. On our redis-cli we run a check-up to see if there is any data already:

./redis-cli
keys *
# It should respond with '(empty array)'
# If there is any data, empty the cache with
flushall
Enter fullscreen mode Exit fullscreen mode

These are the comparison times received from django-debug-toolbar:

# Average of 3x repeated process to compensate for background processes from IDE
Total: 427.72ms ~ average
# Access after cache has been set
Total: 15.00ms
Enter fullscreen mode Exit fullscreen mode

Works like a charm! We got our desired effect.

Where to use it

Redis caching service is best used remotely and probably on a pure Linux or macOS environment, where dedicated hardware resources are allocated and served for Redis (caching) service use only. Redis service will significantly benefit if no background processes are messing with RAM allocation. But this is a ‘touchy’ subject, again we need to reiterate the sentiment we heard before. It depends. As we can see, we can set up our Redis to work locally on our VM or localhost without a problem. Still, looking from a hardware perspective, we did just put a strain on our local machine, by taking away 4GB of RAM and adding background processes to be run as a result of Hyper V VM (any serious/major project distribution would require even more).

Redis is accessible, customizable and secure. Furthermore, Django from the 3.2 version up to the 4.1 version, worked on introducing more and more async support features, making remote Redis access completely a valid option with minimal performance degradation. Of course, if there is a possibility of using Redis locally (for instance a local network) instead of remotely, I consider that approach to be more performance friendly due to the lack of possible latency issues. Async would definitely do its job properly by utilizing await. Redis is definitely worth the investment of both set-up and hardware sacrifice.

Addendum: Accessing The Cache

Related to this topic, how to access stored data (cache) in Redis is another topic that is equally important. Find more in references [3] and [4].

References

[1] “data-flair.training”, “Django Caching”, https://data-flair.training/blogs/django-caching/,

[2] “baeldung.com”, “Memcached vs Redis”, https://www.baeldung.com/memcached-vs-redis

[3] “docs.djangoproject.com”, “Accessing the cache”, https://docs.djangoproject.com/en/4.1/topics/cache/#accessing-the-cache

[4] “redis.io”, “Install Redis on Windows”, “https://redis.io/docs/getting-started/installation/install-redis-on-windows/”

[5] “learn.microsoft.com”, “Set up a NAT network”, “https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/user-guide/setup-nat-network”

[6] “redis.io”, “What’s the Redis memory footprint?”, “https://redis.io/docs/getting-started/faq/”

Top comments (0)