Andrew Tate for Scout APM

Posted on Oct 26, 2023

Integrating an AWS API Gateway With a Django Monolith to Offload Heavy Functions

Django is a great framework for building scalable and maintainable projects. It’s easy to learn but extremely powerful. But like any other framework, it has performance pitfalls that can hit any developer. For Django, that tends to be N+1 queries, memory bloat, and slow responses. Because of the way Django apps are architected, you have to put in a lot of effort to make them feel as fast as modern JS frameworks.

However, an option that can really help Django developers is utilizing serverless functions such as AWS Lambda to lighten the load of the core deployment. Using an API gateway to access these functions, Django developers can reduce complex code in their monolith down to just some requests calls, letting AWS do the rest.

Here, we’re going to walk through a scenario that does just that, showing how to set up an AWS API gateway along with a Lambda function to do the heavy lifting and remove code from Django. Let’s go.

The Benefits of Offloading Django Functions
Integrating AWS Lambda with a Django monolith offers specific advantages that address the challenges and characteristics of traditional monolithic architectures. We can think about these benefits as embracing different levels of engineering.

Proximal Level
This is how the frontline engineers are thinking about using an AWS API gateway. They are looking for a way to optimize their current code and make improvements that are immediately evident and directly impact their work. The two main ones are:

Improved Development Speed: Offloading heavy tasks can reduce the complexity of the main application, which in turn simplifies the development process. Developers can more quickly iterate and test changes without being bogged down by the overhead of heavy operations.
Better Scalability: Developers can scale the offloaded functions independently of the main application. For example, if a particular function is facing higher demand, it can be scaled up without scaling the entire Django application.

Underneath those though, are a ton of other good reasons to use lambda functions or an equivalent in development: easier debugging, better codebase structure, better testing, cleaner dependency graphs, and easier development cycles.

Structural Level
The structural level is closer to what a team lead might want out of an API gateway. They are looking for improved structure in the engineering of the monolith, so important benefits are:

Separation of Concerns: For functionalities with specific environment needs, separating them out to Lambda allows for custom configurations without affecting the main Django environment.
Performance: Some tasks might be more CPU-intensive. Instead of bogging down the Django server, these can be offloaded to Lambda, which can scale and handle them efficiently.
Reduced Surface Area for Attacks: By moving some functions to Lambda, you potentially reduce the attack surface on your monolithic Django application. Each Lambda function is an isolated execution environment, adding a layer of security.
Fault Isolation: If a particular Lambda function fails, it won't bring down your entire Django application. This ensures better availability and fault tolerance.
Parallel Development: Tasks or functionalities offloaded to Lambda can be developed, tested, and deployed in parallel without affecting the main monolithic codebase.

All these are concerned not just with individual developer experience, but with optimizing the experience for the team as a whole, allowing more, better work to be done.

Infrastructure Level
This is how a CTO might view an API gateway. How is this going to ship faster or reduce costs?

Operational Simplicity: In a traditional Django monolithic setup, you'd need to manage the server, OS, and maybe even multiple servers for background tasks (like Celery workers). With AWS, many operational aspects are abstracted away.
Decomposition: Even though you start with a monolithic architecture, offloading parts of your application to Lambda gives you the flexibility of slowly moving towards a microservices-like model without a complete rewrite.
Isolated Deployments: You can independently deploy and update the AWS functions without redeploying your entire Django application, making certain updates quicker and less risky.
Leverage AWS Ecosystem: By integrating with AWS, your Django app can more easily leverage AWS services such as DynamoDB, S3, or Step Functions, to build more complex workflows or data pipelines.

We can see themes across each of these levels–moving to an AWS gateway is about simplifying the current deployment, which helps the individual, team, and organization alike.

Identifying Heavy Functions in DjangoSo what are some of the heavy functions in Django that it makes sense to offload? These are going to be operations that are computationally intensive, involve lengthy data processing, or require significant I/O operations. Some examples are:

Database Queries: These could be complex aggregations of large data sets, or “N+1 Query Problems”, where you are retrieving related objects without proper use of select_related or prefetch_related, leading to a large number of database hits.
File Operations: Handling large file uploads or performing image processing functions such as resizing, filtering, or transforming images. This can be especially CPU-intensive.
High Latency Tasks: If you are sending out bulk emails, processing incoming emails, or integrating with slower third-party systems or legacy systems, all these tasks can slow down a single-threaded process like Django.

You can use application performance monitoring to identify inefficiencies in Django monoliths. Let’s say we’re investigating our application and see that we have a slow query. This will be highlighted in the Scout APM:

Our dashboard is showing that we have a query that is taking almost one second to run. This is too long. We can drill down into the RefreshArticleSelect query is slow to see why:

It’s a big query. Potentially, we could optimize this code to pull less data from the database. But what if this wasn’t an option? If we need all that data, we’re stuck with slow processing within Django.

This is where an API gateway can start to help us. Once we’ve identified the functions we want to offload to AWS, we can start to set up AWS to do so. In this case that’s going to mean:

Creating AWS lambda functions that will take over the computation
Setting up the actual API gateway to allow us to call those functions

Create Your AWS Lambda Function
AWS Lambda functions are serverless functions. They allow developers to run code in response to specific events without provisioning or managing servers, automatically scaling with the size of the workload.

Here, we’ll use them to compute a specific heavy function that would be causing the slow response. So, we need to:

Create an AWS Lambda function.
Write the necessary code that this function will execute.
Ensure that the Lambda function has all required permissions and resources. Creating a Lambda function is surprisingly easy. Set up an AWS account (if you don’t already have one), then head to the Lambda service. From there, click on “Create function”:

You’ll then start creating your function. On the next page, choose “author from scratch,” choose a name for your function, and then choose a Python runtime environment (as we’re using Django):

After a few seconds, your function will be created. Then you can use the code editor to add your original function:

The ‘event’ will contain any POST data from your API. Let’s set that up so we can start testing.

Setting Up API Gateway
Setting up the API gateway is also simple. In the AWS Management Console, navigate to API Gateway and click on Create API. Then you’ll be asked to select the type of API you want to build. Here, we want a REST API:

Click “Build,” Choose “New API,” then choose a name for the API and select Create API:

We have a couple of more steps to take before we can start using the API. First we have to add a method and our Lambda function. Select “Create Method,” then choose POST, Lambda function, and then select your Lambda function from the dropdown. Then “Create method”:

Then we can deploy our API:

To use it, we need to associate it with a “stage”. Click “Create stage” and choose a stage name. Then you’ll have an API endpoint that you can start using (the Invoke URL):

It is this API endpoint that we need to use in Django. First we’ll test it quickly. Add this code to your Lambda function:

import json

def lambda_handler(event, context):
    # Parse the POST data from the body of the request
    try:
        num1 = event['num1']
        num2 = event['num2']
    except KeyError or TypeError:
        return {
            'statusCode': 400,
            'body': json.dumps({'error': 'num1 and num2 are required in the request body.'})
        }

    # Add the numbers
    result = num1 + num2

    # Construct the response
    response = {
        'statusCode': 200,
        'body': json.dumps({'result': result})
    }

    return response

Here, we’ll take two numbers passed on the body of the POST request (which will come via the event variable) from the API gateway, add them, and then send the response back via the API gateway. If we use our Invoke URL in Postman, we can call this Lambda function:

Awesome! We now have a Python function that resides on AWS Lambda that we can call via the AWS API gateway. Anything can go in this Lambda function, including calls to our database and backend as we can add environment variables.

Incorporating your gateway with Django
Let’s say this is the code that was causing us all the slow queries:

class Command(BaseCommand):

def add_arguments(self, parser):
parser.add_argument('--uuid', action='append', type=str)

def handle(self, *args, **kwargs):

unique_id = kwargs['uuid'][0]
print(unique_id)
articles = [['Title'], ['URL'], ['Jul 2021'], ['Aug 2021'], ['Sep 2021'], ['Oct 2021'], ['Nov 2021'], [
'Dec 2021'], ['Jan 2022'], ['Feb 2022'], ['Mar 2022'], ['Apr 2022'], ['May 2022'], ['Jun 2022'], ['Traffic Lost']]
qs = Article.objects.filter(unique_id=unique_id)

for article in qs:
articles[0].append(article.title)
articles[1].append(article.url)

monthly_traffic = article.traffic.split(',')
i = 2
for month in monthly_traffic:
if i == 2:
month = month.split('[')
articles[i].append(month[1])
elif i == 13:
month = month.split(']')
articles[i].append(month[0])
else:
articles[i].append(month)
i += 1
articles[14].append(article.lost_since_peak)

file_name = LOCATION + '/' + unique_id + '.csv'
with open(file_name, 'w') as fp:
a = csv.writer(fp, delimiter=',')
a.writerows(zip(*articles))

s3 = boto3.client('s3', aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY)
bucket = settings.AWS_STORAGE_BUCKET_NAME
s3.upload_file(file_name, bucket, file_name)

We’re basically calling all the data from a database (slow), creating a long list (slow), saving to CSV (slow), and uploading to an S3 bucket.

We can put all the code from articles = [['Title'],... down into our Lambda function. Then this function becomes:

Let’s say this is the code that was causing us all the slow queries:

import requests
import json

class Command(BaseCommand):

def add_arguments(self, parser):
parser.add_argument('--uuid', action='append', type=str)

def handle(self, *args, **kwargs):

unique_id = kwargs['uuid'][0]
print(unique_id)
headers = { 'Content-Type': 'application/json' } 
data = { 'unique_id': unique_id } 
response = requests.post(INVOKE_URL, headers=headers, data=json.dumps(data))

That’s it. Now, when this function is called in the Django app, the heavy lifting will be done by our Lambda function, giving us better performance, better resource utilization, and better options for scaling.

Next Steps
After this, what should we do? There are two things we need to think about.

Firstly, we want to go back to our APM and see how our performance has changed. The point of APM is to continually monitor our performance. This will have helped with our slow query issue, but has it had any ramifications? We can then also integrate telemetry via both Django and AWS into our application to get more insights.

Secondly, though both the API gateway and Lambda functions are simple to set up, there is a ton of configuration to optimize them, especially as you scale. Most importantly will be to:

Secure your API Gateway endpoints. Consider using API keys, AWS IAM, or Cognito for authorization.
Set up necessary CORS configurations if your Django application is making browser-based requests to the API Gateway.
Ensure the Lambda functions have the minimal set of permissions required to execute their tasks (Principle of Least Privilege).

Then you can start to optimize your costs, especially if these functions are being called consistently. Within Lambda you can adjust the memory and timeout settings of your functions according to your actual needs, and, of course, always review your AWS bills and Lambda invocations to see if there are any unexpected costs or behaviors.

Offloading heavy functions from your Django monolithic application to AWS Lambda through API Gateway is an excellent option to not just improve scalability but also reduce costs associated with traditional infrastructure. It all starts with having a good understanding of your current Django performance, and them from there you can start to use the AWS ecosystem to your advantage.

DEV Community

Integrating an AWS API Gateway With a Django Monolith to Offload Heavy Functions

Top comments (0)

Read next

Enhancing Image Viewing Experience With React Native Image Viewer

1915. Number of Wonderful Substrings

Pathfinding

Compile assets for wordpress theme