DEV Community

Cover image for Scaling backend applications for dummies

Posted on


Scaling backend applications for dummies

Maybe you, as every developer, have faced a situation when your application starts to slow down and shows poor performance.

I know, it’s disappointing but it just happens. When you start working in an application and it gets more users and requests, some issues related to overloads, unexpected system failure and service downtime will occur. It’s completely ok, it just means you need to scale up your systems.

If you have enough experience you may plan for some scenarios upfront based on previous projects and that’s a really good starting point because it means your architecture is, at least at some point, scalable.

If you are new to this concept don’t worry, I have created a series of articles to introduce some concepts of scaling applications from the basics.

Scale... what?

Before going further I think it's time to clarify some concepts.

When we talk about scaling in software development we refer to the action to add more resources (hardware) to run a particular application.

But if my program runs perfectly, why do I need to scale?

Great question! Usually when you develop an application you do it in a certain context and when this context changes your application could be affected.

Let me explain with an example. You are programming an ecommerce like Shopify. At the beginning you didn’t pay attention to database performance and it’s completely ok, the app has a few users and response times are quite good. Someday the performance drops and you realize that after the marketing campaign now your application has 100k users at the same time. Your application is the same but it doesn’t work as before, so what happened… exactly! the context your application runs on changed.

These situations tend to happen from time to time. This is why it is so important to take care about the performance and the scalability of your application from the beginning.

I’m not saying that you should create the next Google or Facebook with data centers around the world from day one, but you need a strategy and tools to scale your application. And that’s why this series of article exists.

How can I scale my application?

Now that we are on the same page so it's time to go deeper into scalability concepts.

First of all I want to depict the typical flow when scaling an application.

The first part of scaling an application is to know what it’s going on under the hood. In other words, we need to find what makes performance decrease in order to apply a solution to the problem.

The second part is to apply a solution, obviously. At this point we are really lucky because there are a lot of solutions documented to help us when scaling our applications.

If you ask me for a brief summary of this process should be something like this:


  • Enable observability tools
  • Make a root cause analysis
  • Improve:
  • Optimize code and business logic
  • Optimize database usage (e.g. slow queries)
  • Use background processing
  • Apply advanced scaling patterns

In this article and the following ones we will cover all these aspects and phases of the process. Let’s start talking about observability.

Enable observability tools

This sounds really obvious but it is not.

Before applying any solution to your scaling problem you need to know where the problem is and what is causing it. We should apply different solutions depending on the problem. We are not going to apply the same improvements for slow queries in a database and for an overloaded server that cannot handle more requests.

So first of all you need to audit your application. In the best case scenario your application has some observability tools in place like Kibana, Grafana, Prometheus or similar. If so, you are lucky because you can check this data to spot the problem. If not, my advice is that you need to implement an observability tool right now.

Because this article is an introduction to general concepts I’m not going to give you a deeper explanation on how to install and use any of these tools. You have hundreds of articles about it on the internet. But I want to give you some tips on what data you should gather to get a good observability of your application.

The most important information you want to know about your application is:

-application id: to identify your application in case you have many of them
-request status: true/false or and error code, it’s up to you
-request time: time elapsed from start to finish of request in seconds or milliseconds
-server information: server name/cluster id or similar, helps you to spot specific problems
-profiling information: information about time to run some methods or classes of your application, call to external apis, etc.

You should gather this information for each request. If your app has a huge amount of requests consider gathering data for at least 10%-20% of requests. You can store this information as logs, but if you use an observability tool to display it you will be more productive because you get all your data at a glance.

If you want to get your observability to the next level you can take a look at APM libraries. APM allows you to gather low level information like errors and stack traces from your app with a few lines of code using libraries. These libraries are available for most programming languages and frameworks and are compatible with all observability tools.

In this section we should mention another source of information really useful to deal with performance problems and it’s the classic server monitoring. This information is usually available through specific server monitoring applications like Nagios, Centreon, Zabbix, Pandora FMS or more generic one like observability tools mentioned above.

This information about usage of resources like CPU, RAM, network, etc. could give you a lot of information about your application performance and help you to analyze the problem. So don’t underestimate this kind of information when conducting a performance analysis of your application.

With all the insights of what is happening inside our application we can spot performance problems and start making improvements.

Make root cause analysis

Now, you have all the information available and you need to start searching for the problems.

I’m my experience the typical problems related to poor application performance are caused usually by:

-requests to external APIs that take too long
-not optimized database queries
-non scalable code
-lack of resources

Problems with third party requests should be easy to spot because they tend to be consistent. Usually they are caused by a problem in the vendor’s systems and could be temporary or structural. Besides being easier to find, these problems are not so easy to fix, mainly because you cannot control the code of this API. So basically you should contact your vendor and see what they can do. If you are lucky enough and it’s an internal API developed by another team, just grab a coffee and talk with your colleague to see what is happening and how you can help.

If your problem is related to non optimal queries then the approach is completely different. You should analyze your queries according to your database engine using some “explain” command and look for some improvements. We will cover this topic in the following sections.

Another typical problem is a non optimal code, this is so general, I know… But in my experience a lot of problems are caused by code that is not optimized in terms of business logic and use of resources. Some examples are: repeated queries during the same request, doing business logic in requests that could be deferred, legacy code not useful but not refactored, and so on…

A lack of resources problem is sometimes easy to spot if you have correct observability in place. Sometimes even if your code, queries, etc are optimized and perfect you can reach the limit of your database engine, web server, etc. In this case you will see errors with messages like “connection limit reached”. We will explore some options to solve them in the following sections.

Using the observability information you should be able to identify problems and it’s kind easily. Once you do that the next step is to know how to start solving them.

Scaling by improving code performance

First of all I want to talk about code performance because in my opinion it is the easiest problem to solve but the last one people address.

Before optimizing database queries or making major refactors in your code, or improving your hardware you should review your implemented business logic.

But… What should you look for? Great question! I can give you a list of example:

repeated execution of code during same request, the result should be cached, don’t repeat again same thing
legacy code no longer needed but that it's being executed, remove it immediately
heavy weight loops or iterating same collection several times, try to loop collections as few as you can and avoid loops of loops when possible

Those tips are the most common pitfalls but for sure there are plenty of bad practices to avoid. This is an introductory course so, this should be treated as an example to guide you on making a root cause analysis.

If you improve these issues you can get more performance with a low effort and without making a huge refactor or even expending a lot of money to get more power for your servers.

Scaling by improving database performance

The second place will go to query performance optimization.

In my experience one of the most common issues when working with databases is just writing the structure or inserting information without paying attention to performance.

For example, creating the correct index could make your application x10 times faster.

So it’s crucial to ensure you have the correct indexes in place and that your queries and data structures are optimal to get the most of your database engine.

I won’t dive deeper into database optimization because this is an introductory article. You have a lot of resources online for your database engine

Scaling by leveraging background processing

When your code and database queries are optimized and you still have some part of your process that takes a long time you have the last silver bullet… You can try background processing.

Some parts of business logic can slow down your application requests like logging, inserting a huge amount of data, etc. Sometimes these actions don’t belong to request’s business logic and you can defer them.

A very good example is the “create user” request that sends the welcome email inside the same request. In this case the request just needs to insert user information into the database and return OK when done, you can send the email using a background process a few seconds later.

But.. How can you do it? Ok, it’s easier than it seems.

You can implement an event bus pattern using any message queue servers available like RabbitMQ, Kafka, AWS SQS, etc.

The idea is to send an event informing about some action, for example “user created”, in the main request. Then you will have a separate process, usually a daemon, running in background listening to these events, for example “WelcomeEmailSenderListener” that sends the welcome email.

Using background processing you can make your requests faster and it allows you to scale each part of your system separately allocating the resources they need.

Some advance scaling patterns

When you implement all techniques above and your applications continue to perform really badly, then you need to apply some advanced scaling patterns. These patterns will help you out when gaining performances for your applications.

Some of theses patterns are:

-Implement some caching system
-Scaling servers vertically
-Scaling servers horizontally
-Implement database sharding

We will cover these patterns in the following articles in depth but to give you a brief introduction all of them are based on improving the usage of the hardware resources you have to run your application.

In the real world we mix and match to get the best of all of them so you will end up with a more complex infrastructure and applications but on the other hand you will have a more reliable system that allows you to increase the number of requests and users effortlessly.


Scaling is very important to ensure your application can deal with a huge amount of requests and a lot of users which is a good sign that business is performing well.

Enabling a good observability to gather performance data from your applications is key to understanding why your application has a low performance.

It’s a good practice to check your code and database performance before adding more hardware to run your applications. Some easy to fix issues like removing legacy code, doing a refactor in business logic or creating some indexes in the database could improve your app performance ten times.

You can improve your application performance by using more advanced scaling patterns like: caching, vertical and horizontal scaling, sharding, etc. that allow you to get the most of the hardware you use to run your application.

This is the first article of a series that will cover some aspects of scaling applications, you will find more information about how to scale your application in the next articles.

Top comments (0)