Scalability is one of the top concerns when it comes to designing software systems. Applications are generally created with the objective that someday there will be a large number of users using them.
However, the initial application does not need to support a large number of users. It needs to expand when necessary but overall keep the cost low as long as it can. Let's analyze what is scaling and what are the two different ways to implement it when required.
"Scalability is the measure of a system’s ability to increase or decrease in performance and cost in response to changes in application and system processing demands."
Let's reiterate the keywords:
Ability - As stated earlier, does not have to be super strong at the beginning, but should be able to evolve when needed.
Increase or decrease - Upwards isn't the only direction. To optimize the cost, it needs to shrink too otherwise we will be spending more on resources and reducing our profits.
In response to changes in demand - Demand can be divided into two parts. It's not only the users that can increase, but we also want to occasionally add features. The application needs to monitor the impact of user/feature changes and scale accordingly - whether automatically or through manual intervention.
Imagine that the house(the resource) you're living in was turning out to be smaller for your needs. People(users) in the house increase as life goes on and furnitures(features) increase too as the standard of living goes up. Now you want to expand your house for comfort. There are typically two options
You decide to create another floor above your house and it solves the issue. This is called Vertical Scaling. The users and features remained at the same location but they had a bigger resource supporting it. This is typically equal to adding more physical resources(CPU/Storage/Network upgrages) to an existing server so that it can perform well against increase in demand.
However, you cannot keep creating another floor every time you need to expand. Someday, the strength of the house will become a limitation. In the same way, the limit to which physical resources can be increased is also fixed. Lets look at another way to tackle this problem.
You decide to buy/create another apartment and some of your family members and their things can shift into it. This is called Horizontal scaling. You increase the resources so that the all the load does not come to one resource. This is achieved by adding servers with similar capabilities and a load balancer which divides the user requests among servers.
Now the problem and limitations that we had with vertical scaling is removed in horizontal scaling, we can buy as many apartments as we need (assuming that the apartments are available...but not overthinking it further)
As much as horizontal scaling looks better and efficient in long term, it is not always the easier and certainly not the cheaper option between the two types of scaling.
We will compare the two options but before that let's understand what is being scaled here.
An application is generally created using multiple software components. We talk about tiered architectures. Most of the applications are created using some variation of 3-tier or N-tier architectures. If you want to learn/revise these architectures, check out this thread by Jon Jackson
Jon JacksonAn Introduction to N-tier Systems and Distributed Apps 🌐
A thread ⤵12:18 PM - 13 Oct 2021
So, from the N-tier architecture, we know that there will be multiple backend components supporting the application. They will be divided into tiers and each tier will be physically isolated from the other tiers.
Scaling can be applied to each physical tier separately. Each tier can be scaled in its own way and the entire application can have a mix and match of components with different scaling capabilities.
Note: You may have heard about Microservices architecture. For all practical purposes, a single microservice is an application in itself and has a N-tier architecture of its own. Just like a slice of Pizza is a Pizza itself and contains the same layers.
Below is a comparison of vertical and horizontal scaling based on few desirable parameters.
No Hardware Limitations
Horizontal scaling has no hardware limitations at all and it is possible to scale to as many servers as required. While vertical scaling has an upper limit.
Having one server means that deployment downtimes are likely. Deployment downtimes can be minimized in a vertical system which has a backup server only used during deployments. However, in a horizontal system, the downtimes can practically be reduced to zero. Code can be deployed to servers in batches and they can be taken offline during the deployment while other servers take the load.
It is the ability of a system to continue working (sometimes with reduced performance) when one or more of its components fail. Here too, horizontal systems perform better. If one server fails, only a part of the system's capacity is lost and it can still remain fully functional. There may be some reduction in performance till the fault is resolved. In a vertical system, it fails miserably and needs immediate repair.
For initial costs, vertical scaling looks cost-effective. You only use what you absolutely need. Moreover, you save a lot of money on other infrastructure - network, virtualization technology, load balancers and sometimes people(administrators).
Cost comparison isn't very straightforward in the vertical vs horizontal battle. If availability of a system is a concern and results in loss of revenue, the vertical systems do not turn out to be cost-effective.
Multiple servers are harder to maintain. Additional complexity is added when load balancers are added. Deployment strategies need to be created and followed.
- Design with the intent that the application has to grow eventually.
- Design your application to be easily replicable from one server to another when the need arises.
- Containerization and serverless solutions can make horizontal scaling cheaper. Start with cloud solutions unless you already own private infrastructure.
- Scaling has to be maintained at every backend tier. Think about each component - Database servers have to scale too and not just the application.
I hope this gives an insight into the need for scaling and the two different types of scaling. Stay tuned for the further articles where we will discuss about scaling in terms of:
- Monitoring applications to recognize the need to scale.
- An introduction to containerization and how it helps in deploying applications across multiple servers.
- Setting up auto-scaling with Azure App Service.
If you want to connect with me, I am available on Twitter (DMs are open on DEV as well as Twitter)