DEV Community

Igor Irianto
Igor Irianto

Posted on • Updated on

Scalability For Beginners

What Is Scalability?

Many of us have heard of scalability. But what does scalability really mean?

To define scalability, imagine that you created an e-learning site. It has a moderate traffic, about 1000 people throughout the day are visiting your site. One day, a very famous influencer shared your site and suddenly you have an influx of 100,000 people visiting your site within an hour. If your site is not poised to scale, it will crash.

Having a scalable sit is being able to handle more or less requests without sacrificing user experience.

Notice that I said "more or less". Scalability goes two ways: up and down. When thinking scalability, most people (including me) initially thought of scalability as scaling up. But when traffic is low, why not scale down? By using less resources, you are saving your business money! After all, business is all about making and saving money.

In general, scalability can be defined as being able to swiftly and reliably change the capacity to meet the client demands while keeping it cost effective.

What Are We Scaling?

There are three components you can scale:

  1. Concurrency: Instead of having about 50 people visiting your site within an hour, now you have 5000 people visiting your site within an hour. Can your server handle all these open connections?
  2. Amount of data: As your website grow, your product offerings, contents, and analytics will grow. Will you have enough storage capacity? Can you still fetch, sort, and transfer more data at the same speed?
  3. Latency: If you are adding a new chat feature, can your website handle the increased interactivity? If before you were getting a request once every 20 seconds, can you now handle 20 requests per second?

How To Scale?

There are two primary ways to scale: vertically and horizontally.

To scale vertically is to upgrade your machine (increasing storage / RAM, upgrading the processor, etc). To scale horizontally is to use multiple machines to distribute the loads.

Technically, there is also a third method to help you scale: using CDNs.

Vertical Scaling

Vertical scaling is done by upgrading your machine. If you need more RAM, add more RAM to you machine. If you need more storage, get more hard drives. If you need more network I/O, upgrade your network interfaces.

However, keep in mind that in the market, machine performance doesn't increase linearly with price. A 500GB SSD drive costs more than twice twice as much as 250 GB SSD drive. A 100GB SSD drive costs a lot more than twice the price of 500GB SSD drive.

At some point, it's more economical to scale horizontally.

Horizontal Scaling

Horizontal scaling is all about quantity over quality. If your one and only server is swamped, you would just buy more servers of the same or lesser performance!

Recall that the downside of vertical scaling is that price increases exponentially with performance. This limitation doesn't apply to horizontal scaling. Purchasing an extra server means you are only doubling your cost, that's it.

If you are doing horizontal scaling, you would use a load balancer (or some sort of reverse proxy server) in front of your servers to serve the incoming traffic. To add or remove servers, just add or remove those servers from the load balancer pool.

Horizontal scaling is more cost-effective than vertical scaling.


CDN (Content Delivery Network) is another useful scalability tool. If your website is hosted from a server in Dallas, TX (United States) and your client is in Singapore, it will take a while for a request to travel across the seas. But let's say that you store your website data in a server in Tokyo, Japan, that's a lot less distance to travel!

Your CDN provider usually has servers spread-out throughout different regions across the world. In the scalability context, you can leverage CDN to cache static files. If this client from Singapore visits your site, CDN serves the content from the closest available server instead of our own home server in Dallas. The fewer the request your own Dallas server has to serve means more computing resources you can allocate to process other tasks!

Top comments (0)