Building a Scalable URL Shortener Service That Can Handle Billions of Requests
In this article, we’ll explore how to implement a scalable URL shortener service step-by-step. To follow along with the code, you can check out the full implementation in the scalable_url_shortener repository. This repository contains all the necessary code, including setup instructions and configurations using Docker Compose. By the end of this article, you’ll have a clear understanding of how to build a robust and scalable URL shortener service that can handle billions of requests.
URL shortening is a technique that creates shorter versions of URLs, serving as aliases for longer ones. This is particularly useful for sharing links on platforms with character limits or for improving the aesthetics of links.
Functional Requirements
URL Shortening: Our service should be able to shorten the URL provided by the user. The shortened URL code should consist of a mix of uppercase and lowercase alphabet characters (A-Z, a-z) and digits (0-9).
Redirection: When a user accesses the shortened URL, they should be redirected to the original, full-length URL.
Non-Functional Requirements
Low Latency: The service must respond quickly to both URL shortening and redirection requests.
High Availability: The service should be reliably accessible at all times, ensuring users can shorten URLs and be redirected without interruption.
Strong Consistency: Each unique long URL should generate a unique short URL, and there should never be a case where two different long URLs map to the same short URL.
Assumptions
- The ratio of read (redirection) operations to write (URL shortening) operations is 100:1.
- The service is expected to generate approximately 100 million unique shortened URLs per month.
Topics We Won’t Cover(Don't ignore in production)
Caching: While caching is essential for any scalable service to reduce database load and improve response times, it’s not unique to URL shorteners. Therefore, we will omit caching from this discussion.
Load Balancing: Like caching, load balancing is a general technique used to distribute traffic across multiple servers. Although critical for scalability, it is not specific to URL shorteners, so we will not dig deep into load balancing here.
-
Rate Limiting: Rate limiting is crucial to protect your service from abuse and to ensure fair usage across all users. By controlling the number of requests a user can make within a certain time period, you can prevent excessive load on your system and mitigate the risk of Denial of Service (DoS) attacks.
Stack We Are Going to Use
Ruby: Although the codebase introduced in this article is written in Ruby, you can use whatever language you prefer, as the key focus is on the algorithm we’re going to implement. We will use the Sinatra web framework since it’s a minimal framework, and our use case doesn’t require anything more complex.
MongoDB: We will use MongoDB as our NoSQL database, and we’ll explain why it’s a good fit for this service.
etcd: We’ll use etcd, a distributed key-value store, to reliably manage data across a cluster of machines. It will help us generate unique short URLs in a distributed environment.
API Design
To shorten a url you will need to send the following request
POST /shorten
Request Body
{
"url": "http://test.com"
}
Response
{"short_url":"av32cd"}
Database Choice
Given that our servers will receive millions of requests, it's crucial to consider database scalability from the very beginning.
For a service like URL shortening, the amount of data we need to store is relatively small. However, due to the high volume of read-heavy traffic, our storage solution must be horizontally scalable to handle the load efficiently and maintain low-latency responses as the service grows.
Our data model is straightforward, with minimal need for complex joins beyond associating each shortened URL with a specific user. This simplicity makes NoSQL databases a better fit for our requirements. While it's possible to use an SQL database, doing so would necessitate careful planning and the implementation of multiple read replicas to achieve the desired scalability and performance.
We have chosen MongoDB as our database solution because it scales more easily compared to traditional SQL databases. MongoDB's flexible schema and built-in support for horizontal scaling make it well-suited for handling large-scale, distributed applications like our URL shortener service.
However, employing multiple read replicas, which is essential for scaling, introduces potential concurrency issues. Specifically, we must ensure that once a short URL code is generated, it is not duplicated by another request before the write operation has been propagated to all replicas. Addressing this challenge is critical to maintaining the uniqueness and integrity of our shortened URLs across a distributed system.
Implementation of Endpoints in Ruby
Here is the implementation of the two primary endpoints for our application (you can find the actual code in this GitHub repository):
post '/shorten' do
url = params[:url]
halt 400, 'URL is required' unless url
counter = CounterService.get_next_counter
short_url = Base62.encode(counter)
Url.create(short_url, url)
content_type :json
{ short_url: short_url }.to_json
end
get '/:short_url' do
short_url = params[:short_url]
url_doc = Url.find_by_short_url(short_url)
halt 404, 'URL not found' unless url_doc
redirect url_doc[:original_url]
end
As you can see, to generate a unique short URL, we are using CounterService
and Base62
. I will discuss these components in detail in the next article to keep this one concise.
I hope you enjoyed this first article, and I look forward to seeing you in the next part of the series.
Top comments (0)