DEV Community

Ashis Chakraborty
Ashis Chakraborty

Posted on • Originally published at on

Designing a Rate Limiter

How do you design a rate limiter service?

Rate limiting is a simple concept to grab, but it’s vital for large-scale systems. It is used for security purposes as well as performance improvement. For example, a service can serve a limited number of requests per second. However, if a service is receiving a huge number of requests at a time, it might be too much for the server to handle. In that case, even the server might be shut down. To handle this type of problem, we would need some kind of limiting mechanism that would allow only a certain number of requests to the service.

Why do Need Rate Limiting?

Rate limiting is needed to protect a system from being brought down by hackers. To know the importance of rate-limiting, we have to know about the DOS attack.
The denial of service attack, which is mostly known as a DOS attack, is when hackers try to flood a system with many requests within a short period of time to shut the system down. Because a server has a limit of how many requests it can serve within a certain period of time. So, the system can’t handle the floods of requests properly; it can not handle them properly.
Rate limiting can prevent that from happening as it protects the system from being flooded with intentional unnecessary requests. After a threshold value, the system returns an error.

Sliding Window Algorithm:

If we keep track of each request per user in a time frame, we may store the timestamp of each request in a Sorted Set in our ‘value’ field of hash-table. But, it would take a lot of memory. So, we can use a sliding window with the counters. Now, consider this if we keep track of request counts for each user using multiple fixed time windows.
For example, if we have an hourly window, when we receive a new request to calculate the limit, we can count requests each minute and calculate the sum of all counters in the past hour. This would reduce the memory we need for the set.
Suppose we rate-limit at 500 requests per hour with an additional limit of 10 requests per minute. This means that the client has exceeded the rate limit when the sum of the counters in the past hour exceeds the limit request(500).
For a smaller time frame, the client can’t send more than ten requests per minute. This would be a hybrid and reasonable consideration as we may think that none of the real users would send such frequent requests. Even if they do, they will get success with retries since their limits get reset every minute.
Continue reading on Towards Data Science »

Discussion (0)