HTTP Rate Limit

#python #http #webdev #networking

Draft

The story starts with a link checker sharing that mentions the HTTP rate limit header in the IETF proposed standard.

Ideally, we expect something like this in the HTTP response headers:

   RateLimit-Limit: 10
   RateLimit-Remaining: 1
   RateLimit-Reset: 7

RateLimit-Reset specifies the remaining seconds for the current time window. This should not be considered as a fixed value.

It may also contain a Retry-After header, usually with a 429 status code.

ratelimit-headers has a test implementation of this draft.

Sadly, some HTTP APIs do not strictly implement this draft (others may not even have these headers). You can find different names like X-RateLimit-Reset, X-RateLimit-Requests-Reset, X-RateLimit-Reset-After, etc. Some official SDKs may consider this.

Python `httpx` with rate limit

There are already some implementations for Python HTTP clients. One of them is aiometer. But it's not suitable for my use case. Since httpx already has the internal pool, it would be better to reuse the design.

BTW, my use case is a web crawler client, I hope I can query the URL directly in the code (with rate limit), instead of gathering lots of URLs and using the map function.

Here is a simple implementation:

class RateLimitTransport(httpx.AsyncHTTPTransport):
    def __init__(self, max_per_second: float = 5, **kwargs) -> None:
        """
        Async HTTP transport with rate limit.

        Args:
            max_per_second: Maximum number of requests per second.

        Other args are passed to httpx.AsyncHTTPTransport.
        """
        self.interval = 1 / max_per_second
        self.next_start_time = 0
        super().__init__(**kwargs)

    async def notify_task_start(self):
        """
        https://github.com/florimondmanca/aiometer/blob/358976e0b60bce29b9fe8c59807fafbad3e62cbc/src/aiometer/_impl/meters.py#L57
        """
        loop = asyncio.get_running_loop()
        while True:
            now = loop.time()
            next_start_time = max(self.next_start_time, now)
            until_now = next_start_time - now
            if until_now <= self.interval:
                break
            await asyncio.sleep(max(0, until_now - self.interval))
        self.next_start_time = max(self.next_start_time, now) + self.interval

    async def handle_async_request(self, request: httpx.Request) -> httpx.Response:
        await self.notify_task_start()
        return await super().handle_async_request(request)

    async def __aenter__(self) -> Self:
        await self.notify_task_start()
        return await super().__aenter__()

    async def __aexit__(self, *args: Any) -> None:
        await super().__aexit__(*args)

You can specify the rate limit when you initialize your HTTP client like:

client = httpx.AsyncClient(
    transport=RateLimitTransport(max_per_second=20),
)

DEV Community

HTTP Rate Limit

Draft

Python `httpx` with rate limit

Top comments (0)

Read next

Leveraging Python's Pattern Matching and Comprehensions for Data Analytics

Building an Angular CRUD App with a Go API

Scaling Zensearch's capabilities to query the whole database

Migrating Project from ASP.NET Core MVC to CodeBehind Framework

Draft

Python httpx with rate limit

Read next

Leveraging Python's Pattern Matching and Comprehensions for Data Analytics

Building an Angular CRUD App with a Go API

Scaling Zensearch's capabilities to query the whole database

Migrating Project from ASP.NET Core MVC to CodeBehind Framework

Python `httpx` with rate limit