DEV Community

Cover image for How to rate limit FastAPI with Redis 📈
Deon Pillsbury
Deon Pillsbury

Posted on • Updated on

How to rate limit FastAPI with Redis 📈

Rate limiting is crucial to protect against Distributed Denial of Service outages from malicious actors or from API consumers who may be unintentionally calling an endpoint inefficiently. The following script shows how we can add per user rate limiting to a FastAPI MiddleWare using Redis.

Note: This example is available on GitHub


Make sure you have a Redis Server running and the Python dependencies installed. We are using Docker Compose to run Redis and Poetry to manage the dependencies in this example.

$ docker-compose up redis -d
$ poetry install
Enter fullscreen mode Exit fullscreen mode

Setup the imports, settings, clients and bootstrap our app. Note we are setting the rate limit setting to 3 in the settings and a .env file is used locally to make sure we do not expose our Redis password in the source code and is required so the redis_password setting gets set properly.

import hashlib
from datetime import datetime, timedelta
from typing import Annotated, Any, Callable, TypeVar

import uvicorn
from fastapi import FastAPI, Header, Request, Response
from fastapi.responses import JSONResponse
from pydantic_settings import BaseSettings, SettingsConfigDict
from redis.asyncio import Redis

F = TypeVar("F", bound=Callable[..., Any])

class Settings(BaseSettings):
    redis_password: str
    redis_host: str = ""
    redis_port: int = 6379
    user_rate_limit_per_minute: int = 3
    model_config = SettingsConfigDict(env_file=".env", extra="ignore")

settings = Settings()

redis_client = Redis(
app = FastAPI(
    title="FastAPI Rate Limiting",
    description="Rate limiting users using Redis middleware",
Enter fullscreen mode Exit fullscreen mode

Main Logic

This is the main rate limiter logic, this implementation enforces a per user, per minute rate limit. The users name is hashed and appended with the datetime string for the current minute, this is then Incremented in redis which is set to 1 if it does not exist. If the value is 1 that means it did not exists so we then set a 1 minute Expiration for the key so we properly clean them up. If the current count has exceeded the rate limit we return an HTTP 429 Too Many Requests Error, with a X-Rate-Limit header to let the user know what the rate limit is set to and a Retry-After header so they know how long they need to wait until they can call it again.

async def rate_limit_user(user: str, rate_limit: int) -> JSONResponse | None:
    Apply rate limiting per user, per minute
    # Increment our most recent redis key
    username_hash = hashlib.sha256(bytes(user, "utf-8")).hexdigest()
    now = datetime.utcnow()
    current_minute = now.strftime("%Y-%m-%dT%H:%M")

    redis_key = f"rate_limit_{username_hash}_{current_minute}"
    current_count = await redis_client.incr(redis_key)

    # If we just created a new key (count is 1) set an expiration
    if current_count == 1:
        await redis_client.expireat(name=redis_key, when=now + timedelta(minutes=1))

    # Check rate limit
    if current_count > rate_limit:
        return JSONResponse(
            content={"detail": "User Rate Limit Exceeded"},
                "Retry-After": f"{60 - now.second}",
                "X-Rate-Limit": f"{rate_limit}",

    return None
Enter fullscreen mode Exit fullscreen mode

Plug this in to the API middleware and call our rate_limit_user function. If a response is returned it means the rate limit has been exceeded and we should just return the HTTP 429 response, else continue as normal.

async def rate_limit_middleware(request: Request, call_next: F) -> Response:
    Rate limit requests per user
    user = request.headers.get("x-user")
    if user:
        rate_limit_exceeded_response = await rate_limit_user(
            user=user, rate_limit=settings.user_rate_limit_per_minute
        if rate_limit_exceeded_response:
            return rate_limit_exceeded_response

    return await call_next(request)
Enter fullscreen mode Exit fullscreen mode

Add an endpoint to test with.

Note: I use an OpenResty NGINX Reverse Proxy in my production environments which handles the authentication and passes the X-User header to the API. For this small test script we are just allowing an X-User header to be manually passed to emulate a secure setup but this should not be used in production.

Example Endpoint

@app.get("/user", response_model=str | None)
def get_user(x_user: Annotated[str | None, Header()] = None):
    **NOTE**: the`X-User` header should be passed by a reverse proxy,
    we are manually adding it to this endpoint so you can test
    this example locally
    return x_user
Enter fullscreen mode Exit fullscreen mode

Test It Out

$ python3 -m app.main

INFO:     Uvicorn running on (Press CTRL+C to quit)
INFO:     Started reloader process [16315] using WatchFiles
INFO:     Started server process [16317]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO: - "GET /user HTTP/1.1" 200 OK
INFO: - "GET /user HTTP/1.1" 200 OK
INFO: - "GET /user HTTP/1.1" 200 OK
INFO: - "GET /user HTTP/1.1" 429 Too Many Requests
Enter fullscreen mode Exit fullscreen mode

Make 4 calls quickly and since our rate limit is set to 3 the 4th call should fail which is also reflected in the API logs and we can see the response headers show the rate limit is 3 and we can retry after 33 seconds.

$ curl -v http://localhost:8000/user -H 'x-user: joe'
< HTTP/1.1 200 OK

$ curl -v http://localhost:8000/user -H 'x-user: joe'
< HTTP/1.1 200 OK

$ curl -v http://localhost:8000/user -H 'x-user: joe'
< HTTP/1.1 200 OK

$ curl -v http://localhost:8000/user -H 'x-user: joe'
*   Trying
* Connected to localhost ( port 8000 (#0)
> GET /user HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/8.1.2
> Accept: */*
> x-user: joe
< HTTP/1.1 429 Too Many Requests
< date: Sat, 09 Sep 2023 20:42:27 GMT
< server: uvicorn
< retry-after: 33
< x-rate-limit: 3
< content-length: 37
< content-type: application/json

{"detail":"User Rate Limit Exceeded"}
Enter fullscreen mode Exit fullscreen mode

This API is now protected from any single user overloading it and the rate limiting will be applied to all new endpoints.

Top comments (0)