AWS Lambda throttling refers to the process of limiting the number of concurrent executions or the number of requests per second that a Lambda function can handle. Throttling is designed to prevent excessive resource consumption and ensure the overall stability and performance of the Lambda service.
When a Lambda function receives a high number of requests or exceeds the allocated concurrency limit, AWS may enforce throttling by limiting the rate at which new requests are accepted or the rate at which concurrent executions are allowed. Throttling can occur at different levels:
Account Level Throttling: AWS imposes account-level limits on the total number of concurrent executions across all Lambda functions within an AWS account. If the account limit is reached, any new requests or invocations may be throttled until the load reduces or the concurrency limit is increased.
Function Level Throttling: Each individual Lambda function can have its own concurrency limit, which restricts the maximum number of simultaneous executions for that function. When the function’s concurrency limit is reached, additional requests or invocations may be throttled until existing executions complete or the limit is increased.
Provisioned Concurrency Throttling: If you have provisioned concurrency enabled for your Lambda function, there is a separate limit for the number of provisioned concurrency instances. If this limit is reached, additional requests may be throttled until a provisioned concurrency instance becomes available.
When a Lambda function is throttled, AWS typically responds with a specific error code (e.g., “TooManyRequestsException” or “RequestLimitExceeded”), indicating that the function is currently unable to process the request. Throttled requests may need to be retried or handled accordingly in your application logic.
To mitigate throttling, you can take the following actions:
Monitor your function’s concurrency usage and adjust the concurrency limit based on your workload requirements.
Implement proper error handling and retry mechanisms in your application code to handle throttling errors and resubmit requests if necessary.
Consider using provisioned concurrency to pre-warm your function and ensure a specific number of concurrent instances are always available to handle requests.
Optimize your code and configuration to reduce the time and resources required for each function execution, allowing more efficient use of the available concurrency.
It’s important to note that AWS Lambda provides high scalability and performance, but it’s crucial to design your applications to handle and recover from throttling conditions to ensure smooth operation under varying workloads.
Do read about Concurrency Limit in AWS Lambda