In the era of AWS serverless computing, there are two services that, one would argue, are used together very frequently – the AWS Lambda and Amazon API Gateway.
Combining these two services allows developers to create APIs that serve millions of customers daily, scaling elastically depending on the workload. It is not all sunshine and rainbows, though.
In your AWS serverless journey, you will, at some point, end up faced with inbound payload size limits, most likely when interacting with AWS Lambda and Amazon API Gateway.
In this article, I will lay out different alternatives you might want to consider when you end up in a situation where Amazon API Gateway and AWS Lambda inbound payload size limits constrain your synchronous workflow.
The typical AWS Lambda Amazon API Gateway synchronous architecture
Let us examine what, I would argue, is the most typical synchronous architecture that combines the AWS Lambda and Amazon API Gateway services.
There is nothing wrong with it. I would argue that the architecture is just fine for the vast majority of use-cases, especially for creating API routes in a serverless way.
The payload size difference problem
It is crucial to notice one critical detail – the difference in maximum payload size of Amazon API Gateway and AWS Lambda. This delta might be frustrating and confusing. If you send a payload bigger than 6 MB, Amazon API Gateway will happily accept it and fail when invoking the AWS Lambda.
I'm not aware of any other solution to this problem than to restrict the payload size in the user-land. Amazon API Gateway can validate the request via JSONSchema, but, to my best knowledge, you cannot validate the size of the payload.
One might also look into AWS WAF, where request body validation is possible, but only for payloads of size up to 8192 bytes.
I need to process bigger payloads synchronously
If you cannot afford or do not want to re-architect your solution to be asynchronous (which would, in theory, enable you to process potentially unbound payload sizes), there is one quick-win type of change you can make that could fit your use case. I'm referring to compressing the payload before sending it to Amazon API Gateway.
Notice that the Amazon API Gateway is NOT decompressing the payload. While the service in question has this capability, it is the opposite of what we want to do. If we were to decompress at the Amazon API Gateway level, the service would try to send a 15 MB payload to AWS Lambda, resulting in an error.
I need more time to process my synchronous request
By compressing the payload in the user-land, you can, to some extent, get around the 6 MB payload size limit. What about the 30-second response timeout?
If your payload is large, you might need more time to process the request synchronously. Luckily, with some changes to our infrastructure, it is possible to bump the response timeout to 15 minutes (the AWS Lambda timeout).
AWS Lambda function URL
A very recent addition to the AWS Lambda-related family of features, the AWS Lambda URL feature might be just what you need. We ditch Amazon API Gateway in this architecture since it was the primary response timeout bottleneck.
Keep in mind that ditching Amazon API Gateway could have enormous consequences depending on the use case. Amazon API Gateway is rich in features like validation, caching, usage keys/plans, etc. By utilizing the AWS Lambda URL feature, we gain the longer synchronous request timeout but lose all the niceties of Amazon API Gateway. Keep this in mind while reaching out for this feature.
AWS Lambda fronted with Application Load Balancer
Instead of using Amazon API Gateway for fronting the AWS Lambda function, we can use the Application Load Balancer. Like in the case of the AWS Lambda function URL feature, you will lose many of the Amazon API Gateway features but gain a much longer response timeout.
In some architectures, with high-enough traffic and small sizes of the requests, fronting AWS Lambda with ALB is much more cost-effective than using Amazon API Gateway. You can read more about it in this article.
Note about "a lot" you see for Maximum payload size and Response timeout for ALB. I could not find quotas referring to these variables in the AWS documentation. If you know the limits for those, please let me know!
The ultimate solution
If the 6 MB payload size is not enough, even with compressed payloads, and the 15 minutes of response timeout is too short, consider re-architecting for asynchronous communication. In the end waiting for responses is expensive and reaching out for tools like AWS Lambda function URL or fronting AWS Lambda with ALB to go around the limits might be a sign that asynchronous processing would be the right fit for your use case.
AWS exposes various services to facilitate such workloads, ranging from AWS Step Functions to AWS Batch or AWS Fargate and storing data on Amazon S3. This article would be too long if I were to describe all the possible asynchronous ways to get around the 6 MB AWS Lambda payload size limit – do not worry, that article is next on my list!
Until that, I will leave you with this great article written by Yan Cui as a sneak-peek of what I will be covering.
Closing words
I hope you find the approaches described in this blog post helpful. Remember that all limits have a reason, be it related to technical or hardware limitations. Some might seem off-putting, like the 6 MB AWS Lambda payload size limit. Still, I do wholeheartedly believe that some of them, most likely by accident, push your architecture more towards asynchronous communication, which in most cases is a good thing!
Consider following me on Twitter – @wm_matuszewski so that you do not miss the follow-up article (and more AWS serverless-related content).
Thank you for your valuable time.
Top comments (5)
I might suggest that it would probably be worth mentioning the strategy of "creation of a signed s3 url" for the inverted scenario of trying to get more than 6MB of data in a response. I'd also be interested in hearing how an asynchronous response would be architected.
good point!
Excellent article, thank you. Gotta to start thinking asynchronous architecture from the very start...it's challenging if the front end or consuming apps need synchronous response even when the providing apps is writing response to an EB ....
great summary and AWS digram icons well appreciated :-)
Great article! I think you may be missing that Lambda also has a concurrency quota, which you might reach if you have a popular application/endpoint