This post originally appeared here
A short while ago, I pointed out some Lambda anti patterns. Following up on that post, I thought we’d also like to point out some tips and tricks in overcoming some of the limitations of AWS’ Function as a Service Lambda. Lambda as a service is, in all honesty, pretty awesome. Self-scaling, modular code execution is an incredibly useful tool. However, we do see the classic trend of everything seeming like a nail, and Lambda is the golden hammer. Lambda is designed as a discrete small event handler. When you start using it for other things (or even in normal use), you’ll start bumping up against its limitations. Lambda has a few critical limitations.
This is the big one that gets most people into trouble. Fun hint - unless you have a reason not to - set processing time for all your functions for 5 minutes. You are only charged for actual usage, so rather than be surprised that something took a few seconds longer than you had configured, play it safe. The nature of Lambda is that it’s targeted for small, bite-sized processing tasks, and sometimes you can get into a situation where what seems to be a great use case for Lambda will bump up against this limit and cause you much heartburn. Here are a few approaches you can take to mitigate this limit.
A pretty common use case for Lambda is processing a file that drops on S3. If possible, instead of processing the whole file in a single Lambda invocation, I suggest splitting the work up by calling another Lambda to process a set number of rows that you are more than confident will complete under the limit. Basically, pass the S3 location, the start line number, the end line number and loop over until you’ve kicked off each set. Alternatively, streaming the records into another solution such as SNS or Kinesis may suffice. Be aware that if the size of the data payload is out of your control, you’ll want to still put in some kind of stopgap measure.
While this may cause other issues (see below) - caching some data in the /tmp partition provided to all function containers may provide a means to decrease execution time after the first call - you’ll have to balance whether you can build the cached data, write to /tmp, then still perform the necessary functionality - and you’ll pay this price on each cold start of your function, but on warm execution, it may save a LOT of processing time. To keep it warm, you can schedule a CloudWatch scheduled event to ‘ping’ your function - a no-op parameter that can either ensure the container stays alive, or starts it up so functional hits are pre-warmed. You’ll need to find the right balance here to ensure you’re not spinning up extra copies of your function unnecessarily, but it’s something that you can tune over time.
If in your use case, splitting up the processing of the data file doesn’t work - say it needs to be processed sequentially - then you can leverage the context object to effectively get a ‘time remaining’, and once approaching the limit, call the same Lambda asynchronously, providing a file offset. The subsequent invocations skip the previously processed lines, and continue on. As with any recursion - be wary not to put yourself in an infinite loop - you never want to be in a position to explain how you did a ‘Denial of Wallet’ attack on yourself.
For some cases, increasing the memory allocated may reduce processing time - watch the CloudWatch logs output for your function’s invocation memory usage. If peak usage is at or near the top of the allocation, upping the number may help. This is likely a stopgap measure, but an incredibly easy one to implement. As a bonus - if you see your max memory used is constantly well under the memory allocated - reducing it will lower your costs.
Different languages have different strengths - however, building your Lambda in a compiled language - .Net Core, Go, and even Java may perform better for your particular use case than an interpreted language like NodeJS or Python. It might not be an option, but something to keep in mind.
Let’s face it - there will be some scenarios and events that will just take more than 5 minutes to process. In this age of endless data, data files are getting larger, and you’re going to have to deal with it. As much as I’m a fan of Lambda - it’s not always going to cut it. There are two major escalation points that you can choose - the most straightforward is to move your processing from Lambda to a Fargate task - a serverless container execution can give all the benefits of Lambda without many of the limitations. That comes with more of a preparation cost - but done strategically, Fargate containers can dovetail very nicely into your existing serverless product architecture. The second approach would be to leverage EMR, Glue, or other service to do the heavy lifting, and just use Lambda as the triggering mechanism to ensure the processing flow is started.
The next most likely item to get caught up on is Lambda’s payload limits - 6MB for synchronous execution, but only 128K for asynchronous calls. Truth be told, if you’re passing large payloads around an event framework - you’re doing it wrong :). You should be checking your payload size before calling a Lambda programmatically - because sometimes you’re not in control of your message size, you should also know some workarounds to this.
Like the above advice for processing time, if possible, split your payload to be processed by separate invocations of your function - they autoscale automatically - so splitting and passing part of the payload at a time will allow you to not only avoid the payload limit, but will, as above, run faster in parallel.
The limit is only on the invocation payload - not the data processed, so you can send a S3 or database location instead of the data. I recently was doing some event data processing, and I was processing customer data, and reorganized the data into a map to allow efficient lookup. To save re-doing this map function in subsequent calls, I was attempting to pass it along in the payload to child Lambda calls. Well, as you can guess, the lookup map got too big over time and blew up the Lambda invocation. I ended up using DynamoDB as the scratch space as the required throughput was so low, it was negligible cost, and performed fantastic! Note that DynamoDB has an item limit of 400k - so keep in mind on how you use it. I could have also used ElastiCache, but I simply went with a resource I was already using in the application. Splitting the data and writing out to S3 is an even better way to go, as you can use the dropping of the file on S3 as the mechanism of triggering the subsequent Lambda. Think about your control mechanisms as events rather than flow, and these usage patterns will develop before your very eyes.
Okay - this isn’t a limitation per se, but there are some related limitations. Running Lambdas inside a VPC poses a few restrictions. First, each instance of your function will run inside a container - and that container will be issued an EIP on instantiation (which adds a significant increase to function cold start time as well) - and you may have limits on your account of EIPs. Secondarily to that, you can only run as many instances of your function as you have IP addresses available in your subnet. This is a fundamental issue due to you not typically being able to control how many instances of your function are running. For this - you should only run functions inside your VPC that need to run inside your VPC, and for those that do have to run inside - be sure to design them to minimize likelihood of massive concurrent execution. You can also now add a limit to the maximum number of concurrent invocations of your function. This will cause you to fall into an AWS retry scenario - and will, by nature, throttle your function. You may be trading one set of error messages for another, but it’s there as a lever you can use.
Now disk limits may sound odd in the discussion on serverless technologies - but utilizing the /tmp drive space is a pretty common technique to cache data (as mentioned above), which may minimize execution time on non-cold starts. However, it’s limited to 512MB - so trying to cache too much will cause your container to fail. Use the space sparingly, but use it where it can help.
Memory limits are another factor - depending on your code, hitting the limit of function memory may cause slowness, or may even cause the code to crash. As you are being charged by GB seconds, you do not want to overprovision your function, particularly one called often, but you don’t want to hit that limit either. Do a periodic analysis of the CloudWatch logs output of your function. The final line of output lists the provisioned memory, and the peak memory used. Start high, then tune down - try to aim your peak memory usage at around the 80% mark, just in case you have some unexpected behavior, but you’ll need to take account to the volatility of your memory usage to find the right, but not oversized, mark.
So, in closing, there’s a lot of great things about Lambda, and how it fits into the serverless ecosystem (and yes, they are different), but knowing how to make the most of it is dependant upon knowing its strengths and its limitations. At 3Pillar Global, we are excited about the promise of serverless computing, in all the forms it takes, from Lambda, to Fargate, serverless databases, and beyond. If you really love serverless, then stand by through the limitations, because after all, it’s just a Lambda.