DEV Community

Robin Smith for Click Travel Engineering

Posted on

Trialing AWS Lambda performance? Are you being fair?

I was helping out a colleague this week who had created a simple serverless style setup using AWS: API Gateway -> Lambda -> DynamoDB

There was nothing unusual about the setup, and the function worked as expected first time. Their problem came when they decided to benchmark performance against the traditional 24/7 deployed server that the function was designed to replace. Their results left them massively disappointed, verging on concerned.

Why?

The easiest thing to do in this situation is to think about the current version first. Your existing service is effectively fully scaled out at all times; it sits quietly in your data centre burning cash waiting for requests, and whether it receives a single request or 50 requests per second (RPS) it's ready to respond with the full provisioned capacity from the first millisecond.

But this isn't how lambda works: If there are no requests then there is no capacity.

When that single request arrives, the architecture rapidly provisions resource to deal with the request and also leaves that resource available for a short time in case additional requests come through. This means that if you suddenly load your function with 50RPS from it being idle, then you will be forced to wait for
a short time while the architecture reacts, which it does by spinning up a fleet of functions to process the sudden influx of requests.

Lambda doesn't scale linearly to requests

Receiving 50 requests at once doesn't mean you will have 50 concurrent executions. There are complex calculations performed behind-the-scenes, trading off things like:

  • time it takes to start a new function
  • the expected amount of remaining time on the current execution
  • historical data about how the function is being utilised
  • individual account limits

All of these things play a factor in how your function is auto-scaled, and as a result in some circumstances Lambda may choose to simply queue some of your invocations up behind each other, instead of opting for concurrent executions.

What this means is that if you decide to perform an unrealistic load test then you are very likely to see much poorer scalability and performance than you would expect to see under real production load.

The point here is that Lambda can easily handle load, but you have to understand how it works in order to evaluate it fairly. If your function is expected to handle 50RPS constantly as a baseline then it's unrealistic to benchmark Lambda from 0 to 50RPS within a 60 second window: you need to give it time to scale up if you want a realistic view of how it will perform at that level.

What you'll find is that if you run your function at 50RPS for a longer period, the architecture will work out exactly how much resource is required in order to best service that load, and your function will perform fantastically without any need to scale up further. If you do suddenly get a huge spike and have 1000RPS lambda will still deal with that, but it won't be at the same performance level of your 50rps baseline immediately; it needs to react to the additional load with extra resource which will impact the performance at the beginning of the spike.

Lambda is fantastic, but it's not magic: it takes time to deal with unexpected traffic spikes.

If you want to understand how your service will behave within Lambda under normal load then you need to run that test over a sensible timeframe. This will allow Lambda to understand that this is your normal load, and this will enable you to see the reliable performance you expect.

Of course, running tests like this over a short period also allows you to see the worst-case scenario for how your function will behave should it get bombarded with unexpected spikes.

Both are important metrics but they shouldn't be confused as equal. Don't write Lambda off because of your unrealistic expectations; if you do you'll be missing out.

Top comments (0)