DEV Community

Cover image for Functionless: Can we do (better) without Lambda
Pierre Milliotte for Serverless By Theodo

Posted on

Functionless: Can we do (better) without Lambda

As of September 2021, Step Functions natively integrates with the AWS SDK, expanding the number of supported AWS Services from 17 to over 200 and AWS API Actions from 46 to over 9,000. As a result, Step Functions can interact with any service Lambda integrates with (through the SDK). This opens up a whole new ecosystem of use cases where Step Functions can substitute Lambda.

Is Step Functions worth substituting Lambda ?

The answer depends on which criterion to compare both services:

  • 🚀 Performance - which has the smallest response latency ?
  • 💰 Cost - which is the most cost effective ?
  • 💻 Developer experience - which is the easiest / fastest / most pleasant to set up and maintain ?
  • 🔒 Security: which is the most secure ?

TL;DR This article focuses on performance and shows that Lambda executes on average 3x faster than Step Functions for a specific use case. However, Step Functions' smallest response time overall is the same as Lambda's.

To get these results, I created a lambda and a state machine doing the same tasks (querying from and storing items to Dynamodb), which I then invoked synchronously and successively 1000 times through ApiGateway.

NB: Since I'm using Step Functions in express mode - whose pricing is based on Lambda's - I'm also comparing services' cost.

1. Setting up the architecture

Choosing a recurring lambda template to substitute

For this experiment to be meaningful, I wanted to replicate a realistic lambda template, one that I use in my projects. In my CQRS experience, I have repeatedly used Lambda to react to HTTP requests through Api Gateway for storing new events in Dynamodb. As an example, in a bike rental app, I would have used a lambda to react to a "/rent-bike" POST request, in order to:

Lambda workflow


Lambda workflow

The code would have looked like this:

const handler = async ({ body }) => {
  const { id } = body;
  // QUERY EVENTS
  const { Items: [lastEvent] } = await EventStore.query(id, {
    consistent: true,
    reverse: true,
    limit: 1,
  });

  // CHECK IF EVENTS EXIST
  if (!lastEvent) {
    throw new createHttpError.NotFound();
  }

  // CHECK FOR CONFLICTS
  if (lastEvent.type === 'Rented') {
    throw new createHttpError.Conflict();
  }

  // SAVE A NEW EVENT
  await RentEvent.put({
      id,
      version: lastEvent.version + 1,
    });
};
Enter fullscreen mode Exit fullscreen mode

Replicating the lambda behaviour in a state machine

Before the release of AWS SDK integration, Step Functions could only manage 1 item in Dynamodb (getItem, putItem, deleteItem, updateItem). Since the release, it can now query multiple items through the AWS SDK. Hence I was able to replicate the "rentBike" lambda behaviour for returning the bike with Step Functions in a "returnBike" state machine:

Return bike step function workflow


Return bike step function workflow

Once I set up the bike-rental architecture, it was time to compare the lambda vs. state machine performances.

Minimalist bike rental architecture


Minimalist bike rental architecture

2. Comparing services performance

Querying both resources

I used a Postman collection to query synchronously and successively 1000 times the "/rent-bike" (Lambda) and "/return-bike" (Step Functions) endpoints.

The following Cloudwatch widget displays the average integration latency (~65 requests on each resource per minute) of the "rentBike" lambda (in blue) and the "returnBike" state machine (in orange).

Integration latency


Average integration latency per minute

Api Gateway integrates on average 3x faster with Lambda than Step Functions. 🤯

Splitting the integration latency

To get a better understanding of the performance, I split the integration latency as follows:

Integration latency = API Gateway integration & network latency + Execution duration


Integration latency = API Gateway integration & network latency + Execution duration

1) Api Gatway integration & network latency

API Gateway integration & network latency


Average API Gateway integration & network latency per minute

There is a 10ms difference between Lambda and Step Functions, which might be due to the difference between Api Gateway Lambda proxy integration vs. AWS integration.

One way to close the gap could thus be using a HTTP api in which Api Gateway integrations with Lambda and Step Functions are the same, rather than a REST api.

2) Execution duration

Execution duration


Average execution duration per minute

The bulk of the difference (40ms) between Lambda and Step Functions performances lies in resources' execution time.

It could be explained on the first hand by the memory allocated to both services: each lambda is allocated 1Go of RAM by default, whereas a state machine is allocated 64Mo.

On the other hand, I used X-Ray to deep dive into execution and response times. The following graphs show the execution duration distributions of the 1000 queries made to both services (⚠️ x-axis scales are different).

Execution duration distribution of 1000 requests


Execution duration distribution of 1000 requests

Besides the only query undergoing a cold start, Lambda distribution highlights that executions behave homogeneously overall: 80% of executions made to Lambda are within 20ms. On the opposite, the different peaks in Step Functions distribution suggest that the state machine benefits from multiple optimisations occurring more sporadically.

However, when all Step Functions optimisations are gathered, Step Functions performs as well as Lambda: the best response time for both services overall is 24ms!

Conclusion

Fastest execution Slowest Execution Integration latency p95 API Gateway integration + Network latency p95 Execution duration p95
Lambda 24.0 ms 745 ms (second slowest is 183ms) 38.8 ms 15.0 ms 23.7 ms
Step Functions 24.0 ms 252 ms 129.5 ms 43.1 ms 86.3 ms

Performance-wise, Lambda remains a better solution for this use case because, from my understanding:

  • It benefits from better internal optimisations,
  • It has a better integration with Api Gateway (on a REST api),
  • More resources are allocated to it.

However, Step Functions is promising:

  • Its fastest invocation matches Lambda's,
  • It also benefits from internal optimisations, although not comparable to Lambda's.

Finally, 1000 successive invocations are not representative of a real application behaviour. I'm curious what Step Functions optimisations a more realistic load test would reveal...

Top comments (0)