DEV Community ๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ป

Steve Bjorg for LambdaSharp

Posted on • Updated on

Baseline Performance for .NET on AWS Lambda

I always like to understand what the lower bound looks like. What is the absolute fastest performance we can hope for? I find it insightful as it sets a baseline for everything else.

A necessary warning here is the risk of extrapolating too much from such a trivial sample. We need to take the data for what it is: a baseline. It's not representative of real-world business logic. Simply adding some I/O operations would greatly increase the processing time. Usually I/O is 1,000x to 1,000,000x slower than code.

Minimal Lambda Function

The Minimal project defines a Lambda function that takes a stream and returns an empty response. It has no business logic and only includes required libraries. There is also no deserialization of a payload. This is the Lambda function with the least amount of overhead.

using System.IO;
using System.Threading.Tasks;

namespace Benchmark.Minimal {

    public sealed class Function {

        //--- Methods ---
        public async Task<Stream> ProcessAsync(Stream request)
            => Stream.Null;
    }
}
Enter fullscreen mode Exit fullscreen mode

Benchmark Data for .NET 6 on x86-64

The data neatly shows that the INIT phase is approximately the same for all memory configurations under the 3,008 MB threshold. As mentioned in the Anatomy of the AWS Lambda Lifecycle post, the INIT phase always runs at full speed.

The cold INVOKE phase is about 10x slower for 128 MB than it is for 1,024 MB. However, the sum of all warm INVOKE phases is only ~3x slower. Yet, the cost is less than 5% higher for the improved performance.

It's surprising that even for such a trivial example, we can already appreciate the delicate balance between performance and cost.

Memory Size Init Cold Used Total Cold Start Total Warm Used (100) Cost (ยต$)
128MB 235.615 620.921 856.536 365.519 22.25509
256MB 238.296 315.731 554.027 150.124 22.14107
512MB 241.193 136.89 378.083 124.686 22.37980
1024MB 239.972 60.804 300.776 115.53 23.13891
1769MB 241.005 37.623 278.628 116.322 24.63246
5120MB 218.112 37.009 255.121 119.559 33.24730

Cold Start Duration

Fullsize Image

Lifetime Execution Cost and Total Warm Execution Time

Fullsize Image

Minimum Cold Start Duration for .NET 6

Not surprisingly, the lowest cold start duration was achieved using the highest memory configuration. Tiered Compilation also helped lower the number. However, ReadyToRun did not make much of an impact, which is expected since our minimal project has almost no code.

More notable is that the ARM64 architecture was slower for comparable memory configurations than the x86-64 architecture.

Architecture Memory Size Tiered Ready2Run PreJIT Init Cold Used Total Cold Start
arm64 5120MB yes no no 211.006 30.165 241.171
x86_64 1024MB yes no no 213.085 33.173 246.258
x86_64 1769MB yes no no 215.754 24.164 239.918
x86_64 5120MB yes no no 198.771 24.094 222.865

Minimum Cold Start Duration for .NET 6

Fullsize Image

Minimum Execution Cost for .NET 6

Another unsurprising result is that the ARM64 architecture yields the lowest execution cost since its unit price is 20% lower. Similarly, the memory configuration is towards the bottom end at only 256 MB.

More interesting is that Tiered Compilation is always more expensive to operate. This makes intuitively sense since it requires additional processing time to re-jit code. After that, it's a bit of tossup between the ReadyToRun and PreJIT settings.

Architecture Memory Size Tiered Ready2Run PreJIT Init Cold Used Total Warm Used (100) Cost (ยต$)
arm64 256MB no no no 266.026 378.676 158.064 21.98914228
arm64 256MB no no yes 288.025 371.274 161.529 21.97601788
arm64 256MB no yes no 264.304 361.657 164.619 21.95426344
arm64 256MB no yes yes 287.762 361.285 160.248 21.93844936

Lifetime Execution Cost and Total Warm Execution Time for .NET 6

Fullsize Image

What about .NET Core 3.1?

I struggled if I should mention this since .NET Core 3.1 is reaching end-of-life in December 2022, but the performance delta for the baseline case is just staggering.

A Lambda function using .NET Core 3.1 with 512 MB is 40% faster on cold start than one using .NET 6 with 5,120 MB!

I'm just flabbergasted by this outcome. All I can do is remind myself that this baseline test is not representative of real-world code.

Architecture Memory Size Tiered Ready2Run PreJIT Init Cold Used Total Cold Start
x86_64 512MB yes no no 150.129 6.903 157.032
x86_64 1024MB yes no no 148.376 6.081 154.457
x86_64 1769MB yes no no 148.338 5.972 154.31

Minimum Cold Start Duration for .NET Core 3.1

Fullsize Image

Similarly, execution cost is lower with .NET Core 3.1, but not as dramatically. Still, for .NET 6 there were just 4 configurations that achieved a cost under 22ยต$. For .NET Core 3.1, there are 39 configurations under 21ยต$!

Interestingly, the 4 lowest cost configurations follow a similar pattern: ARM64, 128 MB, no Tiered Compilation, and tossup for ReadyToRun and PreJIT.

Architecture Memory Size Tiered Ready2Run PreJIT Init Cold Used Total Warm Used (100) Cost (ยต$)
arm64 128MB no no no 162.366 102.693 110.096 20.55465044
arm64 128MB no no yes 186.627 98.641 112.327 20.55161642
arm64 128MB no yes no 161.989 88.677 110.391 20.53178133
arm64 128MB no yes yes 185.923 85.289 117.811 20.53850086

Lifetime Execution Cost and Total Warm Execution Time for .NET Core 3.1

Fullsize Image

Conclusion

Based on the benchmarks, we can establish these lower bounds.

For .NET 6:

  • Cold start duration: 223ms
  • Execution cost: 21.94ยต$

For .NET Core 3.1:

  • Cold start duration: 154ms
  • Execution cost: 20.53ยต$

Unless anything fundamental changes, we should not expect to do better than these baseline values.

What's Next

In the next post, I'm going to benchmark JSON serializers. Specifically, the popular Newtonsoft JSON.NET library, the built-in System.Text.Json namespace, and the new .NET 6 JSON source generators.

Top comments (1)

Collapse
 
colejohn profile image
Nicole John

Awesome post! :)

Stop sifting through your feed.

Find the content you want to see.

Change your feed algorithm by adjusting your experience level and give weights to the tags you follow.