Updated with benchmarks for cdk 1.77.0.
I wrote some articles about AWS Cloud Developer Kit earlier this year. I was attracted to CDK immediately upon hearing of it due to the ability to write infrastructure as code in TypeScript. I really like writing code in TypeScript and CDK seemed almost too good to be true.
Table of Contents
- A Missed Opportunity?
- Lamba in CDK
- aws-lambda-nodejs
- Bundling
- Benchmarking #1
- Benchmarking #2
- Benchmarking #3
- Next Steps
A Missed Opportunity?
CDK is a new technology and that means that it doesn't necessarily cover every use case yet. What I found as I worked through official examples was that somebody had written CDK code in TypeScript but the accompanying Lambda code was written in JavaScript! This struck me as a missed opportunity. It turns out it wasn't a missed opportunity but one that just hadn't landed yet.
Lambda in CDK
To explain a bit better for those who aren't really in the transpilation game, TypeScript code is usually transpiled into JavaScript before being shipped into a runtime, be that runtime a web server, NodeJS or Lambda. That's because (leaving deno aside for now), there's no TypeScript execution environment. I say usually because there is actually a pretty cool project called ts-node that lets you execute TypeScript code in NodeJS without transpiling the code ahead of time. ts-node is a great tool to save developers a step in development flows. It's debatable whether you should use it in production or not (I don't). That said, it's totally appropriate to use ts-node with CDK. This lets you shorten the code=>build=>deploy cycle to code=>deploy. That's great!
But this doesn't work with Lambda Functions. CDK turns my TypeScript infrastructure constructs into CloudFormation. It doesn't do anything special with my Lambda code - or at least it didn't until the aws-lambda-nodejs module landed in CDK.
aws-lambda-nodejs
The aws-lambda-nodejs module is an extension of aws-lambda. Really the only thing it adds is an automatic transpilation step using esbuild. Whenever you run a cdk deploy
or cdk synth
, this module will bundle your Lambda functions and output them to your cdk.out
directory. Then the deploy process will stage the bundles in S3 and provide them to Lambda - all with no extra config required. It's quite impressive!
By default aws-lambda-nodejs will try to run your build in Docker. This could be advantageous if you are running your build in an environment with a different NodeJS version than you want the build to happen with, such as in a build pipeline. For most use cases, it'll be better to install esbuild in your application for local bundling.
Bundling
You may have experience with bundlers such as webpack, parcel or rollup. esbuild is a newer bundler written in golang and its benchmarks are truly impressive.
Bundlers were originally used in UI applications. They allowed us to write a modular application that is still delivered as a single .js file to a web browser. Bundlers often will uglify the source code, or make it as small as possible (sometimes replacing variable names with shorter names) so it can be processed as fast as possible by a machine.
It turns out it's a good choice to use a bundler with Lambda as well as it can improve cold start times by bundling all dependencies together. Some dependencies, such as mysql client libraries, will break when bundled, but aws-lambda-nodejs has got you covered there as well.
Benchmarking #1
When I first wrote this article, the current CDK version was 1.41.0. I'm going to run npx bump-cdk -e 1.41.0 && npm i
to go back in time and see how that version ran. This was one of the first versions of cdk that included aws-lambda-nodejs. When I run time npm run synth
, I get this output:
npm run synth 6.26s user 1.59s system 37% cpu 21.096 total
21 seconds isn't bad at all, but I left a comment complaining about poor performance with this repo. When I bump up to 1.56.0, I get a different result.
npm run synth 6.38s user 1.46s system 6% cpu 2:08.74 total
That's a pretty serious performance hit over the versions! I think one of the reasons for this was switching to using amazon/aws-sam-cli-build-image-nodejs12.x
as the base image for the build while the prior version used node-alpine:
% docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 48af1ba4388a 9 minutes ago 2.01GB
parcel-bundler latest 101256132170 22 minutes ago 243MB
node 12-alpine 844f8bb6a3f8 2 days ago 89.7MB
amazon/aws-sam-cli-build-image-nodejs12.x latest c8d3afc3e603 2 weeks ago 1.87GB
Anyway, this stuff doesn't matter that much for two reasons:
- In my day job I use a Docker-based build system (codefresh.io) and spawning Docker containers from Docker containers (DinD) isn't a thing the industry has figured out how to do well yet.
- In cdk 1.60.0, the team introduced the ability to install the bundler and run locally anyway, so let's try bumping up to 1.60.0 and installing
parcel@next
and see how that improves my build performance.
npm run synth 33.67s user 8.68s system 202% cpu 20.863 total
Okay, that's decent again. Last check to go up to cdk 1.75.0, the latest version and use esbuild. Let's see if it's as fast as it says on the box. First let's run it without installing esbuild.
npm run synth 6.87s user 1.46s system 55% cpu 15.065 total
Not too bad. Now after installing esbuild.
npm run synth 5.51s user 0.71s system 137% cpu 4.515 total
That is a serious improvement! These functions are tiny, so let's try some bigger ones.
Benchmarking #2
My Dynamo Lambda Loader project has just one function in it, but that function has aws-sdk and faker as dependencies. Let's see how that one goes. Last time I bumped the dependencies, the cdk version was 1.56.0. Build time is almost 42 seconds.
npm run synth 5.50s user 0.90s system 15% cpu 41.815 total
And the bundle that is output is 1.9 MB. Size matters now. That's actually a bit hefty as this version of the build automatically excluded aws-sdk and is relying on the one that exists natively in Lambda. Using cdk 1.60.0 + Docker there's no real difference in build time or size. However the build time does improve when I install parcel.
npm run synth 34.19s user 10.16s system 353% cpu 12.539 total
Bumping up to 1.75.0, I get improved build times in Docker, but it no longer outputs the size.
npm run synth 5.57s user 0.98s system 62% cpu 10.434 total
My bundle is a bit larger at 2.3 MB. Finally let's test with esbuild installed locally.
npm run synth 5.44s user 0.87s system 145% cpu 4.343 total
It's super fast again, but still 2.3 MB which seems kind of big. We can improve performance by adding the minify: true
option to NodejsFunction
. That drops the size to 1.6 MB. Nearly all of that bundle is from the faker library, which apparently doesn't tree-shake very well.
Just to further the experiment a little, let's see what happens when we remove aws-sdk
from the external modules (it is excluded by default). There are a couple of reasons to not want to rely on the native aws-sdk when using Lambda. One is that we can't be sure of what version will be available. Most of the time this won't matter, but that one time it does, we'll find ourselves filled with regret. There is also some evidence that you'll get faster cold starts when installing and bundling just a part of the aws-sdk. I update my function declaration to look like this:
const initDBLambda = new NodejsFunction(this, 'initDBFunction', {
bundling: {
externalModules: [],
minify: true,
},
entry: `${lambdaPath}/init-db.ts`,
handler: 'handler',
memorySize: 3000,
runtime: Runtime.NODEJS_12_X,
timeout: Duration.minutes(15),
});
My build time remains good but now my bundle is 6.5 MB. I can improve that by changing the import in my function to only import DocumentClient
from DynamoDB instead of the entire aws-sdk.
import { DocumentClient } from 'aws-sdk/clients/dynamodb';
const db = new DocumentClient();
My build time was a little under 4 seconds and the bundle is now 1.9 MB, which is pretty good as I know the bulk of that is still from faker. To put it all in perspective, 1.9 MB is probably a fine size for a Lambda function. If I really need performance, I could try adding faker to nodeModules
:
const initDBLambda = new NodejsFunction(this, 'initDBFunction', {
bundling: {
externalModules: [],
minify: true,
nodeModules: ['faker'],
},
entry: `${lambdaPath}/init-db.ts`,
handler: 'handler',
memorySize: 3000,
runtime: Runtime.NODEJS_12_X,
timeout: Duration.minutes(15),
});
This will make faker external and include it as an install to node_modules
. Doing this drops my bundle size to just over 300kb. Does it actually help with performance? We'd have to run much more extensive tests than I have here. The nodeModules
prop is also useful for libraries that don't minify well, such as the previously referenced mysql client libs (mysql and mysql2 both have this issue).
Benchmarking #3
Okay time for something real! I've applied these lessons to my day job where I've got a cdk stack that delivers eleven Lambda Functions. We make heavy use of aws-sdk but import just the client libs. Due to the Docker-in-Docker and other performance issues, we elected not to use aws-lambda-nodejs for our stack but instead bundle separately with webpack. This has worked fairly well for us:
npm run build 146.97s user 6.78s system 534% cpu 28.781 total
11 functions build in less than 30 seconds. The sizes of the bundled functions ranges from 127kb to 530kb, bundles which include aws-sdk client libraries. This is a pretty good outcome and we've been happy with it. We use new AssetCode()
to point to the build output.
Despite the good results, I have wanted to return to aws-lambda-nodejs and save ourselves from having to maintain the separate webpack build. Now with esbuild, I think we might have a good opportunity. Updating to cdk 1.75.0 and switching to aws-lambda-nodejs with esbuild installed immediately gives me this result:
npm run synth 17.44s user 2.42s system 138% cpu 14.287 total
Not only am I getting this done in half the time, this is doing a little extra work in that it's synthing my entire stack instead of just bundling the functions like webpack was doing. That's quite an improvement. My bundle sizes are quite small due to externalizing aws-sdk by default. Sizes now range from just 108k to 176k. If I change the setting on externals so aws-sdk client libraries are bundled with my functions, sizes are back up to 135k to 532k, so slightly larger than what I had with webpack but not too bad. It would be interesting to know why my bundles are a few kb larger, but overall this does look like an improvement. Not only is my process simpler and cleaner, but it's also faster.
Note the benchmarks above are from cdk 1.75.0, but the api changed slightly for cdk 1.77.0. I've updated the code samples to conform with the new api but the benchmarks haven't moved significantly.
Next Steps
This module is still marked experimental and in fact the API has changed a bit, most significantly in the shift from using parcel as the bundler to esbuild. This means that you probably shouldn't use aws-lambda-nodejs if you aren't willing to accept a few minor API changes in the future. To me, it's a small price to pay to stay current. I'm really pleased with the improvements and attention to the community that the CDK team has shown in building and improving this module. Looking forward to what's next!
Cover: The Beagle Laid Ashore drawn by Conrad Martens (1834) and engraved by Thomas Landseer (1838)
Top comments (6)
Hey @elthrasher ,
Great post!
aws-lamda-nodejs
now uses Parcel v2. It supports external modules (by defaultaws-sdk
is considered as external and not bundled) and node modules (a list of modules that should be installed in thenode_modules
folder instead of being bundled by Parcel).Hi Jonathan, thanks for reading and great work on that PR! I just gave it a try and it looks quite good. I'll update this post some time this week.
Hey , I managed changed to NodejsFunction and my function reduce from 50MB~ to 400KB~ , my only problem is debugging , I did managed to debug my lambda using Function , but with NodejsFunction I can't make it work , the function does run , but debugging doesn't stop on breakpoint , did you mange to debug it ? there is sourceMaps: true property for NodejsFunction , tried played with launch.json configurations ,did something like localRoot": "${workspaceRoot}/.cdk.staging/asset-bundle-G0UqOf", ( where index.map.js is) , but nothing happens (when localRoot can't find index.map.js) it shows message in "debug" tab
UPDATE
I manged to debug ,
but i need to put a breakpoint inside /.cdk.staging/asset-bundle-G0UqOf/index.js which is HUGE LONG file.... not so user friendly
Yeah, I wouldn't expect it to be very easy to put break points in bundled code. Are you running your function with SAM or something like that or just running it locally with node (not as Lamba)? If the latter, I suggest you just use something like ts-node and don't bundle it.
That's also a great use case for TDD where you write a test that you think will interact with your problematic code and shake it out that way. Sometimes TDD = test-driven debugging ;)
yea with SAM ,
sam local invoke lambdaHandler36FF0F39 -d 9999 -e events/1.json ,
well it's a CDK project I play with , I except as long as we move forward to future .... that cdk-serverless will be easier to develop and debug... isn't it all the idea ? :)
Thanks for writing this! I like the step by step and easy to grasp approach to performance. Kudos!