Hi folks, let's talk about Cold Starts in Cloud Services. I'm Focusing on the AWS cold starts, according to
- The background details of cold starts
- How cold starts begin
- What are the effects of cold starts
- How I tried to prevent cold starts
- Further improvements.
I'll try my best to provide you updated details via this article. So, let's begin 😎
The background details of cold starts
If you work in a software company that uses cloud services in production, you've probably heard developers talk about cold starts. If not, now it’s the time to learn everything about cold starts. Maintaining cold starts in clouds, reduces service costs and the amount of time it takes to execute HTTP requests.
Cold start term is related to the Function-as-a-Service in cloud services and this article provides pretty good comparison and an analysis of Cold Starts in Serverless Functions across AWS (lambda), Azure (Functions), and GCP (Functions).
How cold starts begin
A cold start begins with how the architecture of the lambda service implemented in AWS. Normally when we sent an API request to lambda, lambda service works according to following order:
- Is there a free execution environment? (Container with runtime)
- If not, Create an Execution environment
- Download lambda code
- Initialize (Running code outside handler function)
- Run
handler
function code
2, 3, 4 steps caused Cold Starts
Extracted from AWS re:Invent 2023 - Demystifying and mitigating AWS Lambda cold starts (COM305)
Then What's a lambda warm start
These execution environments (lambda instances) remain only 10 or 15 minutes after the first run (Cold start) of handler function. If another request comes during that time it uses already built virtual environment. This is the warm start. BUT if another request comes while already created environment is busy with another request, lambda service has to create another virtual environment (look above steps of lambda service working order and you can understand this process). Then lambda's concurrency increases by 1 (total concurrency = 1 + 1 = 2).
PS: When you’re working with lambda functions, you may probably work with AWS cloud watch. In there you may sometimes be confused with log groups and log streams. These log groups are created for 1 lambda service and 1 log stream is created for one lambda instance in the lambda service (source) in another words, when a cold start begins. So 1 cloud watch stream includes all the details of the lambda instance such as, details of cold start, warm starts, cached requests, logs, errors etc. AWS X-Ray works on these logs output by the cloud watch service.
What are the effects of cold starts
You might wonder why we are focusing on lambda cold starts. Because cold start implies that the relevant lambda instance is starting, the http request must wait until the lambda instance is warmed up, which may result in a few seconds of latency in the frontend website / mobile app. This leads to a poor user experience. Otherwise, if your application is a response-critical website, such as a bank app, an ecommerce website, or a stock market application, this could cause significant disaster.
How I tried to prevent cold starts
I used webpack to reduce the cold start time for AWS lambda functions. Normally, we use webpack for the bundling purposes in single page applications, basically on frontend frameworks such as Angular-CLI, React and more. In here we use the webpack's bundling, minifying and tree shaking features to bundle AWS NodeJS lambda functions.
My approach on applying webpack in microservices:
Install Webpack and other relevant dependencies
npm install --save-dev webpack webpack-cli ts-loader webpack-node-externals glob
Configure the
webpack.config.ts
where theindex.ts
as the entry point to webpack.
import * as path from "path";
import { Configuration } from "webpack";
import nodeExternals from "webpack-node-externals";
const config: Configuration = {
entry: "./index.ts",
target: "node",
mode: "production",
module: {
rules: [
{
test: /\.ts$/,
use: "ts-loader",
exclude: /node_modules/,
},
],
},
resolve: {
extensions: [ ".ts", ".js" ],
},
output: {
filename: "[name].js",
path: path.resolve(__dirname, "lib"),
libraryTarget: "commonjs2",
},
externalsPresets: { node: true }, // Use externalsPresets to specify Node.js environment
externals: [nodeExternals()], // Use the function to exclude node_modules
};
export default config;
- Modify
package.json
as
"scripts": {
"build": "webpack --config webpack.config.ts"
}
- Update
tsconfig.json
:
Added webpack.config.ts
to the include array to ensure TS includes Webpack config file when compiling.
{
"compilerOptions": {
"module": "commonjs",
"moduleResolution": "node",
"esModuleInterop": true,
"pretty": true,
"sourceMap": true,
"allowJs": true,
"target": "es6",
"outDir": "./lib",
"baseUrl": "./",
"types": ["chai", "node"],
},
"include": [
"./**/*",
"test/.mocharc",
"webpack.config.js"
],
"exclude": [
"node_modules", "lib"
]
}
After doing these changes you can build your code and you'll precisely see how your js
files are minified and tree shaked dependencies. In here I reduced the package size of the output folder nearly 90% applying the webpack on my lambda functions. Please consider this might vary according to your architecture and other factors. But as a result of bundling the code using webpack, it should reduce the final output folder package size. Next, I went to check the impact of the webpack on lambda execution time.
How I conducted my tests:
I used 2 API calls to check the impact of webpack.
- GET ALL - Get All Users
- GET - Get User by ID
After sending couple of API requests to AWS lambda functions, I went to CloudWatch to see the results, later I used AWS X-ray, because it clearly shows only the relevant results of set of API calls (for this, first you should enable X-ray in your lambda function/s)
This image shows a set of API requests I sent to an instance of my lambda function. As I mentioned previously in this article, every instance starts with a cold start, and all the other instances invoke warm starts. You can figure out it by the Response Time. Meanwhile requests might be cached due to couple of reasons.
Caching
When we call the same request couple of times, response becomes cached. Caching may occur for a variety of reasons, including:
Lambda Container Reuse: AWS Lambda may reuse the same container for multiple invocations, which can lead to data persistence across invocations if our code uses global variables or similar constructs.
API Gateway Caching: If we’re invoking our Lambda through API Gateway, it might be caching responses.
Client-side Caching: Tools like Postman might cache responses based on headers.
So, we obtain responses without warm beginnings, and caching saves lambda service resources by returning the prior result without activating the lambda instance. You can clearly see how the response time has been reduced when it’s getting cached.
Since our goal is to check lambda warm starts, we can prevent caching by sending requests with a 2 - 3-minute time delay between two API requests. Otherwise, to assure Lambda functions without interference from cache, we can try the following approaches:
Disable API Gateway Caching: Ensure that caching is disabled in API Gateway settings.
Unique Query Parameters: When testing with Postman or similar tools, you can add a unique query parameter to each request. This approach can prevent client-side and intermediate caching. For example, append a timestamp or a random number as a query parameter.
Avoiding Caching in Postman: If you suspect Postman might be caching responses, you can disable caching in Postman settings or use a different tool for testing, like curl in the command line.
Scheduled Invocations:
- To invoke Lambda functions automatically without API calls, you can use AWS CloudWatch Events (or EventBridge).
- Set up a rule to trigger your Lambda function at regular intervals.
- This approach can be useful for simulating traffic and understanding Lambda behavior over time.
When checking X-Ray logs for lambda functions, I should specifically mention that there are two nodes called as function and context.
Lambda Context and Function - X-RAY
Function Node: Actual execution of Lambda function's code, in another words execution time of the function logic itself. This includes the time taken by code and any libraries it uses.
Context (or Initialization) Node: Related to the "initialization" or "bootstrap" phase of the Lambda function. Time spent by AWS Lambda to initialize the execution environment.
A few cold start timings that I obtained by X-ray are included in the table below. Ultimately, although my output folder package size was reduced, neither the lambda execution time nor the cold start time improved. 😑, Since I make quite a number of mistakes, please leave a comment if you find anything wrong with my process. 🫡
CS - Cold Start
Optional: For analyzing AWS X-Ray logs and monitoring the performance impact of applying Webpack to AWS microservices, especially in terms of Lambda cold starts and warm starts, there are several third-party tools and services that might be useful. These tools offer more advanced analytics and visualization capabilities than what's available directly in AWS X-Ray or CloudWatch.
if you're following GitHub Student Developer Pack, you can get free credits for some of these services.
Further improvements.
Later on, I discovered that lambda memory may potentially affect cold starts as well as AWS monthly bill. We can check what is the optimal memory size for each of our lambda using AWS Lambda Power Tuning. And this article well explains, how to test our lambda functions using AWS power tuning tool. Although there isn't much impact from webpack on your current lambda memory size, it might show a significant impact on a different lambda memory size after applying webpack.
Furthermore, we can follow couple of solutions. I got images from this video for solution 1 and 2.
Solution 01 - Using CloudWatch Event Rule
In here we can ping a lambda for an execution environment for a scheduled time period (i.e.: 10 or 15 min). But lambda might remove this virtual environment within 1 or 1.5 hours of time. Until then requests can use the execute environment stimulated by the scheduled event. In this way we can reduce the count of cold starts.
we can use serverless-plugin-warmup
for this purpose.
Solution 02 - Lambda Provisioned Concurrency
For each AWS account, we get 1000 provisioned concurrencies. So, when applying provision concurrency, it will be deducted from our AWS account.
Cost comparison between step 1 and step 2
This basically explains the cost comparison with a lambda function with 100 instances. When lambda pinging, AWS charges the normal charge for lambda service. But in provision concurrency, AWS charges additional cost for explicit warm environment. They're keep charging until we disable the provision concurrency.
Best Practices
- Do NOT apply provision concurrency for all the lambdas. Apply only for frequently invoked lambdas.
- Use provision concurrency based on a schedule - (Scheduled scaling for Application Auto Scaling)
- You can apply Lambda Hing for less frequently used lambda Lambda Ping can further be configured with a CRON expression to minimize cost furthermore. For example, Ping lambda from Monday to Friday from 8AM-5PM. Identify the best memory allocation for your lambda-Use tools like Lambda Power Tuning
Top comments (0)