How to: Puppeteer in AWS Docker Lambda

#typescript #docker #lambda #aws

I was using another great library called puck, which is basically a very customisable editor, that can create sites/newsletters/pdfs (something visual) and came to the next step which was turning the output of the editor into a PDF.

The best option out there seemed to puppeteer for PDF generation based on HTML. Now I didn't have a server running anywhere, so far I was mostly creating small lambdas for any server side functionality for my app, so I wanted to leverage lambda for this use case also.

It turned out to be a bit more trial and error than expected since puppeteer relies on chromium and that doesn't work great out of the box with serverless environments but here is how I got it working.

First install required dependencies in a new Node.js project (initialised using npm init -y):

npm install puppeteer-core@23.9.0 
npm install @sparticuz/chromium@131.0.1

Since we're going to be installing this into a lambda environment, the regular puppeteer package will not work (at least for me) so we are installing the core library together with a version of chromium that works better in lambda.

Here the versions are quite important so I'm locking those in.

Create index.js file with following contents:

import puppeteer from "puppeteer-core";
const chromium = require("@sparticuz/chromium"); // this looks a bit odd with 'import' and 'require' statements but I was using TypeScript so everything compiled down to commonjs in the end. @sparticuz/chromium didn't support ESM

chromium.setHeadlessMode = true;
chromium.setGraphicsMode = false;

const browser = await puppeteer.launch({
    args: chromium.args,
    defaultViewport: chromium.defaultViewport,
    executablePath: await chromium.executablePath(),
    headless: chromium.headless,
});
const page = await browser.newPage();

await page.setContent("<html><p>Hello world</p></html>");

const pdf = await page.pdf({ format: "A4" });
await browser.close();

Running your index.js locally should yield a positive result. Now putting this code into AWS lambda, we need to create Dockerfile since we'll create a docker lambda which should make things simpler. One benefit is that we don't need to put Chromium code in lambda layer but other than that it's really up to you as the developer whether you prefer to go with a vanilla lambda or lambda docker approach.

Here's our simple Dockerfile:

# Use the official Amazon Linux image for Lambda
FROM public.ecr.aws/lambda/nodejs:18

# Set working directory
WORKDIR ${LAMBDA_TASK_ROOT}

COPY . .

RUN npm install
RUN npm run build # i was using typescript so here i run the build command 'tsc', you can omit it if you're doing JS


# Your CMD or ENTRYPOINT command
CMD ["index.handler"]

With a Dockerfile and index.js file at the ready, we can build our Docker image using command below (make sure Docker is installed):

docker build --platform linux/amd64 -t <name> .

Note that we are targeting linux/amd64 which is the default lambda environment.
Also to note is that I'm on a Macbook Pro M1 locally so this is a requirement. If you are on Linux, perhaps you dont need to do this but just to be sure.

The next steps require you to have AWS lambda setup in place and have knowledge of setting up ECR & Lambda which means this guide could become lengthy so I'll leave with a few words of advice on next steps:

After building the docker image, tag docker image using command:

docker tag <name>:latest <aws_account>.dkr.ecr.<region>.com/<ecr_repo>:latest

Push docker image

docker push <aws_account>.dkr.ecr.<aws_region>.amazonaws.com/<ecr_repo>:latest

(These commands are also visible in AWS console when creating your ECR repo)

Update lambda with image using following AWS CLI command (assumes the AWS CLI is installed on your machine). You can also manually update in the console if you prefer:

aws lambda update-function-code \
  --function-name <lambda_name> \
  --image-uri <aws_account>.dkr.ecr.<aws_region>.amazonaws.com/<ecr_repo>:latest \
  --profile default \
  --region <aws_region>

Ensure lambda has 1024 MB memory at least and timeout of 30 seconds.

Hope this was useful!

DEV Community

How to: Puppeteer in AWS Docker Lambda

Top comments (0)

Read next

Decoding the Design: The Evolution of the Amazon Web Services Logo

From 41 Minutes to 8 Minutes: How I Made Our CI/CD Pipeline 5x Faster

Set budget alerts on GCP & AWS

Learn Typescript with me (Part 1)