With a recent update to Azure Functions, it is now possible to run headless Chromium in the Linux Consumption plan. This enables some serverless browser automation scenarios using popular frameworks such as Puppeteer and Playwright.
Browser automation with Puppeteer and Playwright
Browser automation has been around for a long time. Selenium WebDriver was a pioneer in this space. More recently, Puppeteer and Playwright have gained in popularity. The two frameworks are very similar. Google maintains Puppeteer and Microsoft maintains Playwright. It's interesting to note that some of the folks who worked on Puppeteer are now working on Playwright.
Puppeteer and Playwright each support a different set of browsers. Both of them can automate Chromium. They automatically install Chromium and can use it without extra configuration.
Azure Functions support for headless Chromium
It's been a challenge to run headless Chromium on Azure Functions, especially in the Consumption (serverless) plan. Until now, the only way to run it has been by using a custom Docker image on the Premium plan.
Very recently, the necessary dependencies to run headless Chromium were added to the Azure Functions Linux Consumption environment. This means that we can simply npm install Puppeteer or Playwright in a Node.js function app to start using one of those frameworks to interact with Chromium.
Use Puppeteer and Playwright in Azure Functions
It's pretty straightforward to run either Puppeteer or Playwright in Azure Functions. We use npm to install it. Note that because it is needed at run-time, we should install the package as a production dependency. In the examples below, we use Puppeteer/Playwright with headless Chromium in an HTTP triggered function to open a web page and return a screenshot.
Puppeteer
# also installs and uses Chromium by default
npm install puppeteer
const puppeteer = require("puppeteer");
module.exports = async function (context, req) {
const url = req.query.url || "https://google.com/";
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url);
const screenshotBuffer =
await page.screenshot({ fullPage: true });
await browser.close();
context.res = {
body: screenshotBuffer,
headers: {
"content-type": "image/png"
}
};
};
Playwright
Note: Playwright 1.4.0 requires some dependencies that are not installed in the Linux Consumption plan. Use 1.3.0 until this issue is resolved.
# the default playwright package installs Chromium, Firefox, and WebKit
# use playwright-chromium if we only need Chromium
npm install playwright-chromium@1.3.0
const { chromium } = require("playwright-chromium");
module.exports = async function (context, req) {
const url = req.query.url || "https://google.com/";
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto(url);
const screenshotBuffer =
await page.screenshot({ fullPage: true });
await browser.close();
context.res = {
body: screenshotBuffer,
headers: {
"content-type": "image/png"
}
};
};
For the full source, take a look at this repo. When we run the function app locally and visit http://localhost:7071/api/screenshot?url=https://bing.com/
, we get back a screenshot of the page.
Deploy to Azure
Since we're deploying to a Linux environment, we have to make sure we run npm install in Linux so it downloads a version of Chromium that matches the deployment target. Thankfully, Azure Functions supports remote build so that the app is built in the correct Linux environment during deployment, even though we might be developing locally in macOS or Windows.
Configuring VS Code for remote build
If we are deploying using Azure Functions Core Tools, we can skip this step.
By default, the Azure Functions VS Code extension will deploy the app using local build, which means it'll run npm install locally and deploy the app package. For remote build, we update the app's .vscode/settings.json to enable scmDoBuildDuringDeployment
.
{
"azureFunctions.deploySubpath": ".",
"azureFunctions.projectLanguage": "JavaScript",
"azureFunctions.projectRuntime": "~3",
"debug.internalConsoleOptions": "neverOpen",
"azureFunctions.scmDoBuildDuringDeployment": true
}
We can also remove the postDeployTask
and preDeployTask
settings that runs npm commands before and after the deployment; they're not needed because we're running the build remotely.
And because we're running npm install remotely, we can add node_modules
to .funcignore. This excludes the node_modules folder from the deployment package to make the upload as small as possible.
Creating a Linux Consumption function app
We can use any tool, such as the Azure Portal or VS Code, to create a Node.js 12 Linux Consumption function app in Azure that we'll deploy the app to.
Configuring Chromium download location (Playwright only)
By default, Playwright downloads Chromium to a location outside the function app's folder. In order to include Chromium in the build artifacts, we need to instruct Playwright to install Chromium in the app's node_modules folder. To do this, create an app setting named PLAYWRIGHT_BROWSERS_PATH
with a value of 0
in the function app in Azure. This setting is also used by Playwright at run-time to locate Chromium in node_modules.
Publishing the app
If using VS Code, we can use the Azure Functions: Deploy to Function App... command to publish the app. It'll recognize the settings we configured earlier and use remote build.
If using Azure Functions Core Tools, we need to run the command with the --build remote
flag:
func azure functionapp publish $appName --build remote
And that's it! We've deployed a consumption Azure Functions app that uses Puppeteer or Playwright to interact with Chromium!
Top comments (16)
Thanks for publishing this article it really helped me.
I went through the steps, except that I upload a zip from CI, since my function references files outside the functions root dir, and I get:
I wonder if I'm missing a step to configure the user access.
Hi @bennypowers , did you find fix for this?
IIRC you need to build it on the remote
I think you may also need to set an ENV var which specifies that chromium should install in the repo working dir, or something like that
Hi,
Thank you for a very helpful article! With your help I've been almost able to make playwright to work on linux consumption plan. The problem that still exists seems to be that the function does not have access to start the browser process:
Error: browserType.launch: Failed to launch: Error: spawn /home/site/wwwroot/node_modules/playwright-chromium/.local-browsers/chromium-907428/chrome-linux/chrome EACCES)
Googling shows that this problem might arise only when deploying through azure devops pipeline, which I'm doing.
Do you or anyone else have any ideas on what could be the solution for this problem?
Thousand thanks! 🙇♂️
Hi @ronkot , Did you get any solution for this? Please help me out. I am also facing this issue. Please have a look on the below error.
An unexpected error : { browserType.launch: Failed to launch: Error: spawn /home/site/wwwroot/node_modules/playwright-core/.local-browsers/chromium-930007/chrome-linux/chrome EACCES
Unfortunately I couldn't fix the issue. Finally I just gave up using playwright. I'd liked to use Playwright to print html page to pdf, but I refactored my system to use a distinct js-only mechanism to generate PDFs using
pdfmake
.Hi Anthony,
I'm unable to get either the puppeteer/playwright functions to run successfully in Azure - they run fine locally, but once deployed, they fail with the below error:
Result: Failure Exception: Worker was unable to load function GeneratePDF: 'Error: Cannot find module 'playwright-chromium' Require stack: - /home/site/wwwroot/GeneratePDF/index.js - /azure-functions-host/workers/node/worker-bundle.js - /azure-functions-host/workers/node/dist/src/nodejsWorker.js' Stack: Error: Cannot find module 'playwright-chromium' Require stack: - /home/site/wwwroot/GeneratePDF/index.js - /azure-functions-host/workers/node/worker-bundle.js - /azure-functions-host/workers/node/dist/src/nodejsWorker.js at Function.Module._resolveFilename (internal/modules/cjs/loader.js:965:15) at Function.Module._load (internal/modules/cjs/loader.js:841:27) at Module.require (internal/modules/cjs/loader.js:1025:19) at require (internal/modules/cjs/helpers.js:72:18) at Object. (/home/site/wwwroot/GeneratePDF/index.js:1:22) at Module._compile (internal/modules/cjs/loader.js:1137:30) at Object.Module._extensions..js (internal/modules/cjs/loader.js:1157:10) at Module.load (internal/modules/cjs/loader.js:985:32) at Function.Module._load (internal/modules/cjs/loader.js:878:14) at Module.require (internal/modules/cjs/loader.js:1025:19)
Any ideas?
@Darshan - did you ever manage to resolve this? I'm getting the same thing.
Edit: I solved it using some of the advice here (stackoverflow.com/questions/639499...), and by making sure I deployed by right-clicking on my project in VS Code and choosing "Deploy to Function App..." rather than doing it at the command-line.
@tom , no I didn't in the end - but glad you got it to work in the end. When I get a chance I will try and get mine to work as well!
When deploying from VSCode, "azureFunctions.scmDoBuildDuringDeployment": true conflicts with
Error: Run-From-Zip is set to a remote URL using WEBSITE_RUN_FROM_PACKAGE or WEBSITE_USE_ZIP app setting. Deployment is not supported in this configuration.
Error: Linux consumption plans only support zip deploy. See here for more information.
I'm not sure if I missed a step.
Is this a new Linux Consumption plan that you're deploying to? I didn't run into this with a new app. This is likely because one of those settings were previously added to the app. You should be able to delete that app setting and try again. Let me know if it works.
Hi Anthory,
Is there a similar way to get this working using selenium webdriver? Things work fine locally but once deployed, I keep running into 'Exception: WebDriverError: unknown error: cannot find Chrome binary'
Any idea whether we can do remote build from Azure Devops Release Pipeline?
Is this working same way for v4 functions and node 16 lts?
The article was amazing.
Can I use this solution with a function running in a kubernetes cluster with Keda?
Hi Anthony. Is there any ETA for adding the ability to run headless Chromium from App Service Plans?