Puppeteer may currently be the most known headless browser automation library out there. It provides a high-level Node.js API which allows you to spin up and send commands to a Chromium or Chrome browser instance. It has proven itself to be easy to install, simple to use and performant by nature.
Some Backstory 📖
The way that Puppeteer works is that it provides a thin layer above the DevTools Protocol.
The DevTools Protocol is what gives you the power to do all the cool stuff in the actual "Inspect Element" toolbar in your browser. Actually this protocol is the same that powers up most Blink-based browsers (Chrome, Chromium etc.) providing the tools for DOM inspection, network profiling, debugging and all the other cool capabilities we have access to.
In Puppeteer you can do almost anything you can do in the actual browser without hacks included.
Puppeteer belongs under the Google Chrome umbrella and specifically is maintained by the Chrome DevTools team. That fact alone should give you some confidence about the long-term sustainability of the project. Additionally it is guaranteed to be up to date with the latest features that are shipped in the Chromium/Chrome browsers. You will not usually have to wait about a feature being ported to the library.
So let's get to it!👷
Get The Library
Initially make sure you are in a machine with Node.js >=v10.18.1 installed so we can go with the latest Puppeteer version.
Make a new project folder called puppeteer-example so we can start going through the process.
mkdir puppeteer-example
cd puppeteer-example
Now we can go ahead and bootstrap the required Node.js setup.
npm init -y
With this you are ready to install your favorite libraries like left-pad or browser-redirect but you can skip it for now 😂. Back to our target:
npm install puppeteer@4
While installing the library, you probably came across a message on your console stating Downloading Chromium xxx. That message is there to let you know that with the Puppeteer library, a specific version of Chromium for your operating system is also downloaded (inside node_modules) to be used by your installation of Puppeteer. The reason for that is every Puppeteer version is only guaranteed to work with a specific Chromium version it comes bundled with.
Special Hint: If you are a bit disk-space constrained, delete your node_modules directory from your test or unnused Puppeteer projects after you are done.
First Encounter🤞
We got through the installation and now we can start writting some code. You will probably be surprised with how much you can do with a few lines of code.
For our first task, we will try to explore the official Puppeteer website https://pptr.dev/.
Create a test file index.js
with the following contents:
const puppeteer = require("puppeteer");
(async function () {
const browser = await puppeteer.launch({ headless: false }); // We use this option to go into non-headless mode
const page = await browser.newPage(); // Create a new page instance
await page.goto("https://pptr.dev"); // Navigate to the pptr.dev website
await page.waitFor(5000); // Wait for 5 seconds to see the beautiful site
await browser.close(); // Close the browser
})();
Now by running this code using node test.js
you will witness a Chromium instance launching and navigating to the pptr.dev website for 5 seconds before closing down.
I am sure that this now feels a comfortable place for web automation enthusiasts. The only component missing is the scenarios you need to run and getting the feel for the intuitive and simple API that Puppeteer advertises.
Why not take a look ?
Exploring a Simple Scenario 🕵
Skipping the pleasantries, our aim will be to explore the autocomplete search functionality that pptr.dev website has for our convenience.
Thinking Out Loud
So let us go about describing what does an actual user needs to do to get this autocomplete feature to achieve its purpose.
We expect the user to:
1. Open the page
2. Try to find the autocomplete search
3. Type his query for the API method he is looking for
4. Click the most relevant result on the list
5. Expect to see the section with the item he selected
To test out if the Puppeteer API is as intuitive as it claims to be, we can go ahead and translate this thinking to Puppeteer commands.
/* Somewhere else... */
const Homepage = {
autocompleteSearchInput: "input[type='search']",
};
const apiSearchTerm = "metrics"; // The API method we are looking for
/* ... */
await page.goto("https://pptr.dev");
await page.waitForSelector(Homepage.autocompleteSearchInput);
await page.type(Homepage.autocompleteSearchInput, apiSearchTerm);
await page.click("search-item");
// Find the API name using XPath
const $apiMethod = await page.$x(
"//api-method-name[text()='" + apiSearchTerm + "']"
)[0];
// Check if this method name section is actually visible on the viewport
const isApiMethodVisible = await $apiMethod.isIntersectingViewport();
assert.equal(isApiMethodVisible, true);
Well that was it! 🎉
The code above, containing also some housekeeping, in my eyes seems pretty straightforward based on the thinking process we laid out, I do not think I even need to explain what most of the commands contribute to. The API successfully translates to clear language without relying on other external abstractions.
A point that we can stand on a bit is the combination of commands that are used to detect if the API method that we were looking for is actually inside the browser viewport. People with experience in the field know that to assert this fact you would either create your own custom command (doing viewport dimension calculations) or rely on a framework command that has already been implemented for us.
The differentiating factor here is that the command we get directly from Puppeteer could be considered the most reliable, just from the fact that it is provided by the platform itself.
One or Two Things Missing 🙈
After we all agree that the API is rather intuitive and simple to use, we can go over and mention a couple of things that might seem to be "missing" in making our development experience a tad much better.
1) Filling your code with the async
keyword
As you have definitely observed, there is this async keyword you have to sprinkle all around your code, and it feels a bit noisy for me at least. This keyword is required because of the event-driven nature of the browser APIs. The way to code around asynchronous and event-driven platforms in JavaScript is by using Promises to model your operations, and Puppeteer has done just that.
To make handling of those asynchronous operations a bit less painful, JavaScript has added some new keywords to the language syntax. These keywords are the async & await that you see on our code. Because Puppeteer's API needs to use Promises, the best way we can write our code is to use this async/await syntax for most commands.
2) No chaining available yet
Due to some design decisions and the nature of the library, as we have mentioned in the point above, there is currently no support for what we can call method chainning. With this capability our code could become so much more fluent to read and follow through. Picture something like:
await page.$("input[type='search']").click().type("metrics").submit();
I cannot vouch for but I think there are some third-party library solutions you can try. If you want to go a bit over the state and the possible external solutions, you start by taking a look at one relevant GitHub issue.
Closing
You just got through a super fast introduction on how to setup Puppeteer and code a simple scenario for an autocomplete search. From here on out you are on your own, except for all the recipes that will come on The Home of Web Automation.
My suggestion would be to start experimenting on your own use case and as a bedtime story, go over the detailed API documentation on GitHub. It is almost certain you will find a couple of surprising things you did not expect to do using the native commands.
Cross posted from The Home of Web Automation
Top comments (5)
Thank you for this excellent article. How does this compare to Cypress?
Thank you for your kind words! :)
Lot's of things to say here but at least as a low-level comparison, I would refer you to a recent video youtube.com/watch?v=emWHeODwcQY
On a higher level Puppeteer is a library (the core if you may) of a web automation program that could evolve in a framework,
while Cypress is a full featured out of the box framework (with high customization capacity). At least in my eyes
I like the library concept. Cypress while lighter than Selenium, is heavy with Electron, in my opinion. I love the ease of use of Cypress, but I don't like the JQuery architecture. This is why this article was interesting to me. I looked at Puppeteer early on an didn't see the value over Selenium at the time. However, I'm thinking now that it's a Cypress killer due to it's library like way of doing things.
One last point, the Google chrome team uses Puppeteer to validate Chrome releases!
BTW Selenium just release a major version with lots of changes. It'll be interesting to see if Cypress and Puppeteer have set the standards. If they don't put in things like intercepting outbound and inbound http requests they aren't thinking right.
Thanks for sharing your thoughts!
I am gonna cover the new version of Selenium in another post soon and also try to address why some "features" that we have in Puppeteer might be hard to get in Selenium with the current architecture.
Since you are interested in the library-style of tools, I would suggest you also take a look at a similar article introducing Playwright
thehomeofwebautomation.com/getting...
Also I hope you find some more interesting stuff there!
Yes I saw PlayWright earlier this week. Wow, things are moving fast for sure....