So you've written a Playwright automation, and now you want to make it fast.
One of the best ways to accomplish this is by blocking the myriad requests the browser makes after the initial page load. After all, depending on your needs, you may not actually need the browser to download all those images, fonts, or styles. Or you may find it's worth preventing the page from loading trackers like Google Tag Manager, thereby saving big in performance gains over the course of thousands of requests.
This article will go through five fast and effective strategies for blocking network traffic in Playwright. If you like what you see, check out my huge deep dive on blocking resources. It has more ideas for blocking network traffic as well as blocking inline assets from loading, embedding an adblocker in your headless browser, and optimizing the browser itself using CLI args and feature flags.
Without further ado...
1. Block Requests on a Single Page
Take control of network traffic with Playwright’s page.route()
method, specifying patterns of resources that should not be loaded:
// block all CSS files from loading on this page
await page.route('**/*.css', (route) => {
route.abort();
});
You can block any type of file, from fonts to gifs, with matching patterns. Furthermore, you can dynamically adjust which resources to block or allow, switching strategies mid-script to suit your needs.
2. Block Requests Across Multiple Pages and Contexts
For scenarios where you're dealing with multiple tabs or pages, Playwright's flexibility shines by letting you set route handlers on the context
object. This means blocking settings will apply across all pages spawned from the same context:
// block all JS from loading across _all_ pages in this context
await context.route('**/*.js', (route) => {
route.abort();
});
3. Fine-Grained Control Using Regular Expressions
Sometimes you need the precision of regular expressions to capture complex patterns or conditions for blocked resources. Playwright supports regex, broadening your capability to fine-tune what gets loaded:
// use regex to match anything
await page.route(/(items|groups|widgets)/u, (route) => {
route.abort();
});
4. Block Requests by Content Type
To save you from the hassle of maintaining exhaustive blocklists, Playwright allows you to block resources based on their content type, leveraging the browser's ability to understand the type of resource being loaded:
// block images, regardless of extension
await page.route('**/*', (route) => {
if (route.request().resourceType() === 'image') {
route.abort();
} else {
route.continue();
}
});
5. Block Requests Using Custom Conditions
The real power comes with the ability to use arbitrary logic to determine what gets blocked. Access all aspects of the request via the Request
object and build your own conditions for an ultra-specific blocking strategy:
await page.route('**/*', (route) => {
const req = route.request();
// block by method
if (req.method() === 'DELETE') {
return route.abort();
}
// block by header
if (req.allHeaders()['X-Source']?.includes('dangerous')) {
return route.abort();
}
// block by body
if (req.postDataJSON()?.length >= 3) {
return route.abort();
}
route.continue();
});
Next Steps...
These strategies are just a starting point for making your Playwright scripts more efficient. But it doesn't end here.
Remember to check out my deep dive on blocking resources. It covers blocking inline assets, leveraging adblock blacklists, and optimizing the browser itself using CLI args and feature flags. In fact, my BrowserCat blog articles all have plenty more advice on working with Playwright.
Top comments (0)