Speed is a key consideration in web automation and testing. While network requests often play a key role in performance, inline resources can have an equally detrimental impact. And they're often overlooked. After all, inline stylesheets and scripts can slow down your Playwright scripts as easily as external ones.
In this article, we'll explore practical techniques to block inline styles and scripts that don’t serve your automation’s purpose, thus speeding up your Playwright scripts significantly.
For a comprehensive guide to managing resources that goes well beyond inline resources, be sure to visit my Playwright resource guide.
Block Inline Scripts Before Load
Playwright’s .addInitScript()
method is the gatekeeper that lets you run custom JavaScript code before anything else when a navigation event occurs. You also have the flexibility to lace your code with this interception on a Page
or Context
level:
// destroy all scripts on this page before they load
await page.addInitScript(() => {
document.addEventListener('DOMContentLoaded', () => {
document.querySelectorAll('script')
.forEach((el) => el.remove());
});
});
Block Inline Styles Before Load
Often overlooked, inline styles can be just as cumbersome, especially when dealing with large DOM trees or complex CSS animations. Blocking them ensures the browser won’t spend unnecessary cycles parsing and applying styles irrelevant to your script's goals:
// destroy all stylesheets on _all_ pages before they load
await context.addInitScript(() => {
document.addEventListener('DOMContentLoaded', () => {
document.querySelectorAll('style')
.forEach((el) => el.remove());
});
});
Block Other Inline Resources Before Load
This strategy isn't limited to obvious bottlenecks like CSS and JS. You can use the same method to gut the DOM of huge wastes of resources. This is especially useful if you're only focused on scraping data, not screenshots or other visual actions.
// remove huge swathes of the DOM before they load
await page.addInitScript(() => {
document.addEventListener('DOMContentLoaded', () => {
document.querySelectorAll('.sidebar, .footer')
.forEach((el) => el.remove());
});
});
While the idea of aggressively pruning the DOM for speed gains is appealing, it comes with caveats. Broadly eliminating parts of a page can lead to unexpected results, especially if you're removing elements that may serve as dependencies for others. On some websites, it may also trigger anti-bot measures. As always, test your work!
Block Dynamically-Loaded Resources
While the DOMContentLoaded
event tackles resources available at load time, dynamic pages often inject scripts and styles after the initial load. You can get around this by leveraging the MutationObserver
API, which allows you to intercept and manage assets as they appear:
await page.addInitScript(() => {
const observer = new MutationObserver((mutations) => {
for (const mutation of mutations) {
mutation.addedNodes.forEach((node) => {
if (['SCRIPT', 'STYLE'].includes(node.tagName)) {
node.remove();
}
});
}
});
observer.observe(document.documentElement, {
childList: true,
subtree: true,
});
});
Next Steps...
The approaches outlined here are but one facet of Playwright’s robust capability set. Harnessing these strategies propels you towards leaner, faster-executing scripts, steering clear of unnecessary rendering or computations.
Nonetheless, the journey to master web automation doesn’t stop with inlining. Blocking network traffic, embedding an adblock, and tweaking the browser runtime can all have a huge positive impact on your scripts' performance. For more info, read my Playwright resource guide. It's very thorough!
Top comments (0)