DEV Community

Apify for Apify

Posted on • Originally published at blog.apify.com on

How to use Playwright selectors

What's Playwright and why use it?

Playwright is an open-source automation library designed especially for end-to-end testing, but it can also be used to perform other tasks, such as web scraping and user interface testing.

Playwright has features that can come in very handy for testing, scraping, and automation:

  • Playwright supports both headed and headless browser modes and provides better performance with its fast and easy-to-use API.

  • Playwright has the capability of running two tasks in parallel.

  • Playwright provides multiple browser support, including Chrome, Microsoft Edge using Chromium, Safari using Webkit, and Mozilla Firefox.

Here are a few examples of things that you can use Playwright for:

1. Form filling

Imagine you're automating a login process. You can use Playwright selectors to locate the username and password fields, then input your credentials and submit the form.

// Find the username input field.const unameInput = await page.locator('#username');// Fill in the username field with "johndoe".await unameInput.fill('johndoe');// Find the password input field.const pwdInput = await page.locator('#password');// Fill in the password field with "password123".await pwdInput.fill('password123');
Enter fullscreen mode Exit fullscreen mode

2. Web scraping

You can use Playwright for web scraping by utilizing Playwright selectors to extract data from web pages. For example, you could use a CSS selector to identify all of the <a> elements on a page and then use the textContent property to get the text of each link.

// Get all of the `<a>` elements on the page.const links = await page.$$eval('a', (links) => links.map((link) => link.textContent));// Iterate over the links and print their text.for (const linkText of links) { console.log(linkText);}await page.close();
Enter fullscreen mode Exit fullscreen mode

3. End-to-end testing

You can also use Playwright selectors to automate end-to-end tests. For example, you could use a CSS selector to identify the submit button on a form and then use the click() method to click the button.

// Find the username input field.const unameInput = await page.locator('#username');// Fill in the username field with "johndoe".await unameInput.fill('johndoe');// Find the password input field.const pwdInput = await page.locator('#password');// Fill in the password field with "password123".await pwdInput.fill('password123');// Click the login button.await page.locator('#loginButton').click();// Assert that the user is logged in.const logoutButton = await page.locator('#logoutButton');await expect(logoutButton).toBeVisible();await page.close();
Enter fullscreen mode Exit fullscreen mode

In the remaining part of the article, you'll learn more about selectors and how to use them.

Getting started with Playwright for automation, testing, and web scraping

Getting started with Playwright

Assuming you've never used Playwright before, you'll need to install it. Here's a quick step-by-step guide, assuming you have Node.js installed already:

  1. Create a folder in your desired location and name it playwright-sample.

  2. Open this folder in your terminal or command line interface and run npm init -y to initialize Node.js into the project folder.

  3. Follow the prompt and fill in the requested information to create a package.json file for your project.

  4. After performing the task above, run npm i playwright or yarn add playwright to install Playwright.

You can test the installation by creating an index.js file in the project root directory and writing the following code into the index.js file.

import { chromium, firefox, webkit } from 'playwright'//You can also use 'firefox' or 'webkit' instead of `chromium`const browser = await chromium.launch({headless: false});const page = await browser.newPage();await page.goto('https://blog.apify.com');browser.close()
Enter fullscreen mode Exit fullscreen mode

The explanation of the above code sample is as follows:

  1. Object destructuring was used on the first line to import the different browser instances available in Playwright.

  2. A specific browser was selected to run the sample code, which is chromium.

  3. A new page was created on the browser and https://blog.apify.com was opened in it.

  4. The browser was closed.

Understanding selectors in Playwright

A selector is an identifier that allows you to locate and interact with an element on a web page. Understanding the different types of selectors and how you can use them is essential for effective automation.

Types of Playwright selectors

Playwright provides a range of selector types, including:

1 CSS selectors

These types of selectors target elements based on their CSS attributes, properties, and relationships with other elements. For example:

//locate the button using a CSS selectorconst submitButton = await page.locator('#submit-btn');
Enter fullscreen mode Exit fullscreen mode

This locates the element that has the id: submit-btn on the webpage. CSS selectors can be an id, class, element, attribute, or pseudo-classes.

2 Text selectors

These types of selectors are used to locate elements based on visible text on the webpage.

//Locate the button using a text selectorconst button = await page.locator('button:has-text("Login")');
Enter fullscreen mode Exit fullscreen mode

3 XPath selectors

XPath selectors can be used to identify elements based on their position in the DOM tree, their relationships to other elements, and other criteria.

// Locate the second <li> element using an XPath selectorconst listItem = await page.locator('//ul/li[2]');
Enter fullscreen mode Exit fullscreen mode

In Playwright, any selector string that starts with // or .. is assumed to be an XPath selector.

Combining selectors

In more complex scenarios, you can combine multiple selectors to zero in on specific elements. This technique is especially handy when elements lack distinct identifiers.

πŸ–‡ Combining CSS selectors

You can combine CSS selectors using the >, +, and ~combinators. The > combinator selects all elements that are children of the element matched by the first selector. The +combinator selects all elements that are immediately adjacent to the element matched by the first selector. The ~combinator selects all elements that are siblings of the element matched by the first selector.

For example, the following selector selects all of the <a> elements that are children of the element with the container id.

css=#container > a
Enter fullscreen mode Exit fullscreen mode

πŸ–‡ Combining XPath selectors

You can also combine XPath selectors using the and (and), or (or), and not (not) operators.

The and operator selects all elements that match both selectors. The or operator selects all elements that match either selector. The not operator selects all elements that do not match the selector.

For example, the following selector selects all of the <a> elements that are children of the element with the id containeror the element with the id sidebar:

xpath=//container//a | //sidebar//a
Enter fullscreen mode Exit fullscreen mode

In the next section, I'll briefly explain the usage of Playwright selectors.

How to use selectors in Playwright

To use a selector in Playwright, you can utilize the locator() method. The locator() method takes a selector string as its argument and returns a Locator object. The Locator object can then be used to interact with the element that the selector identifies.

For example, the following code uses a CSS selector to identify the input element with the id username:

const input = await page.locator('#username');
Enter fullscreen mode Exit fullscreen mode

Once you have a Locator object, you can use it to perform actions on the element, such as clicking it, filling it in, or getting its value.

await submitButton.click();
Enter fullscreen mode Exit fullscreen mode

Playwright selectors vs. locators

Selectors and locators can be confusing if you dont understand them. Below is a table that explains the difference.

Selectors Locators
A selector is a string that identifies an element on a web page. A locator is an object that represents an element on a web page.
You can use a selector to create a locator. You can use a locator to interact with an element.
Selectors are more flexible and can be used to find elements in a variety of ways, including by their HTML elements, class names, and other attributes. Locators are more efficient because they only need to be created once for each element that they represent.

And here's an explanation of how selectors and locators work.

import { chromium, firefox, webkit } from 'playwright'const browser = await chromium.launch({ headless: false});const page = await browser.newPage();await page.goto('<https://www.apify.com>');// Get the selector for the login button.const ButtonSelector = ('a:has-text("Learn more about Apify Proxy")');// Create a locator for the login button with a timeout of 10 seconds.const ButtonLocator = page.locator(ButtonSelector, { timeout: 10000,});// Click the login button.await ButtonLocator.click();browser.close()
Enter fullscreen mode Exit fullscreen mode

In this example, the selector a:has-text("Learn") is used to find the button on the page. The locator ButtonLocator is created using the locator() method. The ButtonLocator object can then be used to interact with the button, such as clicking on it.

Also, the timeout option is passed to the locator() method. The timeout option specifies the amount of time that the browser should wait for the element to be found before throwing an error. In this case, the browser will wait for 10 seconds for the element to be found.

Playwright selectors: best practices

To maximize the effectiveness of your automation efforts, keep these tips in mind:

  • Use unique identifiers. Whenever possible, rely on unique identifiers like IDs or data attributes to locate elements.

  • Avoid unstable selectors. Selectors that are prone to change (like those based solely on positional relationships) should be avoided.

  • Update regularly. Periodically review and update your selectors to account for any changes in the websites structure.

Advanced Playwright selectors

Playwright also supports a number of advanced selectors, such as React selectors, Accessibility selectors, CSS pseudo-classes, and XPath functions. A few words about these:

1. React selectors

React selectors can be used to identify elements in React applications. For example, the <input>element with the id username in a react application can be found with this selector:

react=input#username
Enter fullscreen mode Exit fullscreen mode

2. Accessibility selectors

These selectors can be used to identify elements that are accessible to users with disabilities. For example, having the <button> element with the id submitButton with the aria-label attribute set to "Submit form" can be located with this selector:

accessibility=button#submitButton[aria-label="Submit form"]
Enter fullscreen mode Exit fullscreen mode

3. CSS pseudo-classes

CSS pseudo-classes can be used to select elements based on their state, such as whether they're focused or disabled. For example, the following selector finds all of the <input> elements that are focused:

css=input:focus
Enter fullscreen mode Exit fullscreen mode

4. XPath functions

XPath functions can be used to select elements based on their attributes, text content, and other properties. For example, the following selector finds all of the <a> elements that have the href attribute set to the value "https://www.blog.apify.com":

xpath=//a[contains(@href, 'https://www.blog.apify.com')]
Enter fullscreen mode Exit fullscreen mode

Start writing web automation scripts with Playwright

Playwright selectors are a powerful tool for automating web browsers. Now that you know the different types of selectors and how to use them, you can write more efficient and reliable web automation scripts.

To learn more about Playwright selectors, check out the Playwright documentation.

FAQs

What is Playwright, and why should I use it for web automation?

Playwright is an open-source automation library used for end-to-end testing, web scraping, and user interface testing. Playwright supports both headed and headless browser modes and provides better performance with its fast and easy-to-use API.

How do I get started with Playwright?

To get started with Playwright, install it as a dependency in your project using npm or yarn. Then, initialize a new project and set up your project structure to begin writing automation scripts.

What are the types of Playwright selectors available for use?

Playwright provides CSS selectors, text selectors, and XPath selectors. You can use these to locate elements on web pages, interact with them, and handle dynamic content.

What is the difference between Playwright selectors and locators?

Playwright selectors are specific to the Playwright library and are used to locate elements within its API. Locators are a broader concept used in test automation and can encompass various strategies for identifying elements, including Playwright selectors.

What are some best practices for using Playwright selectors effectively?

To use Playwright selectors effectively, it's advisable to rely on unique identifiers, avoid unstable selectors, and regularly maintain your selectors to cover changes in the website's structure.

How does Playwright compare to other web automation tools like Selenium and Puppeteer?

Playwright offers advantages such as multi-browser support, improved performance, and a more intuitive API compared to Selenium. It also has an edge over Puppeteer due to multi-browser support and its ability to handle complex scenarios effectively. Check out these posts on Playwright vs. Puppeteer and Playwright vs. Selenium.

Top comments (0)