DEV Community

Cover image for Tests with Playwright + AI: Superpowers!
orliesaurus
orliesaurus

Posted on

Tests with Playwright + AI: Superpowers!

This post is about a new plugin that we released to help Playwright users leverage AI Vision models to achieve more.

Fast forward

If you're in a rush you can skip the whole article and just jump to the code examples here: https://github.com/dashcamio/goodlooks?tab=readme-ov-file#examples


End-to-end tests (E2E tests) simulate how a real user interacts with a web application, from opening the browser to clicking buttons and filling out forms.

They aim to ensure the entire user journey functions as expected.

However, sometimes tests pass, and sometimes they fail for seemingly random reasons.

These are called flaky tests.

E2E

Flaky tests suck but there are many other things that are really hard to test. Here are some examples:

  • Application State: If you click play on a video, has the video started playing?
  • Image Contents: What's this image depicting? Does it make sense in the context of the title of the page?
  • Responsiveness: Given a mobile design, does it match the mobile rendered version in the viewport.
  • Accessibility: Is this specific UI element accessible according to best practices?

Flaky tests

Developers love things that work or don't because they can fix things, but flaky tests…frustrating. They introduce uncertainty into the testing process. Test either pass or don't, they can't possibly only pass sometimes!

And the cherry on top, you have to maintain a lot of tests through a framework…sometimes there are hundreds of tests and some are flaky…

The worst type of flaky test are the E2E ones (End-2-End/End-to-End tests)

Example: Imagine testing a car. A unit test might focus on ensuring the engine starts, while an E2E test would simulate a whole driving experience. The E2E test provides a broader view but is more complex and prone to disruptions from external factors like traffic lights. What happens if the traffic light breaks? What happens if there's a massive hole on the road where you're driving the car?

Ah, the good old testing dilemma

Managing tests: enter testing frameworks

Developers hate flaky tests, so to make things better, they invented tools to make their lives worse...I mean better.

These tools manage the execution of tests. They will tell you whether a test has passed or failed.

Playwright is one of the many testing frameworks out there. Specifically, this one has exploded in popularity.

It is specifically designed for web applications. It allows developers to write tests in JavaScript, Python, or TypeScript.

Playwright: why it makes sense and how can we make it better

Playwright is the new cool thing

Playwright is one of the latest testing frameworks for web apps, with decent in-built reporting capability.

It is a reliable and performant framework with support for multiple browsers, so you can install it and have the certainty that your tests will run on most browsers: write a test once and let it run in parallel through different browsers with almost no adjustments.

As a tester, you know you must write and maintain tests regardless of the framework. There are some rules that need to be maintained.

Test executions needs to start from a fresh slate

You can't start tests from a "dirty" environment, because that might impact how the test execution will be performed. Thus you will be able to set up a test environment as a user would experience it, the first time..and/or the N-th time around.

Tests need to be updated

Tests are often susceptible to changes in the application. Even minor UI tweaks, like button placement or text changes, can break the tests because they rely on specific interactions with CSS selectors. This means you need to update the tests frequently to keep them working.

Tests simulate real user journeys

Tests can involve many steps and interact with various parts of the application. This complexity makes them harder to understand, debug, and modify than more focused unit tests…and sometimes on a user journey you encounter things that a computer might not be able to simulate.

It…

gets…

complicated…

👉 And we know that user journeys are complex. How do you test that a video autoplay, or doesn't autoplay on page load?

👉 How do you ensure that the brand guidelines are respected?

Before and after AI

Extending Playwright with AI

The more time is spent maintaining tests, the less time is spent on testing new features, and new user paths and ensuring the release of good quality software…

Thus, flaky tests are time suckers for developers: it takes time to investigate a flaky test and understand whether it is a false-positive or a real issue. When flaky tests are spotted, they might influence a whole set of tests that depend on them

Validate Web Pages with AI + Natural Language

Unreliable UI tests that break with every code change are really the worst. Some things that Playwright can't do well with test that require visual consistency. This can add a lot of extra work for QA and test developers.

There are workarounds but they might take you down a horrible nightmare. A time-consuming nightmare.

Introducing our Playwright plugin: Goodlooks

Goodlooks

GoodLooks offers a revolutionary solution: visual validation with natural language.

Here's how simple it can be:

//import playwright
const { test, expect } = require("@playwright/test");

//import goodlooks
const goodlooks = require("goodlooks");

//use this API key or register for your own
goodlooks.configure("zpka_c0d0539ada014283bc974f0fd55835ea_2b745cbf");

expect.extend(goodlooks);

// write your first test that goes to a rick roll video on YouTube
test("rickroll", async ({ page }) => {
  await page.goto("https://www.youtube.com/watch?v=dQw4w9WgXcQ");
// write in natural language the body of the test
  await expect(page).goodlooks("video is not playing");
});
Enter fullscreen mode Exit fullscreen mode

The outcome should be a message similar to the following:

✅ PASS. The page shows a video with the play button available and a timeline that is not progressing, indicating that the video is currently not playing.

Enter fullscreen mode Exit fullscreen mode

The Problem with Static Tests

Traditional UI testing relies on static selectors. But what happens when things change?

  • Is that button really missing, or did the ID just change?
  • Does the layout adapt seamlessly for mobile devices?
  • Are the correct images displayed?

Validating these visual aspects with selectors alone is impossible. Manual testing is an option, but it's inefficient and prone to human error.

Natural Language to the Rescue

With an LLM, and more generally, AI, we can enable visual validation with natural language prompts.

Instead of writing code to target specific elements, you describe what you want to see on the page.

This approach offers several advantages:

  • Resilience to Code Changes: Tests remain functional even if the underlying code structure evolves.
  • Focus on Visual Quality: Validate the actual appearance of your webpage, ensuring a consistent user experience.
  • Simplified Testing: Write clear and concise prompts in plain English, eliminating the need for complex code.

...and even more importantly, the ability to test things that:

  • Testing subjective qualities like balance, alignment, and adherence to brand guidelines - things that are difficult to capture with code.

But with goodlooks it gets as simple as that:

const { test, expect, devices } = require("@playwright/test");

const goodlooks = require("goodlooks");
goodlooks.configure("zpka_c0d0539ada014283bc974f0fd55835ea_2b745cbf");

expect.extend(goodlooks);

test("ycombinator", async ({ page }) => {
  await page.goto("https://news.ycombinator.com");
  await expect(page).goodlooks(
    "there is an orange strip at the top of the page"
  );
});
Enter fullscreen mode Exit fullscreen mode
  • Images content: Playwright can easily verify if an image exists, but not necessarily if it's the correct one. With this approach you can fully recognize specific images and confirm they're displayed in the right places. This is particularly useful for e-commerce sites or applications with dynamic content.

But with goodlooks you can simply do:

const { test, expect, devices } = require("@playwright/test");

const goodlooks = require("goodlooks");
goodlooks.configure("zpka_c0d0539ada014283bc974f0fd55835ea_2b745cbf");

expect.extend(goodlooks);

test("correct image appears", async ({ page }) => {
  await page.goto("https://eloquentjavascript.net/");
  await expect(page).goodlooks("there is bird on this page");
});
Enter fullscreen mode Exit fullscreen mode
  • Accessibility testing: Playwright doesn't offer built-in features for accessibility testing, which ensures interfaces are usable by people with disabilities. You might need specialized accessibility testing tools.

In conclusion the future of UI Testing has started already

Test that flake are bad, but tests that are hard to maintain are worse.

Even more so, the inability to test certain things is even worse and very time-consuming because it relies on manual testing.

Obtaining feedback is crucial to ensure that users comprehend the content, can easily accomplish their objectives and tasks, and find your webpage appealing.

Due to the costs and delays involved, running frequent manual tests can be a challenge, but AI can provide assistance whereas a year or so ago, this was much harder to implement.

👉 Take a look at Goodlooks on Github:

GitHub logo dashcamio / goodlooks

Visually Validate Playwright Tests Without Flaky Selectors

GoodLooks Logo

Visually Validate Playwright Tests Without Flaky Selectors

Static selectors break with code changes and can't prove that a site "looks good". Is that button really missing or was the id changed? Is the site responsive on mobile? Is the correct image showing? These kinds of tests are impossible to validate with selectors alone and take a lot of time to test manually. GoodLooks.ai lets you visually validate your web pages with natural language prompts instead of selectors.

Check out our other products: TestDriver.ai and Dashcam.io.

Quickstart

  1. git clone git@github.com:dashcamio/goodlooks.git
  2. npm install
  3. npx playwright test

Note that these examples use a demo key that gets rotated weekly; you'll want to create your own API KEY.

Examples

Element Visibility

Validate that a cookie banner shows up.









Input
Code

Framer.com Cookie Banner

const { test, expect } = require("@playwright/test");
const goodlooks = require("goodlooks");
goodlooks.
Enter fullscreen mode Exit fullscreen mode

Image description


Top comments (0)