DEV Community

Cover image for Getting Started With askui
Jonas Menesklou for AskUI

Posted on • Edited on

Getting Started With askui

Tip: Watch our how to get started video here

Test automation is still considered a bottleneck in modern software development. Especially UI automation can be quite challenging in certain cases. Despite countless frameworks and tools, it is very difficult to automate applications across operating systems, although cross-platform workflows play an increasingly important role. For example, testing a 2-factor authentication of a web application via smartphone is almost impossible to automate as an end-to-end process.

askui tries to close this gap by rethinking UI automation.

About askui

Basically, askui is a platform-independent, selector-free UI automation framework. Instead of controlling elements via selectors such as XPath or CSS selectors and thus being limited to web applications, askui controls the UI at the operating system level.

For making this possible, a neural network was trained on the appearance of UI elements in order to localize UI elements based on screenshots. These are then matched with an instruction that is unique for the respective element. The final execution of the instruction happens via mouse and keyboard control on the operating system. Such an instruction could be for example:

aui.click().button().below().text().withText("Password").exec();

If we take a look at the corresponding UI, it quickly becomes clear which action is to be executed:

Image description

The instruction describes a click on a button below a certain text, in this case Password. Visually, there is only one button here, so when the action is executed, a click on the blue Login button is performed.

Why askui

In principle, one can of course ask whether another test automation tool is needed after Selenium and co. If you look at the development of the last few years, you will notice that websites and apps are using more and more new technologies, e.g. iFrame or ShadowDOM, and third-party systems, which make it more difficult to automate them. Instead of developing a separate automation solution for each new technology, askui wants to act independently of these developments and act as a future-proof framework for UI automation of any operating system. The following advantages result from the approach:

  • Test instructions are written in UI language and are thus easy to understand.
  • No access to code selectors is needed. Thus the tests are independent of changes in the program code and run very stable.
  • Elements are found visually by an AI and not by pixel-matching. This makes the automation completely independent of the resolution and design of the UI elements.

In the following, we will take a look at the first steps in askui.

Setting up the IDE

askui is an open source framework with a public documentation. You can install askui in an IDE of your choice, we will use Visual Studio Code in the following. Firstly, open the folder where we want to install askui. In my example this folder is called "Installation". It should look something like this:
Image description

Next, we need to set up the npm project. For this we open the terminal and type the following command:

npm init -y

This will create a package.json file with descriptions and dependencies of the project. The next step is the installation of askui itself.

Installing askui

The following command is used for installing askui:

npm i -D askui

Thereby askui library provides everything needed for the automation of the operating system. It does not yet provide everything you need for writing and executing a test. You also need a way of

  • writing up the actual test,
  • writing up assertions to test wehther an expectation holds true and, last but not least,
  • a way to execute the tests, i.e., a test runner.

One framework which provides all of this out of the box is Jest which we are going to use in this example as it is quite easy to get started with and well-known. But feel free to use another test framework, such as Jasmine or Mocha. How you use the askui library should be pretty much the same across these frameworks. For installing Jest, type the following:

npm i -D jest

Furthermore, we are going to use TypeScript for writing the test instead of plain JavaScript. Run the following command to install Typescript, TS-Node for using Typescript together with Node.js and the types of Jest and Node.js.

npm i -D @types/jest ts-jest ts-node typescript

Your IDE then should look similar to this:

Image description

Now, we are ready to write our first test.

Running Your First Test

For creating your first test suite, type following in the terminal:

npx askui init

This will create a few files, read here if you want to learn more about them.

We will just focus on test/my-first-askui-test-suite.test.ts for now. This file includes your first example test which should look like this:

Image description

This test is going to click on a random text on your screen after executing. But before we execute it, we want to make sure it will execute on the right screen. Therefore, we will check the file test/helper/jest.setup.ts. There you will find a setting for display which is by default set to 0. This means the test will be executed on your main display. If you don't have any external screen this setting can be ignored. If you do have external screens, you should set it to the screen you want to automate on, e.g. for the first external monitor, change it to 1 and so on.

Now we are ready to execute our first test. Therefore, we type following command in the terminal:

npx jest test/my-first-askui-test-suite.test.ts --config ./test/jest.config.ts

You should now see the test suite being executed inside the shell and, actually, your cursor should move to some text shown on your screen and click on that text. 🎉 Congratulations! You just executed your first test suite using askui.

Tip: If you want to see what the AI model detects on your screen, use the annotateInteractively (see here) command. It will create an overlay where you can check all detected UI elements and their classification.

Top comments (0)