DEV Community

pauliedoherty for Beanworks

Posted on

The Painful Parts of End-to-End Test Automation for your Windows Application


This post is targeted at all you test engineers who do not want to add C# and all the wonderful Visual Studio tools that come with it to your test automation stack. You’re thinking, since our web app and mobile app’s e2e testing are already implemented with NodeJS why should testing our windows app be any different?

Here are the weapons we will be wielding to achieve this:

  • A minimum of Windows 10 or Windows Server 2016 desktop
  • An automation tool to drive your application under test (AUT) - WinAppDriver & Appium
  • A compatible automation client - WebdriverIO
  • A test runner - Mocha
  • A tool to run tests without manual user logon - paexec
  • An Automation Server of sorts (details of this are out of scope of this post)

Other handy extras:

...Note: Many parts in this post are only vaguely covered because there are already so many great resources out there. The parts that are described in detail were difficult to find good resources on, hence this post...

Pain Point 1 - Preparing Test Environment for Automation (lil' painful)

WinAppDriver & Appium:

WinAppDriver is Microsoft's in-house developed tool to automate Universal Windows Platform (UWP), Windows Forms (WinForms), Windows Presentation Foundation (WPF), and Classic Windows (Win32) applications and is developed with Appium integration in mind. WinAppDriver is what does the heavy lifting behind the scenes. It carries out our mouse clicks and keyboard presses as we request. The image below shows the communication flow of our solution:

Alt Text

Appium is a RESTful server that acts as a wrapper between WinAppDriver and your automation client. Appium accepts commands from WebdriverIO and forwards them to WinAppDriver. In response to a request, it returns a status code and logs to the automation client.

To run WinAppDriver & Appium on our test environment we need to:

  1. Download and install WinAppDriver
  2. Enable Developer mode on Windows
  3. Install NodeJS

With these prerequisites installed you are ready to automate using WebdriverIO’s test runner - wdio.

Pain Point 2 - Configuring WebdriverIO and running your first test (somewhat painful)


WebdriverIO (wdio) is a Selenium-like client service that sends requests to Appium via W3C WebDriver Protocol. This is the framework we will use to write our test cases and specify how they will run.

Before proceeding, please initialize a nodejs project in your users home directory. If you're unsure how to do this Phil Nash's guide is excellent. I have creatively decided to call my project my-project.

Taking from WebdriverIO's guide let's set up the test-runner by opening the Command Prompt and running:

mkdir my-project & cd my-project
npm install @wdio/cli
npx wdio config
Enter fullscreen mode Exit fullscreen mode

This will launch a helper utility where you choose what services and project structure you wish to use when running wdio. To get things started please mirror these selections:

Alt Text

Your project file tree should look like:

Alt Text

wdio.conf.js is the file that holds the configuration properties of the entire communication flow diagram in the first section. It is now loaded with the properties needed to allow communications between our webdriver client and Appium server. However, we still need to define our capabilities. Capabilities define which environment we are running Appium on, which protocol Appium will use, and what AUT it will be automating. Let’s uphold tradition and use the classic calculator example. Below are the capabilities required to connect Appium to the calculator application on Windows Server 2019 along with other properties relevant to this post:

// wdio.conf.js snippet
    runner: 'local',
    path: '/wd/hub',
    port: 4723,
    baseUrl: 'http://localhost',

    capabilities: [{
        platformName: 'windows',
        automationName: 'windows',
        deviceName: 'WindowsPC',
        app: 'C:\\Windows\\System32\\win32calc.exe'

    services: [
        ['appium', {
            logPath: './'     // logs are friends

    framework: 'mocha',

    specs: [
Enter fullscreen mode Exit fullscreen mode

A few things to note here -

  1. Defining appium in services instructs wdio to start an Appium server
  2. Runner, path, port and baseUrl properties are used to connect wdio client to the host Appium server
  3. Defining capabilities tells Appium what application to automate and what automation tool we want it to configure; WindowsPC maps Appium to WinAppDriver
  4. We define mocha as the test framework and
  5. We tell said framework what location the spec file are in

More details on Appium/WinAppDriver can be found here.

Side Note:

I highly recommend using Appium-Desktop as a development tool to troubleshoot any Appium connectivity issues. You can download it here.

Start Appium Desktop from Start Menu, open File -> New Session Window… and enter the capabilities above in the 'Desired Capabilities' tab:

Alt Text

Click Start Session to connect Appium to calculator app:

Alt Text

If you cannot connect to your AUT with capabilities in Appium-Desktop, you will also not be able to connect through wdio, so make sure to get things working here first. Here's a contrived error on Appium-Desktop to give an example of error logs:

Alt Text

Now with our capabilities sorted we can start writing test cases for the wdio client.

Pain Point 3 - Controlling app elements (pretty painful)

Shifting attention to our test spec file ./test/spec/example.e2e.js, we can define code to send requests to the Appium server. One of the beauties of webdriver is that the client object is accessible anywhere in wdio’s scope so we can directly call its properties. Let’s do some quick math to verify our calculators sum() functionality => 2 + 2 = 4

// ./test/spec/example.e2e.js
describe('Quick math', () => {
    it('Two plus two is four', () => {
        $('~132').click();              // click 2
        $('~93').click();               // click +
        $('~132').click();              // click 2
        $('~121').click();              // click equals
        const result = $('~150')        // get result
        expect(result).toHaveText('4 ') // verify it equals 4 
Enter fullscreen mode Exit fullscreen mode

A few things to note here:

  1. Test spec files are using mochajs
  2. Webdriver uses expect for assertions
  3. The selector method uses ~ to identify the AutomationId. There’s not much clear cut documentation on this but you can work back from here to figure it
  4. Yes, I did append a space on the 4

We are finally ready to run our tests! In the Command Prompt run

npx wdio
Enter fullscreen mode Exit fullscreen mode

Here’s the result from npx wdio without appending a space:

Alt Text

And here it is with appending a space:

Alt Text

OK, I cheated. I should have parsed the data before asserting. But I refuse to apologize.

Side Note:

Use inspect.exe to find the AutomationId of your elements. Open inspect.exe and hover mouse over element you wish to inspect. Here’s an example of the ‘2’ button on the calculator app:

Alt Text

Thus concludes setting up automation of a windows desktop application.

On a side note, here's a great example of a scalable, maintainable wdio project using the Page Object Model pattern.

Pain point 4 - Running WDIO in a CI/CD pipeline without manual access to the GUI (painful)

What we have achieved so far is great for development and presentation purposes. But to provide real value and execute this in a CI/CD pipeline we will need some extra setup in our test environment. Keeping cost saving in mind, we will not want our test environment running when tests are not being executed. It also becomes clear that Remote Desktop (RDP) is not very useful when it comes to automated pipelines. This now presents an interesting challenge - how do we run our tests with no user interaction or manual login step? One way is to send the test command over ssh. This is what we will explore.

Aside: configuring communication to the test environment is out of scope of this post; there are lots of great articles on setting up ssh servers out there.

This part assumes we are running our test environment in a cloud based CI/CD pipeline, therefore the test environment is not being accessed through the GUI via RDP nor is it sitting beside you on your desk. With all that in mind, let’s remotely run our test cases.

ssh ${USER}@${IP_ADDRESS} 'cd my-project & npx wdio'
Enter fullscreen mode Exit fullscreen mode

Hmmmm, we are met with a pleasant 500 server error.

Alt Text

Let’s see what’s going on by checking our processes in parallel while wdio is attempting to run:

ssh ${USER}@${IP_ADDRESS} 'tasklist /fi "imagename eq node.exe" & tasklist /fi "imagename eq win32calc.exe"'
Enter fullscreen mode Exit fullscreen mode

This gives us:

Alt Text

Hmmmm, node.exe and win32calc.exe are running as services on Session# 0. Service processes on Session# 0 are reserved for user agnostic processes and do not run with a GUI. Therefore Appium/WinAppDriver has no GUI output for it to find the running calculator application which also doesn't have a GUI output. This is sad times for people hoping to test the GUI.

Let’s take a step back to figure out what's going on. Let's open an RDP connection to our test environment. Here, we can directly run our tests, and in parallel check our running processes.
Using the Command Prompt (cmd) on the test environment let's run

cd my-project & npx wdio
Enter fullscreen mode Exit fullscreen mode

And at the same time in a separate terminal run:

tasklist /fi "imagename eq node.exe" & tasklist /fi "imagename eq win32calc.exe
Enter fullscreen mode Exit fullscreen mode

We see the tests have ran successfully:

Alt Text

And our processes are showing an RDP Session which does use the GUI. All actual logged in users get assigned Session# > 0.

Alt Text

To solve the issue in our CI/CD pipeline we must configure our test environment to run our code as a logged in user. This will allow node.exe and win32calc.exe to run as a GUI Session.

To achieve this, we must do 2 things:

  1. Enable Autologon &
  2. Use paexec to run processes as the System account user

This solution is stated by hepivax on this thread.

Enable AutoLogon:

Please use guide here.


PAExec is an open-source equivalent of PsExec

PAExec allows you to run interactive command-prompts on local and remote servers as the System user account (it can do a lot more too!).

Let’s install it on our test server and add it to a System Variable location so we can run it from anywhere with cmd:

  1. Download paexec.exe
  2. Copy and paste to C:\Windows\paexec.exe (If you’re unsure of this step refer to this)
  3. Test cmd has access to paexec.exe by running
where paexec
Enter fullscreen mode Exit fullscreen mode

it should return C:\Windows\paexec.exe

You can run paexec in command prompt for more details on its use cases.

Now that PAExec is installed and accessible, we can open a Command Prompt with PAExec so that cmd is running under System user. Let's do this and verify where we are in the file system by running:

paexec -s cmd /C "cd"
Enter fullscreen mode Exit fullscreen mode

Breaking this command down:
paexec -> Executes paexec.exe
-s -> Flags PAExec to run requested process under System user
cmd -> Open Command Prompt
/C -> Tells PAExec to pass a command to Command Prompt
“cd” -> Once paexec runs Command Prompt as System user it executes cd which returns the current directory path

This outputs our System account user path, verifying we are running cmd as System user:

Alt Text

Now any command we choose to execute in the Command Prompt with PAExec will run under the System user

...piecing it all together
Over in our test automation server (or whatever you are using to remotely trigger your test suite), run the following:

ssh ${USER}@${IP_ADDRESS} 'paexec -s cmd /C "cd C:\Users\${USER}\my-project & npx wdio"'
Enter fullscreen mode Exit fullscreen mode

Note: Remember the new path is required here because we opened a cmd window from the System user account with PAExec.

...And in parallel check that node.exe and win32calc.exe are running as a logged on user with GUI access:

ssh ${USER}@${IP_ADDRESS} 'tasklist /fi "imagename eq node.exe" & tasklist /fi "imagename eq win32calc.exe"'
Enter fullscreen mode Exit fullscreen mode

Let's look at our processes:

Alt Text

We see they are running under a Session# > 0 and Voila! Test executed successfully.

Top comments (3)

kvnaveen profile image

Great Post . But when i tried to run i get an error The main Appium script does not exist at 'C:\Windows\system32\config\systemprofile\AppData\Roaming\npm\node_modules\appium\build\lib\main.js
It is because the appium is not installed as Local System user and it is installed as Administrator user . I tried hardcoding the Appium Path using the AppiumService Options but still no luck. Any suggestions please.

_scottcondron profile image
Scott Condron

Great post! Thanks for sharing :)

pauliedoherty profile image

Thanks for reading :)