DEV Community

Kevan Y
Kevan Y

Posted on

Hard bug fixed by a simple fix

Opened my Pull request on adding best practices for the SSO services. The e2e test was failing in the pipeline, despised testing locally.
On the CI, the test case was only failing on Chromium.

FAIL browser: chromium src/api/sso/test/e2e/auth-flows.test.js (28.142 s)
  Authentication Flows
    ✕ Login flow preserves state param (25222 ms)
Enter fullscreen mode Exit fullscreen mode

First, some of my reviewers are using Mac M1 (ARM architecture), they were not able to build the image properly since the node-alpine version because it was missing some build-in package to build the image. I had to use a bigger node version.
After pushing those changes, @humphd helped me to debug, this issue wasn't able to be producible locally.

I started to run on different environment, and this is the result showed below:
Windows 11 pro 21H2 - test case pass. (by @Kevan-Y )
Windows 11 pro 21H2 on WSL2 Ubuntu-20.04 - test case pass (by @Kevan-Y )
Amazon Linux 2 AMI (HVM) - Kernel 5.10 (ami-0c02fb55956c7d316 (64-bit x86)) - t2.large - test case pass (by @Kevan-Y )
MacOS M1 - test case pass (by @humphd)

It wasn't looking good. This makes debugging hard since it's only happening on CI.

I started to increase the timeout of the test case to 30000ms

it('Login flow preserves state param', async () => {
    const { state } = await login(page, 'user1', 'user1pass');
    // Expect the state to match what we sent originally (see index.html <a>)
    expect(state).toEqual('abc123');
  }, 30000);
Enter fullscreen mode Exit fullscreen mode

Despite increasing timeout it was failing.

Possibility due to some other container such as elastic search, it crashed, since that container had some issue with the memory not enough allocated. I increased the memory in the docker-compose file.

  elasticsearch:
    environment:
      # Limit the initial heap size. By default it will use 1/4 of available RAM
      - 'ES_JAVA_OPTS=-Xms512m -Xmx512m'
Enter fullscreen mode Exit fullscreen mode

After pushing these changes, the CI pipeline was still failing.

I decided to skip all the e2e test cases and only run Login flow preserves state param of src/api/sso/test/e2e/auth-flows.test.js for the purpose of debugging and also turn on LOG_LEVEL to debug.
I pushed those changes, but it wasn't helping. They were 0 logs that helped to identify which place it when wrong. So I switch to the basic way of debugging. "Put console.log on each line"

In src\api\sso\test\e2e\auth-flows.test.js

 it('Login flow preserves state param', async () => {
    console.log('Start test case by await login');
    const { state } = await login(page, 'user1', 'user1pass');
    console.log('Done await login', state);

    // Expect the state to match what we sent originally (see index.html <a>)
    expect(state).toEqual('abc123');
  }, 30000);
Enter fullscreen mode Exit fullscreen mode

In src\api\sso\test\e2e\browser-util.js

const login = async (page, username, password) => {
  console.log('Start login and wait for page');
  // Click login button and then wait for the login page to load
  await Promise.all([
    page.waitForNavigation({
      url: /simplesaml\/module\.php\/core\/loginuserpass\.php/,
      waitUtil: 'networkidle',
    }),
    page.click('#login'),
  ]);
  console.log('after click login');

  console.log('click input[name="username"]');
  // Fill the login form, star with username
  await page.click('input[name="username"]');
  console.log('after click input[name="username"]');

  console.log('fill input[name="username"]');
  await page.fill('input[name="username"]', username);
  console.log('after fill input[name="username"]');

  // Now enter the password
  console.log('click input[name="password"]');
  await page.click('input[name="password"]');
  console.log('after click input[name="password"]');

  console.log('fill input[name="password"]');
  await page.fill('input[name="password"]', password);
  console.log('after fill input[name="password"]');

  console.log('wait for the new page to load');
  // Click login button and then wait for the new page to load
  await Promise.all([
    page.waitForNavigation({
      url: /^http:\/\/localhost:\d+\/auth\.html\?access_token=[^&]+&state=/,
      waitUtil: 'load',
    }),
    page.click('text=/.*Login.*/'),
  ]);

  // The token and state will get returned on the query string
  return getTokenAndState(page);
};
Enter fullscreen mode Exit fullscreen mode

After pushing this change, I finally got some good debug result. The test case was timing out in

await Promise.all([
    page.waitForNavigation({
      url: /simplesaml\/module\.php\/core\/loginuserpass\.php/,
      waitUtil: 'networkidle',
    }),
    page.click('#login'),
  ]);
Enter fullscreen mode Exit fullscreen mode

One possible issue mentioned by @Humphd, the event was sitting on that page for some reason and never navigates.
Instead of waitUtil: 'networkidle', we change to waitUtil: 'load'. see reference
Even with that, it didn't work. But the ultimate fix was to try to update from version 1.19.2 to 1.20.2 of playwright which might potentially fix it.
It was the right solution, just an update of the dependencies, and everything works fine.

Spend a couple of days trying to debug this, and the fix was just a dependency update. It was a good experience, in the end. Sometimes we have to wait for the upstream to fix for us.

Discussion (1)

Collapse
amasianalbandian profile image
AmasiaNalbandian

and the fix was just a dependency update.

Yikes. Got to hate it.