We all know it:
Bugs. Are. Nasty.
I use Endtest to quickly create Automated Tests and execute them on the cross-browser cloud.
You should check out the docs.
Here are the nastiest bugs I've missed and the lessons they taught me:
A few years ago, I was working for this online retail company.
Most of the flows there were automated.
When a user would place an order on the site, the products would be picked up from the warehouse and the invoice would be generated and printed automatically.
The bug crawled from under a change made by a colleague, in a completely different section of the platform.
We only tested around the change and then we released it.
They called us from the warehouse to tell us that all the printed invoices have $0.00 on them.
Surprisingly, no one got fired.
We didn't have automated functional tests back then, we only had unit tests.
I wish I had a tool like Endtest back then.
Testing around the change is only useful to find out if your change works as expected.
No one can predict how a change might affect other areas of the software.
Never release anything, without executing the regression test suites.
Another company: we were working on this functionality that could execute scheduled jobs for the users, pretty much like cron-as-a-service.
The users could pick a time for their scheduled job, we had a cron on our side that would run every minute, picking up the jobs that were scheduled for that specific time and executing them.
So, we decided to spin up a production clone, to test the release there.
After the release, users were complaining that their scheduled jobs were being executed twice.
We spent a few days looking through the entire code, we added patches with extra checks.
We even modified values from the server configuration!
Nothing fixed it.
I looked for similar issues on Stack Overflow, until I found an answer from someone saying that maybe it’s coming from a different instance.
“Yeah…right, wish it was that easy!”
And then it hit me: we forgot to disable the cron on the production clone…and we also forgot to take down the production clone.
Sometimes, the trees prevent you from seeing the forest.
That’s why you need a structured approach for testing.
Make a checklist, write some test cases, but make sure to do it before you actually start testing.
This other company that I worked for had contracts with all sorts of government agencies, contracts worth billions of dollars.
Each delivery phase, obviously, had a deadline.
The penalties for not delivering the software on time were significant.
The company tried to keep things as cheap as possible, with as few employees as possible; this lead to patches being added hours before the UAT sessions.
In case you’re not familiar with the term, User Acceptance Testing (UAT), also known as beta or end-user testing, is defined as testing the software by the user or client to determine whether it can be accepted or not.
These sessions were critical, consisting of guiding a lot of government big shots through the process of checking if the system works.
You had to book these weeks in advance, to find a window in their busy schedule.
If a significant issue was found, they wouldn’t sign the documents and the penalties would be applied.
Of course, we were smart enough to have automated tests and I was the one writing them.
As it was a web application, I just used Selenium and executed them the browsers from my own Windows machine, Internet Explorer included.
We also had a DevOps guy there, who insisted that the tests should be executed from our Jenkins and run there.
I didn’t even know that was possible, and he mumbled something about a headless Chrome browser on a Linux machine.
A patch was added, one day before the UAT.
We were confident, because the automated tests were checking that area, the full regression didn’t reveal any issues.
First day of the UAT: everyone noticed an issue where a form couldn’t be submitted, due to some broken input.
Just clear your browser cache.
Still not working.
Let me check.
Oh, this is bad.
That penalty was close to one million dollars.
Good news is that the company was prepared for that possibility from the start, so it wasn’t a tragedy.
Always test in real conditions. Always.
If you’re going to test on a headless browser on some Linux machine, you are taking a risk.
If you are using some cloud solution providing a Chrome browser without telling you which OS it runs on, it’s probably a headless browser on some Linux machine.
At the company I currently work for, we use Endtest.
It provides real browsers on Windows and macOS machines.
Honestly, it does help me sleep at night. I wouldn’t know if it’s the right tool for you, but for us it did wonders.
What are the ugliest bugs YOU missed?
I’m looking forward to hearing other stories.