DEV Community

Discussion on: Tell me a bug story

Collapse
 
philnash profile image
Phil Nash

I used to work on a site that you could log into with your Instagram account, pick a bunch of pictures and buy them as fridge magnets.

This worked mostly well, but for the occasional pack where some or all of the images would fail to download. We spent ages working with the code that downloaded the image, trying to find where the bug was. (I feel more confident with downloading images in Ruby now!)

Ultimately, we decided the code that was downloading the images wasn't the issue, so perhaps it was the Instagram API? Or a flaky connection from our server?

More investigation lead to the discovery that on occasion a user would just delete their picture from Instagram, leading to our failed download.

So, we moved the downloading from on demand when the print job was run to a background worker once the user made their purchase. Jobs would still occasionally fail.

We moved the job to before the user even completed their purchase. This helped, but jobs would still occasionally fail.

I'm not even sure you could call it a bug in the end. Some users were uploading pictures to Instagram just to get them printed and then deleting them immediately. It didn't matter how many workers we ran against the job queue, there was always a user that was faster at deleting their images. The eventual fix was the loosening of the Instagram restriction, instead allowing users to use Facebook photos or upload photos from their computer/phone. When users no longer had to use only Instagram to get their images on to our site things became better. This was more work for us (Instagram photos were just square at the time, which fit the magnets, opening up to non-square photos meant we needed an image cropper and just a lot more UI) but was better for the user.

Am I calling users bugs here? Of course not! But understanding the ways that user actions can affect the way your site works is just as important as an esoteric language exception. And if something is failing, there are more ways to fix it than just inspecting the code.