DEV Community

Ben Halpern
Ben Halpern

Posted on

What's the longest you've ever spent debugging a single bug?

Oldest comments (60)

Collapse
 
ashleemboyer profile image
Ashlee (she/her) • Edited

I spent several weeks trying to figure out why images from Windows Snipping Tool could not paste into Quill WYSIWYG and then a couple more weeks trying to fix it and work with other kinds of text and image pastes. I even wrote an issue for it that's still open! I've changed jobs 3 times since I wrote this and now I'm back to using it again in my current work project.

Cannot paste images from Snipping Tool #2539

A paste event is detected, but the images never show when you try to copy and paste images from things Windows Snipping Tool. Copying and pasting images from Google, for example, has no issues.

It seems like there is a timing issue for reading files with a base64. I have not been able to reproduce a "fix" I discovered in CodePen, but in the actual project I'm using Quill for, extending the Clipboard module and lengthening the timeout duration at the end of the default onPaste function makes pasting from Snipping Tool work. The bigger the image that needs to be pasted, the larger the duration needs to be.

Again, I am not able to reproduce a bug caused by my "fix", but in my project, lengthening the timeout duration causes two "regular" images to be pasted. I'm throwing this part out there in case it comes up for anyone else. It may be something in my project.

Steps for Reproduction

  1. Visit my CodePen
  2. Capture an image with Windows Snipping Tool
  3. Copy the image and try to paste it in the editor
  4. No image is pasted

Expected behavior: All image pasting should behave consistently.

Actual behavior: Cannot paste images from snipping tools.

Platforms: Windows 10 (I have not tested this on others yet) Chrome 72

Version: My project uses 1.3.4, but the issue persists in 1.3.6. The CodePen is using 1.3.4.

Collapse
 
hb profile image
Henry Boisdequin • Edited

I've been debugging this PR's formatting bugs and Git errors for around a week πŸ˜…

Collapse
 
baenencalin profile image
Calin Baenen

Well, when I was working on the original version of Janky in Python, I was working for god remembers how long trying to figure out how to get properties to work.

And when making RuntDeale prototype in Py, it had a bit of an ordeal with how TKinter wanted to render things.

Collapse
 
glsolaria profile image
G.L Solaria

I spent a week trying to figure out if I was doing something wrong or if I had found a genuine bug in WCF. My gosh it almost broke my spirit. I don't think it has been resolved yet.
github.com/microsoft/dotnet/issues...

Collapse
 
phlash profile image
Phil Ashby

Ewww, nasty. I too have spent waaay too long reading the source for WCF when things do not behave as documented/expected! Probably the longest was when investigating session leakage while using the WS-SecureConversation protocol. It seems absolutely nobody else in the world made that decision, and we probably shouldn't have either, but customers were now using it (30k+ of them) so we had no choice but to find & fix the leaks.. all told a rotating team of 3-4 people spent ~1 year (over a period of 6 months) finding all the ways customers could break stuff and patching up the server side...

Just before I retired, we had a plan to emulate the session aspects of the protocol, and I had a POC working which avoided actual server-side sessions, it employed JWTs to carry the security session data back and forth instead. This would have fixed a lot of problems with state management and scalability, I have no idea if it got implemented!

Collapse
 
djtai profile image
David Taitingfong

Probably a whole 9-hr work day and some change.

I was tasked with creating some Ansible configs for these build agents. The machines being spun up from them were identical, but spread across 3 different networks: A, B, and C. The big difference was one zip. A and B got it from shared drives, but C pulled it from our Artifactory. I was told that the one in Artifactory was the same from both A and B.

A and B were fine but machines on C were failing. I figured it was the zip, and it was...but it took the whole day and 2 30-minute Zoom meetings with different folks.

The problem? Well all 3 zips had the same name: Dir_X.Y.Z_14.0 but

  • The zips on A and B unzipped to C:\Path\To\Dir_X.Y.Z_14.0
  • The zip on C unzipped to C:\Path\To\Dir_X.Y.Z-14.0

A single-character typo brought me to my knees lol. Someone renamed the directory to have a hyphen, but the zip they created still had an underscore, lol. Ahh good times.

Collapse
 
steveblue profile image
Stephen Belovarich

Recently 5 days, off and on between meetings. No stack trace, just a build that kept slowly moving along taking almost 1 hour until I tracked down the culprit: Emotion 10 and how it handles type definitions can slow TypeScript compilation to a crawl. I figured it out by looking for similarities between packages that were slow in a monorepo, then commented out code until I found what caused the slowness and got the build down from 45 minutes to less than 1 minute.

Collapse
 
zackderose profile image
Zack DeRose

If I ever get to 3 hours staring at the same bug, I generally get up and go for a walk or get some other eyes on it, or try to tackle a different task and come back to the bug later.

Maybe not related, but definitely have had long stretches where a certain bug is 'fixed' only to pop up again a week down the line...

Collapse
 
cerchie profile image
Lucia Cerchie

^this

Collapse
 
saschadev profile image
Der Sascha

It was about One month. We used a thridparty bpm engine. After a month we identified a memory leak. This Was at least possible with windbg and we identified the memory consume. So we identified that the dispoe doesnt disposed the interna resources...

Collapse
 
nicolus profile image
Nicolas Bailly

The most I've worked uninterrupted on the same bug is probably around a week. It was one of the worst bug I'd faced too : Some of our clients data would get randomly deleted for no reason and noone had any idea what was happening. I spent days trying to debug every single API trying to determine what could do that...

I eventually ended up parsing the mysql binlog searching for every delete statement on that table, searching where it came from in our codebase, and rerunning them one by one...

Turns out someone had forgotten some parentheses in an 'OR' condition months before.

Collapse
 
defman profile image
Sergey Kislyakov • Edited

~2 months of trying to find a race condition in a bunch of goroutines. I had to create a docker image with the debugging bits included (dlv) to connect to it remotely.

Collapse
 
bradtaniguchi profile image
Brad

3 months, not non-stop obviously, but I continuously went back and tried multiples things multiple times. Even did a 100% full on re-install of the operating system.

The issue? Bad vim-airline fonts on my Raspberry Pi.

The solution? Run a command to update the firmware of the Raspberry Pi.

Collapse
 
pp profile image
PaweΕ‚ Ludwiczak

At some point I just decide it's a feature and not a bug anymore.

Or I remove the buggy feature and pretend like it never existed :).

Collapse
 
nataliedeweerd profile image
𝐍𝐚𝐭𝐚π₯𝐒𝐞 𝐝𝐞 π–πžπžπ«π

Couple of days........

It was a horrible bug and the code was mixed between frontend & backend, with several scripts essentially running the same thing / interfering with each other and causing the issue.

It was an inherited OpenCart site.

Collapse
 
arpymastro profile image
Arpan

Longest I have spent is 3 days..

We were modernizing our reporting solution from Crystal Reports to SSRS and there was a formula written to calculate adjusted hours(CST / IST time difference, weekends, holidays etc..). This formula was written in COBOL with no documentation. I was trying to convert it in C#. Target was to get same values from both formula. Bug was in calculating weekend hours values and took me whole of 3 days to figure that out

Collapse
 
mellen profile image
Matt Ellen

I used to work on some data processing software for a particular measuring device (is that vague enough for you? Sorry, I'm just being overly protective). I would get reports that the software was slowing down if it was used for days at a time. I would occasionally look into the issue, but it stuck around for months until one of our in-house users was able to show me the problem in vivo.

The problem itself was a memory leak, but for some reason I couldn't get that "ah ha!" moment until I saw it in context.

I guess I didn't spend all that long over all debugging the issue, but it played on my mind for all that time.