Imagine how much time developers from all over the world spend on dealing with bugs and issues that arise in code. In fact, it might feel like your abilities in this area can define how valuable you are as a professional developer.
Experienced developers usually end up helping juniors out in difficult code fixes. So, if you’re a junior developer and you feel like you’re always getting stuck and begging your seniors for help, you might be wondering - just how on earth do they do it?
The answer here isn’t totally straightforward. You might think that the reason your senior colleagues can help with all your queries is that they have more experience or better debugging skills. But I would say that’s only half the story. The other 50% is all about psychological advantage, for the following reasons:
You asked ‘em! You’ve shown that you believe in their professional abilities and this positive reinforcement is powerful. This can actually make a person smarter - that vote of confidence gives them a real life intelligence boost.
It’s not their problem, so they don’t have any fear of failure. Without this anxiety, it’s much easier to focus 100% of their brain capacity on finding the actual solution to the problem.
Even though they have nothing to lose, your colleague knows that if they solve the problem, it will give them an added psychological bonus, as you will be in awe of their superiority and high levels of professional skill.
You see, there’s no stick for them, just a big bunch of carrots.
Let’s think about some ways that you can grow your own skill set. I’ve tried to pull together here some useful thoughts and ideas that work for me. I believe that reading this list and applying the advice will push your skills forward, until you’re a fine match for those gurus who you usually turn to for help.
This is an incredibly important point that will really help your development. You should ask for help only after you’ve methodically worked through all of the points explained below and you’re still well and truly stuck - and I mean banging your head against the wall kind of stuck. Some of the best growth I’ve ever experienced is when I’ve been in situations where nobody could really help me with my problem. I had no choice but to figure it out by myself.
Getting emotional is never going to be helpful in resolving bugs. You shouldn’t have bugs you hate. It only makes them more difficult to solve. Every bug is a great opportunity to hone your skills, so it’s much better to try to learn to love them all.
I believe that if you’re really stuck on a debugging problem for a long time, this is likely to be at least 50% for psychological reasons, or maybe even more.
A voice in your head says, “It can’t be like this! It must be black magic!” You feel as if the world has gone mad, and it can’t possibly be your fault. You feel like all you can do is run away from the problem.
Don’t listen to this voice! Everything in code always happens for a reason! There’s no black magic breaking your code, and I’m sorry to break it to you, dude, but there’s no magic wand to fix it either! You just need to find the specific line of code that’s working in the wrong way, and fix it. Simple.
Please don’t just copy and paste the error into Google or Stack Overflow straight away. You should take the opportunity to learn something from every bug and develop your skills - this is particularly important when you’re stuck with a difficult bug to fix.
It’s best to carefully read the error and try to understand the process. Often the reason for the error is right there in front of you - you just need to look at it properly.
Don’t panic. Usually, the code is already telling you what is wrong. What’s the worst thing that can happen? You waste a bit of time reviewing the code and you don’t find the answer.
But actually, that time won’t be wasted, because you will know more about the project. Until you are 100% familiar with it, there’s always work to be done. Plus, it’s always helpful to practice your code reading skills.
This simple rule was told to me by an air defense officer during military training. When you service a complex piece of machinery and something goes wrong, or is not functioning in the same way as it used to, you should focus your attention on the modules that you worked on in the latest fix or check.
This is 100% applicable to programming as well! So you need to look at any piece of code that was updated or fixed recently.
When you realize that the issue is hard and you’re going to need to do a lot of tests, you should pause for a while in order to improve the testing process. In a perfect world, your bugfix checking process would be totally automated.
If you can figure out a way to recreate the error more quickly, then you can test multiple solutions much more efficiently. This will help you to concentrate on the problem and find a rapid fix. You can even temporarily simplify existing code logic to make every test much faster.
I found out about this approach from a school physics tournament called SYPT. This method is used in scientific research and can also be used to resolve hard programming issues. So here is the science approach:
1) Inspect the process and observe what is happening.
2) Try to formulate a theoretical explanation for what’s going on. Define all the possible factors that could be causing the problem.
3) Then you should check these theories one by one, starting with the one that seems to be the most plausible. Make sure you are asking the right questions. Every check should encapsulate the essence of the problem. Do not test several theories at once. This will interfere with the full picture and won’t lead you to the right conclusions.
Every reason or theory should be either proved or rejected by the test results. These same principles can be applied to solving complex bug problems.
This is a famous method, based on the story of a programmer who used to carry a rubber duck around and force himself to explain the coding problem to the duck. This would help him to unlock the problem.
Sometimes I find myself telling somebody about a bug, even though I know that they won’t be able to help me at all! But it’s useful, just the same. Often I discover the right solution while I’m talking. When you’re trying to talk someone through the issue, your mind is organizing it and putting it into the right order to be explained. Sometimes this can be just what you need to unlock the next logical steps that you need to take.
I know that this seems in conflict with the previous advice, Do Not Call for Help, but this is a fallacy. The Rubber Duck method makes you work harder to solve the problem yourself. It’s definitely not a sign of madness here to talk through the problem with people, animals, plants or even inanimate objects like the famous rubber duck.
This is the best rule! It works even if you don’t completely understand what is happening in a huge piece of code. I guess you might have got the point from the title already, but let me explain it in more detail anyway. Let's say to start off with that the code is falling, but the reason is unclear.
You can bisect the code by commenting out a section of it, then repeating the check. If the error has disappeared, then you know that it was in the part of the code that you commented out. If not, then you repeat the process with other sections of code.
The problem will be in one exact line, and you need to repeat the test until you isolate the piece of code where the problem is.
Of course, you have to inspect the whole code to fully understand it and make decisions to prevent the error, but it’s much faster if you don’t have to read all of the suspected code when you’re looking for the error itself.
Ok, first thing’s first - don’t panic! Panicking won’t help to resolve the problem quickly. Here’s what you can do instead to handle production errors - ask yourself this question before doing anything else:
Can I reproduce the error on my local?
- YES? All good, let’s debug it and understand the reason for the problem. Fix, deploy, go home, kiss your wife (your own wife, please!)
- NO? You know, people tend to freak out when something is OK locally but fails on production. I would say, let’s try to see both the negatives and the positives in this situation.
This challenge already gives us some clues as to how to find a solution. What is the difference between the local and production environments? Possibly we have a different configuration or different web server settings, or some other issue that could be applicable to the problem. It’s mostly a matter of logic.
Even if you need to fix something ASAP, it does not mean you should test your fix straightaway on production! I can’t tell you how many times I’ve seen double or even triple deployments having to take place, all because of some stupid typo during the hotfixing. Check local tests and try to keep high code quality. One gradual fix with one deploy is better than three hotfixes with three deployments.
Imagine that you’re working on a particularly painful bug. The company is losing money right now because of this problem. You’ve discovered the reason for the bug, and realized that the fix would be a huge piece of work and would obviously require extra unit testing and so on. What should we do in situations like these?
Let’s work to provide some possible temporary hotfix. It’s ok if it’s not perfect, but it needs to be fast and to prevent the company from losing more money while you’re working on the definitive solution.
If some new feature that we’re trialling is buggy, we can roll it back first to protect the main process. So we can just apply the stable version of the code while we take our time in figuring out how to fix the bug properly.
Sometimes your CI shows failing tests for no reason, even if they aren’t failing on your local interface. These kinds of test failings are a separate theme for a separate article… or even a book.
The reason for failing is definitely the order in which your tests are running. But even if you know the right order, that’s not necessarily the answer, as you can have thousands of tests running and it’s impossible to even identify which exact test is the reason for the failure.
Some testing tools (such as ruby rspec) have smart bisect functionality that helps to minimize the number of tests involved in the count. However, this can be a really slow process and will not necessarily give a definitive result.
So it’s better to read the issue and recognize what common resources have been touched before the test that has affected it. Then you can predict what tests are the reason for the failure. After that, all you need to do is run target tests in the right order to prove your theory.
Yeah, I know this sounds like a hard process, but the fixing of such tests always leads to an improvement in their structure.
I don't want to brag, but these days my code has a lot fewer bugs than it used to have. I learned to be more attentive and compose correct code structures with good architecture that will force bugs to surface. You most likely won’t face so many or such serious bugs if you understand the process at a very high level and in a lot of detail.
Your own brain can function in the same way as your programming code runner. Apart from that, I try to use TDD and end-to-end testing if there is enough time. It’s impossible to avoid 100% of bugs, but you can decrease the number of bugs significantly by working in this way.
Try to enjoy your bug fixing! Maybe to start with, it feels less satisfying than writing new code, but it has its own advantages.
For instance, some hard bugs will turn you into a true detective and you will really enjoy the thrill of the chase when you’re on the right path.
So, the harder the bug, the more pleasure you will get when you win! Thank you for reading! And good luck beating those bugs!
PS: Big thanks for illustrations to my friends from Pixel Point