Lucca Mabel

Posted on Apr 12, 2023 • Edited on Oct 18, 2023

Best practices for refactoring legacy code to make it more maintainable and easier to work with

#cleancode #refactoring #productivity #testing

It is quite common for us developers, especially those newer to the field, to have an idea that when entering a new company, a new project, they will come across clean, well-organized, well-structured, and easy-to-navigate and maintain code (or maybe just me in my naivety haha).

With the emphasis on using the best practices in the tech market, every agile development process adopted by many companies, numerous materials on how to write quality code available, it is common to think that the scenario of the previous paragraph is the reality. However, it's not. It's much more common to find a scenario of outaded code, with outdated dependencies, outdated stack, difficult-to-read code, and confusing architecture (maybe not so chaotic haha but you get it...).

So, considering all this, what to do in the face of this chaos? How to work with legacy code? How to know where to modify? How to know if the system won't break?

These are great questions, and I hope to address them well in this article by bringing points from my personal experiences and lessons learned from the book Working Effectively With Legacy Code.

What is 'Legacy Code'?

Let's start by identifying our problem, understanding what legacy code is, and some reasons why it is generated.

We can think of legacy code as messy, unreadable, difficult to make changes to, and hard to find code. But this type of code takes time to become what we call legacy, right?

It starts off beautiful, clean, and harmonious, until someone, for example, starts implementing duplicated methods in several classes, uses abstract names for variables, implements giant methods that do many things, forgets to implement a test, doesn't perform periodic refactorings, and submits a pull request for all of this, which is approved without scrutiny for reasons X, Y, or Z. Afterwards, the person who implemented all of this leaves the team or changes projects/companies and didn't leave any documentation behind.

However, despite all of this being true, the book "Working Effectively With Legacy Code" brings a perspective that legacy code is basically code without tests. So, no. It doesn't take a lot of time to call any code legacy, you can write legacy code right now.

But why are tests so important in a software? The importance of implementing unit tests.

I think now the path we are following is a little clearer... so let's go.

Why refactor?

To understand the importance of refactoring, it's also important to understand the importance and cost of code rewriting.

I understand the problems that arise from working with legacy code. It's not a productive flow, it's not easy to find anything, you never know if you changes one thing it will break the entire system, requirements are confusing to understand, and often support is limited.

So, for these reasons, I know we think it will be easier to simply rewrite the entire system to make it clean and harmonious again. However, it's often more costly and something the company isn't willing to do at the moment. Therefore, it's important to know when to rewrite and when to refactor.

Things you should never do, Part I

Getting straight to the point...

So, why refactor?

It will take less time, and therefore, less cost. The idea in this scenario is not an extensive refactoring, it's more like "adjusting as you go".

If you're going to implement something in a method, it already helps a lot to take extra time to refactor the method you're working on and make the necessary adjustments. But, it doesn't mean you need to allocate more time to refactor the whole class, for example. Do you understand what I mean?

Nevertheless, it's important to have refactoring as a constant practice and do it whenever possible.

When to refactor?

Before starting the refactoring process, it's important to have in mind the current behavior and the expected behavior of the feature that you're going to implement/modify.

This can be easily verified with tests. But sometimes they don't exist within the system we're working on. Therefore, in order to evaluate the risks we will face when performing the refactoring process, it's important to have very clear:

The changes that will be made
How it will be verified that they were made correctly
How it will be verified that, by making the changes, nothing will break

After evaluating these cases, it's noticeable that refactoring is not always simple, and that's why many people say "if it's working, don't touch it."

But, as tempting as it may be to avoid everything, software problems will only increase over time and come back worse. And without practice, it becomes increasingly difficult to make necessary changes.

Software Evolution and the Importance of Refactoring

How to refactor?

The book presents two approaches to making changes in the system: Edit and Pray and Cover and Modify.

Edit and Pray

This practice is the most common. You try to understand the code that will be changed, are careful with the modifications, make them, and when finished, run the system to verify if what was modified is visible and try a bit more to check if nothing broke.

It's modifying with hope and faith that everything will work out and using extra time to confirm that everything really did work out.

Cover and Modify

On the other hand, this practice brings more security when making modifications. The idea is to have confidence in the changes and to know for sure that nothing will break.

For this, it is important to check if what you are looking to modify is covered by tests, so that it is possible to guarantee that the modification works as expected and does not have any other effect. With tests, it is much easier to make changes carefully and assertively.

How to test?

The book presents The Legacy Code Change Algorithm, which provides a structure for making functional changes that add value through testing.

Identify the points of change

Locate the areas where you need to implement changes.

Find the test points

Identify where you need to implement tests for the changes you will make.

Separate dependencies

Perhaps this part is where you will encounter the most difficulty. There are some techniques to perform this separation, but I won't go into it at the moment. However, I believe that this material can provide a good starting point.

Write tests

Separate the test cases and write them to cover as many scenarios as possible.

Make modifications and refactor

The book recommends using TDD for making changes in legacy code. It is a way of validating your code as you go along with the implementation. A good recommendation on how to start this practice is How I managed to start practicing TDD.

Conclusion

In summary, legacy code is a combination of situations that happen in a developer's daily life, but it's basically code without tests.

In this scenario, it is important to know when to refactor and when to rewrite code. And when refactoring, be aware of the current state of the functionality and the expected state, as well as prepare for verification scenarios to ensure that the changes were made correctly.

It's also important to know when to perform refactorings; they do not need to be extensive, but they need to be constant. When left aside, the problems arising from it will only increase over time, and you will lose the practice of solving them.

Additionally, it is important to verify if the code you want to modify has test coverage before starting a refactoring process. Only in this way, it's possible to ensure that nothing beyond the expected will be changed, and the system will work as expected.

If there isn't test coverage in the system or the piece of code you want to modify, it's possible to implement them by following The Legacy Code Change Algorithm.