Keeping code healthy with refactoring

#refactoring

Making the healthcare experience more human is our motto here at DocPlanner. To achieve this as developers, we contribute implementing useful features for doctors and patients, but we are also in charge of the health of our own code.

Photo by Markus Spiske on Unsplash

In this first article we would like to talk about our main tool to keep code healthy, clean, and in good shape to meet business expectations: refactoring. So, yes, this is an introductory piece for a series about what we like to call everyday refactoring.

Code health is somewhat similar to the health of you and me: you need to take care every day, developing good habits and avoiding obvious risks. If those habits fail to be kept, software entropy starts to grow.

Software entropy is a term coined by Ivar Jacobson and refers to a measure of the disorder of a software system. Software entropy increases every time a developer adds or modifies features on top of the existing ones. In large, monolithic applications, this can lead to a situation in which code can become a big ball of mud, where all things are tightly coupled in all sorts of unexpected and magical ways. Messy code is difficult to maintain and is harder and harder to add new features at the pace desired by the business.

This happens due to several reasons. We would like to stress two of them:

Code is not ready to be extended with ease, so we eventually end up with tangled and confusing code each time we add or modify an existing feature.
We don’t perform a continuous tidying and fixing of those problems, before and after working in our features.

This continuous code tidying is what we refer to Everyday *r*efactoring.

What is refactoring

Refactoring is both the action, and the effect, of applying limited and focused changes to code without altering its behavior, yet improving its structure, making it more readable, expressive and sustainable in the long term. You should use tests to guarantee that refactoring doesn’t break anything in the software, but you also can pretty safely rely on some provable or safe refactors specially if they are performed with the help of automated tools.

Learning from those who know

Refactoring opposes to the idea of redesign or rewriting. Rewriting happens when we act on large chunks of code, even entire modules, in order to modify its design. This, of course, is considerably more risky, because we will have to change tests and a lot of files. Not only can be unsafe, but it is also problematic from the business perspective because we could end up breaking or stopping new features.

Why refactoring

The goal of refactoring is to improve the way that code reflects domain knowledge in a progressive way. The better the code expresses the domain, the easier it is to add new features or improve the existing ones. So, the main reason to refactor continuously is economics.

Let’s refactor this mess

For example, imagine that you are surfing a certain file in a project, looking for the best place to introduce a change. In the process you could find some variable that represents a concept in the domain, but it is using a poorly crafted name that is imprecise or even misleading, and you don’t understand what it is doing there. After several minutes reading and thinking, you came up with the meaning of that variable and its role in the code.

If you do nothing, another developer, that could be your future self, will need to spend a good amount of time trying to decipher that line again. Nevertheless, if you change its name for another one that expresses the intent with more clarity, that future developer will have a better experience with that part of the code, and they will be in a better position to act on it.

This kind of refactoring is an application of the Camp Rule, which says that you should leave the code in a better situation than you found it.

When all developers in the team do this, code will evolve to better standard of quality at a faster and more sustainable pace. Also, most relevant parts of the code will receive more care, especially the more business critical parts of the codebase. Those parts that are less central for the business will receive less caring. You can find more about this approach in the outstanding book *The nature of software development*, by Ron Jeffreys.

We need to keep an eye on two general goals:

Reduce coupling. When refactoring one of our objectives is to help loosing coupling. So, a first step will be to isolate dependencies, minimize calls to them, and in the mid to long term, invert them and inject.
Increase cohesion. This means keep together the things that belong together and separate the things that do not. In other words, make units of software to handle only one responsibility and break complex units into smaller units.

When to refactor

To identify what parts of the code need refactor we can learn to discover code smells. Code smells are certain patterns in the software that make it difficult to understand or to modify. You can find catalogs of smells and how to identify them in places like Refactoring Guru or in the canonical book Refactoring, by Martin Fowler.

Surfing refactoring.guru

In general, you can identify refactoring opportunities in every place of the code where you feel that some idea is badly expressed. As a rule of thumb: if you need to read some block of code several times in order to understand what’s going on, you have found a refactoring candidate.

Being more systematic, there are four main moments for refactoring, according to Martin Fowler:

Litter-picking: we refactor when we see the need to clean or reduce litter in some piece of code: bad naming, unnecessary nested conditionals, complex expressions, liar comments, etc.
Comprehension: we refactor to gain understanding of a piece of code. We could change code or could only add comments, to document the new knowledge for a future time when we have a better opportunity to make changes. That’s what Eric Evans calls refactor for insight.
Preparatory: when we refactor before we start working in a feature, in order to ease its development, or to improve its extensibility.
Planned refactoring: this matches with the idea of redesigning, so it should be avoided unless we have a real need for it.

How to refactor

The ideal situation is having tests that verify the behavior of the piece of code we want to refactor. This way, we can be safe that the structure changes what we want to apply are neutral to that behavior and can be happily merged and deployed to production.

Eventually, you will need this

If not, we should try to write tests that describe the current behavior. There are several techniques to achieve this, but we will describe them in future articles. The general idea is to provide a batch of examples of input data to establish a baseline of the response to assert against it. Once we have this safety net, we can proceed to refactoring.

The less than ideal situation is when we cannot write any test for that piece of code, something that will surely happen in projects with badly managed dependencies, highly coupled components and low cohesive code.

In this case, we can use several safe refactoring techniques, also called provable refactors because it can be proved that they won’t affect behavior. Many of them can be performed easily with the help of automated tools, so they are fast and easy to apply. To perform others, we need to follow a series of steps to make changes with safety.

What/where to refactor

We apply refactors in the pieces of code that we need to touch in order to develop a User Story or fix some bug. Refactor should be part of the normal development process and it doesn’t need dedicated moments. Refactor is one of the tools we use to work, along with testing.

OK. Let’s do it!

That refactor can be preparatory, improving things before we start working in the new feature. On the other hand, after we introduce the new code, we could improve its quality and its extendability for the future. But don’t confuse that with developing unsolicited features.

Some places are safer than others. For example, we can refactor the body of public functions and methods as long as we don’t change their signature. We can touch variable names and private methods, inline or extract variables from expressions, extract blocks of code to other functions or private methods, etc.

When we need to change public interfaces or signatures, we can apply migration techniques to avoid backwards compatibility problems. One of them is to move the body of the method to a new one with the new signature, then replace the old method body with a call to the new one and progressively replace old method calls to the new one. This way, we can have both methods at the same time.

Coming soon…

This article is a general introduction to refactor but it will be incomplete without some examples. Stay tuned for future instalments in which we will talk about some frequent refactor opportunities and techniques to work with them.