The refactoring guideline

#codequality #refactoring #softwaredevelopment

The software life cycle

Almost all software has been started with a nice plan and high hopes. The one who started though that it will fulfil all requirements in the future. Later on the first version has been implemented, usually it is already not totally following the original plans. With time new needs are coming: new features should be added, some behaviour needs to be changed, some bugs must be fixed. These activities are usually done with as-fast-as-possible attitude. Not thinking about the future, not thinking about the past, not thinking about the architecture, just solve it in the fastest way. After a while there will be a code which is basically doing what it should, but there the code quality is horrible, it is usually managing the memory in a strange way, which causes some tricky bugs. The runtime is neither optimal. There’s a lot of code duplication and death code and it is really difficult to figure out what should be changed in case of new needs. So even the implementation of a small feature can take quite long. And one more important thing: most of the developers hate to work with such a bad quality code!

What is the meaning of refactoring?

Refactoring is changing the code in way that its quality will get better, but it’s behaviour is not changing. So that for example bug fixing, or making the behaviour more user friendly is not part of refactoring. But code changes which are increasing the readability or the modularity of the code or changes which achieve a better performance on runtime or memory usage are part of refactoring.

What are the benefits of refactoring?

During refactoring you are always increasing the quality of your code, with other words: you are making your code better. Code quality is a quite complex topic, but I think it is mainly about the following points: stability, readability, modularity, maintainability, reliability, efficiency, security, testability and size. So during your refactoring you are addressing one or more point regarding this code and you are making your code better from the selected purposes. Pay attention here, the different purposes can be contradicting. For example quite often you can reach a better runtime by using more memory.

This is how is it based on theory. But let’s see now the practise. As I meant in the introduction usually the code base of a software is getting more and more complex by time and in the meanwhile its quality is getting worse and worse. Usually it is going to be less readable, less maintainable and later on less reliable. That means bugs, bugs everywhere. And since the readability is bad it takes long to find the root cause of the bugs and since the maintainability is bad it takes long to fix these issues. Due to the wrong structure of the code it takes long to implement new features as well. So the project will reach a phase, when a lot of people are working on that without big results. It looks like that they are doing nothing, just keeping the code alive. It’s a waste of time and waste of money. This point is usually the time to refactor the code base and reach a better readability, maintainability and stability. All other points are always up to the situation, for example if the program is running to slow of course runtime will be a focus as well.

The low level benefit is better code quality, the high level benefit is nothing else, than money.

How to start with refactoring?

As I told refactoring means a code change without the change of the behaviour. Before changing anything you need to know pretty well the current behaviour. This should be documented in a way. The easiest solution is to write it down into a document. But a much more effective solution is to implement automated tests for your code if it is possible. In fact it is possible in almost every case, except some really special ones. You need to implement test cases which are covering the whole functionality, are independent from the implementation details and are passing on the current code base. After every change you can run them as check if they are still green.

Once it is done it’s good to go down onto component level. Here even if they are not present yet you should think on a component level: what are the main functionalities of your program, how could these functionalities be categorised, what are the different level of your architecture. All these points needs to be considered. The components should have a clear responsibility and well-defined input and output interfaces.

Once you figured out the main components try to separate your existing code base into these components and use the predefined interfaces between them. This can be challenging. It can happen that you need to cut classes or fuctions as well. Make always sure that your code is compiling and you automated tests are passing.

If your code is already organised based on the components you can write some component level automatised tests using their interfaces. So later on if a test is failing you will know which component is the problematical one.

What should be refactored?

Refactoring can be done on different levels and with different purposes. Let’s start with the case when the main goal is to make better readability and maintainability.

I would suggest to start it from the highest level. Check what are you current components. Do they have a clear single responsibility? Are their interfaces clear enough? Are there duplicated interfaces? Is the communication workflow as easy as possible? If the answer is no for any of these questions you should start on this level. Split the too complex components into multiple ones, remove the not needed interfaces, make the communication between your components easier.

As next step check each component and check their classes: do they have a clear and simple responsibility? Do they have clear public interfaces? Are they connected to each other in an optimal way? If any of the answers is no, you should do changes on this level. On this level you should always follow the so called SOLID principle:

Single responsibility principle

Each of your classes should have a simple clear responsibility
Open/closed principle

Your code should be open for extensions, but closed for modifications
Liskov substitution principle

This is maybe a bit more complicated. But it means, that each of your parent classes can be replaced with any of its child classes without braking the functionality

Interface segregation principle

Your interfaces shall be small and clear and having one well-specified purpose. So you should avoid having interfaces which are returning a lot of data.

Dependency inversion principle

You should design your classes your classes in a way, that the classes which are dependencies of your class can be set through some setter functions or constructor parameters. So that you can change them to any subtype later on. So for example if you have a Logger class which is used to create log files and you are can change it though a setter function you can change your Logger between an XMLLogger, a JSONLogger or a SimpleLogger if all these classes are derived from the same Logger base class. It is also helpful if at unit testing you need to mock your dependencies.

Once you are done with the refactoring of your classes go one level done to the level of functions. At this step try to eliminate functions which have too many parameters. Try to split long functions into multiple ones. Try to eliminate code duplication. Make a clear difference between functions, which are returning with some value, but not changing the state of the object and methods which have the purpose to change the state of the object. At this point make sure that you code is covered by unit tests. If not, let’s cover it. Make sure if your function names and variable names are clear enough, rename them if needed. Define constants instead of some magic numbers. This is a really complex topic to address all the points in one article. Now once you are done with the refactoring on this level go on level back to the level of your classes. Is there any class now which should be maybe splitted into multiple ones? If yes, do so. As next step do the same on the component level. And at this point you may see, that it is a never ending story. There’s no perfect code. There’s worse code and better code. Your code can be always better, but never perfect. You need to find the right balance between refactoring and implementation of new features.

If the goal of this refactoring to reach a better performance you need to analyse your code from an other perspective. You should monitor your runtime and identify which are the most critical components and functions from runtime point of view. After finding them there are two things what you can do: find ways to optimise in their algorithm and in the way of communication. Or change some part to multithreading. Multithreading is a complex topic again, be careful here, pay special attention on the shared memory fields. And one more thing: creating a new thread is not always making a better performance!

Summary

As already mention refactoring is really a complex topic, but it is needed almost everywhere. I tried to give you a rough overview, I hope it helped you.

http://howtosurviveasaprogrammer.blogspot.com/