Definition of legacy: something transmitted by or received from an ancestor or predecessor or from the past
— Merriam Webster
The law of uphill analysis and downhill invention suggests that it always feels easier to create something from scratch than to build upon the work of others. However, in regards of software, the real world consists of brown-field applications, old systems where new parts needs to be added. So understanding Legacy Code — defined here as code where the original authors are not available for questions anymore — is a fundamental part of a software developer's life. This blog post is a collection of hints to cope with this inherently difficult task.
Have a positive attitude
Be patient and tough; someday this pain will be useful to you.
— Ovid
The same situation may be seen by one person as a burden, while another person sees it as an exiting challenge. It is the attitude which makes the difference.
Still dealing with Legacy Code implies dealing with frustration. So it helps to apply the robustness principle not only to code but also to the personal level: "Be conservative in what you do, be liberal in what you accept from others."
Anyway, the more difficult things get, the more we can learn from it.
Simplify the problem
A model is a simplified representation of the reality which leaves out everything which is not relevant to solve the problem at hand. With this it may be easier to think about the problem and find a solution.
One way to create a model of a software system is to create scratches on scribbling paper or a whiteboard. Find out the names of important things and write them down. Draw circles and rectangles around them. Draw lines and arrows between them. Use different colors.
Also modeling notations like the Unified Modeling Language (UML), C4, Entity Relationship Modeling (ERM), Data Flow Diagrams, etc. can be used. When the sketches are getting complex, it may be useful to use modeling tools like Visual Paradigm, Dia, or draw.io.
Improve the documentation
One of the best ways to learn to explain it to someone else. Further capturing the things which have been hard for you to understand in writing may help you and the people who come after you.
- Look at the HTML Sanity Checker Architecture Documentation for a good example.
- Create a README file for every sub-project.
- Create a glossary for the project's domain language.
Be a power user of your text editor
The developers of text editors (incl. IDEs) spend a lot of time and effort to help making code reading efficient. So the investment in learning how to use your favorite editor's advanced features will most likely pay off.
Examples for IntelliJ IDEA:
Get an overview over the code base
Like for reading books, different reading techniques can be applied for reading code. Set a time box (e.g. an hour) and skim the whole code base. Get a sense for architectural pattern and coding style.
The following strategies might be applied:
- Top down: Look at the API/UI first, then business processes, and use cases.
- Bottom up: Look a the database tables first, then work your way up to the use cases and business processes.
- Identify the hotspots, the areas with the highest complexity and most frequent changes.
Understanding by refactoring
When tidying we are not here to fix the code, we are here to understand the code. — Kent Beck
When the code is hard to understand, it may get easier with an improved code structure. The techniques to get there are known as Refactoring: Rename variables, Extract Methods, etc. This helps to build a mental model of the application step-by-step.
If the risk is too high to apply the changes in the main code base, they might be discarded after gaining the better understanding.
Apply static analysis tools
There are probably tools available in your language's ecosystem which make understanding legacy code a bit easier. Here are some ideas for such kind of tools:
- Code search engines, e.g. Sourcetrail and Sourcegraph.
- Code as city visualizer, e.g. CodeCharta.
- Write your own utilities with a scripting language, e.g. Python.
References
Software Archeology | Andy Hunt and Dave Thomas
Working effectively with Legacy Code | Michael Feathers
5 Tips to Understand Legacy Code | Jonathan Boccara
Jonathan Boccara on Understanding Legacy Code | Software Engineering Radio
How To Understand New Code | Dave Xiang
Code Reading: The Open Source Perspective | Diomidis Spinellis
Top comments (0)