Software systems are a collection of bits recorded on a storage device. It might seem that once a software application works, as long as the hardware it runs on is functional, it should just keep working for eternity. So, why is "software maintenance" even a thing? There is no physical wear and tear, and there are no materials that can rot or rust away. Software systems are not affected by the atmosphere, pollution, the weather, and there isn't anything that can physically break. However, this argumentation doesn't take into consideration an essential aspect: the context.
While software applications do not have moving parts that are subject to physical wear and tear, they most often have dependencies with their working environment. With very few exceptions, as the working environment inevitably changes, the assumptions the application was built upon crumble. Eventually, the app stops working or loses its value.
Let me give you a few examples of context changes that can cause a software system to lose its value or break:
- A website that allows users to log-in using Google authentication stops working when Google changes its authentication protocol. (External dependencies change)
- A weather prediction software system becomes outdated when the implemented mathematical weather models become outdated. (State-of-the-art knowledge change)
- The navigation software of a car becomes obsolete when more recent navigation software products released by competitors are significantly better. (Market expectation change)
- The software that manages bank accounts in a financial institution must be updated when the hardware it runs on breaks, and compatible hardware is no longer available. (Hardware change)
- An application stops working when any one of the used third party libraries requires an update to a new version that is not backward compatible. (Software dependency change)
- An application needs to be modified to comply with new laws and regulations. (Compliance change)
- An application needs to be fixed because the definition of leap year changed. (External concepts and definitions change)
- An application needs to integrate with a new and popular third-party system. (Market environment change)
- Traffic is increasing on a website, and the website software needs to be rearchitected to be able to scale to the new traffic levels. (Usage patterns change)
So, in principle, it is true that software applications don't withstand physical wear and tear. However, non-physical processes cause software applications to age and expire if they are not rigorously maintained.
Maintenance of software systems comes in four different flavors: adaptive, corrective, perfective and preventive. Please, note that I did not make up these names. If I did, I would have chosen something different. For example, I find the term "perfective" to be cringy, but I am going to stick with it for compatibility with the standard industry jargon.
A maintenance action is always a software change, but its classification depends on the reasons that triggered it. Let's take a deep dive in the four types of maintenance, and how to recognize them.
A corrective maintenance action is a software change that you make because:
- there is an acute issue that prevents the software to work as expected
- the problem is actively affecting users, or you suspect that it is.
For example, if you release a software system and your users ran into a bug, a corrective maintenance action is required to fix it. Note that, if the users were never affected by the bug and you resolve it before anybody noticed it, the maintenance action is preventive or adaptive. However, if even one single user might have been affected, then fixing the problem is a corrective maintenance action.
If you spend the majority of the time dealing with corrective maintenance tasks, your engineering team is running in firefighting mode. Pay attention to that situation, as something might be wrong with your testing practices, your ability to anticipate problems, seniority and skill level of your team, the quality of your code or your engineering capacity.
Adaptive maintenance is something you do when you change a software system because:
- you have definite information that the landscape in which your software operates is changing (i.e., market, technology, laws, etc.)
- you have a good understanding of the direction and timeline of the landscape change
- your customers will be affected at some predictable point in the future, or they are affected now but have trivial and inexpensive workarounds
For example, let's say that you are running a website and you discover that it is not compatible with a new version of Safari that was pre-released by Apple in developer-only beta. Based on historical patterns, you know that Apple will release the incompatible version of Safari in the next few months. If you address the problem before the new version of Safari goes mainstream, then you are performing an adaptive maintenance task. That's because:
- Safari is changing
- you have a good guess on when it will be released
- your customers are not using the developer version of Safari; if they are, they are probably aware of the risks and can switch to the generally available release.
If you wait for the new version of Safari to become generally available to the public, and your customers run into issues, then you have to take a corrective maintenance action. Knowingly delaying adaptive maintenance until it becomes necessary to perform corrective maintenance is costly and should always be avoided.
A very famous example of massive worldwide adaptive maintenance rush was caused by the "Y2K bug" in the late 1990s. The calendar year was guaranteed to change from 1999 to 2000 on 1/1/2000. Many software systems were clearly not ready for it, but customers of those systems were not affected yet. At exactly midnight on January 1st, 2000, any software maker who did not invest in the adaptive maintenance work to fix the Y2K flow was now dealing with a corrective maintenance emergency.
You perform perfective maintenance when you change a software system because you decide to improve its value by improving something that is already working. Perfective maintenance includes things like speed optimizations, UI and usability improvements, etc. Perfective maintenance is often, but not always, initiated by customer feedback. Well-run software organizations listen carefully to their customers and invest in perfective maintenance tasks until just before they hit a point of diminishing returns.
Preventive maintenance is a software change you make to avoid potential (but not guaranteed) future issues.
The difference between adaptive and preventive maintenance can be fuzzy at times. In general, preventive maintenance is not bound to a trend that is sure to occur due to changing contextual conditions; instead, it is linked to an event that might or might not happen in the future. For example, if you are concerned about traffic spikes due to unforeseen and unpredictable future circumstances, and you decide to reinforce your software to deal with it, you are performing preventive maintenance. On the other hand, if your traffic growth patterns are somewhat predictable, and you have a good idea when your systems will start having scalability issues, any change you make to avoid those issues is adaptive maintenance.
Every time that you make changes to a software system, it is essential to identify the type of maintenance that you are performing. Keeping track of it gives excellent insights into your engineering practices. For example, the percentage of time that you spend in each of the four maintenance types gives you an idea of the maturity and skill level of your software engineering organization. It can also give you an idea of the maturity level of your product and codebase.
How much time your engineering team should spend on each of the four types of maintenance depends on many factors. For example, immediately after a release, you should not be surprised to see a spike in corrective maintenance work. Despite your best attempts to release bug-free code, issues always crop up after a release. There are no hard and fast rules, but there are general considerations and principles to keep in mind.
- Software systems age. Maintenance is part of a software system cost of ownership. You can somewhat keep it under control, but you cannot avoid it.
- As your software systems become bigger and more sophisticated, or as your customer base grows, the amount of maintenance required increases proportionally (but not linearly). Without growth (AKA, more people), engineering teams that keep adding new features to a software system, or supporting a growing customer base, sooner or later end up spending 100% of their time on maintenance tasks. I call that the "zero-growth" stage, and it is not a good place to be. How quickly you'll get there depends on many factors.
- Investing in preventive maintenance is like taking a bet. To win you have to bet and be right more often than you are wrong.
- Do not let issues go from "potential" to "predictable" to "acute." Most corrective maintenance tasks are very costly because they affect your brand. You can usually avoid them with less expensive adaptive or predictive maintenance work if you pay attention. Take bets, move quickly, don't ignore clear signals.
- If you are not working on any adaptive or preventive maintenance task, you are probably not paying attention to the context in which your software operates.
- There is a point of diminishing returns in perfective maintenance tasks. You can polish an apple only so much before it becomes a futile exercise.
- There are always going to be parts of a software system that you prefer not to touch. That is not necessarily a bad thing; however, you should watch out for how much time you spend working around problems in those areas.
- Sometimes it is not immediately obvious how to categorize maintenance work as it seems to fall into multiple buckets. That's ok. Take your best shot. The more you do it, the easier it gets.