Oh no! You have to rewrite that huge legacy app! You already argued that you can fix the legacy code and you've sent the managers the Joel Spolsky "never rewrite" article. Management responds by pointing out that both Basecamp and Visual Studio both had famously successful rewrites.
Do you update your CV or do you tuck in and get the job done?
Sometimes the rewrite is where you're at. Either you've made the decision or you were given the order. Maybe it's because your product is written in VB 6, or UniBasic, or Easytrieve (all of which I've programmed in) and you can't find developers. Or maybe the software is a slow, CPU-bound memory hog and Python or Perl just aren't great choices there.
Whatever your reasons, you need a clean strategy. A problem this hard can't be covered in a couple of paragraphs, but by the time we get to the end, I'll give you a fighting chance of not only making that rewrite succeed, but also avoiding exchanging one plate of spaghetti code for another. This isn't the only way to approach this problem, but I've seen this work and as an added bonus, you'll be developing a SOA (service-oriented architecture). If you're unsure why you should do that, read Steve Yegge's epic rant on the topic.
We start by giving a quick refresher on structured programming, along with vertical and horizontal layering of applications. If you're comfortable with that, you can jump straight to Part 2.
I learned to program in the early 80s, first with BASIC before moving on to 6809 assembler. Being self-taught, "structured programming" wasn't a concept I was familiar with and neither BASIC nor assembly prepared me for that. But moving on to C and later learning Warnier/Orr diagramming (pdf) taught me the basics of structured programming. We all learn this after a while. This module handles payments while that module handles orders and this is how you use subroutines, and so on.
Unfortunately, for some programmers, that's where their design experience stops. They're not concerned with separation of concerns. They see no problem that huge subroutine that concatenates a bunch of strings, based on conditionals, to form SQL, and then returns an HTML table that will be concatenated with other HTML snippets to try to make a web page.
Eventually we start learning about the vertical layers that our application might have. For example, in a classic MVC pattern, you might have a view layer, which accepts JSON from a controller and renders it as HTML. All kept very separate from the main logic. The controller merely dispatches requests from the view to the business model layer and that layer gets its data from a lower-level data layer (often a database).
The layers have distinct boundaries around their responsibilities. Ignoring this approach makes it much harder to manage different parts of an application. For example, I frequently see people using ORMs and then embedding business logic in them (I've made this mistake more than once). Thus, when they try to change that logic, or change the data layer, you're often working on code with two sets of responsibilities and it gets hard to untangle them if needed.
A key point to layering is to remember that, for vertical layering, each layer can only talk to adjacent layers. The view can talk to a controller, but never the data layer. Keeping vertical layers separate helps to minimize spaghetti code.
For an example of how bad layering can make your life miserable, you can read my Project 500 case study.
Horizontal layering is less well-known, but it can be a great tool.
Imagine you have a
ShoppingCart class for a simple e-commerce web site. Is that useful? Not by itself. You probably need plenty of other classes to make it useful. You might need a
Customer object, and
Product objects, and
Currency objects, and all sorts of other things (assuming you're going OO).
So let's step back and ask ourselves what kinds of things that site might need. This example is obviously not a huge system, but we're keeping things simple:
- Product search
- Shopping cart
For many monolithic sites, those are often all lumped together in one big code base, with their edges overlapping. But if you start to think of those as services, you could look at those as horizontal layers.
You might think that you only want horizontal layers to talk to adjacent layers, but this time it's different. You have a variety of different services and they often need to share domain knowledge across different services. For example, if you also have a blog "service", when someone is searching for a product, you might show related blog entries. When reading the blog, you might be able to add an item directly to the shopping cart for that.
What's important is that your layers are separated cleanly, with each preferably being a black box. If you have a monolithic, legacy codebase, there's a good chance layers aren't separated cleanly (or at at all). But you need to at least understand the concepts.