This post was first published on CoderHood as Only three numbers matter: zero, one, and more-than-one. CoderHood is a blog dedicated to the human dimension of software engineering.
Three friends are thirsty. They decide to drink some water from a nearby fountain.
The first friend scoops water using his hands. He starts drinking without delays and is done before the others have begun. However, it takes him a while, and the process is messy. When he is done he is all wet, and his hands are freezing.
The second one goes to buy one single sheet of paper; then he folds it into a paper cup. It takes a while to do it right, but when he is done he fills it with water and drinks. In the end, he is dry, his hands are warm, but the cup is soaked and ruined.
The third one buys a box of paper, then builds a machine that can quickly fold paper cups. It takes him the whole day to do it, but when he is done, he uses the first cup to drink. When the first cup starts falling apart, he makes more. Then, he invites everybody to join him, making more cups using his machine. Eventually, he starts a small business selling his cups to thirsty people.
This story illustrates a pattern often found in software development: when designing a solution, the “numbers” you need to pay attention to – at first – are zero, one, and more than one. Forget the rest for now.
If you want to test a sorting algorithm, you should test it with zero items, one item, and many items. Zero items is a particular case that usually causes problems. One item is also a special case that needs to be verified. After that, little difference exists, for example, between 3 and 5 items. Zero, one, and N, with N>1, are the three quantities that you need to pay attention to, at least from a functional standpoint.
If you want to create an online editor, you could design it to support only one user, a limitation that would let you cut many corners. It would be easy to start that way, but difficult to change later to make it work for two or more users. However, if you design it for more than one user, you have a generic solution ready to scale. In this example, planning support for zero users is not compelling. A zero users support is what you start with: no online editor at all.
If you want to design a computationally heavy application, you can design it to run as one single process. Things are easy that way, and you don’t need to worry about thread-safety. If you want the application to be multithreaded, things get more complicated quickly. However, designing an application running using two threads, or 100 threads, is equally involved. Similarly, to the previous example, the zero thread case is what you start with: no application at all.
In general, when you have a particular problem you have three primary paths you can take for its resolution:
When you take this route, you choose not to do any work to design and build a solution for a problem. You just do nothing, or you do what is most convenient, without any particular investment in design, reusability, or innovation. This is a no-direct cost and no-value path, with a big hidden cost: dealing with the problem without a solution is inefficient, unorganized, and unpredictable.
- If you need shelter, you find a cave.
- If you need to unscrew a screw, you try using your hands.
- If you need to find a number in a list of numbers, you do a linear search (naive solution.)
When you take this route, you craft a narrow and specialized solution to solve the particular problem at hand. This is a low-cost and low-value approach. Building a dedicated solution for a particular issue does not require any significant modeling or design of an abstraction. While a solution designed for one single problem gets you moving quickly, the result won’t apply to similar problems in the future.
- If you are looking for shelter, you pile up whatever stuff you have around and create a temporary cover.
- If you need to unscrew a screw, you create a screwdriver that fits exactly the screw you have.
- If you require finding a number in a list of numbers, you implement a quicksort algorithm to sort the number set, and a binary search algorithm to locate the number.
When you go in this direction, you come up with a generic abstraction that describes and resolves an entire class of problems, including the particular issue at hand. This is a high-cost and high-value approach. Once you have this work done, you can apply it to anything similar, and build more value on top of it.
- If you are looking for shelter, you cut wood into dimensional lumber. Once you have dimensional lumber, you have the material to build anything you want, now, and in the future. You can even sell the timber, as it would be useful to others.
- If you need to unscrew a screw, you create a screwdriver with changeable heads. With this tool, you can deal with any screw that you’ll ever encounter.
- If you need to find a number in a list, you first abstract the concept of a list to any data type (numbers, strings, objects, etc.concept). Then, you abstract the idea of a search algorithm to take a generic list and find an item. For algorithms that require sets to be ordered, you abstract the concept of a sorting algorithm. At this point, you have all you need to implement any algorithm to resolve any variation of the original problem, with any data type.
When you are faced with designing a solution to a problem, this principle will remind you to ask if you need to model an abstraction for the problem or parts of it.
Designing something to deal with N possible variations of a concept is more expensive than designing it for only 1. However, if you design something for 1 variation only and then you realize you need to support N, changing your solution to do so can be orders of magnitude more expensive.
My recommendation is to try to go with the “more than one” solution if you can afford it and if you have the hunch that you will need such abstraction anytime in the future.
If you go with the one solution, do it deliberately, not accidentally. You need to know the compromise you are making, and the ramification it will have in the future.
If you go with the zero solution, you are approaching the problem like a cowboy shooting from the hip, and you are adding to a spaghetti ball of challenges that will likely cost you later. Do this only if you know what you are doing.
In the future, I will write about various aspects of the “more than one” solutions, and how those solutions cost proportionally to how large N is. For example, think of Twitter. Building a system that performs the essential “twitter” functions is straightforward if you are designing it for a few users. However, if you intend it to work for millions of users using it all day, things get exponentially more complicated. Scalability is the single most expensive requirement in the “more than N” range of solutions.
Stay tuned for more on this topic!