The physical world and the rules that regulate it are complicated and very difficult to grasp. To make sense of it, we --- humans --- attempt to explain it with mental models. Models are logical constructions defined with terms that we created and understand. We use those models as proxies for reality in an attempt to grasp and predict its complicated and multifaceted rules.
Models are not an engineering-only concept; they are fundamental to everyday life and are natural simplifications of the world that we experience. We build models for everything, every day. For example, the memory we have of someone we met is a model of that person that we constructed in our mind. It is a description that is simple enough to store away in our memory, and sophisticated enough to be nuanced.
Given the limitation of our brains, we create logical models of many aspects of reality. For example, we have models for objects based on their most prominent characteristics. A very efficient model of rock is the word "rock" itself.
A more nuanced model could be a simple shape, which gives us more information on what the rock looks like.
But that is still very rough. It is hard to tell if that's a rock, or an egg or something else. Anything with that shape can be described as an oval, which makes the model unable to capture the distinctive characteristics of a rock. A more nuanced model could be visualized as a drawing of it.
The drawing is not much more detailed than the circle, but it carries information to remind us that a rock does not have smooth contours, and shows some of the imperfection of the surface. Still, due to my poor artistic skills, that drawing could represent many different things.
A picture of a round rock is a better and more detailed model, which includes color, texture and the surrounding context. Also, a photograph has the advantage of being captured by a machine without the filter of human interpretation.
However, this model falls short of being a full description of the rock. For example, it is two dimensional and shows only a particular side. It does not contain any information about the weight, three-dimensional characteristics, internal composition, position in space, etc.
A full description of the rock at a particular moment in time would require mapping every one of its atoms and their relative positions in space. Given that the stone changes over time, a complete description would require such a map for every instant of the rock existence. That would require defining when the rock started existing, and it would be done only when it is no longer recognizable as a rock.
That's, clearly, an absurd model. Such a description would be infinitely complex and unusable with our limited mental processing powers. In other words, it would be vastly useless for most purposes.
The terms "model" and "abstraction" are often overloaded; it is important to clarify their definition in the context of this document. Warning: the following definitions are more formal than I'd like them to be; however, I find it necessary to define them precisely to avoid confusion.
- Domain: a specified sphere of concepts, objects, activities or knowledge.
- For example, "building" is a domain composed of all possible buildings.
- Model: a set of distinctive characteristics of a subset of a domain, including their constraints. The constraints are necessary, in a well-defined model, to limit the number of variations to a finite set.
- For example, a house model could contain a set of distinctive characteristics of a house: things like color (one of the 16,777,216 8-bit RGB values), size (integers from 50 ft2 to 10,000 ft2), # of rooms (from 1 to 10).
- The subset of the building domain covered by the house model is the set of all houses that can be described by it. In the example, every possible combination of colors, sizes and # of rooms. That is a total of 1,669,332,992,000 possible houses.
- Instance of a model: One particular variation that can be described by a model.
- For example, an instance of the house model could be identified as being red (RGB #FF1010), 4000 ft2 and have 6 rooms.
- Explosion of a model is the set of all the possible instances of a model.
- In the house model example, all the 1,669,332,992,000 possible houses.
- Model A is called an abstraction of model B and B is a model abstracted by A if the explosion of A is a superset of the explosion of B.
- For example, a house is an abstraction of a studio apartment. A studio apartment is a model in the house abstraction.
Carbon-Based Lifeform is an abstraction for many models, such as: Plant, Animal, Microorganism, etc. Animal is a model that can describe a set of distinctive characteristics of a, well, animal. One of the domains of Animal is the Carbon-Based Lifeform domain. Animal is also an abstraction of a dog and Dog*is a model in the *Animal abstraction. Dog is also an abstraction of a German Shepherd model. A particular German Shepherd named Cody is an instance of the German Sheppard model.
If you made it this far, the rest is much more comfortable. I promise.
The level of abstraction that you should choose when modeling something depends on what you know about the context in which you are operating, and how things might change in the future.
For example, if you wanted to create software to manage the participants of a "Best in Show" dog competition, modeling a dog would be sufficient. The chances that a dog-centric competition will expand to include other animals is so low that abstracting to all animals is not something you need to plan for.
On the other hand, if you were writing software to manage guests of a pet shelter, Dog is probably not the right abstraction, even if all the current guests are, indeed, dogs. You need to look a step ahead. Shortly the shelter might host cats, birds, lizards, and who knows what else. In this case, Animal is probably a better abstraction.
In software development, choosing the right abstraction can be tricky. If you make it too simple, it won't let you create a model to satisfy even the immediate requirements. If you make it restricted to the urgent needs, you might have to change it almost immediately to implement the next iteration of the model. However, if you make your abstraction too generic and all-encompassing, modeling solutions might get so complicated that you'll go out of business before you are finished.
In my experience, the level of abstraction that engineers choose is determined --- good or bad --- by:
- Their skills and experience.
- Their personality.
- Data available about the domain.
- Time and resources.
- Accidental or deliberate assumptions.
In the rest of this post, I use a concrete example and an analogy to visualize what it means to choose an abstraction.
Imagine that a customer asks you to build a calculator application (like a pocket calculator); something that would satisfy the needs of grade-school students.
The customer has a high-level ideal in mind. They don't know it, and can't clearly describe all the details, but if they took the time to document the ideal model, it would look like this:
However, instead of a design document, the customer invites you for lunch and gives you a verbal description that you try to jot down as best as you can; as a result, your "spec" is a sketch on the back of a coffee-stained napkin.
They tell you that they want to get something out soon, and it doesn't have to be perfect. You accept the challenge.
Now you need to figure out everything else. The data you have, in this case, is a vague idea of what the goal is. Your rough sketch of the calculator gives you some information about the "reality" that you need to model, but not a complete description. You have more questions than answers.
As a developer, you need to use the information you have to create the proper abstraction. A useful abstraction is one that allows implementing a simple model to satisfy the initial needs and gives space to refine that model until it approximates the ideal.
To recap, your domain is the universe of Calculators. Your goal is to define and build the ideal model; however, you don't know what that looks like is. What you have is a vague sketch of a model scribbled on the back of a napkin.
To draw an analogy, imagine that the ideal model is this black ink splatter:
The splatter represents the functionality, features, and behaviors of the perfect solution. The ideal solution is a model that uses all aspects of an abstraction to satisfy all the needs, present and future, of the problem that you are trying to solve.
The jagged edges of the splatter represent the complications and details of the ideal abstraction. The shape is complex and has many aspects that are difficult to model correctly.
A first simple approximation could be an application that emulates a calculator that looks like this:
That is a start, but it covers only a small part of the ideal target. The picture below shows that small part in the context of the black ink splatter analogy; visualized in blue:
The customer is not going to be satisfied with it. The model is too simple, and it covers only a few of the features of the ideal.
A second attempt could expand on that idea, but in the process, you might make some choices that are a deviation from the ideal concept; for example, you could choose to add some new buttons and a usage workflow that is not quite what the customer wants. The result might look like the following model:
This model covers more of the original black ink splatter compared to the over-simplification, but it also expands outside of the boundaries of the ideal.
Overall, this model is not complete, and a little off.
After several iterations, a close study of the original sketch, and lots of conversations with the customer, you could come up with a better approximation of the ideal model; it is not perfect, but it captures most of the needed characteristics. It misses some of the buttons and functions, but it adds a larger screen which the customer likes.
This approximation models the ideal solution much more closely than the previous iterations. Visualized in blue on top of the black ink splatter, this model coverage of the perfect abstraction looks like this:
Over-simplification is not acceptable, but over-engineering and over-abstraction are even more dangerous. You fall into that trap when you make things much more complicated than the ideal solution, adding unnecessary cost and missing the spirit of the original target.
As a result, if you were asked to create a calculator, you might end up with a super-computer.
Which, in the ink analogy, covers the entire black splatter and a chunk of the observable universe around it.
In this exaggerated example, the model grew much larger than the ideal abstraction, meaning that you over-abstracted the problem.
When you are presented with a problem, you need to think carefully about the right level of abstraction. The goal is to create one that:
- Can model an initial solution to satisfy the absolute minimum requirements.
- Can be refined into new models to resolve known foreseeable future needs at an acceptable cost.
- Does not attempt to solve problems that you'll never need to address.
To evaluate an abstraction and choose the initial model, you need to have two lists:
- List of the minimum requirements; this list is required to determine the proper initial model.
- List of all foreseeable future needs; this list is needed to choose the appropriate abstraction.
When evaluating an abstraction and an initial model based on it, ask yourself the following questions:
- Does the model satisfy the minimum list of requirements? If not, the model is too simple and needs to be improved. That is, you need to expand the initial blue area on the black ink splatter.
- Does the abstraction allow for an eventual expansion of the model to cover the foreseeable future needs at a reasonable cost? If not, you might need to expand your abstraction. That is, you may need to enlarge the area that the black ink splatter covers in the domain.
- Is the model solving many more problems that are not in one of the two lists? If so, you might have created an over-abstraction. Consider reducing it. Over-abstraction increases long-term costs.
Choosing a proper level of abstraction is a science and art that you learn with experience. I hope that this dissertation will help you visualize the challenge, and give you some additional tools to improve your model abstraction skills.