DEV Community

loading...

Expensive abstractions

hvihvi
Updated on ・8 min read

Expensive Abstractions

The wrong abstraction topic as described by Sandi Metz desserves more attention than it currently gets.

Duplication is far cheaper than the wrong abstraction.

The scenario is pretty simple :

  1. developer A sees duplication -> he enforces an abstraction
  2. developer B tries to use this abstraction but it is missing something for his use-case. He modifies the abstraction
  3. time passes, loop over step (2) multiple time
  4. you join the party and take a look at the code, it's a total mess

To fix it, the initial abstraction needs to be undone.
However, as time goes by, it gets harder to undo this abstraction. This is the reason why detecting, removing them early and knowing when to avoid hasty abstraction is important.

An example : The Abstract Comparison Page

Our company compares all kind of products. Comparison pages all look somewhat the same, a filterable list of items (like most flight/hotel comparison websites). We are in the process of rewriting them from scratch with React.

Our first idea as DRY-obsessed developers was to abstract this in a reusable <FilterableItems/> component.
Its API has 3 inputs:

  • items : the list of items to render
  • renderItem : takes an item as input and decides how to render it: item -> render
  • renderFilters : takes a callback to set a predicate as input and decides how to render it setPredicate -> render

The reusable component handles the merge of all predicates to filter and displays our items.

Simple enough. The code is concise, inversion of control makes it very modular and we can easily reason about it in a functional way.

Unfortunately, sometimes filters are on top of the list, and sometimes on the side... We decide to add a "Layout" input to our API so that it can be defined if necessary.
It's a bit confusing to have this separation of layout and actual render but some call it separation of concerns. After all this is what we do when we segregate HTML and CSS...

Then, we notice that the parent component needs to be aware of the inner state of its children, to count displayed items. We could work our way around this issue with some kind of callback...

Luckily, all these new requirements came up quickly so we got to observe the cost of this abstraction early on.
This early attempt at inversion of control went wrong very quick.
Instead of forcing its way in, it is best to give up on this abstraction.

After we inline every usage, the code is much more flexible and straightforward. It uses React composition and state management along with javascript's map/filter/reduce, with no abstraction on top of it. We are now free to lift the state up to share it where it needs to be used. There is going to be a bit of duplication, and this is fine.

The cost of the wrong abstraction

Our codebase's history has seen a few costly abstractions...
What looked like a smart abstraction at first became struggle for teams, to a point where no one had the courage to remove them. Whenever a new requirement came up, devs had to explain that "we can't implement it in a decent amount of time, because we'd have to make modifications to the framework, which affects the whole codebase...".
This is the path our little example would probably have taken if we had not reverted it. Due to its large scope and the intent to use it for every comparison pages.

We use duplication and DRY as our primary motivation for creating abstractions, but other fancy keywords like Separation of Concerns and Consistency are also a source of hasty abstractions.

Abstraction generation

It seems that developers go through different phases of abstraction generation as they grow up:

  1. Doesn't care about creating abstractions or duplication
  2. Wants to remove duplication at all cost, enforces many abstractions
  3. Avoids early abstraction

Going from (1) to (2) is pretty straightforward, there are tons of educational content that will push you in this direction.
Creating a cool abstraction provides satisfaction and you can show it off.
Knowing when not to do it takes more maturity. Unfortunately it isn't as flashy and it might go unnoticed. You might even be blamed for duplication, which is why education on the subject matters.

Detection

There are probably too many ways to detect abstractions that have gone wrong to list them all, but I'll try to name a few of them:

Huge API surface

The surface is how much an API exposes to its clients.
I strongly recommend Sebastian Markbage's talk on the topic.
When an API's surface is too large, the abstraction under it becomes hard to use. You need to be aware of too many factors to use the API.
A symptom is having too many ifs and parameters... They create a lot of complexity for the user as they need to be aware of inner mechanisms.
Would you rather have a <Button> component and pick its colour and shape, or have a <PrimaryButton> and a <SecondaryButton> ?
Most people go straight to the first one because it minimises duplication, but the 2nd version is actually more flexible in the long run.
It's a matter of finding the right balance here and split things when necessary, a single Button that has all the logic for every buttons in your design-system is a huge pain point when it grows too big.

Inheritance

Use Composition over Inheritance (aka: avoid the extends keyword).
Unfortunately, OOP languages provide tools that are easily misused and create more problems than they solve, which is probably why quite a few programming figures try to stay away from them (Linus Torvalds creator of Linux and Git, Joe Armstrong creator of Erlang...), and newer language like Golang don't implement any extends feature.

Logic in your tests

For example, a test that loops over every "Car" instances in your codebase and checks that they have 4 wheels. Maybe a Car should have 4 wheels by design, that way the dev using a Car doesn't have to add the right amount of wheels himself. Logic resides in code, you probably don't want to be testing your tests.

Unnecessary indirection

It's a bit irritating when you have to navigate a lot for simple stuffs.
For example extracting random impure functions because your IDE has a shortcut to do so or extracting single usage strings into constants creates unnecessary navigation overhead.

All-in-one functions

For example, if you need to map from type A to D going A -> B -> C -> D, instead of extracting a single function mapAtoBtoCtoD that handles too many transformations, it might be easier to understand successive maps (mapAtoB, mapBtoC, mapCtoD) and compose with them. That way you can inline the trivial ones for minimal navigation overhead.
It's a matter of finding the right balance. It is like someone asking for direction in the street:

  • Lost person: "How do I go to Eiffel Tower?"
  • Person A: "Take first left, then second right"
  • Person B: "Go to Eiffel Tower"
  • Person C: "Place your left foot in front of your right foot, then..."

Excessive Separation of Concerns

Like DRY, Separation of Concerns tend to justify painful architectures.
A metaphor can help demonstrate the absurdity of excessive separation of concern:

Tidying up your socks in a "left" box and a "right" box separates concerns but it doesn't help you get dressed faster in the morning.

When parts that live together are segregated, navigation overhead happens and mental models become difficult.
For example if you use the component abstraction, having tests, js, css, html and assets for a single component in separate dedicated subtrees becomes a navigation nightmare as they are bound to change together.
CSS-in-JS libraries or JSX (HTML-in-JS) try to break these concern walls to make developers' lives easier.

Excessive Consistency

The socks metaphor also applies to "consistency". You could be consistent in ways that slows you down. It is preferable to stay open to trying alternatives.
For a while, our company used solely GWT as framework of choice for all of its legacy products.
Choosing exploration over consistency has opened a path toward alternatives such as React, for faster and cheaper delivery. Like duplication, it's a matter of finding the right balance.

The list goes on, but the common pattern here is that when the abstraction is enforced, code becomes harder to use, navigate, understand and therefore change.

Removing an abstraction

Removing an abstraction is actually pretty straightforward but it might take a while.

  • Avoid reuse : It's best not to keep using an abstraction that has gone out of control and explore new options instead
  • Take small steps : Always commit minimal steps with a meaningful message, that way when you break something you can go back in time easily
  • Inline : Inline each usages, maybe 1 at a time. If it's a function, inline it, if it's inheritance, flatten it into a single class, lift up nested ifs etc
  • Cleanup : After an inline you're left with raw code, you can now hunt for unused variables, unused if branches...
  • Rewrite : The idea here is to not make the same mistake. If the code is still a mess, go with simple functional programming principle, such as enforcing immutability, isolating side-effects from pure functions...
  • Repeat : Rollback when it goes wrong, fight the temptation to build a new abstraction right away, the good ones take time to figure out.

Producing better abstractions

As a disclaimer, let's remind that not all abstractions are harmful. Most of them were actually useful before they went out of control.

From my experience, applying functional programming principles has proven helpful in creating better abstractions, probably thanks to being backed by mathematic models and not "religious" principles like DRY, SoC, TDD, consistency, OOP patterns...

Let's look at a few abstractions produced by functional paradigms that we use on an everyday basis and that help us abstract quite a bit of complexity at different scale.

  • Pure Functions : GIVEN these inputs, WHEN you apply a function, THEN you get this output. Pretty trivial for our minds, no side effects.
  • Immutability : This thing has this value. End of the story. No mental stack storing mutations and local states.
  • Optionals, Promises, streams... : Monads. The name doesn't really matter, what matters here is that we can map, which nicely abstracts all the inner complexity. These abstractions are very powerful, to the point that most modern languages turn them into operators. For example async/await for Promises, or ?. for Optionals. I doubt we'll ever want an abstractfactory operator, but I could go wrong...
  • The Component abstraction : All the current web frameworks (React, Vue, Angular) use the component abstraction. Like its name suggests, it enables reuse via composition.
  • Dependency Injection : The most popular OOP pattern is for the most part an attempt at enabling functional composition through instances and methods, in languages where functions are not first-class. It alleviate the burden of instance management.
  • Redux : Isolates state, avoids mutations, extracts pure functions, makes your app predictable... It's a bit of a learning curve, but when you get it, thinking the redux way is a real time-saver.

Conclusion

Thanks for reading this far, I hope all these examples open new perspectives for developers looking into producing better code.
Sometimes, no abstraction is good enough.

Discussion (0)