As developers, we often hear cliched phrases tossed around like "Don't Repeat Yourself". We take ideas like this and run with them, sometimes a bit too far.
Stepping back and evaluating why we do these things is helpful. So today, let's look at an alternative ideology to DRY programming.
Don't Repeat Yourself (DRY) programming, defined
DRY is defined (according to Wikipedia) as:
Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
Some of this might get a bit pedantic, but that can be helpful when considering something like this. Let's break down the parts of the phrasing there.
Every piece
What is "every piece"? Can we never repeat a variable name? An HTML entity?
Ok, ok. So we can repeat <div>
's without much issue and I don't think anyone will take offense at it. But this does bring up the question - when do we decide something has become a "piece of knowledge"? In React, a good example might be a component - but is that PrimaryButton
and SecondaryButton
or does it mean a generalized Button
class? The answer is generally considered to be "Whatever your organization chooses", but this can still leave a good bit of ambiguity around what we choose to abstract.
knowledge
This is another ambiguous point - what do we define as knowledge? Consider a styled button element using some atomic classes and React. If it takes a senior dev 10 seconds to create , they may not consider that knowledge worth abstracting. But to a more junior developer who doesn't know the system well, that knowledge could be a good abstraction. Otherwise, they might have to hunt down the classes, remind themselves of how buttons work, and figure out the syntax for an onClick
. Knowledge is relative and using it in a definition adds ambiguity.
Update: Xander left the following comment below. I think that article does a great job of explaining what "knowledge" should mean.
Just wanted to leave this here for people who are interested.
"DRY is about Knowledge, Code duplication is not the issue."
verraes.net/2014/08/dry-is-about-k...
single, unambiguous, authoritative representation
A "single" representation leaves a lot to be desired. From the view of a devops engineer, a single representation might be an entire application they need to deploy. To a frontend dev, that might be a component. And to a backend dev, that might be a method on a class or an API endpoint. Where does the line get drawn?
We also have the word "unambiguous" - but as I've just pointed out, the rest of this sentence defines more ambiguity. "Authoritative" makes sense - your DRY code should define exactly what it does and be true to that definition. However, that isn't explicitly confined to DRY code.
system
Finally, we have the world "system" - this gets back to the "single" statement we discussed a second ago. What is a "system"? In React, it might be a component or a Redux action/component/reducer. In containerized software, we could be talking about a whole pod or just a single instance.
At the end of the day, DRY all to often promotes pre-optimization, which is unnecessary and sometimes actually hurts your ability to write code. Sometimes it is more difficult to modify an abstracted component to fit a specific use case. You add a lot of complexity or you break that component out into something new - which isn't super DRY. You can't know every use case for your component on day one.
An alternative - Write Everything Twice (WET) programming
Instead, I propose WET programming. To me the definition would be:
You can ask yourself "Haven't I written this before?" two times, but never three.
With this definition the focus moves away from premature optimization and instead allows you to repeat similar code a couple times. It also shifts the focus to a more gut reaction. It allows you to make decisions based on the exact use case you are looking at. If you are building a web app, you probably want to abstract your buttons into a component, because you are going to be using a lot of them. But if there is a single page that has some special styling (maybe a pricing page?), then you don't need to worry too much about abstracting out the components on that page. In fact, under this system, if you needed a new page that was similar to that special page, you could just copy/paste and change the code you need. However, at the moment that that happens a third time, its time to spend a bit of time abstracting out the parts that can be abstracted.
I would also add this stipulation (to both WET and DRY programming):
You must comment your abstractions
Anytime you abstract something out you are reordering the map of your application. If you aren't commenting to discuss your reasons for abstracting, you are doing a disservice to your team (and your future self!).
What do you think? Does this track with how you develop?
Oldest comments (93)
Yes.
So summarizing: only make an abstraction for something if you can remove three repetitions.
I remember this as "Generalize at n = 3"
N = 2 leads to N >= 3 most of the time. I get the point and I also don't abstract out things immediately, but rather than follow hard rules, we can say that ... I don't know it seems like 99% of code is going to be used either once, or more than 2 times. Something to get used exactly 2 times would be very rare.
Also in general programming, abstraction isn't the only way to avoid repetition.
And the main reason for DRY is to not have to change the code in 7 places when something changes.
So you kinda have to know which things logically belong together. Very often that's things that share the exact same code, but sometimes it isn't.
A good reason to wait until you have three or more repetitions is that it helps you prevent writing abstractions that take ten arguments.
*You may think "oh, this code here repeated here, and here" so you write an abstraction. Then you add a third variation, and realize you need a height argument, so you redefine it with a default of
None
. Then repetition four requires a colour. And before you know it, your abstraction ends up with 10 different branches and becomes 🍝.On the other hand, if you do write it in different places, you'll see what the differences are, and maybe abstract it into 3 didn't functions that only do one thing, so you don't end up with abstractions that you end up refactoring anyways.
(*Impersonal you - not saying you're guilty of that, as it gets easier to predict with experience, but good habit to develop early on).
This is very well put. There are many ways to abstract something and you don't know what the best abstraction is going to be until you have a large enough sample of sub-problems to be able to make an informed decision.
Choosing the wrong abstraction is costly and leads to complexity bloat because developers have a natural tendency to keep adding new abstractions on top instead of refactoring or deleting existing ones.
Also, every time you invent a new abstraction to reduce repetition, you introduce one more concept that people have to learn in order to make sense of your code and it adds up; especially if the abstractions are contrived technical concepts and not strongly rooted in an underlying business domain.
This only works if you have excellent communication between your devs. Otherwise if you have three devs, each recreating the wheel three times, that's nine repetitions.
It also depends on what the item is... if you are writing a way to find the next recurring date/time using an RRULE (RFC 5545), then you should definitely write that only once and have everyone use only that one. Otherwise one iteration will work if a daylight savings boundary is crossed, and the next two iterations will not, as an example...
Fabulously articulated Conlin.
Thanks Ben!
So spot on. Thank you for this.
Some devs go to crazy lengths to not write something twice. It's honestly a little silly :).
This! And then you have to deal with their crazy abstraction, because they think method names should be the only real “comment”
Sorry, anecdotal example / rant 😛
No need to apologize. You are not alone. :-)
Good post! I personally prefer to repeat as needed better than one complex code with several conditions just to accommodate all known use cases. Repeat makes code easier to maintain and understand.
If I understand you correctly you’re saying: if I abstract away code, but in that code I have to cater for a couple of different use cases the code inside the abstraction becomes too complex, right?
There’s two answers:
1: it could be that the code you’re trying to deduplicate is not similar enough, in that case don’t deduplicate.
2: there are ways (patterns) to make the code inside your abstractions less complex (basically: less conditionals). For a nice illustrated example of this: youtu.be/8bZh5LMaSmE
This is usually a sign that your abstractions are weak and you should consider refactoring the code around what's repeated, too. Often it means you have too little abstraction and too much specialization in your functions/classes/etc.
Though as Niels points out, you might be trying to de-duplicated code that's not truly duplicated. Always question the code first, but if you can't come up with a better abstraction, it may be best to leave it for another time. I do agree that a little duplication is better than an over-complicated abstraction. The goal is to consolidate the knowledge and authority to one implementation as much as is reasonable.
I agree! question the code first before abstracting code :)
Excellent backronym!
Thank you for putting in words something I try to preach within my team!
I work on more than one web site at a time, all using the same base in-house library toolkit. For each of these sites, they all take a slightly different approach to solving various problems, sometimes related or sometimes not. Once two independent solutions emerge that look like they may be related, the two are pitted against one-another, and then merged together into a single solution, and then pushed into the core toolkit for the rest of the systems to have available as well. This method of parallel development has created significantly better software, because it forces multiple approaches to problems, rather than just a single approach.
The philosophy has been mostly the same as you suggest though, WET rather than DRY. If something is done once? Cool. If it is done twice? Odds are each one will have different advantages and disadvantages. When it comes to doing it a third time, the first two are merged into a single piece of code along with the new requirements, and pushed to the core library.
Doing this, I now have an extremely robust data processing system, user authentication system, path router, HTML templating system, and more all ready to go for future projects.
You're basically building your own framework out of your own business requirements, sweet! :)
Yup, that's exactly it! And the entire framework is full open source BSD licenced on my GitHub account. The main issue is lack of documentation. This has been one of my main focuses this year and into next.
Good luck, after all that's how both Django and Rails started. They were frameworks used internally in companies
In 2019 we plan on experimentally standing up more instances of the underlying platform that powers dev.to (Emphasis on experimentally)
I'm really excited to see what could come out of extracting this very high-level, highly opinionated framework for building online community spaces.
I've also heard this referred to as the rule of three.
I believe this is referred to as the rule of three.
I think WET applies more from a UX perspective, but if you are doing programming in c# or java or any object oriented language, DRY principle applies.
This is a problem with your abstraction and not DRY. Another commenter mentions a similar problem with having lots of conditionals in their abstractions. That's another symptom of the sale problem, the wrong abstraction.
Also, I'd be concerned about your rubric, "haven't I written this before?" That really only applies when you are the only developer in the code. Hopefully, code review would catch this. I think you have to be a little more conscientious about this in shared code base since no one developer is likely to know all the versions of the same code there are floating around.
But, your point is still valid. Nice summary.
One of the arguments for doing it this way is that having three examples to abstract from is more likely to give a generally useful abstraction than the first two. It's not a hard and fast rule though - sometimes when you need the second instance, it's obvious there will be more later, sometimes you find several instances later that the abstraction could be improved and go back to the first few instances to refactor them.
And trying to force a specific use case that doesn't really fit to the wrong abstraction because otherwise you might repeat some bits of it is a symptom of the same problem. Sometimes it means your abstraction would benefit from having parts refactored out separately, sometimes it's better just to have a comment saying "this looks like , but that's not a good fit because ". But if you find the same comment being written three times, rethink the refactor....
This is a super valid point! It's a concern with DRY as well, since if multiple developers don't know the full codebase, they are likely to abstract something multiple times. To me this all comes down to making sure you have well commented code and making team communication easy. Code Reviews also help you catch these problems before they hit the main branch.
I'm for M.O.I.S.T myself.
Make Organization Inconsistent and Stiff Teamwork ?
Not even close.
DRY it's from OOD principle and those principles they are the most powerful design we use to enhance back-end, your suggestion is not valid for this side of programming, it may apply on UI/UX.
While I do think there is more need for DRY principals on the backend, there are many functional languages that do not use OOP at all. Even within OOP languages like Java, premature abstraction can hurt you. Abstracting two classes into a higher order class could cost you if the abstraction is too generalized.
OOD is not OOP first, OOD is general design patterns for software development, you can use OOD in many language that they are not oriented object i.e php...
For abstraction you may right, but also OOD handle that in general matter.
The rule of three that it's similar to your suggestion can apply also with DRY, it seems opposition but DRY doesn't really means DON'T REPEAT THE CODE Like many understand, DRY is DON'T REPEAT YOUR SELF in intelligent manner so when low level piece of code have same sense in one module create abstraction for it.
A slight correlary, in my mind, is “you’re going to throw away the first one”. Now, what constitutes that one differs like you outlined, but I think this principle serves to reinforce the “write everything twice” - if you’re working on the MVP, no matter how much duplication you have in it, you’re going to be rewriting it later, so save it for later. (But still apply WET in the second round! The Second Version Will Be Perfect is a notorious tar pit.)
Just wanted to leave this here for people who are interested.
"DRY is about Knowledge, Code duplication is not the issue."
verraes.net/2014/08/dry-is-about-k...
This is a great link! Thanks so much for sharing. Gonna update the post to add it!
Finally some sense in an article and comment thread filled with anti-sense.
You don't "abstract" to avoid duplicating code, you abstract to give names to bits of code so you can think about them easier; for example you don't make functions to reuse them, you make functions to give a name to that operations so you'll think about "create_basket" instead of whatever the basket code does, and work with that abstraction. You don't even need to have that function called twice, once is enough.
Reusing inapropriately code previously abstracted in an object or function might be bad, but "don't abstract unless you write it twice or three times" is worse.
This is true, but you have to be willing to spend some time making sure you came up with the right name. Which is of course one of the hardest things in programming ;).
True, but many behaviours cannot be boiled down to a three word function name without causing even more confusion and indirection. Sometimes the raw code itself tells the story best. You don't want to force the reader to jump around between different files or parts of the code to figure out how the code accomplishes a certain simple task.
Maybe if the author of the function knew how to find the perfect words to express the exact behavior of the function in a way that everyone could intuitively understand, that would be great, but this is not reality. Human psychology is not so simple - There will always be people who will intuitively misunderstand no matter how careful and precise you are in your choice of terminology; and in these cases, your abstraction will cause indirection and confusion.
Very often, the author uses the wrong sequence of words which have multiple interpretations and this only serves to confuse and misdirect the reader. Writing the correct abstraction requires familiarity with the sub-problem and this familiarity can only be achieved through being exposed to the same sub-problem multiple times, more than 2 times, sometimes even more times.
Also, with a sample size of 2, you cannot always know what kind of abstraction you'll end up needing... Maybe as your system evolves, these 2 currently similar behaviors will diverge rather than converge and sharing an abstraction between them won't make sense; it will mislead the next developer down the wrong path.