First of all, let me apologize for the clickbait-y title, but we need to talk about Don’t Repeat Yourself (DRY) principle.
Few years ago we were asked to help with further development of an existing platform in the blockchain industry. It was a product directed to traders and helped them automate tax fillings. And it was the biggest copy&paste-based project I have seen in my life.
It was written in PHP. If you know PHP then you know that each
.php file acts as an entry point to your code - and from there you usually import (include) utilities, libraries which are common to your project. That is if you don’t use a framework. This one didn’t.
This project was created from the ground up by its founder who learned programming while creating and maintaining this product.
They didn’t have much programming knowledge beforehand and didn’t know all of these good practices and fancy rules that we have. So each
.php file was a copy of a previous one with modifications needed for given route/view. Tons of repeated code.
And here’s what’s really important: at this point the product already made millions of dollars.
Let that sink in. No amount of copied code, repetition and what we would call “bad quality code” stopped the product from being successful.
What’s more is that we were able to pretty quickly identify repeated code and introduce the right abstractions because we already knew different use cases from the duplicated code.
But please be aware that I'm not advocating this kind of development. It's an EXTREME example showing that you don't need to be a perfectionist.
I’ve known developers who lived to not repeat themselves. I think that at some point they went from “in general it’s better not to have too much repetition” to “if you repeat any code then you will burn in hell”. This is one of these stories.
We were working in a company which had multiple related products. And something that one of the developers did blew my mind. They were working on a new application and decided that they could use a two-line function from the previous project (3 line if you count the function’s definition).
The choice here was simple - just copy these 3 lines of code into the new project. But as a lot of developers think - including this one - “repetition is root of all software evil”. So they spend two days setting up a common library (including deployment process etc) just so they won’t duplicate 3 lines of code.
Now ask yourself how does that benefit the project and how it gets us closer to delivering on our objectives.
The source of all evil
By simply googling “Don’t repeat yourself” I learned that:
- Repetition is the root of all software evil
- Duplication is waste
- If you do it then you don’t understand how to apply abstractions
- It decreases quality of code
- It should all be eliminated
It does sound like bad idea, right? But at this point it sounds like we vilify it. It sounds like if you do it then you are a bad developer! Like there isn’t any scenario where you should do it. But ya’ll use StackOverflow, don’t you?
Did you at any time ask yourself a simple question - is it really THAT bad that I copy and paste a little bit of code? Does it have any benefits?
Only a Sith deals in absolutes
Here’s the thing - every rule in software development makes sense at most 80% of the time. Each was coined in some specific context - for which it made sense. But that context got lost in translation and some people began to follow the rule religiously instead of treating it as a rule of thumb.
This is our fault. We make those principles sound so absolute. Don’t repeat yourself means don’t repeat yourself. Then it’s passed from one person to another, copied over blogs, books over and over until it becomes the truth and all context and all the nuance is lost.
So here’s an alternative that I think some of you should try:
Try not to repeat yourself too much. But sometimes you can. Because sometimes it may make sense.
It’s not as catchy phrase though.
Please repeat yourself
This rant is already getting too lengthy for my taste, so let’s get to the point. There are numerous cases where repeating yourself is not only not an anti-pattern but it’s actually a tool.
I love repeating myself. Especially when I’m writing tests. I copy tests all over the place and change what’s needed for the given test. This results in a lot of duplication. When I’m done I simply go through these tests and see what’s the best way to reduce (not completely eliminate) repetition and what can be extracted into separate abstractions.
I could spend a lot of time upfront to figure out how do I want to set up the tests, what helpers do I need, then redoing everything because it turns out that 2 of 25 tests need a little bit different setup. But why would I do this if I can just see where the code takes me? Why not see what’s actually needed instead of doing all this guesswork?
This is not just for tests, but tests are where this is most obvious and I would encourage you to start there.
The overuse of DRY (Don’t Repeat Yourself) is an anti-pattern in itself. Overzealous duplication removal leads to bad abstractions because developer is creating imaginary abstractions instead of uncovering real ones.
Repeating yourself is basically giving yourself the time and space to come up with the right abstractions instead of engaging in guesswork and clairvoyance. We don’t know what the future code will be. We don’t understand all the use cases at first or ways in which our code will be used. If we introduce abstractions too soon then in best case we end up rewriting everything.
Repeating yourself is a great tool to uncover abstractions.
What other people say
As you may guessed I’m not the first person to notice this. There are two great articles about this topic:
- AHA Programming by Kent C. Dodds
- The Wrong Abstraction by Sandi Metz
You should definitely read them as they extend this article nicely and will give you more understanding on when to use duplication. Here’s a few excerpts:
“duplication is far cheaper than the wrong abstraction”
”prefer duplication over the wrong abstraction”
— Sandi Metz, “The Wrong Abstraction”
“Avoid Hasty Abstractions”
”Optimize for change first”
”the big takeaway about AHA Programming is that you shouldn't be dogmatic about when you start writing abstractions but instead write the abstraction when it feels right and don't be afraid to duplicate code until you get there.”
— Kent C. Dodds, “AHA Programming”
I want to be clear that I’m not inviting you to make a mess. I am pointing out that some level of temporary duplication is healthy. First story is supposed to show you that despite what you may think - the success of a product does not depend on that but instead it depends on business development.
So copy code and modify it when necessary to give yourself space so you can uncover real abstractions instead of imaginary ones. Of course - that’s not the only method to uncover better abstractions. Talking to stakeholders and understanding business better is another way that we will discuss in the future. But it’s not a situation where you need to pick one over the other but rather complementary methods.
There is a lot of rules of thumb in software development - or “good practices”. Usually they work when you understand context and apply them sensibly. Sadly there is a lot of dogma in software industry, it’s driven by hype and emotions and rarely by pragmatism. So remember that these “good practices” are not “all or nothing” but more of a “try to do this more than the other thing and you’ll be good”.
Oldest comments (68)
The first story explains that the code was written at the time by a novice, should we hold a novice to the same level of competence as a pro? From your quick Google search "it reduces the quality of code", if you believe in repetition you should have left their code as is.
The second story is just a developer who doesn't understand the principles of DRY. It's a function that encapsulates code, an abstraction by all ramifications, according to DRY principles you just use the function. They should have rewritten their entire language API if they feel reusing code is bad.
I agree software principles are correct 80% of the time but these principles need to be understood properly.
I think repetition is a temporary state until you figure out the right abstractions, so leaving that code as is - when it hindered further development of the codebase at this point - would be a bad choice. I don't "believe in repetition" because coding is not a religion. I think it is useful in small, healthy amounts and not something we should avoid at any cost.
I agree. But that's the point though. People treating coding principles as dogma instead of actually understanding them. I wrote this piece to convience people try to start treating it less as dogma or "good practices that you just need to apply" and start thinking more about why, how and when to apply them.
Could not finish reading. This is ultimately nonsense. In no cases, regardless of your attempts to imprint some rationale to it, is a good thing to repeat code. I have personally fixed one bug 98 times.
To quote one of your so-called samples, you imply that spending 2 days for 3 lines of code is madness. When you over simplify like that, yes, it does sound like madness. But the DRY principle is not there for 3 lines of code, it is there to be a guiding beacon in your search for perfection and self advancement. Just because you find one case where it sounds bad, it doesn't mean that there will be more cases like that, or that other cases should be considered too.
I have more than once started with an abstraction or a library with a single piece only to see it grow over time while I thank myself for DRY'ing it 3 months later.
In short, nonsense. Apologies for the blunt comment, but it needs to be said.
I think that you should at least read the whole thing before labeling something as nonsense. Otherwise it's dishonest and it's clear that you are reacting emotionally, not having a rational conversation. I encourage you to read the whole thing because we may think more similar than you think - and maybe it's not all nonsense :)
I also encourage you to read content I linked as it adds a lot to the argument.
If you read the whole thing you'll see that it's not the scenario I'm advocating for. What I'm advocating for is giving yourself space to uncover abstractions instead of imaging them through iteration. That's it
Yep, it is madness. And it's there to show how DRY is often misunderstood and taken to extremes by people who ultimately do not understand it.
I don't think "search for perfection" is what a software developer should do. People who search for perfection often can risk the entire project because they can't focus on the task at hand. That said I'm not saying that's your case, just what comes to my mind when I hear these words.
Hi. I think you have a point. Maybe I should have finished reading. In all honesty, though, you get to a point where you just say "no more". I'll try.
Let me explain the faulty part in your plan: Code has the potential to have bugs in it. Any code, big or small. Code has also the potential to grow, to become more complex in order to better serve its ever-changing purpose. These two reasons alone are enough to invalidate what you state. Why? Because if I have a copy/paste party, I will have to have a copy/paste enhancement/bug fixing party. The 98 bugs I corrected? The bug was there unintentionally. The persons who copied over that many times did not notice it. It was a 7-line handler. Would you say that 7 lines are worth the trouble if we were to follow your logic? Probably not worth the trouble right? Well, 98 times the bug repeated.
So yes, I stand by my conclusion.
Depending on the situation - I would say around 3 or 5-ish time I'd repeat it I would put it into a separate function. Depends on the context though. 98 times is just careless and way too much. As I said - there is a healthy amount of repetition, 98 is not healthy. And repetition should be a temporary state until the right abstraction is uncovered. Sometimes it's before you commit, sometimes it's before you make a PR, sometimes longer. The point is - don't be afraid to copy stuff because of the dogma and stigma.
Do you think different developers in the team will be counting the number of times the code has been duplicated, in order to say "wait a minute, let's just stop for a second", and then refactor? Once a copy/paster, always a copy/paster. That's not real life Mr. Rafal. Your argument still doesn't stand.
Reflect on this:
It's all about the long run. Copy/pasters are doomed to always be copy/pasters.
I had so many arguments with colleagues about copy/pasting ! Even wrote an article about it:
I would add:
Why do we need to factorize code ? Because we don't want that 2 part of code having the same goal diverge. But what about 2 part of code having different goal but implemented with the same code ? Should we factorize those 2 identical code ?
Hell, no ! If you do this, you will end up one day or another with a single function (a module, an entry point, whatever) that will implement 2 different behavior with a big
ifstatement. And not a single
if, but a lot of small
ifinterleaved with common behavior.
Factorizing code is a good practice, but it is not a dogma.
A false abstraction is a great example of that!
Sadly dogma in software development often kills rational discussion and pragmatism. People are way too emotionally attached to some things.
More than all those rules and acronyms, it's better to follow the kaizen methodology of continuous improvement, or what some people like Uncle Bob say as "boy's scout rule" of "leaving the code better than how you found it".
I do however like to point: just because it works, doesn't mean it's done. Some people see it working and commit without even giving it a little refactoring.
Another reason you may want to duplicate the code is that they have to different consumers, which can independently. The effects of this are twofold:
Of course you can refactor code into reusable code units, but it ultimately comes down to balancing the qualities of the software system like maintainability and stability, which we get to do by writing and refactoring the code.
So tired of the over-abstracting over-engineering over-thinking.
Write WET code (Write Everything Twice), then DRY it out as duplicate loc shows strong signs of needing an abstraction. Live a little devs, stop fussing over perfection until perfection is worth it.
Of course you are entitled to your opinion. In my opinion and experience, this line of thinking, practical as it may sound, will lead you to mediocrity. I stand in the other side. I studied Chemical Engineering, but I am a senior software developer who has been nominted twice to Microsoft MVP in C#. Why? Because I do not stand mediocrity. I am the developer I am because I forbid myself from copying/pasting.
Your kind of talk leads to developers I have personally recommended to be fired. No hate, just the harsh truth.
Well, if throwing titles around and fallacy of authority is your argument then there isn't anything left to discuss I guess.
It is not a fallacy. It is a story of success. Bring me a copy/paster that has been nominated. Please. I'll wait.
Just because you were nominated to something doesn't mean you are always right or even right in this discussion. Also I don't understand how would I know other nominated people and what their coding style is. The point is - it's not an argument.
I am sure that you are a good engineer but at the same time I can see that you are set in your ways. It's not my place to change your mind I guess, it's up to you if you read it and try to see the world from different perspective or not. Nonetheless I wish you good luck :)
I do not intend to "pull rank" in order to be right. I merely explain with an example the potential great benefits of renouncing to the terrible, terrible practice of copying/pasting.
My point here is that no copy/paster will ever stand out? Why? Because it is not their code. That simple. If you want to cruise through your career being barely average, by all means, copy and paste. That's what I'm saying. I'm not pulling ranks.
Please you need to update your understanding of a developer 'standing-out'.
If you think the only way a developer can 'stand-out' is by being nominated as a Microsoft MVP n-number of times, then you need to take a sabbatical.
By no means did I ever try to imply such a thing. Your conclusion cannot possibly be following a logical path. As stated, it is an example of what you can achieve if you set yourself into a disciplined path.
Sounds good. I'll ship and you can spend your time fussing over whether or not your code could be even more DRY. Maybe you'll even get nominated DRYest Coder Ever?
Sarcasm. The tool of the weak. No worries. I'll let you know if I'm nominated. Cheers!
Hard to be believe that that alone is what makes you a great developer.
It does. Debug the statement. If you cannot copy and paste, you must do it some other way. This invariably takes you done the path of simple things like inheritance and virtual methods to complex design patterns. While you learn all this you learn all sorts of "peripheral" knowledge. Yes, you must be knowledge-thirsty.
I think we are referring to different things when we say "copy/paste", I don't mean blindly copying stuff from StackOverflow. I mean copy/pasting code you wrote yourself, and I think that's what the article also refers to.
Exactly. I am talking about the bad practice of copying and pasting pieces of code to create repetitions as opposed to applying a better coding practice.
Get bent with your fancy nominations. Your argument is INVALID.
Loving all the hate. LOL. It just tells me how much I have surpassed the crowd.
I mean… Donald Trump was nominated (and won, once) for President of the US. I think we can agree that a mere nomination, or even a win, doesn’t automatically imply qualification.
Your need to state and inflate your own ‘achievements’ detracts from your argument at best; at worst, it reveals your own mediocrity and insecurities.
The author makes very good points and backs them up with actual appeals authority. He uses actual logic and gives good examples.
Take yourself down a peg, bro.
Since you are so fond of logic, you should apply it while you read. I am not trying to gain traction with my nomination. It is an example of how much you can advance yourself when you forbid yourself from copying and pasting.
Try reading things correctly next time. You'll find yourself in a much better position.
Trump was qualified. He proved virtually every critic and “expert” wrong on economy, foreign policy, racism, energy, etc.:
Thanks for the piece of data. I really cannot care any less since I'm not a US citizen.
Your ego is so high, that you better watch out to not fall from it - you may hurt yourself ;]
Yeah, that only applies to things that are volatile. My "ego" as you say, is "flying high" based on rock-solid knowledge acquirement. Unless I have a stroke or something, I'll remain "high". But thanks for worrying! 😏 And welcome to my fan base. It is amazing how people try to bring you down. As I have told others, hate demonstrations just tell me how higher I am from the average crowd.
Finally, you may want to read more than one message in a thread. Cheers, fan.
I kind of informerly follow this pattern - you can't really create the right abstraction until you understand at least 2 different use cases - but you are essentially moving towards DRY code, which is not what this article suggests.
The problem with taglines like "please repeat yourself" is that developers will read that and take it to heart. Our industry doubles in numbers every 3-4 years, meaning most developers are still pretty young, and they should not be armed with this type of mentality.
Furthermore, every line of code is maintenance, and maintenance is 70% of our lives in software. Every time you delete a line of code, that's one less thing to maintain. Lets say someone finds a bug and fixes it, but they didn't go searching around the entire codebase to see if the bug existed somewhere else? Now you're just deploying software with the same bug... fixed in one place but not others. It's a false sense of "it's fixed".
Maybe this product mentiioned in the article could make many more millions if the developers weren't makeing the same feature enhancements 20 times across 20 nearly identical files.
Maybe that first abstraction took 2 days to implement, but now the shared library is in place and all shared code can now be put there with even less effort that copy/paste. Those 2 days ended up saving them weeks in the long run.
I'm not advocating in any way that what was done in that case was a good idea, just that you can still succeed even without perfect code. I'll add a sentence there to explain that. It's fair point that this could be misunderstood.
Write WET code, then DRY it out as the benefits become compelling enough. That could mean enough devs need that now so make it a shared lib, or it's short and simple but has been repeated like 4-5 times now, is long and complicated and error-prone and is needed in a 2nd place.
I've seen excellent DRY abstractions pay for themselves 100 times over while building large apps and internally-shared libs. And I've seen folks waste hours and hours on academically perfect code only for it all to get deleted a month later when the Product team pivots.
So, you like to work twice and make the development process slower to avoid thinking? I know that you don't have to overthink everything, but it's a basic thing to think before coding to avoid make a bulky and heavy project.
No, but I'm happy to move fast and work twice when it avoids sinking waaaaaaay too much effort into something prematurely. Things change, tradeoffs have to be made. It's about maximizing my efforts when and where it truly is needed vs. treating everything as equal. Not all code is equal. For example:
Got some repetition in your app's HTML classes? No biggie, do not waste your time DRYing that markup.
Got some repetition in your shared design system? Oooh, better clean that up.
Got an inefficient algorithm in an internal tool that never deals with large datasets? Forget about it!
Got an inefficient algorithm behind a customer-facing feature that hogs resources and takes 3x as long to execute? Drop what you're doing and perfect that algorithm and make sure it has full test coverage. Now!
I was one of those Devs that wrote code that was ALWAYS DRY.
I'll be honest it's so hard for me to write WET code because it's been indoctrinated into my mind all throughout college.
Most of the codebases I've worked in don't keep things DRY, I've always found myself saying "oh man, you could encapsulate this into a function and then clean this up quite a bit, why didn't they keep things dry?"
Up until recently, my new boss started to tell me, "You know those devs that are DRY devs, they need to let the application cook"
"Once you have an application working really well, you then can clean it up , but it needs to cook!"
Just this year I've realized this is true and I'm starting to like it.
You build an application with well-written code and let it cook/bake and once you have a good working product, you can identify where you can keep things DRY and clean things up a bit.
Definitely an interesting subject.
"Once you have an application working really well, you then can clean it up , but it needs to cook!".
Now I love Tailwind CSS.
Sometimes, I modify duplicate code using find and replace all.
Worth nothing. Anyway all duplications are a technical debt. You can fix it later or sooner. Just don't collect it.
There are times that repeating yourself can be desirable or necessary, but all good software development starts with minimising duplication. That's because all code requires testing and maintenance so it's a liability. You want less if it, not more.
Re-thinking the DRY pattern is aligned with the school within microservices around "share nothing".
I think the main point is what makes your code base simpler and easier to maintain. Sometimes repeating a line of code is a lot more clearer than writing an abstraction but repeating too much it will make it harder to maintain.
I tottally agree with your point.
I fully agree. DRY is not a monster nor should it be seen as such. Knowing how to balance between the two worlds and build the expected solution.
I've just spent a few months writing an abstraction framework for interacting with our simulation hardware - we have hardware simulating fluidic movements for a medical testing device, and as part of our CI, we want to automatically test changes to the product without having someone hooked up to an eternal drip feed...
In constructing this framework, I had a few considerations, not least the assurance of maintainability, and certainly including concerns of maintainability.
One of the things is that there are several channels into the device, in the form of binary or text streams, operating in the background so we didn't lose any outputs. For those, I made sure to have abstractions in place in terms of a threaded stream reader that many other types of streams could benefit from.
On the other end, we had several types of streams that would use this same threaded stream mechanism with a couple small variations. Each
readmethod was coded nearly exactly the same in about 4 lines, but with a few adjustments. I unified those adjustments into a single method (and thus, an additional abstraction).
The end API functions I exposed are coded each nearly identically, except for any of their corner cases. I could actually have abstracted those further.... I did not.
At some point I decided that the abstractions and redirections were too deep and accepted a certain level of repetition at the higher levels, whilst keeping the "conceptual" (repetitious) elements genericised.
I am a fool for following a (self-coined) mantra: "Solve the general case".
Sometimes that mantra is unhelpful. There have been some projects that didn't require such zeal and precision. But I was never able to know this in advance. So I have applied those rules. Abstractions, abstractions, abstractions.
At the end of it, to figure out where something comes from, I have to delve down several layers before figuring out a single issue.
I'm not advocating anything here really.... just a personal diatribe on how it is difficult sometimes to decide when to "just do it"... and when to "think architecturally."