This blog post first appeared on The Valuable Dev.
Once upon a time, a fearful young developer (me), wanted to write magnificent code. I was seeing generations of developers speaking about it, as we speak about pyramids two thousands years after their edification. I wanted to let my mark in the world!
Therefore, I did what I thought would be the best: avoiding every traps everybody felt into, by following the holy coding principles, created by the ones who have the Knowledge.
I followed them drastically. It was my religion, my cult, my way of life. After hearing about the DRY principle, my code became suddenly as DRY as the Sahara. I swear to the moon, the sky and the twelve deities: Nothing will be repeated in my code! Never!
This lead to many, many problems.
I understood that principles need to be well understood, by really reading the books where they are defined, not only random comment on stack overflow.
I understood that principles should be used depending on the context.
A little reminder for those in the back who don't follow: the DRY principle means "Don't Repeat Yourself" and was first introduced in the book The Pragmatic Programmer.
The principle itself was known and applied before this book came to life. However, the Pragmatic Programmer defined it precisely and put a name on it.
Without waiting more, let's dive into the wonderful land of DRY!
Don't Repeat the Knowledge
Even if the sentence don't repeat yourself sounds simple enough, it sounds as well a bit too... generic.
In The Pragmatic Programmer, DRY is defined as
Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
That's great but... what's a piece of knowledge?
I would define it either as:
- A precise functionality in the business domain of your application
- An algorithm
To take overly used e-commerce examples, a shipment
class and its behavior would be part of the business domain of your application. A shipment is something real your company uses to send products to their customers.
Therefore, the logic of this shipment
should only appear once in the application.
The reason is obvious: imagine that you need to send shipments to a warehouse. You need to trigger this logic in 76 different places in your application.
No problem: you repeat the behavior 76 times.
After a while, your boss comes to you and asks you to change this behavior. Instead of sending shipments to one warehouse, you need to send them to three different ones.
The result? You will spend a lot of time on your shipment logic, since you will have to change it in 76 places! This is a pure waste of time, a good way to produce bugs and the best method to piss your boss off.
The solution: create a single representation of your knowledge. Put the logic to send the shipment in one place and then use the representation of this knowledge anywhere you need it. For example, sending a shipment could be a method of the class Shipment
you can reuse everywhere your want.
DRY and Code Duplication
So DRY is all about knowledge? All about business logic?
Let's begin by the obvious:
<?php
interface Product
{
public function displayPrice();
}
class PlasticDuck implements Product
{
/** @var int */
private $price;
public function __construct(int $price)
{
$this->price = $price;
}
public function displayPrice()
{
echo sprintf("The price of this plastic duck is %d euros!", $this->price);
}
}
$plasticDuck = new PlasticDuck(2);
$plasticDuck->displayPrice();
This code doesn't look that bad, does it?
Dave, your colleague developer, thinks it does. Seeing this code you wrote, he comes at your desk and screams:
- The word
price
is repeated 6 times! -
displayPrice()
method is repeated in the interface, the implementation and called at runtime!
And now, you wonderful developer, you look at Dave, like an experienced gardener looks at a slug, answering:
- Variables (and properties) like
price
need to be repeated in your code. It's not a functionality. - The knowledge (displaying the price) is only present once, in the method itself.
No DRY violation here!
Dave is speechless, feeling your powerful aura illuminating the whole room.
However, you attacked Dave's expertise: he's angry. He wants to win the argument. Soon, he finds another piece of code you've written and comes back to your desk, slapping it in your face:
<?php
class CsvValidation
{
public function validateProduct(array $product)
{
if (!isset($product['color'])) {
throw new \Exception('Import fail: the product attribute color is missing');
}
if (!isset($product['size'])) {
throw new \Exception('Import fail: the product attribute size is missing');
}
if (!isset($product['type'])) {
throw new \Exception('Import fail: the product attribute type is missing');
}
}
}
Dave, full of himself, claims: "You little pig! This code is not DRY!".
And you, aware of what the DRY principle is really about, answer: "But the business logic, the knowledge, is still not repeated!".
Again, you're right. The method validates some CSV parsing output in only one place (validateProduct()
). This is the knowledge, it's not repeated.
Dave is not ready to accept it, though. "What about all those conditionals everywhere? Those if
? Isn't it an obvious DRY violation?"
You take a deep voice to answer, pronouncing every word perfectly, your knowledge bouncing on the wall to create an infinite echo of awareness:
"Well... no. It's not. I would call that unnecessary code duplication, but not a violation of the DRY principle".
Suddenly, your fingers type on your keyboard, at the speed of light, the following code:
<?php
class CsvValidation
{
private $productAttributes = [
'color',
'size',
'type',
];
public function validateProduct(array $product)
{
foreach ($this->productAttributes as $attribute) {
if (!isset($product[$attribute])) {
throw new \Exception(sprintf('Import fail: the product attribute %s is missing', $attribute));
}
}
}
}
It looks better, does it? There is no code duplication anymore!
To summarize:
- Knowledge duplication is always a DRY principle violation.
- Code duplication doesn't necessarily mean violation of the DRY principle.
Dave is still not convinced. With a serenity defying the highest spiritual masters through the ages, you give him the final stroke. You take a book on your desk and you read:
Many people took it [the DRY principle] to refer to code only: they thought that DRY means “don’t copy-and-paste lines of source.” [...] DRY is about the duplication of knowledge, of intent. It’s about expressing the same thing in two different places, possibly in two totally different ways.
This is from the 20th anniversary edition of the Pragmatic Programmer.
DRY Everything: the Recipe for Disasters
Dangerous Generalities
Let's take a real life, more interesting example:
I'm currently working on an application for filmmakers. They can upload their movies and their metadata (title, description, cast and crew of the movie...) on it easily. This information are then displayed on a VOD platform.
This is a MVC application looking like this:
The content team of my company can as well use the same application, to create the movie's metadata when the filmmakers don't want to do it themselves.
Both filmmakers and our content team have very different needs. The content team is used to work with content management systems, the filmmakers are not.
Therefore, we decided to create two interfaces:
- The first one for the content team, without guidance or explanation, where you can enter content as fast as you can.
- Another one for the filmmakers, with a more friendly user experience.
Here's what we did:
The controllers from the two different applications are almost the same. It's not only about their names: their implementation as well. We basically copy pasted them.
This looks like an obvious and ugly violation of the DRY principle: views and controller repeated all over the place.
What were the other solutions? We could have grouped the common logic by using something like a template method, putting all the common logic in an abstract class. However, this would have coupled the controllers of the two different applications together.
Change the abstract class and every single of your controllers need to support the change.
In many cases, we knew that the interface would become different in the future, depending on the application. It would have created a lot of if
in the controllers actions if we would have only a set of controller for both applications, not something we want. The code would have been way more complex.
Moreover, the controllers shouldn't contain any business logic. If you recall the definition of the DRY principle, it's this knowledge, this business logic which should not be duplicated.
In short, trying to apply DRY everywhere can have two results:
- Unnecessary coupling
- Unnecessary complexity
You don't want any of these in your application! If you want to know why, I wrote quite a bit about KISS and complexity here.
Premature Refactoring
You shouldn't apply the DRY principle if your business logic doesn't have any duplication yet. Again, "it depends", but, as a rule of thumb, trying to apply DRY to something which is used only once can lead to premature generalization.
If you begin to generalize something because "it could be useful later", please don't. Why?
- You will spend time to create abstract classes and whatnot which might be only used in one place. Business needs can change very quickly and drastically.
- Again, you will possibly introduce complexity and coupling in your code for... nothing.
Code reuse and code duplication are two different things. DRY states that you shouldn't duplicate knowledge, not that you should code to be able to reuse everything.
This is what I learned over the years: code for the specific, don't try to generalize. Even if your managers would love to have 90% of your application reusable for every single use case. In practice, this is almost never possible.
Two functionalities, even if they look very similar at first glance, can become very different in the future. If you have any doubt, it's better to copy your code and let it takes different path. It's way simpler on the long run, than having a forest of if
which makes your application a machine for confusion and headaches.
Sandy Metz said it better than me:
Duplication is far cheaper than the wrong abstraction.
Code first, make it work, and then keep in mind all these principles you know (DRY, SOLID and so on) to refactor, on a second step.
DRY principle violations should be handled when the knowledge is already and obviously duplicated.
Similar Domain Knowledge... ?
You remember when I stated above that repetition of business logic is always a violation of the DRY principle? Obviously, this apply when the same business logic is repeated.
An example:
<?php
/** Shipment from the warehouse to the customer */
class Shipment
{
public $deliveryTime = 4; //in days
public function calculateDeliveryDay(): DateTime
{
return new \DateTime("now +{$this->deliveryTime} day");
}
}
/** Order return of a customer */
class OrderReturn
{
public $returnLimit = 4; //in days
public function calculateLastReturnDay(): DateTime
{
return new \DateTime("now +{$this->returnLimit} day");
}
}
You can hear Dave, your colleague developer, gently screaming in your ears once again: "This is an obvious violation of everything I believe in! What about the DRY principle? My heart is bleeding!".
However, Dave is again wrong. From an e-commerce perspective, the delivery time of a shipment to a customer (Shipment::calculateDeliveryDay()
) has nothing to do with the last day the customer can return his ordered products (Return::calculateLastReturnDay
).
These are two different functionalities. What appears to be a code duplication is just a pure coincidence.
What can happen if you combine those two methods in one? If your company decide that the customer has now one month to return his products, you will have to split the method again. If you don't, the shipment delivery will take one month as well!
This is not the best way to please your customers.
DRY is not only a principle for coding nerds
Even the Gin can be DRY nowadays!
DRY is not something you should only respect in your code. You shouldn't repeat the knowledge of your business domain in anything related to your project.
To quote Dave Thomas again:
A system's knowledge is far broader than just its code. It refers to database schemas, test plans, the build system, even documentation.
The idea of DRY is simple in theory: you shouldn't need to update in parallel multiple things when one change occurs.
If your knowledge is repeated two times in your code, and you forget to update one occurrence of the same knowledge, expect bugs and tears. In your documentation, it could lead to misconceptions, confusion and ultimately wrong implementations.
DRY is a principle, not a hard rule
At the beginning of my career, I was often victim of analysis paralysis. All those principles where holding me back to be productive and efficient. It was too complex and I didn't want to screw everything.
However, principles are not rules. They are just tools for you to go in the good direction. It's your job to adapt them depending on the situation.
Everything has a cost in development. The DRY principle is not an exception.
Top comments (2)
Is the duplicated paragraph a test to see if we're applying the DRY principle? 😅
Exactly! This was an obvious duplication of knowledge and you spotted it right on! You have my eternal respect.
Thanks 🤓