DEV Community

Cover image for Nobody Likes a DRY PASTRY
Derek D.
Derek D.

Posted on

Nobody Likes a DRY PASTRY

Welcome to another one of my vendetta articles. Last time it was A Pythonic Guide to SOLID Design Principles proving the SOLID Design Principles are indeed Pythonic. This time I'm taking aim at the most abused clean code principle Don't Repeat Yourself, or DRY, as it is commonly abbreviated. It's a pretty simple principle to follow. All you have to do to is not repeat yourself, meaning don't write the same code twice. It's even got catchy phrases like, "Can you DRY this code up", or "This code could be DRYer" which I often see in code review. I don't disagree with the DRY principle. I do disagree with how it is used as an excuse for clean code as if just because there is no duplicate code the code is also clean. That's why I like to say nobody likes a DRY PASTRY. Let me say that again nobody likes a DRY PASTRY. I had to repeat myself because PASTRY stands for Please Always Stop To Repeat Yourself.

History

Let's start with the origin of the DRY Principle. It comes from the Pragmatic Programmer written by Andrew Hunt, and David Thomas published in 1999. In the article the DRY Principle is defined as

"Every piece of knowledge must have a single, unambiguous, authoritative representation within a system."

Ironically over the years the DRY Principle has been the subject of many technical articles. Each echoing the definition found in The Pragmatic Programmer, and then proceeding to interpret what that definition means through code snippets. So what has happened is we've repeated the very principle that say's Don't Repeat Yourself! Since most of dev's now are being introduced to DRY by reading someone's interpretation of it instead of reading the authoritative definition from the Pragmatic Programmer the understanding of DRY has been corrupted. The DRY principle has been reduced to mean duplicate code is bad, don't write duplicate code or you are bad and as this popular image for the DRY Principle shows... "Repetition is the root of all software evil".

Alt Text

In lies my disdain for how DRY is being used. Repeating code is not bad and I can prove it.

Why Repeated Code is Good

I like the analogy of baking to help explain why repeated code is a good thing. Take for instance chocolate chip cookies. You need wet ingredients like eggs, butter, and oil. They are necessary for the chemical reactions to take place that make cookies, soft, chewy, and delicious. Have you ever tried making cookies without the eggs, butter or oil? They are not very good. The moisture is not just a good thing it's vital for a quality cookie. The same is true for code, and just as the cookies dry out as they are developed our code should dry out as it is developed. When you write code multiple times and repeat yourself you are introducing your wet ingredients, when the code is functional you go back to refactor. While refactoring you'll see the actual use cases of your code allowing you to create DRY abstractions that avoid some of the gotcha's you couldn't have anticipated if you started with an abstraction. That being said here are some practical tips to help you to avoid ending up with a DRY PASTRY.

  1. Code for your use case not your reuse case
  2. Premature abstraction is the root of all fragile software
  3. Duplication is cheaper than the wrong abstraction
  4. Code is not reusable. Abstractions are.

Code for your Use Case not your Reuse Case

This is a quote from a senior developer I worked with at my first job out of college. He would always tell me, "code for your use case and not your reuse case". I didn't understand what it meant until I experienced DRY being done wrong. In essence what he was trying to say here was don't try and anticipate use cases that are not a current requirement. Anticipating use cases is often what happens when developers are trying to DRY up their code. They see similar code being written and try to "abstract" it by putting it all in a class or a function. I put "abstract" in quotes there because these function and classes often are more of a distraction than they are an abstraction. Consider this example where a developer has implemented two functions, post and put, that use the same code to get query string parameters, and body from an HTTP request.

import json


def post(request):
    params = {}
    for param in request['query_string'].split('&'):
        field, value = param.split('=')
        params[field] = value

    body = json.loads(request['body'])
    ...


def put(request):
    params = {}
    for param in request['query_string'].split('&'):
        field, value = param.split('=')
        params[field] = value

    body = json.loads(request['body'])
    ...
Enter fullscreen mode Exit fullscreen mode

to DRY up this code they might create a function to abstract the logic of parsing HTTP requests.

import json

def parse_request(request):
    params = {}
    for param in request.query.split('&'):
        field, value = param.split('=')
        params[field] = value

    body = json.loads(request.body)
    return params, body

def post(request):
    params, body = parse_request(request)
    ...


def put(request):
    params, body = parse_request(request)
    ...
Enter fullscreen mode Exit fullscreen mode

That looks pretty good, but there is a problem. What happens when you want to use parse_request() for an HTTP Get Request? You can't because Get requests don't use the body which means a None value would be passed to json.loads() causing it to blow up. Just because code is DRY doesn't mean it's reusable. What would have been more reusable is a get_query_params function and a get_body function. The only way the developer would have seen this is by duplicating the code for getting query string parameters and request bodies in the post, put, and get functions, then going back to refactor and creating abstractions for the actual patterns in the code and not what they anticipated the pattern to be. It's worth noting most IDE's have refactoring features that will find common patterns and abstract them for you. Very few IDE's have features that try to predict how code will be used though. An IDE's refactoring features would likely have resulted in two functions, one for getting the query string parameters and another for getting the request body because those were repeated patterns in the code. The get_query_params and get_body functions are good abstractions because they are decoupled from one another, and have a single responsibility making them SOLID as well. They also happen to be DRY code. The point is when you try to predict use cases you create premature abstractions which lead to bloated classes, functions and abstractions.

Premature Abstraction is the Root of all Fragile Software

This brings me to my second point. Premature abstractions are the root of all fragile software. I say this because premature abstractions put you at risk of coding wrong assumptions. Wrong assumptions lead to bad abstractions, bad abstractions lead to hacks, and too many hacks lead to fragile not easily maintainable code. Trying to always parse the request body was a wrong assumption, because not every HTTP request has a body. In this case there's a simple solution, but that's not always the case, and the more features stuffed into an abstraction the more hacks will be needed to support all of its use cases. This creates a knot in your code, especially when the bad abstraction is used a lot. When fixing a highly used bad abstraction you have to change a lot of calling code. Not only is that tedious it violates the open/closed principle from SOLID, and increases the risk of introducing a bug. This is why I say duplicate code is far cheaper than the wrong abstraction.

Duplication is cheaper than the wrong abstraction

This could also be phrased as duplication is easier to correct than the wrong abstractions. Removing duplication is like picking up toys and putting them in a toy box, while fixing the wrong abstraction as I hinted at earlier is like untying a knot. That's because duplicate code follows a pattern and the abstraction to fix it will also follow a pattern so the fix is calling the function or class everywhere the duplicate code was and passing in the appropriate variables when needed. You saw this previously when the parse_request() function was created. Untying the knot is usually more difficult though because you need to identify all the use cases and where they need to be handled. You then need to find and remove any hacks that were put in place to work with the bad abstraction. Here is an example of untying the knot that would have been created if the developer forced the abstraction to handle the special case of get requests.

Here is the code with a try/except block around the json.loads() call which is the hack.

import json


def parse_request(request):
    params = {}
    for param in request.query.split('&'):
        field, value = param.split('=')
        params[field] = value

    try:
        body = json.loads(request.body)
    except:
        body = {}

    return params, body


def post(request):
    parse_request(request)
    ...


def put(request):
    parse_request(request)
    ...


def get(request):
    parse_request(request)
    ...
Enter fullscreen mode Exit fullscreen mode

Here is the code after untying the knot.

import json


def get_query_params(request):
    params = {}
    for param in request.query.split('&'):
        field, value = param.split('=')
        params[field] = value

    return params


def get_body(request):
    return json.loads(request.body)


def post(request):
    params = get_query_params(request)
    body = get_body(request)
    ...


def put(request):
    params = get_query_params(request)
    body = get_body(request)
    ...
Enter fullscreen mode Exit fullscreen mode

This example wasn't too bad, but it still required

  • creating a new function
  • renaming and old one
  • removing the try/except block

After all that you still had to make sure the functions are called in the correct places which was more work than DRYing up the duplicate code which was basically copy/paste. The longer the bad abstraction lives on the more use cases it accumulates and the more places it gets called from. Every time the bad abstraction is touched or used the knot tighter and harder to fix.

Code is not Reusable, Abstractions are

It all comes down to this, code is NOT reusable, abstractions are. The definition of DRY even say's as much. "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system". Meaning there should be one abstraction for each piece of knowledge. So one abstraction that has the knowledge on how to parse query string parameters, one abstraction with the knowledge to parse a request body, one abstract representation of a user and so on and so forth. Maintaining one abstraction per piece of knowledge keeps code SOLID, DRY and it all starts when you repeat yourself, so Please Always Stop To Repeat Yourself.

Top comments (1)

Collapse
 
turculaurentiu91 profile image
Turcu Laurentiu

Amazing article, this is so underrated!