DEV Community

Cover image for Code Smell 231 - Redundant Data
Maxi Contieri
Maxi Contieri

Posted on • Originally published at maximilianocontieri.com

Code Smell 231 - Redundant Data

Where are your sources of truth?

TL;DR: Say it only once

Problems

  • Don't Repeat Yourself principle violation

  • Consistency problems

  • Maintainability

  • Testing and Debugging

Solutions

  1. Keep the responsibilities to relevant objects and delegate to a single source of truth

Context

The principle of "Don't Repeat Yourself" (DRY) encourages you to avoid redundancy and duplication of behavior.

Redundant data can lead to inconsistencies because updates or changes need to be made in multiple places.

If you update one instance of the data and forget to update another, your system can become inconsistent, which can lead to errors and unexpected behavior.

Maintaining redundant data can be a nightmare when it comes to making changes or updates since It increases the workload and the likelihood of introducing errors during maintenance.

With a single source of truth, you only need to make changes in one place, simplifying the maintenance process.

When data is repeated in multiple places, it becomes difficult to identify the authoritative source of that data, leading to confusion for developers.

Sample Code

Wrong

class Transfer:
    def __init__(self, amount, income, expense):
        self.amount = amount
        self.income = income
        self.expense = expense

class Income:
    def __init__(self, amount):
        self.amount = amount
        # amount is the same for party and counterparty

class Expense:
    def __init__(self, amount):
        self.amount = amount

transfer_amount = 1000  
# simplification: should be a money object with the currency

income = Income(transfer_amount)
expense = Expense(transfer_amount)
transfer = Transfer(transfer_amount, income, expense)

print("Transfer amount:", transfer.amount)
print("Income amount:", transfer.income.amount)
print("Expense amount:", transfer.expense.amount)

Enter fullscreen mode Exit fullscreen mode

Right

class Transfer:
    def __init__(self, amount):
        self.amount = amount
        self.income = Income(self)
        self.expense = Expense(self)

class Income:
    def __init__(self, transfer):
        self.transfer = transfer

    def get_amount(self):
        return self.transfer.amount

class Expense:
    def __init__(self, transfer):
        self.transfer = transfer

    def get_amount(self):
        return self.transfer.amount

transfer_amount = 1000  
transfer = Transfer(transfer_amount)

print("Transfer amount:", transfer.amount)
print("Income amount:", transfer.income.get_amount())
print("Expense amount:", transfer.expense.get_amount())

Enter fullscreen mode Exit fullscreen mode

Detection

[X] Manual

This is a semantic smell

Exceptions

  • For performance issues, you can add caches and redundancy, but you need extra effort to keep the data synchronized

Tags

  • Data

Conclusion

In larger and more complex systems, redundancy becomes a significant problem.

As your system grows, the challenges associated with maintaining and synchronizing redundant data also increase.

Redundant data also increases the surface area for testing and debugging.

You need to ensure that all copies of the data behave consistently, which can be a challenging task.

Relations

Disclaimer

Code Smells are my opinion.

Credits

Photo by Jørgen Håland on Unsplash


Everything will ultimately fail. Hardware is fallible, so we add redundancy. This allows us to survive individuals hardware failures, but increases the likelihood of having at least one failure at any given time.

Michael Nygard


This article is part of the CodeSmell Series.

Top comments (0)