loading...

Avoid dict overuse in Python

victorosilva profile image Victor Silva Updated on ・2 min read

Python's dicts are great and we all love them, but on large projects we can easily end up with dozens of functions like this:

There's nothing inherently wrong with the function above, but there are some little things that may complicate our lives as the project grows.

We may mistype the value of a key when accessing the returned data (like typing daily_interest_rates instead of daily_interest_rate). Because we're dealing with a dict, we would only find this error (that is, if we find it) during run time.

Or maybe we could start calling this function and using its returned dict in a lot of places. In this case, we would find ourselves repeatedly coming back to the implementation just to remember the dict's keys, or to check the due dates' structure.

We can get rid of these potential problems in advance, and make this code a lot more explicit, if we

Use proper classes instead of using dicts everywhere

Let's try it:

Note: below, I will use dataclasses, added in Python 3.7, to avoid some boilerplate __init__ code.

Now our function returns instances of a class, and thanks to the type hints that were added, we know the exact structure of these instances.

Note: if you never heard of type hints, check my really short introdution to them:

By using any Python-aware IDE, the two problems that I described before are no more. Because of the new class CalculationResult combined with type hints, PyCharm will warn me if I mistype a field name:

PyCharm - mistyped field

If I forget the structure of the due dates, type hints got me covered. PyCharm will rely on them to remind me that the dates are the values of a dict whose keys are integers:

PyCharm - field structure

Now, I'm not saying we should stop using dicts altogether. My point is that, for larger projects, maybe it's better to represent our data as proper classes. With the new Python dataclasses, we can do this in a jiffy.

Nowadays I tend to prefer dicts only in places like the field CalculationResult.due_dates above: when all keys are of the same type and represent the same thing (e.g.: the installments) and all the values are of the same type and represent the same thing (e.g.: the due date of the corresponding installment).

I find that my code is getting more expressive, explicit and maintainable by adopting this strategy.

Posted on by:

Discussion

pic
Editor guide
 

Thanks for the article, I completely agree. For myself, even on very small projects, the benefits of using an explicit class make themselves apparent very quickly. My rule of thumb is that if I know ahead of time what all the keys are, it shouldn't be a dictionary, it should be a class.

With good tools for quickly defining classes, there is no reason not to any more. I tend to use attrs library instead of the standard library dataclass, but it works very similarly, especially if you use auto_attribs=True.