Have you ever encountered a message like TypeError: unhashable type: 'set'
, and couldn't figure out why exactly this exception was raised? Well, I'm here to help you!
When does it usually happen
In addition to other situations, Python will raise this exception as soon as you try to store something that can not be hashed (AKA unhashable) in a structure implemented as a HashMap. The most common Python types that are implemented as HashMaps are dict
and set
.
For dict
s, the TypeError: unhashable
exception is raised when one tries to place an unhashable object as a key of the dictionary.
For set
s, the TypeError: unhashable
exception is raised when one tries to add an unhashable object to the set
.
What does it mean "can (not) be hashed"
In Python, we say that something can be hashed if it is possible to calculate it's hash with the built-in hash(obj)
function. This function returns an integer for a given object, respecting the following rule:
Two objects that compare equal must also have the same hash value, but the reverse is not necessarily true.
In computer science hash functions are used in multiple contexts (like cryptography and compression), but for HashMaps (as set
and dict
) it is used to compute a kind of ID for each object. This ID is used to organize data in such a way that we can do fast insertions and lookups.
It means that for every key in a dict
(and for every object in a set
) Python runs the function hash(obj)
, and that is when it actually raises the TypeError: unhashable
.
Hashable and unhashable types in Python
As a general rule, Python immutable objects are hashable and mutable objects are unhashable.
Actually, since what is really happening is a call to the object's __hash__(self)
magic method, anything can be hashable in python (as long as we implement it)!
So, to summarize it all, let's see a custom class with a custom __hash__(self)
method
# I'm using dataclass here to avoid coding
# __init__ and __repr__ methods
from dataclasses import dataclass
@dataclass
class User:
name: str
email: str
def __hash__(self):
# Here we define how we should compute
# the hash for a given User. In this example
# we use the hash of the '__email' attribute
return hash(self.email)
def __eq__(self, other):
# This method is necessary if you want to
# guarantee that User objects with same
# email will not duplicate in a set or
# as a dict key, for example
return self.email == other.email
main_account = User("Vitor Buxbaum", "vitor.buxbaum@gmail.com")
gmail_account = User("Vitor Buxbaum", "vitor.buxbaum@gmail.com")
work_account = User("Vitor Buxbaum", "vitor.buxbaum@betrybe.com")
my_accounts = {main_account, gmail_account, work_account}
# my_accounts =
# {
# User(name='Vitor Buxbaum', email='vitor.buxbaum@gmail.com'),
# User(name='Vitor Buxbaum', email='vitor.buxbaum@betrybe.com')
# }
Top comments (0)