get_or_create
, is an awesome helper utility, until it’s not. There is a nod to what we’re about to talk about in the docs:
This method is atomic assuming correct usage, correct database configuration, and correct behavior of the underlying database. However, if uniqueness is not enforced at the database level for the _
kwargs
used in aget_or_create
call (seeunique
or_unique_together
), this method is prone to a race-condition which can result in multiple rows with the same parameters being inserted simultaneously.
Lets talk about it in more detail. Here’s the interesting bits of the source code for get_or_create
:
lookup, params = self._extract_model_params(defaults, **kwargs)
try:
return self.get(**lookup), False
except self.model.DoesNotExist:
return self._create_object_from_params(lookup, params)
Pretty self-explanatory, and does exactly what the name implies. And if you go further into _create_object_from_params
, you’ll notice that it does a lot more than just make a call to .create()
. Here’s what happens there:
try:
with transaction.atomic(using=self.db):
obj = self.create(**params)
return obj, True
except IntegrityError:
exc_info = sys.exc_info()
try:
return self.get(**lookup), False
except self.model.DoesNotExist:
pass
six.reraise(*exc_info)
This is cool — it’s actually accounting for race conditions. It tries to create the object, but if that operation throws an IntegrityError
, it does the lookup again and tries to return what it finds.
The problem is this: if you hit this part of the code in one thread (meaning the lookup has already taken place and not returned anything) on an object that does not have a uniqueness constraint on the attributes you’re doing the lookup based on, if one is created in another thread, the creation in this thread will not throw an IntegrityError
, and you’ll end up with two! This may be fine — for now. After all, your call to get_or_create
returned an instance matching your parameters, and so did the call in the other thread, and both will carry on their merry way.
The problem arises next time you try to retrieve the object using the same lookup params with get_or_create
. Because you now have two objects in your database, when get_or_create
tries its .get()
, (note, not .filter()
), you’ll get a MultipleObjectsReturned
error, and no way out unless you catch this exception yourself.
The moral of the story? Don’t use get_or_create
on objects that don’t have uniqueness constraints on the attributes you’re doing the lookup based on,at the database level.
Top comments (0)