When I’m trying to find a topic for this series, I either decide to write about something I just learned, or I choose to write about something I found from the list of top Python questions on Stack Overflow. Today, I’m hitting both by covering how to merge two dictionaries in Python.
Problem Introduction
Earlier in this series, I covered a similar problem where I wanted to convert two lists into a dictionary. In that article, I covered various methods for mapping one list onto the other. This time around I want to convert two dictionaries into a single dictionary like so:
yusuke_power = {"Yusuke Urameshi": "Spirit Gun"}
hiei_power = {"Hiei": "Jagan Eye"}
# Insert merge code here
powers = { "Yusuke Urameshi": "Spirit Gun", "Hiei": "Jagan Eye"}
Here, we have two dictionaries: yusuke_power
and hiei_power
. Each dictionary maps a YuYu Hakasho character to one of their abilities. In this case, I chose Yusuke and his Spirit Gun as well as Hiei and his Jagan Eye. Ultimately, we want to be able to merge these dictionaries, so we have a collection of characters and their powers. Let’s see if we can accomplish that below.
Solutions
As always, I like to list off a few possible ways to solve the problem. To start, we’ll try a brute force solution, then we’ll dig into some more sophisticated solutions.
Merge Two Dictionaries with Brute Force
As is tradition in this series, I always like to kick things off with a roll-your-own solution. In this case, we’re looking to iterate over one dictionary and add its items to the other dictionary:
yusuke_power = {"Yusuke Urameshi": "Spirit Gun"}
hiei_power = {"Hiei": "Jagan Eye"}
for key, value in hiei_power.items():
yusuke_power[key] = value
Naturally, this solution leaves a lot to be desired, but it gets the job done. At the end of the day, yusuke_power
should look like the powers
dictionary we want.
To accomplish something closer to what we want, we would have to iterate over both dictionaries separately:
yusuke_power = {"Yusuke Urameshi": "Spirit Gun"}
hiei_power = {"Hiei": "Jagan Eye"}
powers = dict()
for dictionary in (yusuke_power, hiei_power):
for key, value in dictionary.items():
powers[key] = value
Unfortunately, this solution doesn’t scale very well. That said, there are better ways to solve this problem.
Merge Two Dictionaries with a Dictionary Comprehension
Since I’m a big fan of comprehensions, I think it’s worth mentioning that the solution above can be written in a single line with a dictionary comprehension:
yusuke_power = {"Yusuke Urameshi": "Spirit Gun"}
hiei_power = {"Hiei": "Jagan Eye"}
powers = {key: value for d in (yusuke_power, hiei_power) for key, value in d.items()}
Here, we have written a dictionary comprehension that iterates over both dictionaries and copies each item into a new dictionary. Naturally, it works just like the brute force solution.
Merge Two Dictionaries with Copy and Update
As with many of the collections in Python, they have a builtin copy function associated with them. As a result, we can leverage that copy function to generate a new dictionary which includes all the items of the original dictionary. In addition, dictionaries have an update function which can be used to add all the items from one dictionary into another:
yusuke_power = {"Yusuke Urameshi": "Spirit Gun"}
hiei_power = {"Hiei": "Jagan Eye"}
powers = yusuke_power.copy()
powers.update(hiei_power)
With this solution, we’re able to generate that powers
dictionary which contains all the items from the original two dictionaries. As an added benefit, copy
and update
are backwards compatible, so Python 2 users won’t feel left out.
It’s worth noting that we can extend this solution to merge any number of dictionaries with a custom function:
def merge_dicts(*dicts: dict):
merged_dict = dict()
for dictionary in dicts:
merge_dict.update(dictionary)
return merged_dict
Now, we can generate a new dictionary which contains all the items in any number of dictionaries.
Merge Two Dictionaries with Dictionary Unpacking
When Python 3.5 rolled out, it introduced a dictionary unpacking syntax which allows us to merge dictionaries with a new operator:
yusuke_power = {"Yusuke Urameshi": "Spirit Gun"}
hiei_power = {"Hiei": "Jagan Eye"}
powers = {**yusuke_power, **hiei_power}
Naturally, this solution scales for any number of arguments:
yusuke_power = {"Yusuke Urameshi": "Spirit Gun"}
hiei_power = {"Hiei": "Jagan Eye"}
powers = {**yusuke_power, **hiei_power, "Yoko Kurama": "Rose Whip"}
Of course, the drawback is backwards compatibility. If you’re still rocking Python 2 or even older versions of Python 3, this feature may not be available to you. Regardless, I think it’s a pretty clever piece of syntax, and I like how it looks.
Performance
For the first time in this series, I thought it would be beneficial to take a look at the performance of each of the methods above (if you’re lucky, I might update the old articles to include performance as well). To do that, I’m going to use the builtin timeit
library.
To use the timeit
library, we have to set up some strings for testing:
setup = """
yusuke_power = {"Yusuke Urameshi": "Spirit Gun"}
hiei_power = {"Hiei": "Jagan Eye"}
powers = dict()
"""
brute_force = """
for dictionary in (yusuke_power, hiei_power):
for key, value in dictionary.items():
powers[key] = value
"""
dict_comprehension = """
powers = {key: value for d in (yusuke_power, hiei_power) for key, value in d.items()}
"""
copy_and_update = """
powers = yusuke_power.copy()
powers.update(hiei_power)
"""
dict_unpacking = """
powers = {**yusuke_power, **hiei_power}
"""
With our strings setup, we can begin our performance test:
>>> import timeit
>>> timeit.timeit(stmt=brute_force, setup=setup)
1.517404469999974
>>> timeit.timeit(stmt=dict_comprehension, setup=setup)
1.6243454339999062
>>> timeit.timeit(stmt=copy_and_update, setup=setup)
0.7273476979999032
>>> timeit.timeit(stmt=dict_unpacking, setup=setup)
0.2897768919999635
As it turns out, dictionary unpacking is very fast. For reference, I performed the testing on a Surface Go with Windows 10 and Python 3.7.1.
A Little Recap
Well, that’s all I have in terms of typical solutions. All that said, be aware that all of these solutions will overwrite duplicate values. In other words, if two dictionaries contain the same key, the last dictionary to be merged will overwrite the previous dictionary’s value.
Also, it’s worth noting that all of these solutions perform a shallow copy of the dictionaries. As a result, dictionaries that may be nested or store objects will only have their references copied, not the actual values. If that’s a constraint in your application, you may need to write your own recursive copy function.
At any rate, here are all the solutions:
yusuke_power = {"Yusuke Urameshi": "Spirit Gun"}
hiei_power = {"Hiei": "Jagan Eye"}
powers = dict()
# Brute force
for dictionary in (yusuke_power, hiei_power):
for key, value in dictionary.items():
powers[key] = value
# Dictionary Comprehension
powers = {key: value for d in (yusuke_power, hiei_power) for key, value in d.items()}
# Copy and update
powers = yusuke_power.copy()
powers.update(hiei_power)
# Dictionary unpacking (Python 3.5+)
powers = {**yusuke_power, **hiei_power}
# Backwards compatible function for any number of dicts
def merge_dicts(*dicts: dict):
merged_dict = dict()
for dictionary in dicts:
merge_dict.update(dictionary)
return merged_dict
And, that’s it! As always, I appreciate the support. If you liked this article, do me a favor and share it with someone. For those feeling extra generous, consider becoming a member of The Renegade Coder. If you’re not convinced, check out some of these other Python articles:
Rock Paper Scissors Using Modular Arithmetic
Jeremy Grifski ・ Mar 18 '19
How to Write a List Comprehension in Python
Jeremy Grifski ・ May 3 '19
The Coolest Programming Language Features
Jeremy Grifski ・ Apr 3 '19
Once again, thanks for the support! Before you go, share your recommendation for a topic you’d like to see in the comments.
The post How to Merge Two Dictionaries in Python appeared first on The Renegade Coder.
Top comments (6)
Assuming you are going to discard the original dicts in favor of the merged one (which is typically the case in most situations), there is no need to "copy and update". You can simply run the
update()
method on one of the dicts and consider it the final which will be faster and efficient:yusuke_power = {"Yusuke Urameshi": "Spirit Gun"}
hiei_power = {"Hiei": "Jagan Eye"}
hiei_power.update(yusuke_power)
powers = hiei_power
Yeah, that works too! Though, I’m not a huge fan of introducing an alias just for the sake of performance, but that’s just me.
Also, if we’re talking hypotheticals, it might be useful to maintain the old dictionaries since an update is going to overwrite any duplicate keys.
Do not forget about ChainMap, ChainMap is really memory efficient way of working with multiple dictionaries as single data structure. It is extremely powerful.
From the Python documentation:
In a lot of your examples the dictionaries will be duplicated (or you will have duplication of key-values), this is fine for smaller dataset but when you start working with large dictionaries you might start chewing through a lot more memory than you would like.
In your examples you are also dealing with a small set of dictionaries (2-3). Where chainmap is useful is if you have many (10' or 100's or 1000's) of dictionaries.
It always amazes me just how big the standard library is. For instance, I’ve never heard of ChainMap. Thanks for the tip.
Also, good considerations! To be fair, I was only considering two dictionaries based on the title, and I usually don’t worry too much about premature optimization. If these solutions ended up being a bottleneck, I might look into ChainMap.
Great article. I liked the many options to solve the problem. I feel the dictionary comorehension is too much of a cool trick that is hard to understand. The scaling version with the kwargs is beautiful.
Like some other articles, I'm a bit disappointed with using time libraries to measure performance of code because the running time is biased and tied up to the machine performance. Big O notation would be more suitable.
Big O is great, but you'd need to know how these solutions work internally to measure it. That's why we use empirical measurements like the
timeit
library to get a rough idea of each solution's performance.Personally, I prefer correctness and readability over performance anyway. If something becomes a major bottleneck, then I would start looking into more performant solutions.