loading...

Pythonic way to aggregate or group elements in a list using dict.get and dict.setdefault

mojemoron profile image Micheal Ojemoron ・2 min read

Using dict.get

How often have you aggregated an item by its group like this:

ar= [1,1,3,1,2,1,3,3,3,3]
pairs = {}
for i in ar:
    if i not in pairs:
         pairs[i] = 0
    pairs[i] = pairs[i] + 1


print(pairs)
{1: 4, 3: 5, 2: 1}

The above code shows how to get the frequency of elements in a list.
While this code is good, the time complexity of in(membership operator) is O(n) worse case.
We can make this code better and more pythonic by using python's dict.get(k,d) method.

Using the same list, we have:

pairs = {}
for i in ar:
    pairs[i] = pairs.get(i, 0) + 1
print(pairs)
{1: 4, 3: 5, 2: 1}

the trick here is that dict.get(k,d) returns None by default or the specified default value if a key doesn't exist.

Using dict.setdefault

Similarly when grouping things together we usually do this:

items_by_type = {}
for item in items:
    if item.type not in items_by_type:
        items_by_type[item.type] = list()
    items_by_type[item.type].append(item)

Let's make it better and more pythonic:

items_by_type = {}
for item in items:
    items_by_type.setdefault(item.type, list()).append(item)

The trick here is that setdefault(k,d) only sets the item if it does not
exist. If it does exist then it simply returns the item.

Final Thoughts

You can use the Counter class from the collection module to get the frequency of elements in a list.
Just do this:

from collections import Counter
#using our above example:
print(Counter(ar))

Happy coding! ✌

Please drop your comments :)

Posted on by:

mojemoron profile

Micheal Ojemoron

@mojemoron

Problem solver|Technical Writer|Voracious Reader interestered in Cloud computing-Algorithm&DataStructure-ML

Discussion

pic
Editor guide
 

This is just too magical, and hard to understand.

items_by_type = {}
for item in items:
    items_by_type.setdefault(item.type, list()).append(item)

A less magical version is defaultdict (from collections)

items_by_type = defaultdict(list)
for item in items:
    items_by_type[item.type].append(item)
 

Diddo the swap from

for i in range(len(items)):
   ...

to

for item in items:
   ...
 

you are absolutely right, I have updated the code. Thanks

 

You are right😁, fortunately there are several ways to do things in Python. I am more concerned about using dictionary methods. Thank you

 

Good walk of the different dictionary methods.

 
 

I love the Counter object. I've had a rest API getting bogged down with pandas trying to count something and it took a few seconds for each response. Counter was just a few ms.