loading...
Cover image for How to Sort a List of Dictionaries in Python

How to Sort a List of Dictionaries in Python

renegadecoder94 profile image Jeremy Grifski Originally published at therenegadecoder.com on ・5 min read

You may recall that I recently published an article on parsing a spreadsheet, and the output ended up being a list of dictionaries. Of course, for data processing purposes, it’s always nice to be able to sort that data, so I thought it would be fun to share a few options for sorting a list of dictionaries in Python.

Problem Introduction

As mentioned before, I was working on parsing a CSV file for data visualization, and I ended up getting everything I wanted in the following format:

csv_mapping_list = [
  {
    "Name": "Jeremy",
    "Age": 25,
    "Favorite Color": "Blue"
  },
  {
     "Name": "Ally",
     "Age": 41,
     "Favorite Color": "Magenta"
  },
  {
    "Name": "Jasmine",
    "Age": 29,
    "Favorite Color": "Aqua"
  }
]

Of course, having the data in a nice format and actually using that data for visualization are very different problems. In other words, we have our data, but we might want to use a subset of it. Likewise, order of the data might matter.

As an example, we could order our data points by age. That way we could plot them in order of increasing or decreasing age to see if we could spot any trends. For instance, maybe older individuals prefer certain colors, or perhaps younger individuals have certain types of names.

In any case, we always have to start with data processing. Today, I want to focus on sorting a list of dictionaries.

Solutions

As always, I like to share many possible solutions. It’s normal for me to share a brute force method followed by a couple more elegant methods, so take care to skip ahead if needed.

Sorting a List of Dictionaries by Hand

Sorting is probably one of the most researched areas of Computer Science, so we won’t dive into the philosophy. Instead, we’ll leverage one of the more popular algorithms, selection sort:

size = len(csv_mapping_list)
for i in range(size):
    min_index = i
    for j in range(i + 1, size):
        if csv_mapping_list[min_index]["Age"] > csv_mapping_list[j]["Age"]:
            min_index = j    
    temp = csv_mapping_list[i]
    csv_mapping_list[i] = csv_mapping_list[min_index]
    csv_mapping_list[min_index] = temp

Here, we’ve sorted the list of dictionaries in place by age. To do that, we leverage the “Age” field of each dictionary as seen in line 5.

Since looking into this topic, I’ve found that Python has a nice way of handling the variable swap in a single line of code:

size = len(csv_mapping_list)
for i in range(size):
    min_index = i
    for j in range(i + 1, size):
        if csv_mapping_list[min_index]["Age"] > csv_mapping_list[j]["Age"]:
            min_index = j
    csv_mapping_list[i], csv_mapping_list[min_index] = csv_mapping_list[min_index], csv_mapping_list[i]

Clearly, I didn’t pick that great of a variable name for the swap, but you get the idea. To accomplish the swap, we leverage tuple packing and unpacking. In other words, we create a tuple on the right side of the expression and unpack it on the left side of the expression. Pretty cool stuff!

Sorting a List of Dictionaries With Sort Function

Luckily for us, we don’t have to implement sorting by hand in Python. Instead, we can use the builtin sort function for lists. In the following snippet, we sort the list of dictionaries by age.

csv_mapping_list.sort(key=lambda item: item.get("Age"))

Here, we have to specify the key parameter as dictionaries cannot be naturally sorted. Or, as the Python interpreter reports:

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    csv_mapping_list.sort()
TypeError: '<' not supported between instances of 'dict' and 'dict'

To solve this problem, we use the key parameter. The key parameter allows us to provide a function which returns some value for each item in our list. In this case, the natural ordering of each dictionary is mapped to the age field of each item using a inline lambda function.

As expected, the list of dictionaries is sorted in place as follows:

[
  {
    'Name': 'Jeremy', 
    'Age': 25, 
    'Favorite Color': 'Blue'
  }, 
  {
    'Name': 'Jasmine', 
    'Age': 29, 
    'Favorite Color': 'Aqua'
  }, 
  {
    'Name': 'Ally', 
    'Age': 41, 
    'Favorite Color': 'Magenta'
  }
]

And, it’s just as easy to sort by any other key for that matter:

csv_mapping_list.sort(key=lambda item: item.get("Name"))
csv_mapping_list.sort(key=lambda item: item.get("Favorite Color"))

In both cases, the list will be sorted “alphabetically” as the values are strings. However, be aware that this sort method is case sensitive. I wrote a whole separate article for dealing with String sorting if you’re interested in that.

If you're not a fan of lambda functions, you're welcome to take advantage of the operator module which contains the itemgetter function. In short, the itemgetter function provides the same functionality with better performance in a more convenient syntax:

from operator import itemgetter
f = itemgetter('Name')
csv_mapping_list.sort(key=f)

Thanks, dmitrypolo, for the tip!

Sorting a List of Dictionaries With Sorted Function

A more generic version of the builtin sort function is the builtin sorted function. It works exactly like the sort function, but it works for all iterables. In other words, if our list was actually a tuple, we'd have another option:

csv_mapping_list = sorted(csv_mapping_list, key=lambda item: item("Age"))

As you can see, sorted is a little different than the regular sort method in that it returns a new sorted list. To be clear, sorted does not sort the list in place. Instead, it constructs an entirely new list. As a result, we’re able to sort any iterable including tuples.

Like sort, sorted has a ton of custom options, so I recommend checking out the Python documentation if you have a more specific situation. Alternatively, you can reach out in the comments!

A Little Recap

While writing this article, I started to get a feeling of déjà vu. Then, I remembered that I already wrote an article about sorting a list of strings in Python. Apparently, all the methods from there were just as applicable here. At any rate, here are all the solutions discussed in this article:

# Custom sorting
size = len(csv_mapping_list)
for i in range(size):
    min_index = i
    for j in range(i + 1, size):
        if csv_mapping_list[min_index]["Age"] > csv_mapping_list[j]["Age"]:
            min_index = j
    csv_mapping_list[i], csv_mapping_list[min_index] = csv_mapping_list[min_index], csv_mapping_list[i]

# List sorting function
csv_mapping_list.sort(key=lambda item: item.get("Age"))

# List sorting using itemgetter
from operator import itemgetter
f = itemgetter('Name')
csv_mapping_list.sort(key=f)

# Iterable sorted function
csv_mapping_list = sorted(csv_mapping_list, key=lambda item: item("Age"))

As usual, I appreciate your support. If you have any recommendations for future articles, let me know in the comments!

Posted on by:

renegadecoder94 profile

Jeremy Grifski

@renegadecoder94

Engineering Education PhD student interested in challenging cultural issues in the tech community.

Discussion

pic
Editor guide
 

You don't need to use lambda here at all. In fact you would be better off using itemgetter from the operator module in the standard library.

from operator import itemgetter

f = itemgetter('Name')
csv_mapping_list.sort(key=f)

[{'Name': 'Ally', 'Age': 41, 'Favorite Color': 'Magenta'},
 {'Name': 'Jasmine', 'Age': 29, 'Favorite Color': 'Aqua'},
 {'Name': 'Jeremy', 'Age': 25, 'Favorite Color': 'Blue'}]
 

Define "better off." Using itemgetter is definitely another way to do it, but it's almost exactly the same as the lambda option. Likewise, I could have also written a function of my own to pass as the key.

Of course, I'm happy to add itemgetter as another option if you want.

EDIT: I added your example to the article.

 

itemgetter is faster than using lambda specifically because all the operations are performed on the C side. I should have clarified when I made my response. Thanks for the shoutout!

Thanks for the clarification! I wasn't aware of that. The article has been updated to include a note about performance.