Martin Heinz

Posted on Jul 26, 2021 • Originally published at martinheinz.dev

The Unknown Features of Python's Operator Module

#python

At the first glance Python's operator module might not seem very interesting. It includes many operator functions for arithmetic and binary operations and a couple of convenience and helper functions. They might not seem so useful, but with help of just a few of these functions you can make your code faster, more concise, more readable and more functional. So, in this article we will explore this great Python module and make the most out of the every function included in it.

Use Cases

The biggest part of the module consists of functions that wrap/emulate basic Python operators, such as +, << or not. It might not be immediately obvious why you would need or want to use any of these when you can just use the operator itself, so let's first talk about some of the use cases for all these functions.

First reason why you might want to use some of these in your code is if you need to pass operator to a function:

def apply(op, x, y):
    return op(x, y)

from operator import mul
apply(mul, 3, 7)
# 21

Reason why we need to do this is, is that Python's operators (+, -, ...) are not functions, so you cannot pass them directly to functions. Instead, you can pass in the version from operator module. You could easily implement wrapper function that does this for you, but no one wants to create function for each arithmetic operator, right? Also, as a bonus this allows for more functional style of programming.

You might also think, I don't need operator module for this, I can just use lambda expression!. Yes, but here comes the second reason why you should use this module. Functions in this module are faster than lambdas. You obviously won't notice that with single execution, but if you run it in loop enough times, then it's going to make a big difference:

python -m timeit "(lambda x,y: x + y)(12, 15)"
10000000 loops, best of 3: 0.072 usec per loop
python -m timeit -s "from operator import add" "add(12, 15)"
10000000 loops, best of 3: 0.0327 usec per loop

So if you're used to writing something like (lambda x,y: x + y)(12, 15), you might want to switch to operator.add(12, 15) for a little performance boost.

Third and for me the most important reason to use operator module is readability - this is more of a personal preference and if you use lambda expressions all the time, then it might be more natural for you to use those, but in my opinion, it's in general more readable to use functions in operator module rather than lambdas, for example consider the following:

(lambda x, y: x ^ y)(7, 10)

from operator import xor
xor(7, 10)

Clearly the second option is more readable.

Finally, unlike lambdas, operator module functions are picklable, meaning that they can be saved and later restored. This might not seem very useful, but it's necessary for distributed and parallel computing, which requires the ability to pass functions between processes.

All The Options

As I already mentioned this module has a function for every Python arithmetic, bitwise and truth operator as well as some extras. For the full list of mapping between functions and the actual operators, see table in docs.

Along with all the expected functions, this module also features their in-place versions that implement operations such as a += b or a *= b. If you want to use these you can just prefix the basic versions with i, for example iadd or imul.

Finally, in operator you will also find the dunder versions of all these functions, so for example __add__ or __mod__. These are present there for legacy reasons, and the versions without underscores should be preferred.

Apart from all the actual operators, this module has some more features that can come in handy. One of them is little know length_hint function, which can be used to get vague idea of length of an iterator:

from operator import length_hint
iterator = iter([2, 4, 12, 5, 18, 7])
length_hint(iterator)
# 6
iterator.__length_hint__()
# 6

I want to highlight the vague keyword here - don't rely on this value because it really is a hint and makes no guarantees of accuracy.

Another convenience function that we can grab from this module is countOf(a, b) which returns number occurrences of b in a, for example:

from operator import countOf
countOf([1, 4, 7, 15, 7, 5, 4, 7], 7)
# 3

And last of these simple helpers is indexOf(a, b), which returns index of first occurrence of b in a:

from operator import indexOf
indexOf([1, 4, 7, 15, 7, 5, 4, 7], 7)
# 2

Key Functions

Apart from operator functions and couple of the above mentioned utility functions, operator module also includes functions for working with higher-order functions. These are attrgetter and itemgetter which are most often used as key-function usually in conjunction with function such as sorted or itertools.groupby.

To see how they work and how you can use them in your code, let's look at a couple of examples.

Let's say we have a list of dictionaries, and we want to sort them by a common key. Here's how we can do it with itemgetter:

rows = [
    {"name": "John", "surname": "Doe", "id": 2},
    {"name": "Andy", "surname": "Smith", "id": 1},
    {"name": "Joseph", "surname": "Jones", "id": 3},
    {"name": "Oliver", "surname": "Smith", "id": 4},
]

from operator import itemgetter
sorted_by_name = sorted(rows, key=itemgetter("surname", "name"))
# [{"name": "John", "surname": "Doe", "id": 2},
#  {"name": "Joseph", "surname": "Jones", "id": 3},
#  {"name": "Andy", "surname": "Smith", "id": 1},
#  {"name": "Oliver", "surname": "Smith", "id": 4}]

min(rows, key=itemgetter("id"))
# {"name": "Andy", "surname": "Smith", "id": 1}

In this snippet we use sorted function that accepts iterable and key function. This key function has to be a callable that takes single item from the iterable (rows) and extracts the value used for sorting. In this case we pass in itemgetter which creates the callable for us. We also give it dictionary keys from rows which are then fed to object's __getitem__ and the results of the lookup are then used for sorting. As you probably noticed, we used both surname and name, this way we can simultaneously sort on multiple fields.

The last lines of the snippet also show another usage for itemgetter, which is lookup of row with minimum value for ID field.

Next up is the attrgetter function, which can be used for sorting in similar way as itemgetter above. More specifically, we can use it to sort objects that don't have native comparison support:

class Order:
    def __init__(self, order_id):
        self.order_id = order_id

    def __repr__(self):
        return f"Order({self.order_id})"

orders = [Order(23), Order(6), Order(15) ,Order(11)]
from operator import attrgetter
sorted(orders, key=attrgetter("order_id"))
# [Order(6), Order(11), Order(15), Order(23)]

Here we use self.order_id attribute to sort orders by their IDs.

Both of the above shown functions are very useful when combined with some functions from itertools module, so let's see how we can use itemgetter to group elements by its field:

orders = [
    {"date": "07/10/2021", "id": 10001},
    {"date": "07/10/2021", "id": 10002},
    {"date": "07/12/2021", "id": 10003},
    {"date": "07/15/2021", "id": 10004},
    {"date": "07/15/2021", "id": 10005},
]

from operator import itemgetter
from itertools import groupby

orders.sort(key=itemgetter("date"))
for date, rows in groupby(orders, key=itemgetter("date")):
    print(f"On {date}:")
    for order in rows:
        print(order)
    print()

# On 07/10/2021:
# {"date": "07/10/2021", "id": 10001}
# {"date": "07/10/2021", "id": 10002}
# On 07/12/2021:
# {"date": "07/12/2021", "id": 10003}
# On 07/15/2021:
# {"date": "07/15/2021", "id": 10004}
# {"date": "07/15/2021", "id": 10005}

Here we have a list of rows (orders) which we want to group by date field. To do that, we first sort the array and then call groupby to create groups of items with same date value. If you're wondering why we needed to sort the array first, it's because groupby function work by looking for consecutive records with same value, therefore all the records with same date need to be grouped together beforehand.

In the previous examples we worked with arrays of dictionaries, but these functions can be also applied to other iterables. We can for example use itemgetter to sort dictionary by values, find index of minimum/maximum value in array or sort list of tuples based on some of their fields:

# Sort dict by value
from operator import itemgetter
products = {"Headphones": 55.90, "USB drive": 12.20, "Ethernet Cable": 8.12, "Smartwatch": 125.80}

sort_by_price = sorted(products.items(), key=itemgetter(1))
# [('Ethernet Cable', 8.12), ('USB drive', 12.2), ('Headphones', 55.9), ('Smartwatch', 125.8)]

# Find index of maximum value in array
prices = [55.90, 12.20, 8.12, 99.80, 18.30]
index, price = max(enumerate(prices), key=itemgetter(1))
# 3, 99.8

# Sort list of tuples based on their indices
names = [
    ("John", "Doe"),
    ("Andy", "Jones"),
    ("Joseph", "Smith"),
    ("Oliver", "Smith"),
]

sorted(names, key=itemgetter(1, 0))
# [("John", "Doe"), ("Andy", "Jones"), ("Joseph", "Smith"), ("Oliver", "Smith")]

Methodcaller

Last function from operator module that needs to be mentioned is methodcaller. This function can be used to call a method on object using its name supplied as string:

from operator import methodcaller

methodcaller("rjust", 12, ".")("some text")
# "...some text"

column = ["data", "more data", "other value", "another row"]
[methodcaller("rjust", 12, ".")(value) for value in column]
# ["........data", "...more data", ".other value", ".another row"]

In the first example above we essentially use methodcaller to call "some text".rjust(12, ".") which right-justifies the string to length of 12 characters with . as fill character.

Using this function makes more sense for example in situations where you have a string name of the desired method and want supply the same arguments to it over and over again, as in the second example above.

Another more practical example for usage of methodcaller can be the following code. Here we feed lines of a text file to map function and we also pass it our desired method - in this case strip - which strips whitespaces from each of the lines. Additionally, we pass result of that to filter which removes all the empty lines (empty lines are empty string which are falsy, so they get removed by filter).

from operator import methodcaller

with open(path) as file:
    items = list(filter(None, map(methodcaller("strip"), file.read().splitlines())))
    print(items)

Closing Thoughts

In this article we took a quick tour of (in my opinion) an underrated operator module. This shows that even small module with just a couple of functions can be very useful in you daily Python programming tasks. There are many more useful modules in Python's standard library, so I recommend just checking module index and diving in. You can also checkout my previous articles which explore some of these modules such as itertools or functools.

Top comments (3)

xtofl • Jul 31 '21

Thanks for this post! I noticed that itemgetter("x")(dict(x=10, y=20)) would return a single element (10), while itemgetter("x", "y")(dict(x=10, y=20)) returns a tuple. Do you happen to know a way to streamline this and force it to always return the same type (i.e. a tuple)?