Martin Heinz

Posted on Jun 28, 2021 • Originally published at martinheinz.dev

Functools - The Power of Higher-Order Functions in Python

#python #tutorial

Python standard library includes many great modules that can help you make your code cleaner and simpler and functools is definitely one of them. This module offers many useful higher order functions that act on or return other functions, which we can leverage to implement function caching, overloading, creating decorators and in general to make our code a bit more functional, so let's take a tour of it and see all the things it has to offer...

Caching

Let's start off with the simplest yet quite powerful functions of functools module. These are caching functions (and also decorators) - lru_cache, cache and cached_property. First of them - lru_cache provides least recently used cache of function results or in other words - memoization of results:

from functools import lru_cache
import requests

@lru_cache(maxsize=32)
def get_with_cache(url):
    try:
        r = requests.get(url)
        return r.text
    except:
        return "Not Found"


for url in ["https://google.com/",
            "https://martinheinz.dev/",
            "https://reddit.com/",
            "https://google.com/",
            "https://dev.to/martinheinz",
            "https://google.com/"]:
    get_with_cache(url)

print(get_with_cache.cache_info())
# CacheInfo(hits=2, misses=4, maxsize=32, currsize=4)
print(get_with_cache.cache_parameters())
# {'maxsize': 32, 'typed': False}

In this example we are doing GET requests and caching their results (up to 32 cached results) using @lru_cache decorator. To see whether the caching really works we can inspect cache info of our function using cache_info method, which shows number of cache hits and misses. The decorator also provides a clear_cache and cache_parameters methods for invalidating cached results and inspecting parameters respectively.

If you want to have a bit more granular caching, then you can also include optional typed=true argument, which makes it so that arguments of different types are cached separately.

Another caching decorator in functools is a function simply called cache. It's a simple wrapper on top of the lru_cache which omits the max_size argument making it smaller and after as it doesn't need to evict old values.

There's one more decorator that you can use for caching and it's called cached_property. This one - as you can probably guess - is used for caching results of class attributes. This is very useful if you have property that is expensive to compute while also being immutable.

from functools import cached_property

class Page:

    @cached_property
    def render(self, value):
        # Do something with supplied value...
        # Long computation that renders HTML page...

        return html

This simple example shows how we could use cached property to - for example - cache rendered HTML page which would get returned to user over-and-over again. Same could be done for certain database queries or long mathematical computations.

Nice thing about cached_property is that it runs only on lookups, therefore allowing us to modify the attribute. After the attribute is modified, the previously cached value won't be used, instead new value will be computed and cached. It's also possible to clear the cache, all we need to do is delete the attribute.

I would end this section with a word of caution for all of the above decorators - do not use them if your function has any side effects or if it creates mutable objects with each call, as those are not the types of functions that you want to have cached.

Comparing and Ordering

You probably already know that it's possible to implement comparison operators in Python such as <, >= or == using __lt__, __gt__ or __eq__. It can be quite annoying to implement every single one of __eq__, __lt__, __le__, __gt__, or __ge__ though. Luckily, functools module includes @total_ordering decorator that can help us with that - all we need to do, is implement __eq__ and one of the remaining methods and rest will be automatically provided by the decorator:

from functools import total_ordering

@total_ordering
class Number:
    def __init__(self, value):
        self.value = value

    def __lt__(self, other):
        return self.value < other.value

    def __eq__(self, other):
        return self.value == other.value

print(Number(20) > Number(3))
# True
print(Number(1) < Number(5))
# True
print(Number(15) >= Number(15))
# True
print(Number(10) <= Number(2))
# False

The above shows that even though we implemented only __eq__ and __lt__ we're able to use all of the rich comparison operations. The most obvious benefit of this is the convenience of not having to write all those extra magic methods, but probably more important is the reduction of code and it's improved readability.

Overloading

Probably all of us were taught that function overloading isn't possible in Python, but there's actually an easy way to implement it using two functions infunctools module - singledispatch and/or singledispatchmethod. These functions help us implement what we would call Multiple Dispatch algorithm, which is a way for dynamically-typed programming languages such Python to differentiate between types at runtime.

Considering that function overloading is pretty big topic on its own, I dedicated a separate article to Python's singledispatch and singledispatchmethod, so if you want to know more about this, then you can read more about it here.

Partial

We all work with various external libraries or frameworks, many of which provide functions and interfaces that require us to pass in callback functions - for example for asynchronous operations or for event listeners. That's nothing new, but what if we need to also pass in some arguments along with the callback function. That's where functools.partial comes in handy - partial can be used to freeze some (or all) of the function's arguments, creating new object with simplified function signature. Confusing? Let's look at some practical examples:

def output_result(result, log=None):
    if log is not None:
        log.debug(f"Result is: {result}")

def concat(a, b):
    return a + b

import logging
from multiprocessing import Pool
from functools import partial

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("default")

p = Pool()
p.apply_async(concat, ("Hello ", "World"), callback=partial(output_result, log=logger))
p.close()
p.join()

The above snippet demonstrates how we could use partial to pass function (output_result) along with its argument (log=logger) as a callback function. In this case we use multiprocessing.apply_async which asynchronously computes result of supplied function (concat) and returns its result to the callback function. apply_async will however always pass the result as a first argument and if we want to include any extra - as in this case log=logger we have to use partial.

This was fairly advanced use case, so a more basic example might be simply creating function that prints to stderr instead of stdout:

import sys
from functools import partial

print_stderr = partial(print, file=sys.stderr)
print_stderr("This goes to standard error output")

With this simple trick we created a new callable (function) that will always pass the file=sys.stderr keyword argument to print, allowing us to simplify our code by not having to specify the keyword argument every time.

And one last example for a good measure. We can also use partial to utilize little known feature of iter function - it's possible to create an iterator by passing callable and a sentinel value to iter, which can be useful in following application:

from functools import partial

RECORD_SIZE = 64

# Read binary file...
with open("file.data", "rb") as file:
    records = iter(partial(file.read, RECORD_SIZE), b'')
    for r in records:
        # Do something with the record...

Usually, when reading a file, we want to iterate over lines, but in case of binary data, we might want to iterate over fixed-sized records instead. This can be done by creating callable using partial that reads specified chuck of data and passing it to iter which then creates iterator out of it. This iterator then calls read function until end of file is reached always taking only specified chuck of data (RECORD_SIZE). Finally, when the end of file is reached sentinel value (b'') is returned and iteration stops.

Decorators

We already spoke about some decorators in the previous sections but not about decorators for creating more decorators, though. One such decorator is functools.wraps, to understand why we need it, let's first take a look at the following example:

def decorator(func):
    def actual_func(*args, **kwargs):
        """Inner function within decorator, which does the actual work"""
        print(f"Before Calling {func.__name__}")
        func(*args, **kwargs)
        print(f"After Calling {func.__name__}")

    return actual_func

@decorator
def greet(name):
    """Says hello to somebody"""
    print(f"Hello, {name}!")

greet("Martin")
# Before Calling greet
# Hello, Martin!
# After Calling greet

This example shows how you could implement a simple decorator - we wrap the function that does the actual task (actual_func) with outer decorator function which becomes the decorator that we can then attach to other functions - as for example with greet function here. When the greet function is called you will see that it prints both the messages from actual_func as well as its own. Everything looks fine, no problem here, right? But, what if we try the following:

print(greet.__name__)
# actual_func
print(greet.__doc__)
# Inner function within decorator, which does the actual work

When we inspect name and docstring of the decorated function we find that it was replaced by the values from inside the decorator function. That's not good - we can't have all our function names and docs overwritten every time we use some decorator. So, how do we solve this? - With functools.wraps:

from functools import wraps

def decorator(func):
    @wraps(func)
    def actual_func(*args, **kwargs):
        """Inner function within decorator, which does the actual work"""
        print(f"Before Calling {func.__name__}")
        func(*args, **kwargs)
        print(f"After Calling {func.__name__}")

    return actual_func

@decorator
def greet(name):
    """Says hello to somebody"""
    print(f"Hello, {name}!")

print(greet.__name__)
# greet
print(greet.__doc__)
# Says hello to somebody

The only job of wraps function is to copy name, docstring, argument list, etc. to prevent them from being overwritten. And considering that wraps is also a decorator we can just slap it onto our actual_func and the problem is solved!

Reduce

Last but not least in the functools module is reduce. You might know it from other languages as fold (Haskell). What this function does, is take a iterable and reduce (or fold) all its values into single one. This has many different applications, so here are some of them:

from functools import reduce
import operator

def product(iterable):
    return reduce(operator.mul, iterable, 1)

def factorial(n):
    return reduce(operator.mul, range(1, n))

def sum(numbers):  # Use `sum` function from standard library instead
    return reduce(operator.add, numbers, 1)

def reverse(iterable):
    return reduce(lambda x, y: y+x, iterable)

print(product([1, 2, 3]))
# 6
print(factorial(5))
# 24
print(sum([2, 6, 8, 3]))
# 20
print(reverse("hello"))
# olleh

As you can see from the above code, reduce can simplify and oftentimes compress code into single line that would otherwise be much longer. With that said, overusing this function just for sake of shortening code, making "clever" or making it more functional is usually a bad idea as it gets ugly and unreadable really quickly, so in my opinion - use it sparingly.

Also considering that usage of reduce generally produces one-liners it's ideal candidate for partial:

product = partial(reduce, operator.mul)

print(product([1, 2, 3]))
# 6

And finally if you need not only the final reduced result, but also intermediate ones, then you can use accumulate instead - function from another great module itertools. This is how you can use it to compute running maximum:

from itertools import accumulate

data = [3, 4, 1, 3, 5, 6, 9, 0, 1]

print(list(accumulate(data, max)))
# [3, 4, 4, 4, 5, 6, 9, 9, 9]

Closing Thoughts

As you could see here, functools features a lot of useful functions and decorators that can make you life easier, but this module is really just a tip of an iceberg. As I mentioned in the beginning Python standard library includes many modules that can help you build better code, so besides functools which we explored here, you might also want to checkout other modules such as operator or itertools (I wrote article about this one too, you can check it out here) or just go straight to Python module index and click on whatever catches your attention and I'm sure you will find something useful in there.

Top comments (3)

Alex Martelli • Jun 30 '21

To decorate a method with cached_property, the method must be callable without arguments (just like it must be in order to decorate it with built-in property), so the example of cached_property given in the article is wrong.

The sum function the article builds with reduce is tricky, since it does NOT return the sum of the items (like the built-in sum would), but rather one more than that (as can be seen in the very example the article gives), making the function's name extremely misleading -- I recommend removing that 1 initializer!