In this oxymoronically titled article, we study laziness as a core aspect of functional programming in Python. I'm not talking about hammock driven development or some such leisurely titled paradigm, but lazy evaluation. We'll see how lazy evaluation can make you more productive by improving re-usability and composeability through refactoring a small example, introducing lazyness along the way.
Simply put, lazy evaluation means that expressions are not evaluated until their results are needed. Contrast this with eager evaluation, which is the norm in imperative programming. Under eager evaluation, functions immediately compute their results (and perform their side-effects) when they are called. As an example, consider this python function called
get_json which calls a web api as a side-effect and parses the response as json:
import urllib.request import json def get_json(url: str) -> dict with urllib.request.urlopen(url) as response: content = response.read() return json.loads(content)
Now, imagine that we want to implement a retry strategy with a simple back-off mechanism. We could take the eager approach and adapt
import time def get_json(url: str, retry_attempts: int = 3) -> dict: last_exception: Exception for attempt in range(1, retry_attempts + 1): try: with urllib.request.urlopen(url) as response: content = response.read() return json.loads(content) except Exception as e: time.sleep(attempt) last_exception = e raise last_exception
That works, but the solution has a huge shortcoming: We can't re-use the retry strategy for other types of HTTP requests. Or alternatively, but equivalently:
get_json violates the single responsibility principle because it now has three responsibilities:
- Calling an api
- Parsing json
This makes it hard to re-use. Let's fix it by being lazy. To keep things general, we'll define a type-alias that models lazy values that are produced with or without side-effects. Let's call this alias
Effect since it allows us to treat side-effects as first-class values that can be manipulated by our program, thereby taking the "side" out of "side-effect".
from typing import Callable, TypeVar A = TypeVar('A') Effect = Callable[, A]
We'll use this alias to implement the function
retry which can take any
Effect and retry it with the same backoff mechanism from before:
def retry(request: Effect[A], retry_attempts: int = 3) -> A: for attempt in range(1, retry_attempts + 1): try: return request() except Exception as e: time.sleep(attempt) last_exception = e raise last_exception def get_json_with_retry(url: str, retry_attempts: int = 3) -> dict: return retry(lambda: get_json(url), retry_attempts=retry_attempts)
retry treats the results of (for example) HTTP requests as lazy values that can be manipulated. I'm using the term lazy values here because it fits nicely with our theme, but really I could just have said that
get_json as a higher-order function. By treating functions as lazy values that can be passed around,
retry achieves more or less total re-usability.
So far so good. Now lets implement a function that executes lazy values in parallel called
run_async. Obviously, we want to be able to use
from typing import Iterable from multiprocessing import Pool def run_async(effects: Iterable[Effect[A]]) -> Iterable[A]: with Pool(5) as pool: results = [pool.apply_async(effect) for effect in effects] return [result.get() for result in results] def get_json_async_with_retry(urls: Iterable[str], retry_attempts: int = 3) -> Iterable[dict]: # lambda url=url is a small "hack" to prevent the url to # be mutated in the closures of the lambdas by the for loop effects = [lambda url=url: get_json_with_retry(url, retry_attempts) for url in urls] return run_async(effects)
(This won't actually work since we have lambdas in the mix and
multipocessing doesn't like that without third party libraries, but bear with me.)
That's all fine, but you might object that we've made a mess of our code. I agree. In particular I think that our "glue" functions
get_json_async_with_retry are embarrasingly clumsy. What's missing, in my view, is a general solution for glueing lazy values together that would make these specialized glue functions redundant.
To achieve that, we'll use the following dogma:
- Functions that perform side effects return
Effectinstances. In other words, rather than peforming a computation or side-effect directly, they return a lazy description of a result (and/or side-effect) that can be combined with functions that operate on
- Functions that operate on
Effectinstances return new
With this scheme, any lazy result can be composed infinitely with functions that operate on lazy results. Indeed, functions that operate on lazy results can be composed with themselves!
So let's maximize the re-usability of our solution by turning our laziness up to The Dude levels of lethargy.
Lets start by refactoring
get_json to return an
def get_json(url: str) -> Effect[dict]: def effect() -> dict: with urllib.request.urlopen(url) as response: content = response.read() return json.loads(content) return effect
Pretty straight-forward. Now let's do the same for
def retry(request: Effect[A], retry_attempts: int = 3) -> Effect[A]: def effect() -> A: for attempt in range(1, retry_attempts + 1): try: return request() except Exception as e: time.sleep(attempt) last_exception = e raise last_exception return effect def run_async(effects: Iterable[Effect[A]]) -> Effect[Iterable[A]]: def effect() -> Iterable[A]: with Pool(5) as pool: results = [pool.apply_async(effect) for effect in effects] return [result.get() for result in results] return effect
With this in hand we can compose any variation of our functions to our hearts desire with minimal effort:
url = ... get_json_with_retry: Effect[dict] = retry(get_json(url)) urls = [...] get_json_async_with_retry: Effect[Iterable[dict]] = run_async( [retry(get_json(url)) for url in urls] )
First, realise that we could further re-use both
get_json_async_with_retry with any function that operates on
Effect instances. Also, notice that lazyness (or higher-order functions) is what enables us to program with this degree of re-use, and what allows compositionality on this high level of abstraction (ultimately on the level of entire programs).
When functional programmers claim that programming in functional style makes you more productive, this is the reason: functional programming (which often involves lazy evaluation) can drastically improve re-useability and compositionality, which means you can do much more with much less. All of these advantages were part of my motivation for authoring the library pfun that makes it possible to write Python in functional style without all of the boilerplate and ceremony in this example.
As an added bonus, functional programs are often more predictable and easy to reason about. Also, with few modifications, the pattern we have developed here can be extended to enable completely type-safe dependency injection and error handling. Moreover, static type checking becomes more useful because all functions must return values, even if they merely produce side-effects (in which case they'll return
In our effort to refactor our example, we have halfway re-invented a common functional pattern: The enigmatic IO type (which we call
Effect in our example, and which I hope you don't find mysterious at all at this point). There is much confusion about what
IO is and does, especially outside the functional programming clubs of Haskell and Scala programmers. As a consequence you'll sometimes hear
IO explained as:
- Functions that perform side-effects return
- Functions that do not perform side-effects do not
While this explanation is not strictly speaking incorrect, it's wildly inadequate because it doesn't mention anything about lazyness which is a core feature of
IO. Based on this naive explanation, you might be tempted to try something like:
from typing import TypeVar, Generic A = TypeVar('A') class IO(Generic[A]): def __init__(self, value: A): self.value = value def get_json(url: str) -> IO[dict]: with urllib.request.urlopen(url) as response: content = response.read() return IO(json.loads(content))
IO implementation simply tags return values of functions that perform IO, making it clear to the caller that side-effects are involved. Whether this is useful or not is a matter of some controversy, but it doesn't bring any of the functional programming benefits that we have discussed in this article because this
IO version is eager.
In summary: lazyness (or higher-order functions) enables radical re-use and compositionality, both of which makes you more productive. To get started with functional programming in Python, checkout the pfun documentation, or the github repository.