In this oxymoronically titled article, we study laziness as a core aspect of functional programming in Python. I'm not talking about hammock driven development or some such leisurely titled paradigm, but lazy evaluation. We'll see how lazy evaluation can make you more productive by improving re-usability and composeability through refactoring a small example, introducing lazyness along the way.
Simply put, lazy evaluation means that expressions are not evaluated until their results are needed. Contrast this with eager evaluation, which is the norm in imperative programming. Under eager evaluation, functions immediately compute their results (and perform their side-effects) when they are called. As an example, consider this python function called get_json
which calls a web api as a side-effect and parses the response as json:
import urllib.request
import json
def get_json(url: str) -> dict
with urllib.request.urlopen(url) as response:
content = response.read()
return json.loads(content)
Now, imagine that we want to implement a retry strategy with a simple back-off mechanism. We could take the eager approach and adapt get_json
:
import time
def get_json(url: str, retry_attempts: int = 3) -> dict:
last_exception: Exception
for attempt in range(1, retry_attempts + 1):
try:
with urllib.request.urlopen(url) as response:
content = response.read()
return json.loads(content)
except Exception as e:
time.sleep(attempt)
last_exception = e
raise last_exception
That works, but the solution has a huge shortcoming: We can't re-use the retry strategy for other types of HTTP requests. Or alternatively, but equivalently: get_json
violates the single responsibility principle because it now has three responsibilities:
- Calling an api
- Parsing json
- Retrying
This makes it hard to re-use. Let's fix it by being lazy. To keep things general, we'll define a type-alias that models lazy values that are produced with or without side-effects. Let's call this alias Effect
since it allows us to treat side-effects as first-class values that can be manipulated by our program, thereby taking the "side" out of "side-effect".
from typing import Callable, TypeVar
A = TypeVar('A')
Effect = Callable[[], A]
We'll use this alias to implement the function retry
which can take any Effect
and retry it with the same backoff mechanism from before:
def retry(request: Effect[A],
retry_attempts: int = 3) -> A:
for attempt in range(1, retry_attempts + 1):
try:
return request()
except Exception as e:
time.sleep(attempt)
last_exception = e
raise last_exception
def get_json_with_retry(url: str, retry_attempts: int = 3) -> dict:
return retry(lambda: get_json(url), retry_attempts=retry_attempts)
retry
treats the results of (for example) HTTP requests as lazy values that can be manipulated. I'm using the term lazy values here because it fits nicely with our theme, but really I could just have said that retry
uses get_json
as a higher-order function. By treating functions as lazy values that can be passed around, retry
achieves more or less total re-usability.
So far so good. Now lets implement a function that executes lazy values in parallel called run_async
. Obviously, we want to be able to use run_async
with get_json
and retry
:
from typing import Iterable
from multiprocessing import Pool
def run_async(effects: Iterable[Effect[A]]) -> Iterable[A]:
with Pool(5) as pool:
results = [pool.apply_async(effect) for effect in effects]
return [result.get() for result in results]
def get_json_async_with_retry(urls: Iterable[str], retry_attempts: int = 3) -> Iterable[dict]:
# lambda url=url is a small "hack" to prevent the url to
# be mutated in the closures of the lambdas by the for loop
effects = [lambda url=url: get_json_with_retry(url, retry_attempts) for url in urls]
return run_async(effects)
(This won't actually work since we have lambdas in the mix and multipocessing
doesn't like that without third party libraries, but bear with me.)
That's all fine, but you might object that we've made a mess of our code. I agree. In particular I think that our "glue" functions get_json_with_retry
and get_json_async_with_retry
are embarrasingly clumsy. What's missing, in my view, is a general solution for glueing lazy values together that would make these specialized glue functions redundant.
To achieve that, we'll use the following dogma:
- Functions that perform side effects return
Effect
instances. In other words, rather than peforming a computation or side-effect directly, they return a lazy description of a result (and/or side-effect) that can be combined with functions that operate onEffect
instances. - Functions that operate on
Effect
instances return newEffect
instances
With this scheme, any lazy result can be composed infinitely with functions that operate on lazy results. Indeed, functions that operate on lazy results can be composed with themselves!
So let's maximize the re-usability of our solution by turning our laziness up to The Dude levels of lethargy.
Lets start by refactoring get_json
to return an Effect
.
def get_json(url: str) -> Effect[dict]:
def effect() -> dict:
with urllib.request.urlopen(url) as response:
content = response.read()
return json.loads(content)
return effect
Pretty straight-forward. Now let's do the same for retry
and run_async
def retry(request: Effect[A],
retry_attempts: int = 3) -> Effect[A]:
def effect() -> A:
for attempt in range(1, retry_attempts + 1):
try:
return request()
except Exception as e:
time.sleep(attempt)
last_exception = e
raise last_exception
return effect
def run_async(effects: Iterable[Effect[A]]) -> Effect[Iterable[A]]:
def effect() -> Iterable[A]:
with Pool(5) as pool:
results = [pool.apply_async(effect) for effect in effects]
return [result.get() for result in results]
return effect
With this in hand we can compose any variation of our functions to our hearts desire with minimal effort:
url = ...
get_json_with_retry: Effect[dict] = retry(get_json(url))
urls = [...]
get_json_async_with_retry: Effect[Iterable[dict]] = run_async(
[retry(get_json(url)) for url in urls]
)
First, realise that we could further re-use both get_json_with_retry
and get_json_async_with_retry
with any function that operates on Effect
instances. Also, notice that lazyness (or higher-order functions) is what enables us to program with this degree of re-use, and what allows compositionality on this high level of abstraction (ultimately on the level of entire programs).
When functional programmers claim that programming in functional style makes you more productive, this is the reason: functional programming (which often involves lazy evaluation) can drastically improve re-useability and compositionality, which means you can do much more with much less. All of these advantages were part of my motivation for authoring the library pfun that makes it possible to write Python in functional style without all of the boilerplate and ceremony in this example.
As an added bonus, functional programs are often more predictable and easy to reason about. Also, with few modifications, the pattern we have developed here can be extended to enable completely type-safe dependency injection and error handling. Moreover, static type checking becomes more useful because all functions must return values, even if they merely produce side-effects (in which case they'll return Effect[None]
).
In our effort to refactor our example, we have halfway re-invented a common functional pattern: The enigmatic IO type (which we call Effect
in our example, and which I hope you don't find mysterious at all at this point). There is much confusion about what IO
is and does, especially outside the functional programming clubs of Haskell and Scala programmers. As a consequence you'll sometimes hear IO
explained as:
- Functions that perform side-effects return
IO
- Functions that do not perform side-effects do not
While this explanation is not strictly speaking incorrect, it's wildly inadequate because it doesn't mention anything about lazyness which is a core feature of IO
. Based on this naive explanation, you might be tempted to try something like:
from typing import TypeVar, Generic
A = TypeVar('A')
class IO(Generic[A]):
def __init__(self, value: A):
self.value = value
def get_json(url: str) -> IO[dict]:
with urllib.request.urlopen(url) as response:
content = response.read()
return IO(json.loads(content))
This IO
implementation simply tags return values of functions that perform IO, making it clear to the caller that side-effects are involved. Whether this is useful or not is a matter of some controversy, but it doesn't bring any of the functional programming benefits that we have discussed in this article because this IO
version is eager.
In summary: lazyness (or higher-order functions) enables radical re-use and compositionality, both of which makes you more productive. To get started with functional programming in Python, checkout the pfun documentation, or the github repository.
.
Top comments (0)