Programming in functional style has never been popular among Python programmers. Even though libraries such as
toolz exist, imperative and object-oriented style is still the norm. Being a functional programming convert myself, I'm not sure why that is. I suspect that the lack of a static type system has been a big reason that Python programmers generally haven't recognized the advantages of functional programming, since the marriage between the two is what brings many of the benefits.
With PEP 484 and friends this has changed. The ecosystem around the typing module has matured to the point where a full fledged functional library with static types is practical. As a true functional programming believer this motivated me to author a library called
pfun which brings functional programming with a strong emphasis on static types to Python.
pfun distinguishes itself from other libraries such as
toolz in that
- It's designed to provide the most accurate typing possible within the limitations of the type system
- It provides a unified functional system for managing side-effects in a module called
This article will explain the
pfun effect system in enough detail that you should be able to use it, and maybe even contribute to it. we'll start with an introduction to functional programming and its benefits. If you're already a true believer and functional programming aficionado, you can probably skip ahead to A Functional Effect System for Python.
In functional programming, we build our programs by combining functions.
But we only use special functions: We only allow functions that compute new results based on their arguments. Functions that do other things besides computing results, such as mutating program state or writing to a file, are not allowed.
We call these "other things" side effects. The reason we don't like functions that perform side effects is that they are harder to re-use since the programmer needs to keep track of which parts of the program (or external state that the program depends on) are affected by the side effect when invoking the function. We prefer functions without side effects to the point that we have named them pure functions, and dislike functions that perform side effects to the extent that we have named them impure functions.
Look at these two functions that both add an element to a list for example:
from typing import List def add_with_side_effect(l: List[int], e: int) -> List[int]: l.append(e) return l def add_without_side_effect(l: List[int], e: int) -> List[int]: return l + [e]
add_with_side_effect mutates the list passed as an argument. You need to consider which other parts of the program has a reference to
l when calling it. In other words you need to think about your program globally in order to figure out if calling
add_with_side_effect is safe.
add_without_side_effect returns a new list. The list the caller has and passes as the argument
l is left intact. It doesn't matter if any other part of your program has a reference to
l since it's left unchanged. In other words you can reason about your program locally, which is much, much easier most of the time.
Functions that are side-effect-free have a number of significant advantages:
- They are truly modular
- They are easy to reason about
- They are easy to test
- They are more robust
- And more
Limiting ourselves to using pure functions also presents a challenge, since most programs in the real world need to read and write files, raise exceptions and use the network. Given that all of these things are inherently impure, it would seem that functional programming is close to useless.
In the next section we'll see how to do all of those things in a purely functional context.
How do we write functions that can handle side-effects and are still pure? To achieve this seemingly impossible feat we're going to use a clever trick. The trick is to change our impure functions such that instead of performing a side effect, they return a description of a side effect that can be executed at a later time. The most straight-forward way of doing that is to write functions that return other functions that performs the side effects. To that end we'll use the following
Effect type alias:
from typing import TypeVar, Callable A = TypeVar('A') Effect = Callable[, A]
Lets define a function
read_file_pure that returns a description of the side effect of reading a file:
def read_file_pure(path: str) -> Effect[str]: def effect() -> str: with open(path) as f: return f.read() return effect
read_file_pure really is pure: for the same argument path it always returns the same function regardless of the actual contents of the file. The effect function returned by
read_file_pure is not pure of course.
So far so good, but how do we actually do anything with the contents of a file? We could of course run the function returned by
read_file_pure immediately after calling it to get the content and then perform whatever transformation is necessary - but then our program is no longer pure!
What we'll do instead is to write a new function that takes an
Effect and applies a function to its result in a new effect. Lets call this function
map (because it maps a function over the result of executing an impure function):
from typing import TypeVar A = TypeVar('A') B = TypeVar('B') def map(effect: Effect[A], f: Callable[[A], B]) -> Effect[B]: def new_effect() -> B: return f(effect()) return new_effect
Again, notice that
map is pure because it is lazy: No side-effects are carried out when calling
map. It's only when the function returned by
map is called that the side-effect described by the
effect parameter is carried out.
This means that we can manipulate values returned by
read_file_pure in a pure fashion, for example in:
def lower_case_file(path: str) -> Effect[str]: return map(impure_read_file(path), lambda contents: contents.lower())
Most functional programs take the shape:
1) Create an
Effect to get input for the computation
2) Transform the input using
map (or a similar function)
3) Run the
Effect returned by
Or in code:
def transform(input: str) -> str: ... if __name__ == '__main__' program = map(impure_read_file(path), transform) _ = program()
In other words,
program is an impure function, built by cleverly combining lazy, pure functions. Calling
map doesn't actually do any real computation, except whats required to put together the
program function. The real computation, including side-effects, are only executed when
program is called at the edge of our python script (in the
That's great, but what if we want to combine several effects? For example, what if we had a function for writing strings to files without side-effects:
def write_file_pure(path: str, contents: str) -> Effect[None]: def effect() -> None: with open(path, 'w') as f: f.write(contents) return effect
If we try to use
map, we end up with an
map only takes care of "unwrapping" the effect it takes as a parameter, not the effect returned by the function that is mapped over the result of the
effect parameter. This means that our program now looks like this:
if __name__ == '__main__': program = map( read_file_pure('foo.txt'), lambda content: write_file_pure('bar.txt', content) ) _ = program()() # program() returns an Effect
Notice that we have to call the
Effect returned by
program() to actually run the effect returned by
write_file_pure. Also realize that it's possible to produce an
Effect that wraps an arbitrary number of nested effects, for example if
map is called in a loop. We might end up in a situation where it's not easy to figure out precisely how many functions we need to call at the edge of our script!
To fix this, we introduce another function that knows how to combine an
Effect with a function that returns an
Effect. Lets call it
and_then since it's chaining together two effects sequentially:
def and_then(effect: Effect[A], f: Callable[[A], Effect[B]]) -> Effect[B]: def new_effect() -> B: return f(effect())() return new_effect
Now we can rewrite our program from before as:
if __name__ == '__main__': program = and_then( read_file_pure('foo.txt'), lambda content: write_file_pure('bar.txt', content) ) _ = program()
Notice that we only have to call
program to run all the effects involved in the computation since
and_then knows how to combine two effects. Also realize that if we use
and_then appropriately, we can combine an arbitrary number of effects, even in a loop. In other words, we have a completely general framework for working with side-effects in a pure fashion:
- Functions that need to perform side-effects return
- We transform the result of effects using
- We combine effects with functions that return effects using
Since functional programs are generally composed of chained calls to
and_then, lets refactor the functions as instance methods so that we can use a nicer dot notation:
from dataclasses import dataclass from typing import Callable, TypeVar, Generic A = TypeVar('A', covariant=True) B = TypeVar('B') @dataclass(frozen=True) class Effect(Generic[A]): effect: Callable[, A] def __call__(self): return self.effect() def map(self, f: Callable[[A], B]) -> Effect[B]: return Effect(lambda: f(self())) def and_then(self, f: Callable[[A], Effect[B]]) -> IO[B]: return Effect(lambda: f(self())())
(We've decorated the class with
dataclass to remove some boilerplate and avoid mutation of the
effect field). Now our program from before becomes:
def read_file_pure(path: str) -> Effect[str]: def effect() -> str: with open(path) as f: return f.read() return Effect(effect) def write_file_pure(path: str, content: str) -> Effect[None] def effect() -> None: with open(path, 'w') as f: f.write(content) return Effect(effect) if __name__ == '__main__': program = read_file_pure( 'foo.txt' ).and_then( lambda content: write_file_pure('bar.txt', content) ) program()
Which arguably reads a little nicer.
So far we have only been concerned with one particular side-effect: IO. Lets now turn our attention to error handling that is also normally modeled as a side-effect in imperative programming.
Our requirement for pure functions is that they can only compute values based on their arguments. To comply with this dogma, lets define wrapper types that can be used to 'tag' values returned from functions.
from dataclasses import dataclass from typing import Union, TypeVar, Generic R = TypeVar('R', covariant=True) L = TypeVar('L', covariant=True) @dataclass(frozen=True) class Right(Generic[R]): get: R @dataclass(frozen=True) class Left(Generic[L]): get: L Either = Union[Left[L], Right[R]]
For convenience, we've also defined a type-alias
Either that we can use to more easily write type signatures for functions that can fail. Since it's a union type, it gives us the added benefit that our type checker can help us figure out if we are handling errors correctly.
With this in hand we can do error handling in a purely functional way:
def divide(nume: float, denom: float) -> Either[str, float]: try: return Right(nume / denom) except ZeroDivisionError: return Left('Division by zero')
That's great, but what if we need to write a function that needs to model both the IO side-effect and the error handling side-effect? We can of course wrap one in the other, e.g:
def read_file_pure(path: str) -> Effect[Either[IOError, str]]: def effect() -> Either[IOError, str]: try: with open(path) as f: return Right(f.read()) except IOError as e: return Left(e) return Effect(action)
Except now we have to deal with error handling every time we need to use
and_then, because neither of them know how to unpack the value wrapped by an
def lower_case_file(path: str) -> Effect[Either[IOError, str]]: read_file_pure(path).map( lambda either: Right(either.get.lower()) if isinstance(either, Right) else either )
Yuck. Thankfully, there are a number of ways to avoid having this error handling code everywhere in our functional program. The solution we use in
pfun.effect is to "bake error handling into our
Effect type", in order that
and_then can take care of the wrapping/unwrapping for us:
from dataclasses import dataclass from typing import Union, TypeVar, Generic A = TypeVar('A', covariant=True) E = TypeVar('E', covariant=True) B = TypeVar('B') E2 = TypeVar('E2') @dataclass(frozen=True) class Effect(Generic[E, A]): effect: Callable[, Either[E, A]] def __call__(self) -> Either[E, A]: return self.effect() def map(self, f: Callable[[A], B]) -> Effect[E, B]: def effect(): either = self() if isinstance(either, Left): return either else: return Right(f(either.get)) return Effect(effect) def and_then(self, f: Callable[[A], Effect[E2, B]]) -> Effect[Union[E, E2], B]: def effect(): either = self() if isinstance(either, Left): return either else: return Right(f(either.get)()) return Effect(effect)
We have added a type parameter
E to represent the error type. The implementations of
and_then are more less identical to our previous versions, except that they expect an
Either returned from the wrapped
effect function and handle them appropriately. Notice also that the return type of
Effect[Union[E, E2], B], because the resulting
Effect can fail if either
self() returns a
Left value or if
f(...)() returns a
Now the mess from before simply becomes
def read_file_pure(path: str) -> Effect[IOError, str]: def effect() -> Either[IOError, str]: try: with open(path) as f: return Right(f.read()) except IOError as e: return Left(e) return Effect(effect) def lower_case_file(path: str) -> Effect[IOError, str]]: read_file_pure(path).map(lambda content: content.lower())
Effect type we've cooked up so far is pretty close to what you'll find in
pfun.effect. The main difference is that the wrapped
effect function in
pfun.effect takes a generic argument
from dataclasses import dataclass from typing import TypeVar, Generic, Union R = TypeVar('R', contravariant=True) E = TypeVar('E', covariant=True) E1 = TypeVar('E1') A = TypeVar('A', covariant=True) B = TypeVar('B') @dataclass(frozen=True) class Effect(Generic[R, E, A]): effect: Callable[[R], Either[E, A]] def __call__(self, r: R) -> Either[E, A]: return self.effect(r) def map(self, f: Callable[[A], B]) -> Effect[R, E, B]: def effect(r: R) -> Either[E, B] either = self(r) if isinstance(either, Left): return either return Right(f(either.get)) return Effect(effect) def and_then(self, f: Callable[A, Effect[R, E1, B]] ) -> Effect[R, Union[E, E1], B]: def effect(r: R) -> Either[E, B]: either = self(r) if isinstance(either, Left): return either return Right(f(either.get)(r)) return Effect(effect)
Allowing the wrapped
effect function to take a parameter enables dependency injection for increased code re-use and improved test-ability
For historical reasons, this is known as the reader effect. We will go into what
pfun.effect can do with this argument in a lot of detail in future posts, but for now lets just exemplify the dependency injection feature.
Say for example that you want to implement an effect that reads from a database. Using the
R parameter you can pass the connection string to the action, making it possible to re-use the same
Effect instance for reading from different databases:
def read_from_database(query: str) -> Effect[str, IOError, dict]: def effect(connection_string: str) -> Either[IOError, dict]: try: database = connect(connection_string) return Right(database.execute(query)) except IOError as e: return Left(e) return Effect(effect)
This allows us to combine
read_from_database with other functions through
and_then, and then re-use the entire function call chain against different databases in a completely type-safe manner.
if __name__ == '__main__': program = read_from_database('select * from user').map(lambda users: users['user1']) test_results = program('test_user@test_database') production_results = program('prod_user@prod_database')
Take a look at the documentation for more information on how to use the
R parameter to its full potential.
There is still a significant problem with the
Effect type we have built in this post: long chains of effects combined with
RecursionError because one effect needs to be executed before the next one can be executed.
pfun.effect solves this problem and more:
pfun.effectcan build sequences of effects of arbitrary length without causing a
RecursionErrorwhen executed, because effects are run using a mechanism called trampolines
pfun.effectis compatible with
asynciofor high performance asynchronous IO
pfun.effectprovides a number of useful helper functions to help you write code at a very high level of abstraction
pfun.effectprovides a plugin for mypy that enables fine grained type inference of helper functions
pfun.effect provides a full fledged purely functional, statically typed effect system for Python. Head over to the documentation to go more in depth, or to the github repo for issues, questions and contributions.