Protocols and Composition in Python

#python #codequality #architecture #programming

It's a source of heartbreak and distress for me when instrumentation-related side-effects (like logging, metrics, tracing, retrying logic, error handling, etc.) begin making their way inside the business logic layer of an application.
If only there was a way to overlay instrumentation on top of business logic without touching it directly... It would be even better if we could keep the various instrumentation concerns separate from one another, while we're at it...

Right, let's write some code!

Naive Implementation

Say, we're writing an advanced service that adds numbers. We could jump in right away and have a working implementation in no time at all:

def add(a: int, b: int) -> int:
    return a + b

Great! Now we want to deploy it to production, so we're asked to add logging - to make it possible to debug production failures:

logger = logging.getLogger(__name__)

def add(a: int, b: int) -> int:
    result = return a + b
    logger.debug("adding %s and %s gives %s", a, b, result)
    return result

It works! But, now we've run into performance issues so we need to add collection of performance metrics, like the ability to time the execution of the add method:

logger = logging.getLogger(__name__)

def add(a: int, b: int) -> int:
    t_start = time.perf_counter()
    result = return a + b
    t_end = time.perf_counter()
    logger.debug("adding %s and %s gives %s", a, b, result)
    took = t_end - t_start
    logger.debug("took %s seconds", took)
    return result

Still works, but things are definitely getting a bit messy - in fact, it's becoming difficult to understand what the service is doing in the first place. Not to mention the impact on testability of our code! Granted, if we're logging to the local filesystem or stdout testing is probably not much of an issue at this stage, but if we're storing logs remotely our tests are at risk of becoming slow, fragile, and we're likely spamming logs with useless messages every time we run our tests. It's a slippery slope towards spaghetti code - there must be a better way!

Protocol-based implementation

When I was a kid, my dad used to tell me: "hurry slowly". I find this to be a reasonable principle when applied to coding. Let us start slowly then and begin by defining our service as a pure interface expressing the desired functionality - or, in Python parlance - as a "protocol":

# service.py

from typing import Protocol


class AddServiceProtocol(Protocol):
    "Represents functionality of adding two numbers."

    def add(self, a: int, b: int) -> int:
        ...

The protocol of our service is simple: we take two int values and we return an int value representing their sum. Given the protocol, the concrete implementation might look as follows:

# service.py

class AddService:
    "Implements AddServiceProtocol."

    def add(self, a: int, b: int) -> int:
        return a + b

A protocol is Python's take on "structural subtyping" - it's a type that's effectively implemented by anything that matches the signature of the protocol's methods. Concrete implementations can subclass the protocol, in which case implementation correctness will be enforced at runtime on instantiation of classes inheriting from a protocol. That said, explicit subclassing is entirely optional. A tool like mypy will be able to reason about protocols and their implementations based on method signatures alone. Think - abstract base classes light. Or, think - pythonic duck-typing augmented with static verification tooling.

All that sounds very fancy, but what's the benefit of doing things this way? Let's have a look at what adding logging to our implementation might look like. Instead of adding the logging logic inside the add method of the main implementation let's create a separate implementation that will satisfy the service protocol while also wrapping the service itself:

# service.py

class LoggingAddService:
    """
    Implements AddServiceProtocol. Wraps AddService and adds basic logging.
    """

    def __init__(self, service: AddServiceProtocol, logger: Logger) -> None:
        self._inner = service
        self._logger = logger

    def add(self, a: int, b: int) -> int:
        result = self._inner.add(a, b)
        self._logger.debug("[add] adding %s and %s gives %s", a, b, result)
        return result

We use dependency injection and initialize the LoggingAddService with a reference to an instance of a class that fulfills the AddServiceProtocol contract and an instance of a logging.Logger. When called, the add method
on LoggingAddService runs the add method on the _inner class, while also logging the details of the call using the reference to the _logger.

What we have effectively created something like a middleware for our service, one we can safely compose with other similar wrappers as long as they also implement AddServiceProtocol. Since we can, let's create another middleware, one that records how long it takes to add numbers:

# service.py

class TimingAddService:
    """
    Implements AddServiceProtocol. Wraps AddService and adds timing of method calls.
    """

    def __init__(self, service: AddServiceProtocol, logger: Logger) -> None:
        self._inner = service
        self._logger = logger

    def add(self, a: int, b: int) -> int:
        start = time.perf_counter()
        result = self._inner.add(a, b)
        end = time.perf_counter()
        elapsed = end - start
        self._logger.debug(f"[add] took {elapsed:0.8f} seconds")
        return result

Yes, it's all repetitive and a bit boring, but I'd argue this is a good thing! It doesn't take a lot of effort to understand what this code is doing, and once we've grokked the pattern, we'll recognize it immediately wherever it's applied. Since each layer exists as a separate unit with explicitly defined
dependencies, each layer can be unit-tested in isolation. Finally, we've effectively deferred the decision about how the service should be configured - or wrapped - when it's executed. This decision can be in fact left to the user
as the feature-dependent wrapping can be performed at runtime based on user-selected options.

As means of demonstrating this last property, let's create a simple CLI tool for adding numbers - we'll add logging debug messages and timing reports as optional features that can be enabled using flags. I'm going to use typer to turn our service into a CLI application:

# main.py


import typer

from logger import std_out_logger
from service import AddService, AddServiceProtocol, LoggingAddService, TimingAddService


def main(a: int, b: int, debug: bool = False, timing: bool = False) -> None:
    """
    Adding 'a' to 'b' made easy!
    """
    service: AddServiceProtocol = AddService()
    if timing:
        service = TimingAddService(service=service, logger=std_out_logger("timing"))
    if debug:
        service = LoggingAddService(service=service, logger=std_out_logger("logging"))
    print(service.add(a, b))


if __name__ == "__main__":
    typer.run(main)

The main function is where we wire the parts of our application together. The individual components don't need to know about each other otherwise - all they care about is the contract represented by the protocol and whatever additional dependencies they require to do their thing. The various middlewares are layered on top of the core service based on the values of debug and timing flags. The main function becomes the only place where we use the conditionals that toggle the timing and logging features - just imagine what our code would look like if these had to be colocated with our business logic!

Wrapping up

I first learned of this pattern in Go's go-kit where I've seen it called "service middlewares". The Design Patterns book describes something similar as the "decorator" pattern - not to be confused with Python decorators, which have the unfortunate property of melding themselves with what they decorate, which limits their practical usefulness, at least as far as reducing coupling is concerned.

A similar effect can also be achieved by means of class inheritance, although the composition-based solution will be more light-weight and flexible as you won't have to choose between pre-creating classes representing each possible permutation of wrappers (i.e. AddService, LoggingAddService, TimingAddService, LoggingAndTimingAddService, etc.) and overloading classes with features and responsibilities which might not be required in many runtime scenarios. Indeed, this particular pattern is a fine example of why composition might be preferable to inheritance in many cases.

References

A working example of the protocol-based implementation can be found on GitHub.
This post was originally posted on my blog.
Cover photo by Ryan Quintal.

Top comments (5)

Steve B • Nov 6 '21

Nice post 👍 I think you've got a small mistake on your first "class LoggingAddService". It doesn't use the inner for doing the add.

Filip Wojciechowski • Nov 6 '21

Yes, of course - thanks for spotting it! Fixed.

Steve B • Nov 6 '21

Glad I could help. I enjoyed reading this post. This kind of thing is the reason I created: github.com/meadsteve/lagom. Although I've not settled on a pattern for dealing automatically with decorators like in your example.