DEV Community

Cover image for Advanced python typing structures - how to express your types better and make your code more robust.
Filip Geppert
Filip Geppert

Posted on

Advanced python typing structures - how to express your types better and make your code more robust.

Contents

Static typing has gone through a rough way in Python community. Finally, it's becoming a production standard as typed code lets you catch bugs faster. Moreover, I bet that anyone who started using it and was tasked to refactor
the code months later, can clearly see its huge benefits.

In this post, we're going to focus on advanced structures offered by Python typing module. These structures help you better express your typing intents.

Here's a list of topic we'll cover in this post:

Join my newsletter to stay up to date with posts on python, programming, data engineering, and machine learning: https://filipgeppert.com
Additionally, I'll send you a free copy of my eBook on data science.

Generics

Let's imagine a following scenario:

from typing import Any

def first(seq: list[Any]) -> Any:
    return seq[0]
Enter fullscreen mode Exit fullscreen mode

First of all, try to avoid using Any in your codebase. According to mypy docs:

A static type checker will treat every type as being compatible with Any and Any as being compatible with every type.

In the long run, this will obviously result in bugs hiding in your code. In simple terms, Any disables type checker.
To solve this problem, you can use generics.
The code would look like this:

from typing import TypeVar

T = TypeVar('T')
def first(seq: list[T]) -> T:
    return seq[0]

first_item = first([1,2,3])
Enter fullscreen mode Exit fullscreen mode

mypy now knows that first_item has type int and will use this information for further type checking. That's the
power of generics. They let you preserve information about types. Additionally, you can restrict TypeVar to specific types.

from typing import TypeVar

T = TypeVar('T', int, float)
def first(seq: list[T]) -> T:
    return seq[0]

first_item = first([1,2,3]) # Ok
another_first_item = first(["1","2","3"]) # Throws an error!
Enter fullscreen mode Exit fullscreen mode

If you need to be less restrictive, you can use T = TypeVar('T', bound=float) to allow all subtypes for a specified type.

Finally, generics help you solve a common problem that you'll eventually see in any large codebase.
Imagine that an intention of a developer is to create a function that adds two numbers of the same type, but supports
additions for int and float. Here's an implementation using basic types:

from typing import Union

def add(first: Union[int, float], second: Union[int, float]) -> Union[int, float]:
    return first + second

add(1, 2.1) # Shouldn't be allowed, yet mypy doesn't complain
Enter fullscreen mode Exit fullscreen mode

Unfortunately, with this type annotations, nothing blocks a client from calling add(1, 2.1). Mypy is not able to catch
the following error and add function becomes unstable in terms of type that is returned. Here's how you can fix this using
generics:

from typing import TypeVar

T = TypeVar('T', int, float)

def add(first: T, second: T) -> T:
    return first + second

add(1, 2.1) # mypy throws an error!
Enter fullscreen mode Exit fullscreen mode

This implementation clearly shows the intention of a developer. Now, only numbers of the same type can be added.

Protocols

We all have gone down the road of writing code where objects are inheriting multiple times (or worse, we're asked to take over a codebase that does it).
This becomes a problem when we start typing our code as mypy errors start to scream. Protocol is a great solution to add type annotation for methods
that really matter to you. Here's an example:

from typing import Protocol

class MyProtocol(Protocol):
    a: str
    def set_a(self, new_value: str) -> None: ...
    def get_a(self) -> str: ...

def some_function_gazylion_modules_away(obj: MyProtocol) -> str:
    obj.set_a("1")
    return obj.get_a()
Enter fullscreen mode Exit fullscreen mode

Now, any time some_function_gazylion_modules_away is used, it expects an object that has a attribute and two methods with an exact same type signature.
It doesn't matter if object that is passed implements some additional methods.
Only characteristics specified in Protocol definition are important for a type checker.

Additionally, Protocol can be checked at runtime when you decorate it with runtime_checkable.

from typing import Protocol, runtime_checkable

@runtime_checkable
class MyProtocol(Protocol):
    a: str
    def set_a(self, new_value: str) -> None: ...
    def get_a(self) -> str: ...
Enter fullscreen mode Exit fullscreen mode

Callable

Mypy offers Callable type. For an object to be a valid Callable, it must implement the __call__ method.
Callable has the following syntax:

Callable[[<list of input argument types>], <return type>]

Here's how you can use it to type check your decorators:

from time import time, sleep
from typing import Callable

def time_it(func: Callable[[], int]) -> Callable[[], int]:
    def wrapper() -> int:
        start_time = time()
        func()
        end_time = time()
        duration = end_time - start_time
        print(f'This function took {duration:.2f} seconds.')

    return wrapper

@time_it
def computation() -> int:
    sleep(10)
    return 10

computation()
Enter fullscreen mode Exit fullscreen mode

Overload

Sometimes using Union is not enough to express properly a behavior of a function. A common use case could be a file reading function like this:

from typing import BinaryIO, TextIO, Union

def read_file(file: Union[TextIO, BinaryIO]) -> Union[str, bytes]:
    data = file.read()
    return data
Enter fullscreen mode Exit fullscreen mode

Type annotations do not express what is exactly returned when either TextIO or BinaryIO is passed.
We already mentioned that using generics is one solution for this problem, but there's another one that you might like more.
You can fix this by adding overload decorators:

from typing import BinaryIO, TextIO, overload

@overload
def read_file(file: TextIO) -> str: ...
@overload
def read_file(file: BinaryIO) -> bytes: ...
Enter fullscreen mode Exit fullscreen mode

Now mypy knows exactly what are return types in case of a particular input.
In my experience, adding overload annotations can quickly add a lot of repeated code to your production files.
Consequently, this reduces readability. To fix this, you could start using Stub files.

Stub files

Often, we face projects that were developed before mypy was popular. Adding types gradually is the best way
to make a project more robust, but let's face a truth - sometimes we just don't have time for it.

Luckily, mypy offers a "workaround" that allows us to type project without modifying the original source code, and
it's called Stub files. Let's imagine having a production code (or a third party library) you can't touch:

cache = {}

def some_function(untyped_kwarg):
    return 
Enter fullscreen mode Exit fullscreen mode

You can create a stub file by adding *.pyi file in your project.

from typing import Dict

cache: Dict[int, str]
def some_function(untyped_kwarg: int) -> int: ...
Enter fullscreen mode Exit fullscreen mode

Note: The .pyi file takes precedence if a directory contains both a .py and a .pyi file.

Summary

Static type hints require a lot of effort and have a steep learning curve, but can eventually save you tons of time.

Hopefully, structures presented in this post can add new superpowers to your Python stack!

Join my newsletter to stay up to date with posts on python, programming, data engineering, and machine learning: https://filipgeppert.com
Additionally, I'll send you a free copy of my eBook on data science.

Happy coding!

Top comments (0)