Contents
Static typing has gone through a rough way in Python community. Finally, it's becoming a production standard as typed code lets you catch bugs faster. Moreover, I bet that anyone who started using it and was tasked to refactor
the code months later, can clearly see its huge benefits.
In this post, we're going to focus on advanced structures offered by Python typing
module. These structures help you better express your typing intents.
Here's a list of topic we'll cover in this post:
Join my newsletter to stay up to date with posts on python, programming, data engineering, and machine learning: https://filipgeppert.com
Additionally, I'll send you a free copy of my eBook on data science.
Generics
Let's imagine a following scenario:
from typing import Any
def first(seq: list[Any]) -> Any:
return seq[0]
First of all, try to avoid using Any
in your codebase. According to mypy
docs:
A static type checker will treat every type as being compatible with Any and Any as being compatible with every type.
In the long run, this will obviously result in bugs hiding in your code. In simple terms, Any
disables type checker.
To solve this problem, you can use generics.
The code would look like this:
from typing import TypeVar
T = TypeVar('T')
def first(seq: list[T]) -> T:
return seq[0]
first_item = first([1,2,3])
mypy
now knows that first_item
has type int
and will use this information for further type checking. That's the
power of generics. They let you preserve information about types. Additionally, you can restrict TypeVar
to specific types.
from typing import TypeVar
T = TypeVar('T', int, float)
def first(seq: list[T]) -> T:
return seq[0]
first_item = first([1,2,3]) # Ok
another_first_item = first(["1","2","3"]) # Throws an error!
If you need to be less restrictive, you can use T = TypeVar('T', bound=float)
to allow all subtypes for a specified type.
Finally, generics help you solve a common problem that you'll eventually see in any large codebase.
Imagine that an intention of a developer is to create a function that adds two numbers of the same type, but supports
additions for int
and float
. Here's an implementation using basic types:
from typing import Union
def add(first: Union[int, float], second: Union[int, float]) -> Union[int, float]:
return first + second
add(1, 2.1) # Shouldn't be allowed, yet mypy doesn't complain
Unfortunately, with this type annotations, nothing blocks a client from calling add(1, 2.1)
. Mypy is not able to catch
the following error and add function becomes unstable in terms of type that is returned. Here's how you can fix this using
generics:
from typing import TypeVar
T = TypeVar('T', int, float)
def add(first: T, second: T) -> T:
return first + second
add(1, 2.1) # mypy throws an error!
This implementation clearly shows the intention of a developer. Now, only numbers of the same type can be added.
Protocols
We all have gone down the road of writing code where objects are inheriting multiple times (or worse, we're asked to take over a codebase that does it).
This becomes a problem when we start typing our code as mypy errors start to scream. Protocol
is a great solution to add type annotation for methods
that really matter to you. Here's an example:
from typing import Protocol
class MyProtocol(Protocol):
a: str
def set_a(self, new_value: str) -> None: ...
def get_a(self) -> str: ...
def some_function_gazylion_modules_away(obj: MyProtocol) -> str:
obj.set_a("1")
return obj.get_a()
Now, any time some_function_gazylion_modules_away
is used, it expects an object that has a
attribute and two methods with an exact same type signature.
It doesn't matter if object that is passed implements some additional methods.
Only characteristics specified in Protocol
definition are important for a type checker.
Additionally, Protocol
can be checked at runtime when you decorate it with runtime_checkable
.
from typing import Protocol, runtime_checkable
@runtime_checkable
class MyProtocol(Protocol):
a: str
def set_a(self, new_value: str) -> None: ...
def get_a(self) -> str: ...
Callable
Mypy offers Callable
type. For an object to be a valid Callable
, it must implement the __call__
method.
Callable
has the following syntax:
Callable[[<list of input argument types>], <return type>]
Here's how you can use it to type check your decorators:
from time import time, sleep
from typing import Callable
def time_it(func: Callable[[], int]) -> Callable[[], int]:
def wrapper() -> int:
start_time = time()
func()
end_time = time()
duration = end_time - start_time
print(f'This function took {duration:.2f} seconds.')
return wrapper
@time_it
def computation() -> int:
sleep(10)
return 10
computation()
Overload
Sometimes using Union
is not enough to express properly a behavior of a function. A common use case could be a file reading function like this:
from typing import BinaryIO, TextIO, Union
def read_file(file: Union[TextIO, BinaryIO]) -> Union[str, bytes]:
data = file.read()
return data
Type annotations do not express what is exactly returned when either TextIO
or BinaryIO
is passed.
We already mentioned that using generics is one solution for this problem, but there's another one that you might like more.
You can fix this by adding overload
decorators:
from typing import BinaryIO, TextIO, overload
@overload
def read_file(file: TextIO) -> str: ...
@overload
def read_file(file: BinaryIO) -> bytes: ...
Now mypy
knows exactly what are return types in case of a particular input.
In my experience, adding overload
annotations can quickly add a lot of repeated code to your production files.
Consequently, this reduces readability. To fix this, you could start using Stub files
.
Stub files
Often, we face projects that were developed before mypy
was popular. Adding types gradually is the best way
to make a project more robust, but let's face a truth - sometimes we just don't have time for it.
Luckily, mypy
offers a "workaround" that allows us to type project without modifying the original source code, and
it's called Stub files
. Let's imagine having a production code (or a third party library) you can't touch:
cache = {}
def some_function(untyped_kwarg):
return
You can create a stub file by adding *.pyi
file in your project.
from typing import Dict
cache: Dict[int, str]
def some_function(untyped_kwarg: int) -> int: ...
Note: The .pyi
file takes precedence if a directory contains both a .py
and a .pyi
file.
Summary
Static type hints require a lot of effort and have a steep learning curve, but can eventually save you tons of time.
Hopefully, structures presented in this post can add new superpowers to your Python stack!
Join my newsletter to stay up to date with posts on python, programming, data engineering, and machine learning: https://filipgeppert.com
Additionally, I'll send you a free copy of my eBook on data science.
Happy coding!
Top comments (0)