DEV Community

Aaron Harris for Kite

Posted on • Edited on • Originally published at kite.com

Python Type Hints Guide: Type Hinting in Python

Guide: Type Hinting in Python

Since version 3.5, Python supports type hints: code annotations that, through additional tooling, can check if you’re using your code correctly.

Introduction

With the release of version 3.5, Python has introduced type hints: code annotations that, through additional tooling, can check if you’re using your code correctly.

Long-time Python users might cringe at the thought of new code needing type hinting to work properly, but we need not worry: Guido himself wrote in PEP 484, “no type checking happens at runtime.”

The feature has been proposed mainly to open up Python code for easier static analysis and refactoring.

For data science–and for the data scientist– type hinting is invaluable for a couple of reasons:

  • It makes it much easier to understand the code, just by looking at the signature, i.e. the first line(s) of the function definition;

  • It creates a documentation layer that can be checked with a type checker, i.e. if you change the implementation, but forget to change the types, the type checker will (hopefully) yell at you.

Of course, as is always the case with documentation and testing, it’s an investment: it costs you more time at the beginning, but saves you (and your co-worker) a lot in the long run.

Note: Type hinting has also been ported to Python 2.7 (a.k.a Legacy Python). The functionality, however, requires comments to work. Furthermore, no one should be using Legacy Python in 2019: it’s less beautiful and only has a couple more months of updates before it stops receiving support of any kind.

Getting started with types

The code for this article may be found at Kite’s Github repository.

The hello world of type hinting is:

# hello_world.py
def hello_world(name: str = 'Joe') -> str:
    print(f'Hello {name}')

We have added two type hint elements here. The first one is : str after name and the second one is -> str towards the end of the signature.

The syntax works as you would expect: we’re marking name to be of type str and we’re specifying that the hello_world function should output a str. If we use our function, it does what it says:

> hello_world(name='Mark')
'Hello Mark'

Since Python remains a dynamically unchecked language, we can still shoot ourselves in the foot:

> hello_world(name=2)
'Hello 2'

What’s happening? Well, as I wrote in the introduction, no type checking happens at runtime.

So as long as the code doesn’t raise an exception, things will continue to work fine.

What should you do with these type definitions then? Well, you need a type checker, or an IDE that reads and checks the types in your code (PyCharm, for example).

Type checking your program
There are at least four major type checker implementations: Mypy, Pyright, pyre, and pytype:

  • Mypy is actively developed by, among others, Guido van Rossum, Python’s creator.
  • Pyright has been developed by Microsoft and integrates very well with their excellent Visual Studio Code;
  • Pyre has been developed by Facebook with the goal to be fast (even though mypy recently got much faster);
  • Pytype has been developed by Google and, besides checking the types as the others do, it can run type checks (and add annotations) on unannotated code. Since we want to focus on how to use typing from a Python perspective, we’ll use Mypy in this tutorial. We can install it using pip (or your package manager of choice):
$ pip install mypy
$ mypy hello_world.py 

More advanced types
In principle, all Python classes are valid types, meaning you can use str, int, float, etc. Using dictionary, tuples, and similar is also possible, but you need to import them from the typing module.

# tree.py
from typing import Tuple, Iterable, Dict, List, DefaultDict
from collections import defaultdict

def create_tree(tuples: Iterable[Tuple[int, int]]) -> DefaultDict[int, List[int]]:
    """
    Return a tree given tuples of (child, father)

    The tree structure is as follows:

        tree = {node_1: [node_2, node_3], 
                node_2: [node_4, node_5, node_6],
                node_6: [node_7, node_8]}
    """
    tree = defaultdict(list) 
    for child, father in tuples:
        if father:
            tree[father].append(child)
    return tree

print(create_tree([(2.0,1.0), (3.0,1.0), (4.0,3.0), (1.0,6.0)]))
# will print
# defaultdict( 'list'="">, {1.0: [2.0, 3.0], 3.0: [4.0], 6.0: [1.0]}

While the code is simple, it introduces a couple of extra elements:

First of all, the Iterable type for the tuples variable. This type indicates that the object should conform to the collections.abc.Iterable specification (i.e. implement __iter__). This is needed because we iterate over tuples in the for loop;
We specify the types inside our container objects: the Iterable contains Tuple, the Tuples are composed of pairs of int, and so on.

Ok, let’s try to type check it!

$ mypy tree.py
tree.py:14: error: Need type annotation for 'tree'

Uh-oh, what’s happening? Basically Mypy is complaining about this line:

tree = defaultdict(list)

While we know that the return type should be DefaultDict[int, List[int]], Mypy cannot infer that tree is indeed of that type. We need to help it out by specifying tree’s type. Doing so can be done similarly to how we do it in the signature:

tree: DefaultDict[int, List[int]] = defaultdict(list)

If we now re-run Mypy again, all is well:

$ mypy tree.py
$

Type aliases

Sometimes our code reuses the same composite types over and over again. In the above example, Tuple[int, int] might be such a case. To make our intent clearer (and shorten our code), we can use type aliases. Type aliases are very easy to use: we just assign a type to a variable, and use that variable as the new type...

Relation = Tuple[int, int]

def create_tree(tuples: Iterable[Relation]) -> DefaultDict[int, List[int]]:
    """
    Return a tree given tuples of (child, father)

    The tree structure is as follow:

        tree = {node_1: [node_2, node_3], 
                node_2: [node_4, node_5, node_6],
                node_6: [node_7, node_8]}
    """
    # convert to dict
    tree: DefaultDict[int, List[int]] = defaultdict(list) 
    for child, father in tuples:
        if father:
            tree[father].append(child)

    return tree

Generics

Experienced programmers of statically typed languages might have noticed that defining a Relation as a tuple of integers is a bit restricting. Can’t create_tree work with a float, or a string, or the ad-hoc class that we just created?

In principle, there’s nothing that prevents us from using it like that:

# tree.py
from typing import Tuple, Iterable, Dict, List, DefaultDict
from collections import defaultdict

Relation = Tuple[int, int]

def create_tree(tuples: Iterable[Relation]) -> DefaultDict[int, List[int]]:
    ...

print(create_tree([(2.0,1.0), (3.0,1.0), (4.0,3.0), (1.0,6.0)]))
# will print
# defaultdict( 'list'="">, {1.0: [2.0, 3.0], 3.0: [4.0], 6.0: [1.0]})

However if we ask Mypy’s opinion of the code, we’ll get an error:

$ mypy tree.py
tree.py:24: error: List item 0 has incompatible type 'Tuple[float, float]'; expected 'Tuple[int, int]'
...

There is a way in Python to fix this. It’s called TypeVar, and it works by creating a generic type that doesn’t require assumptions: it just fixes it throughout our module. Usage is pretty simple...

Check out the code -- Guide: Type Hinting in Python 3.5
on Kite's blog.
Giovanni Lanzani is the Director of Learning and Development at GoDataDriven.

Top comments (2)

Collapse
 
zchtodd profile image
zchtodd

Nice guide to type hints. I've always been on the fence about it myself. I haven't tried it yet but mypy seems kinda like TypeScript in that it's adding optional type enforcement on top of an existing language.

Collapse
 
alphaharris profile image
Aaron Harris

Thanks! I'll let Giovanni know you liked it. Part of the allure is adding type checking to the CI workflow, thought I don't think this was specifically discussed in the article. A lot of these "tacked on" features seem to find their homes in linting and deployment.