DEV Community

Guillaume Pasquet
Guillaume Pasquet

Posted on

Why I stay away from Python type annotations

Ever since optional static typing was added to Python 3.5+, the question of using type annotations keeps creeping back everywhere I work. Some see them as a step forward for the Future of Python™, but to me and many others it's a step back for what coding with Python fundamentally is. I've been in a number of debates over type annotations at work and so decided to compile some of the recurring points of discussion here.

Static typing will protect you

This is the argument universally put forward about type annotations. They'll save us from ourselves. Someone already did a study of this concept about Typescript, but I think looking at code will suffice.

Let's take some of the examples of type annotations you can easily find out there:

def concat(a: int, b: int) -> str:
    return str(a) + str(b)

Okay so you've written a custom concat that only operates on integers. But does it really? Python's str() will work with anything that supports it, not just int, so really this function will work with any two arguments that can be cast to strings. Here's the way this function should be written so that typing is enforced at runtime:

def concat(a: int, b: int) -> str:
    "Raises TypeError" # <- Type annotations don't support this
    if type(a) != int or type(b) != int:
        raise TypeError()
    return str(a) + str(b)

The parameter types aren't checked at runtime, therefore it is essential to check them yourself. This is particularly true if the code you're writing will be used as a library. And given python's modular nature, any code can be imported and re-used. Therefore relying on type annotations isn't sufficient to ensure that your code is safe and honours the contract outlined by the type annotations.

Readability

Another claim I see often is that type hints improve readability. Let's take a look.

def concat(a, b):
    ...

def concat(a: int, b: int) -> str:
    ...

Okay at face value this is actually clearer. Or at the very least the impact of typing doesn't affect readability. Now let's look at real life.

def serialize(instance, filename, content, **kwargs):
    ...

def serialize(instance: Instance, filename: str, content: Optional[Dict[str, Any]] = None, **kwargs: Any) -> bool:
    ...

Now that's becoming hairy. Don't laugh, this is inspired by real code I see daily.

So we have a function that serializes, god knows what, then it takes an instance, filename and some content. If we have the type annotated version, we can tell that the instance is an Instance confusingly, the filename is a str, and content is a horrible optional mess, it probably goes deeper which is why the author gave up and just put Any. It returns a boolean, but we have no idea what the boolean value means.

So in this case, the type hints just let us ask more questions, which could be a good thing. However let's be honest, this function wouldn't pass code review in either case.

Here's a slightly better one:

def serialize_foo_on_instance(instance, filename, content, **kwargs):
    ...

class Foo:
    data: dict[str, Any] = {}
    ...

def serialize_foo_on_instance(instance: Instance, filename: str, content: Optional[Foo], **kwargs: Any) -> bool:
    ...

Okay that's slightly better. The secret sauce here was just to improve our naming to make the function's role more explicit -- a best practice.

Note that to get rid of the lengthy type annotation we had to define a new class in the bottom option. This is the recommended way I've found. However there are times where adding abstraction layers isn't the right approach. They divorce the code from the original data and have a certain performance impact.

It's also possible to alias the type; but I still feel the typing is pushing me towards more abstraction.

Self-documenting code?

Let's have one more go to see if we can improve readability further:

def serialize_foo_on_instance(instance, filename, content, **kwargs):
    """
    Serializes foo on a specific instance of bar.
    Takes a foo data, serializes it and saves it as ``filename`` on
    an instance of bar.
    :instance: instance to serialize the foo on
    :filename: file name to serialize to
    :content: foo data, just creates the file if None
    :returns: True on success, False on error
    """
    ...

Okay, now we know what the function does, and what the parameters are supposed to be. Let's see how that looks with type annotations:

def serialize_foo_on_instance(instance: Bar, filename: str, content: Optional[Foo], **kwargs: Any) -> bool:
    """
    Serializes foo on a specific instance of bar.
    Takes a foo data, serializes it and saves it as ``filename`` on
    an instance of bar.
    :instance Bar: instance to serialize the foo on
    :filename str: file name to serialize to
    :content Optional[Foo]: foo data, just creates the file if None
    :returns bool: True on success, False on error
    """
    ...

Right so we're a bit more verbose and specified the types each parameters take. We've introduced docstrings for both our definitions, and they explain what the function does, the role of the parameters, what happens to optional ones and what the values of bool in the return means.

Could we do away with the docstring and solely rely on "self-documentation" through type annotations? Not a chance: -> bool doesn't say anything about what it means to receive either True or False. In the same way, Optional[Foo] doesn't give us a clue about what happens when the value is None.

Write generic code = reuse

Python is magnificent by how reusable it is. Every file you write is a module and can be reused for any purpose. Ages ago I wrote a software forge for Bazaar just by reusing modules from Bazaar itself even though they were never intented to be used that way. This transpires through the entire language, including the function definitions.

By clamping down on types, are we making our code less reusable? Possibly, let's experiment. Let's assume that instance is an object obtained from a string ID, and that we'd really like to use some kind of string generator for filename. Let's have a look:

class FilenameGenerator:
    def __str__(self):
        return "blah.txt"

def serialize_foo_on_instance(instance, filename, content, **kwargs):
    if type(instance) == str:
        instance = Bar.by_name(instance)
    ...

filename_gen = FilenameGenerator()
serialize_foo_on_instance("bob", filename_gen, content)
serialize_foo_on_instance("bob", filename_gen, content)

Pretty straight-forward here. Now let's annotate this.

def serialize_foo_on_instance(instance: Union[Bar, str], filename: Union[str, ??], content: Foo, **kwargs: Any):
    if type(instance) == str:
        instance = Bar.by_name(instance)

Wow, that's already more involved. But is it even truly generic? In other languages we'd use interfaces or abstract types and inheritance to make functions generic. I couldn't find the type name for any object that can be cast to a str so I put ?? for now.

Without type annotations, our code is generic right-off the bat. Possibly overly so, so we need to do some pre-flight checks. With annotations, our code is "specific" by default and we work hard to make it generic. Note the quotes around "specific" as this is only enforced by linting tools like mypy and so you still need to do your pre-flight checks. This is a fundamental shift in the nature of the language.

Python vs the world

A lot of developers like to claim that type hints are the best thing for the language since sliced bread. However I get the feeling those people only look at Python in a vacuum.

Python's selling points as a language are the readability and the ease of coding, which results in speed for the developers. Adding type annotations reduces those advantages greatly. And Python becomes less attractive in the programming world.

Below is a fibonacci implementation in Python:

from typing import List

def fibonacci(previous: List[int], length: int, depth: int = 0) -> List[int]:
    if depth == length:
        return previous
    previous.append(previous[-1] + previous[-2])
    return fibonacci(previous, length, depth + 1)

if __name__ == "__main__":
    start: List[int] = [1, 2]
    print(fibonacci(start, 100))

And the same with Rust:

fn fibonacci(previous: &mut Vec<u128>, length: u32, depth: u32) -> &Vec<u128> {
    if depth == length {
        return previous;
    }
    previous.push(previous[previous.len() - 2] + previous[previous.len() - 1]);
    fibonacci(previous, length, depth + 1)
}

fn main() {
    let mut start = vec![1, 2];
    println!("sequence: {:?}", fibonacci(&mut start, 100, 0))
}

Here we see that the difference between Python and a more advanced, powerful language has become minimal. Rust is much faster and allows me to do more than Python, so the question becomes: why choose Python?

Please don't pick on my choice of Rust, this argument would also work with Go, Java, C#, etc...

Conclusion

From my perspective, type annotations provide little benefit at the cost of extra work. They can lead to a false sense of security that the parameter types you're getting in a function are guaranteed, but there is no such check performed at runtime.

There's also a false sense that type annotation provides documentation, but they never explain what a function does and how the data within the types is effected during a function call. So it's no substitute for good docstrings.

Given this, I prefer to not use them, and to keep clean, well documented code.

Top comments (8)

Collapse
 
martinthoma profile image
Martin Thoma • Edited

I disagree with pretty much every point you made:

  1. Protection: It's not about the runtime. In my projects, I let mypy run in CI. This means the types of every production code is checked. Yes, there are important notes like you need to set certain mypy flags so that everything is actually checked. The case that happens most often to me is a check for "None" which I forgot.
  2. Readability: You pretty bad usage of types and still it helps. In the typed example, I know that a dictionary is expected. You also later in that section mentioned a couple of points how the annotation can be improved. One part that I didn't see was the usage of pydantic/dataclasses and using NewType. For Dict[str, Any] and str types you can typically improve this quite a bit. This also typically changes the architecture a bit to move validation of data to an earlier stage. Hence the usage of type annotations leads to a better architecture and thus more readable/maintainable code.
  3. Documentation: I wouldn't get rid of documentation - if it exists. If you use type annotations, you can automatically check if they are correct and you can also check if they exist. For docstrings ... well, it's mixed. I've seen developers who add them everywhere, developers who don't add them anywhere, teams using numpy/sphinx/google style... this is a complete own story.
  4. Python vs the world: Libraries, personal preference, speed of development, the ecosystem. If neither of them is a good reason for you to use Python, then maybe you should use another language.

In case you want to learn more about type annotations, have a look at medium.com/analytics-vidhya/type-a...

Collapse
 
trodiz profile image
trodiz

You can read the doc string and still understand that a dictionary is expected. Which is exactly what developers have been doing well before version 3.5. And you can be as verbose or succinct as you like. Type annotations can only work with previously known types. python's ability to do polymorphism will be stifled by your type checkers. a dynamically defined type, which should work with the function will still throw an error in your type checker. This is what OP meant by saying "going specific to generic".

In your final paragraph you demand OP to choose another language even when they clearly stated what makes python a great language. You even repeated those points yourself. If I'm willing to annotate and type check everything in my code, I'd rather choose Rust or go for the added benefit of performance. If you're the one who is looking for static typing from a slow dynamically typed language, maybe, just maybe, it is you who need to look for another language.

Collapse
 
senhaji_rhazi profile image
SENHAJI RHAZI Hamza

I do totally agree with these arguments, what I like about python is that I write code as fast as I think, I do both in the same time, having to handle constraints of type when my code is not stable yet, it's removing from my freedom to express myself as I want

Collapse
 
belevatini profile image
Rocky Bronzino

Good programmers always write pseudocode first, then implement that in the language of choice. A while back I wrote pseudocode for a function I was explaining to someone, and realized it was syntactically valid Python code. So Python's huge advantage is that it is pseudocode that runs. This makes development much easier as we can start interacting with the pseudocode instead of waiting for it to be implemented. Putting types into Python completely strips this use case.

I will go further and say that anyone who thinks Python should have static types has no idea what static types really do. I am all for statically typed languages, the compile time guarantee and the machine sympathy that they allow. But putting types into Python is giving programmers the worst of both worlds: the slow iteration time of formal languages, and the slow run time of scripted languages. None of the benefits of either.

Collapse
 
vladhaidukkk profile image
vladhaidukkk

I expected something different from the article. I would say that I disagree rather than agree with the arguments presented here. There was already a comment below that clearly describes why.

At this point, I think that type hints have many disadvantages, but it depends on what you are using Python for. If your goal is to write a web scraper/crawler, then doing a lot of checks for None can be very annoying, given that you are sure there will be a value there. But for large or medium-sized systems, this is a warning that you may have done something wrong. And yes, this is not a warning at the interpreter level, but at the level of some static code analyzer, but it is there. I understand that Python may not always be the best choice for large systems, but sometimes it is really convenient for them because of its other advantages.

In general, I want to say that using type hints is not bad and in most cases makes the code self-documenting and more understandable for other developers, but when they start to cause a lot of discomfort due to the dynamic nature of the language, then either you have designed your code poorly or type annotations are really unnecessary. That is, in certain cases, type annotations can be abandoned.

Collapse
 
vladhaidukkk profile image
vladhaidukkk

Read this to understand when type hinting can be annoying: reddit.com/r/Python/comments/10zdi...

Collapse
 
arnaudparan profile image
Arnaud Paran

Okay, you should check out prototypes and PEP 544

It is the way to implement duck typing in type annotations, so anything which can be cast to a str could be defined by

class CastableToStr(typing.Protocol):
def str(self) -> str:
pass

Now all the other points are overly personal, and the comparison with rust is funny because rust is one of those languages which proove that a good compiler with static typing etc prevents mistakes.

And the final question why choose python? Well, I never understood why people would chose python, using that language because "python is easier to use" is just downright shooting yourself in the foot. An economy of one month of coaching for your new developpers and ending up with more bugs and issues does not seem like a great choice to me anyways

Collapse
 
danieleniero profile image
daniele-niero

the comparison with rust is funny because rust is one of those languages which proove that a good compiler with static typing etc prevents mistakes.

But that is exactly the point, Rust has a compiler, it is a statically typed language, but python is interpreted and the interpreter doesn't understand these type annotations, they are just hints for other tools, like mypy, but they are not taken into consideration during the program execution.
The annotations may say something and at run time you can pass something else and python will run anyway, or at least it will run the function until it find another problem, it it finds it.
In this sense they are a "false security mechanism".

There are other languages with an easy, expressive syntax similar to Python, but statically typed and compiled. Nim, Swift, Crystal, Julia and more. If there is the need for static types without loosing the simplicity of a language, there are alternatives. I really don't understand why it has to be forced into Python, in a hacky way, making Python look like something that it is not and losing sight of the dynamic nature of the language, which can be, given the right circumstances, a huge benefit.

The right tool for the right job.
Type hinting in Python feels like using a screwdriver as if it was a hammer.