9

Python’s “Type Hints” are a bit of a disappointment to me

 2 years ago
source link: https://www.uninformativ.de/blog/postings/2022-04-21/0/POSTING-en.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

blog · git · desktop · images · contact & privacy · gopher


Python’s “Type Hints” are a bit of a disappointment to me

2022-04-21

In Python 3.5, “type hints” were introduced. I was really excited when I first heard this. You can now annotate functions:

def greeting(name: str) -> str:
    return 'Hello ' + name

And variables:

foo: str = greeting('penguin')

That looks very promising. I was expecting that I could finally do static typing in Python. I’m a big fan of that, because it allows me to catch lots of errors in advance. You don’t need to run every single line of code to make sure that your types are correct – instead, a static analyzer can now inspect your code.

There are several limitations, though.

I had already made up my mind about this a while ago and decided to not use type hints in my own projects, but recently, we started a new project at work and people wanted to use type hints again. I decided to give them another chance. It didn’t go so well. I stand by my initial decision, sadly.

Despite all that, what I’m hoping for is that someone will come along and tell me that I got it all wrong. “When you do it as follows, the system works: $explanation” Because, you know, I’d really like to have static typing in Python.

The short version

The documentation says:

The Python runtime does not enforce function and variable type annotations. They can be used by third party tools such as type checkers, IDEs, linters, etc.

So, this executes perfectly fine:

foo: int = 'hello'
print(foo)

The fact that you can put nonsensical types wherever you want and still get a working program has consequences. Let’s explore them.

Worse than comments and worse than Hungarian Notation

You sometimes see this:

# returns str
def greeting(name):
    return 'Hello ' + name

Since the language’s syntax itself has no concept of typing, people resort to adding a comment which states the return type. Similar to Python’s type hints, nothing ever checks these comments. But the thing is: They are comments. It is obvious that they are not binding.

Hungarian Notation, e.g. strName, can be deceptive, too. I think it’s a little worse than comments, because every time you read strName you are being nudged into thinking it’s a string, even though it might actually be something else. But at least it’s just a name and the reader knows that.

Type hints, on the other hand, appear to be binding. They are part of the official language and, to the uninitiated, they create the impression that they matter. Except they only kind of somewhat matter sometimes, as we’ll see. They are essentially a comment in disguise.

Type checking is your job after all

Now, if Python itself does not care about those hints in the slightest way, then what does? Well, maybe an IDE does or maybe an external tool like mypy does. I’ll be talking about mypy a lot now, but I think that all of this applies to every other tool as well. I don’t blame mypy for any of this.

But as a general rule: You don’t know if there really is a tool in place to check them. You especially don’t know this when you join a new project, but it’s also a little hard to tell afterwards. Did the mypy task in your build pipeline silently break? Maybe mypy is misconfigured? Maybe it spits out errors but does not cause build jobs to actually fail? What if mypy has a bug? What if mypy is not even complete and doesn’t cover all cases?

You see, unlike traditional compiled languages, where there is absolutely no way around type checks being performed, because you then wouldn’t get an executable program at all, you manually have to make sure that things like mypy do their job and do it correctly. It is very easy for things to go wrong here, because running mypy is not an essential step – it’s an additional, optional step, which requires maintenance.

Mypy and the Python runtime are also inherently out of sync and just because one of them says it’s fine, doesn’t mean the other one will agree.

My favorite story about this: Back when I was young and naïve, I joined a Python project which was full of type hints. I knew that, unlike me, these guys were very fond of type hints. As such, I assumed that they had tooling in place to check the hints (and that the hints were correct as a result). Guess what, they didn’t. Well, they did, but only in their IDEs – and the warnings of IDEs are simple to ignore. As a result, I read their code, saw those type hints, and my brain started to assume, “aha, this is of type foo, that’s type bar, …”. Most of this turned out to be wrong and it threw me off the track during my debugging session.

Note that their program worked and it had a good test suite and everything. Just the type hints were wrong. (Well, and there was this one bug that I was going to fix, which was an error in the program’s logic, though, unrelated to typing.)

It is much harder to fall into this trap when using a compiled language: When you write int instead of struct foo somewhere, it is very unlikely to go unnoticed. Your program will either not compile or it will show up when running your test suite. (Yes, C has its pitfalls in this regard.)

I don’t want to blame the other guys in the project here. Mistakes happen, it’s fine. Us humans need tools to avoid making them. My point is that Python’s type hints are probably too fragile for this job.

There is an Any type and it renders everything useless

Python is a dynamically typed language. By definition, you don’t know the real types of variables at runtime. This is a feature.

So, natually, there now is an Any type. The following program passes mypy validation:

from random import randint
from typing import Any


def foo() -> Any:
    if randint(0, 1) == 0:
        return 42
    else:
        return 'foo,bar,baz'


bar = foo()
print(bar.split(','))

Of course, in 50% of the cases you get an exception.

Any also goes both ways, which was pretty surprising to me:

A special kind of type is Any. A static type checker will treat every type as being compatible with Any and Any as being compatible with every type.

So, this is valid code:

from typing import Any


def foo() -> int:
    bar: Any = 'hello'
    return bar


result = foo()
print(result)

In other words, even if your project uses type hints all over the place and has a checker like mypy in place, your type hints might still be wrong. foo() does not return an integer.

Mypy has an option to turn this into an error:

$ mypy --warn-return-any foo.py
foo.py:9: error: Returning Any from function declared to return "int"
Found 1 error in 1 file (checked 1 source file)

I guess they cannot turn this on by default, because, well, the Python docs say it’s valid. So, yes, you can somehow avoid this trap, but it reminds me a lot about having to do things like -pedantic -Wall -Werror -Wyes-really -Wyeah-i-mean-it --dont-you-listen-to-me in C projects. Flags like these are problematic because you have to remember to use them – and in new versions of your type checker, you have to make sure that they didn’t change their meaning and that they didn’t introduce new flags that you really, really should use.

Also, what if some of your code relies on using Any? Python is a dynamically typed language and it’s perfectly legal to do these kinds of things. You will then probably proceed to tell mypy to turn off the Any checks just for some modules … which is a recipe for chaos.

You must also be aware that Any can sneak in through libraries. This happened to me and it went unnoticed for quite a while. Let’s take the example from above again and make it more explicit:

from typing import Any


def an_imported_library_function() -> int:
    bar: Any = 'hello'
    return bar


result: int = an_imported_library_function()
print(result, type(result))

Suppose you look at the documentation for that library: It says it returns int. You might even glance at the source code and see that -> int. You make your own variable result an int. Everything screams int. Mypy does not report any errors by default. And yet, when you run it, you get this:

$ ./foo.py
hello <class 'str'>

Last but not least, Any lurks in a couple of places. When you write foo: Dict = {}, it’s actually foo: Dict[Any, Any] = {}, and boom.

Duck type compatibility

Example:

foo: int = 123
bar: float = foo

if isinstance(foo, int):
    print('foo is an int')
if isinstance(foo, float):
    print('foo is a float')

if isinstance(bar, int):
    print('bar is an int')
if isinstance(bar, float):
    print('bar is a float')

Passes validation:

$ mypy --strict numeric.py
Success: no issues found in 1 source file

Unexpected result:

$ python3 numeric.py
foo is an int
bar is an int

How can I declare bar as float, but then it accepts an int and actually is an int at runtime (so it’s not like it’s being converted automatically – of course not, the runtime does not care about type hints)?

The reason is duck type compatibility.

This is probably not a big deal in Python, though. When you divide two integers, 1 / 3, the result is a float and not accidentally an int, like in some other languages. And luckily this is limited to just a few built-in types.

And yet … It says bar: float but it’s an int. They could have just called it number, if it’s ambiguous anyway.

Most projects need third-party type hints

Most Python projects out there pre-date type hints, so they are completely untyped. The default of mypy is to go “🤷” and accept untyped code almost as if it was declared Any. Again, there is an option to turn on these checks, so this is another thing to know and remember.

Now, if do you want to make use of type hints when using such libraries, well, you have to add the hints. typeshed contains hints for a bunch of popular projects.

I hope you realize what this means: The library itself and its type hints are out of sync. When mypy does not report any errors for your code, what does that mean? Do you actually call that library correctly? Do you really know that now, just because mypy spits out a green line? Are you sure?

Sadly, dataclasses ignore type hints as well

So we were making a client for a REST API. Traditionally, we would have built dicts and then POSTed them:

payload = {
    'cars': [
        {
            'name': 'toy yoda',
            'wheels': 4,
        },
    ],
    'salad': 'potato',
    'version': 8,
}

Code like this can get really messy really fast. Python 3.7 introduced dataclasses. Together with type hints, it might look like this:

from dataclasses import dataclass
from typing import List, Literal


@dataclass
class Car:
    name: str
    wheels: int


@dataclass
class APIRequest:
    cars: List[Car]
    salad: Literal['potato', 'italian']
    version: Literal[8]


payload = APIRequest(
    cars=[
        Car(
            name='toy yoda',
            wheels=4,
        ),
    ],
    salad='potato',
    version=8,
)

If you now turned wheels=4 into wheels='4', mypy would report an error.

I’d argue that this is better code. Dataclasses and type hints allow the reader to know how an API request is composed. Looks like it’s self-documenting, right?

Of course, writing wheels='4' can be reported as an error, but how much of your code handles static data like this? Isn’t it much more likely that this '4' is actually a variable? The big question then becomes where this variable comes from and whether it’s covered by (correct) type hints. If it isn’t (or if it’s Any), then your build is green, but your code might be wrong and will fail somewhere down the road (just as if you didn’t have type hints in the first place).

It would have been really nice if dataclasses automatically honored their type hints and raised errors on mismatches.

Literal is extra deceptive: It not only states the type of something, but also declares which contents are allowed. Again, this is not relevant at runtime. All your type hints can be perfectly fine, but if you read the string potatu instead of potato from a file or an HTTP response or whatever, then nothing will ever complain about it.

I’d still argue that it’s better to use dataclasses (maybe without type hints) than to compose chaotic dicts, but this is probably a personal preference of mine. Many Python programmers praise the language for not having to use classes like this. People like that they can just throw around dicts.

Type inference and lazy programmers

Consider this:

foo = {
    'hello': 'world',
    'bar': ['baz'],
}

foo['bar'].append('potato')

print(foo)

Since all programmers are as lazy as they can get away with, this code does not contain any type hints. Most code probably looks like this. Mypy then has no other choice but to guess what the types must be, because, well, you want to do type checking, don’t you?

So, while the code snippet executes just fine:

$ ./bar.py
{'hello': 'world', 'bar': ['baz', 'potato']}

Mypy is not happy with it:

$ mypy bar.py
bar.py:9: error: "Sequence[str]" has no attribute "append"
Found 1 error in 1 file (checked 1 source file)

The detected type isn’t wrong per se, because both world and ['baz', 'potato'] are technically sequences of str (run python -c 'for i in "foo": print(type(i))' to see it in action). It’s not quite what the code intended to say, though, because the latter is clearly supposed to be a list which indeed has an .append() method.

How do you solve this? The obvious answer is to add the correct hints:

from typing import Dict, List, Union


foo: Dict[str, Union[str, List[str]]] = {
    'hello': 'world',
    'bar': ['baz'],
}

foo['bar'].append('potato')

print(foo)

Now, it … wait, it’s still wrong:

$ mypy bar.py
bar.py:12: error: Item "str" of "Union[str, List[str]]" has no attribute "append"
Found 1 error in 1 file (checked 1 source file)

Crap, we couldn’t properly express that only bar is a list and hello is a string. We now first have to resolve the type ambiguity of this Union, so that we’re making sure that what we’re dealing with really is a list:

if isinstance(foo['bar'], list):
    foo['bar'].append('potato')

Note, though, that this is one of the few instances where I think that mypy can be beneficial: It tells you that, whoops, you’re dealing with something that might not be what you think it is. This is good, this is what I want to see. It happened a couple of times in our code base and prevented actual bugs. Great!

Another way might be to used TypedDict:

from typing import List, TypedDict


TypedFoo = TypedDict(
    'TypedFoo',
    {
        'hello': str,
        'bar': List[str],
    },
)

foo = TypedFoo(
    hello='world',
    bar=['baz'],
)

foo['bar'].append('potato')

print(foo)

What happens in real life, though, will probably be this:

from typing import Dict


foo: Dict = {
    'hello': 'world',
    'bar': ['baz'],
}

foo['bar'].append('potato')

print(foo)

Now, foo is of type Dict[Any, Any], the damn tool is happy, the Python runtime doesn’t care anyway (your program worked the entire time), and the lazy programmer can finally close that annoying Jira ticket. ;-)

See, since good typing isn’t enforced or even encouraged by this whole thing, there are many ways to avoid having to deal with types at all, if you’re just not in the mood right now. You don’t even have to do something along the lines of #pragma disable type_check – you can just throw in some type that’s probably close enough and then claim: “Hey, look, we got type hints!”

It requires a lot of discipline and good faith to get those hints right.

Exceptions are not covered

Exceptions are out of scope of type hints: You cannot annotate a function and say, “this might throw a ValueError”.

This is a bit unfortunate. In Python, exceptions can be very surprising, because you never know when they are being thrown. In, say, Java, exceptions are part of a function’s signature, and this helps a lot.

What you really want is a compiler, isn’t it?

If you’re a fan of type hints, you should ask yourself that question.

You must run the compiler on your code, you cannot skip that step. It checks types (and other things) and thus catches a lot of errors in advance. Do you call that function correctly? Does that object have a .split() method? Is this variable even defined? Hey, let’s auto-generate some meaningful API docs directly from the code, which is something we can do, because we know that the types are correct, so the generated HTML files will also be correct. And so on.

It feels really good to see a program compile without warnings or errors. You then know that you got it right. This is especially true for Rust, which really doesn’t let a lot of things slip. (You might get a little bit of a different experience with C, though.)

And this is the kind of thing that I initially hoped could be accomplished by using Python’s type hints, but, sadly, no.

Even if the Python runtime did check all the type hints at runtime, then it would still be too late. I don’t want a fancy type exception at runtime. That already exists (most of the time). I want to know about type mismatches in advance.

What I want is this: Some language that’s as easy to use as Python and it should be compiled and with good static typing, but it should also not be compiled because then it wouldn’t be as easy to use as Python anymore. Whoops?

Still, aren’t type hints better than nothing?

This blog post was originally titled “Python’s Type Hints considered harmful”. Whether they are harmful or not depends on the answer to that question: Isn’t it better to detect at least some errors than to detect none at all?

I’d say: No. Type hints are not binding. You can not be sure that they are correct. As such, you must always treat them as if they were wrong. And I think that indeed makes them more harmful then useful. They waste mental energy when reading code, they create new maintenance burdens, and they are potentially deceiving: You cannot trust them. There’s too much that can go wrong with them and I don’t think that’s fixable unless you make it a new language.

While writing the code examples for this blog post, I had a nagging feeling at the back of my head: “Is my code even correct?” Yes, even just those few lines. We’re not even talking about Dict[str, Union[Dict[str, Union[int, str, bool]], List[int]]], which I have seen in the wild, even though you could argue that your code shouldn’t deal with that stuff in the first place.

Conclusion

I like Python. It’s a nice and easy to use language. And if foo: int made sure that foo will always be an int and nothing else, then I’d be using type hints everywhere. Would that still be Python?

Comments?


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK