This is a piece of Python code that works, and no linter will complain about it.
def average_length(strings: Sequence[str]) -> float:
total_length = sum(len(string) for string in strings)
return float(total_length) / len(strings)
print(average_length(["foo", "bar", "spam"]))
#> 3.3333333333333335
print(average_length("Hello World!"))
#> 1.0
It looks like "Hello World!"
shouldn't be a valid input for the average_length
function. The intent is obviously to calculate an average length of a sequence of strings like ["foo", "bar", "spam"]
. Why is this happening?
Well, an str
instance is a sequence of strings as well: it has length, it's iterable, and it even can be reversed (see https://docs.python.org/3/library/collections.abc.html#collections-abstract-base-classes). So, from the interpreter's point of view, any string in Python is just a sequence of one-character strings.
I'm not saying that Python is broken. It is the way it is, and it is not changing soon. However, what could be changed that will cause minimal harm but still will make things a bit better? I suggest introducing a built-in single character type, i.e. char
. This immediately will result in str
becoming Sequence[char]
and not Sequence[str]
. Other than requiring length to be precisely one, the char
type could behave exactly like str
.
Also, things like chr
and ord
functions could benefit from a more precise type annotation using char
instead of str
.
I think this would be a great addition to the language.
Top comments (0)