Edit(Jul 16th, 2023)
I was a bit too rash about the conclusion that "native coroutine in Python is merely an interface". I admit that it was too vague expression, not telling much about the native coroutine concept. I would like to investigate native coroutines in details in the next post.
Initial Thinking: So generators are iterators but what about coroutines?
From the last post we have covered how a generator behaves as an iterator. But at the very last part of the post, we said we will talk about how the interface of a generator can be connected to the concept of “simple” coroutine object in Python.
As we already have seen, PEP 342 tries to extend the generator concept into a coroutine one. If you read the details, you could see that the yield
keyword is now treated as an expression(that is, it itself can be considered as an r-value).
So rather than
yield foo
we now can see it as
bar = (yield foo)
And also there is an important method to introduce, namely send()
. These two - the yield
expression and send()
are the two main elements that extends the functionality of generators into coroutines in Python.
However, before we directly deal with send()
, let us review how __next__()
works inside a generator.
Review: how __next__()
works
See this simple example code below:
def example():
print("### start ###")
a = yield 1
print(2)
print("a:", a)
b = yield 3
print(4)
print("b: ", b)
print("### end ###")
if __name__ == "__main__":
gen = example()
print(“first call start”)
print("first call: ", next(gen))
print(“second call start”)
print("second call: ", next(gen))
print(“third call start”)
print("third call: ", next(gen))
If you run the code above, we will have a result like this:
first call start
### start ###
first call: 1
second call start
2
a: None
second call: 3
third call start
4
b: None
### end ###
Traceback (most recent call last):
File “<my_python_file_path>”, line 17, in <module>
print("third call: ", next(gen))
StopIteration
So you can see that __next__()
just stops right after the generator function example
yields the control flow. For convenience, I visualize this situation as follows:
Stop moving to the next line and wait for the next
__next__()
call, so thatsecond call start
is printed earlier than2
anda: None
Also stop here for the next
__next__()
, so herethird call start
is printed earlier than4
andb: None
In the next sections, we will see what the send()
method and yield
expression have to do with __next__()
.
yield
expression and send()
You might have noticed that the variables a
and b
are None
in the above code. It was intentional, since we wanted to introduce how the yield
expression behaves with __next__()
. However, it looks quite weird, since both a
and b
are assigned None
.
As a matter of fact, __next__()
is exactly the same as send(None)
according to PEP 342(You can also confirm this from this CPython code line). So next(gen)
in our code is actually gen.send(None)
, and this None
becomes the value of the yield
expressions yield 1
and yield 2
, which are then assigned to a
and b
respectively. This is how the Python team extended generators into coroutines: Now generators are merely special cases for coroutines.
But then how does this send()
method behave such that next() = send(None)
makes sense?
According to the documentation of send()
, it “resumes the execution” and passes its argument to the yield expression.
So if you see the following code line:
def example_coroutine():
yielded_value = "yielded"
sended_value = yield yielded_value
print(sended_value)
if __name__ == “__main__”:
coro = example_coroutine()
print(coro.send(None)) # same as next(coro)
print(“before send…”)
coro.send(“sended”)
then the result will be:
yielded
before send…
sended
# omitted: Traceback and StopIteration Exception
Here send(“sended”)
actually “resumes” after the control flow halted at sended_value = yield yielded_value
. That is why we have ”before send…”
message printed earlier than ”sended”
. Now the string ”sended”
is assigned to sended_value
and get printed out inside the example_coroutine
coroutine body.
However, you’ll also notice that we first called coro.send(None)
. Other than a None
value, the Python interpreter throws an exception:
can't send non-None value to a just-started generator
This is designed by Python itself. As you see, for calling send()
we need to “resume” the code and pass the argument of send()
to a yield expression. But since the coroutine function(=generator function) starts without such halted yield
expression, if we call send()
of a coroutine for the first time, we might waste our argument value of send()
. So Python instructs to start the coroutine with send(None)
first(see this CPython code - if you call send(<not_none_value>)
, you’ll get the exact the same error message as the one written in the source code).
We can visualize the process above as follows(now doesn’t it make sense to put the arrows just right below =
operators?). Note that the code is a little bit augmented for more detailed explanation:
- We send
None
first to bootstrap our coroutine -
yielded_value
goes out to the main routine(i.e. the function that calledsend
), and the coroutine stops here - The value
”yielded”
gets printed - Now we send any non-
None
value - here the string”sended”
is sent to the coroutine - Now the control flow resumes from where it paused earlier, at (2). The value
”sended”
is assigned tosended_value
and the remaining lines get executed just before we meet anotheryield
expression -yield yielded_value2
- Then
yielded_value2
goes out to the main routine - The string
”yielded_value2”
gets printed, and the remaining process continues until the coroutine getsStopIterations
exception, which means it has no more values to yield
Wow, a bit confusing, but that is the way it is!
For your interest, if we tweak the code from the previous section like this, replacing next()
with send()
:
def example():
print("### start ###")
a = yield 1
print(2)
print("a:", a)
b = yield 3
print(4)
print("b: ", b)
print("### end ###")
if __name__ == "__main__":
co = example()
print(“coroutine init: ”, co.send(None))
print(“first call start”)
print("first call: ", co.send(2.5))
print(“second call start”)
print("second call: ", co.send(4.5))
then the result will be:
### start ###
coroutine init: 1
first call start
2
a: 2.5
first call: 3
second call start
4
b: 4.5
### end ###
// Traceback messages omitted
The Road to Async
Now the word “coroutine” in the Python glossary makes sense; we can send data to it and get some value in return in several “points”(=send
method call).
But we are not still done yet. Please note that I called this “generator as a coroutine” as “simple” coroutine, and the official docs still mentions coroutines under the context of the async APIs. Hence it seems like there is another gap between a simple coroutine and a “native” coroutine(one with async … await …
). Our coroutine is not a real “coroutine” yet, at least in a modern Python context.
Let’s read these sentences from the documentation explaining the yield
expression:
All of this makes generator functions quite similar to coroutines; they yield multiple times, they have more than one entry point and their execution can be suspended. The only difference is that a generator function cannot control where the execution should continue after it yields; the control is always transferred to the generator’s caller.
So according to the documentation, it sounds like native coroutines can control where it should restart after yielding its control to other routines(=functions).
But how? The await
(or equivalently, __await__()
) expression works internally, such that coroutine objects can resume its control flow after await (expression)
has finished its own execution. Then how does that mean that await
get rid of need to rely on the caller’s control?
Personally I was surprised at this point: If you read the docs on coroutine objects, there is barely any technical explanation on yielding control flows. Even in PEP 492 there are only explanations on what a coroutine can do, but not how it does them. Whereas the docs on yield
shows a very specific behavior(retaining its callstack and pausing), the one on await
doesn't provide much technical details.
Hence, our next story must start with demystifying native coroutines.
Conclusion
The coroutine concept had been implemented first based on generators in the history of Python, and it has adopted its own concept of coroutine(native coroutine) since Python 3.5. However it is still concealed from us; we need to figure out what it is, in order to really grasp how async logics in Python work.
Top comments (0)