A little under a year ago, I wrote this IPython extension that enables autoawait for Twisted Deferreds in Jupyter notebooks.
But last Thursday, I attended a Virtual Lightning Talk Night hosted by the NYC Python Meetup group and, while I was planning on just chilling, I ended up giving a talk on this project based on slides I made months ago and hadn't looked at in the interim. It went ok.
The organizers suggested that I turn these slides into a blog post. Funny enough I hadn't actually considered this up until now - after all the project itself was fairly well documented and on GitHub, right? But they're probably right that more people will read a blog post than technical documentation. Plus, this is an opportunity to do a deep dive! That, and they said that it's a good way to get noticed if you're trying to find work (at the risk of being a little grifty: hire me). So here we go!
To set the scene, I want to talk about three things I'm a big fan of: Twisted, async/await, and Jupyter. If you already know about these things, feel free to skip ahead.
Twisted (and a Primer on Asynchronous IO)
First of all, Twisted is great.
Twisted for those unfamiliar is an asynchronous networking library for Python. Most networking in Python is blocking - that is, when you for example make a web request with the requests library, Python stops doing, well, anything at all, until the request is complete. It's not quite the async library du jour these days - Twisted was first released in 2002, Python core now ships with its own framework (called asyncio) and these days the new hotness is considered to be Trio within the community. But it has stood the test of time. It was very well-architected from the start, has gotten consistent updates over the years, and on top of that has great maintainers. I submitted a patch to them last year and it was an all around positive experience.
For an example, let's take a PowerShell project of mine where I search the Merriam-Webster thesaurus API for approved PowerShell verbs and port the CLI script to Python. In this script, we make an API call to an Azure Function, which is Microsoft's answer to AWS Lambda, parse the response as JSON, and print the results to the screen:
#!/usr/bin/env python3
import sys
import requests
if not sys.argv[1]:
raise Exception('Need a thing to search for!')
query = sys.argv[1]
# Hanging out, just doing Python things - in this case printing output
print(f'Searching the Verbtionary for {query}...')
print()
# Making a blocking web request - this is an Azure Function that doesn't get
# called very often so due to cold start it will take a few seconds to
# complete...
res = requests.get('https://verbtionary.azurewebsites.net/api/search?query=edit')
# OK, our request is done now - but notice that we didn't /do/ anything during
# that time other than making the web request. No printing, no multitasking,
# just... hanging out.
# But now that this is done, we can do really simple error handling and
# response parsing...
res.raise_for_status()
payload = res.json()
# ...and print out our results!
print('Verb\tAlias Prefix\tGroup\tDescription')
print('----\t------------\t-----\t-----------')
for result in payload['Body']:
print('\t'.join([
result['Verb'],
result['AliasPrefix'],
result['Group'],
result['Description']
]))
Actually running it looks something like this:
(base) [josh@starmie ~]$ ./verbtionary.py edit
Searching the Verbtionary for edit...
Verb Alias Prefix Group Description
---- ------------ ----- -----------
Edit ed Data Modifies existing data by adding or removing content
Send sd Communications Delivers information to a destination
Stop sp Lifecycle Discontinues an activity
Write wr Communications Adds information to a target
In this case my test word, edit, is in fact an approved verb. Great!
By default all IO in python, whether it's network requests like these, database calls or serving APIs, are blocking. When you serve HTTP off Flask, each process can only handle one request at a time.
This isn't as bad as it sounds - after all, there are plenty of performant APIs written in Flask. For webapps, people tend to use a WSGI server like uWSGI or gunicorn which generally use a pre-fork model to create a separate process for each request. For making multiple API calls at the same time (usually due to the n+1 problem) I've often reached for ThreadPoolExecutors. The common thread 😏 here is using processes or threads to run multiple blocking things in parallel.
Of course there are drawbacks to both of these approaches. In the case of processes, creating a new process takes time! A very very casual benchmark shows that Python takes 150-200 milliseconds to do not a whole lot:
(base) [josh@starmie ~]$ time python -c ""
real 0m0.019s
user 0m0.018s
sys 0m0.001s
Threads also have a startup time, though they're more lightweight than a full-on process. However, Python also has what's called a global interpreter lock which means that even in a best case scenario only one thread can run actual Python at any given time. If you're doing network IO this is often acceptable, and if you're using a computing library like numpy it can punt on the problem by doing parallelization in c-land. In addition, threads can be tough to reason about because any two threads have access to the same memory and are arbitrarily taking and releasing the GIL. Because of this, using threads requires using abstractions such as locks, queues, semaphores and if you're lucky, executors.
An alternative to both of these concurrency models is what's called evented IO. Most operating systems these days have a mechanism for monitoring file descriptors and handing updates to them as events to a given process - for examples, epoll in Linux, kqueue in the BSDs and IOCP in Windows. Evented IO frameworks such as Twisted connect to these APIs in order to do their IO.
Why is this a good idea? With this system, a given process still only has one Python interpreter and in fact only one thread, but it's also free to do other things while the OS manages IO. This means that concurrency within a process is now casually possible, all without the overhead of multiple processes or threads.
For an example, this is what that script would look like using Twisted and treq instead of requests. It's going to look a lot more complicated - and it is! But rest assured that this is a low level interface and that we'll be showing a much-easier-to-read version of this next:
#!/usr/bin/env python3
import sys
import treq
# Twisted uses a special object called a reactor to manage interacting with
# the OS's event system and runs what is called an event loop.
from twisted.internet import reactor
# We use this little helper to run our code and shut off the event loop
# when everything is done.
from twisted.internet.task import react
# When doing evented IO, we create callback functions, which get called by
# our framework when IO completes. Here we have two non-blocking IO calls -
# one to initiate a request, and one to pull down the body of the response.
# This function makes our initial request...
def makeRequest(query):
print(f'Searching the Verbtionary for {query}...')
print()
# This method call returns an object called a Deferred - we'll come back
# to how to use this later!
return treq.get(
'https://verbtionary.azurewebsites.net/api/search?query=edit'
)
# ...this one does simple error handling and loads the JSON response...
def handleResponse(res):
if res.code != 200:
raise Exception(f'Status {res.code}!')
# This *also* returns a Deferred - this is because we can handle the
# status code before we even read in the response body!
return res.json()
# ...and this one prints our results!
def printResults(payload):
print('Verb\tAlias Prefix\tGroup\tDescription')
print('----\t------------\t-----\t-----------')
for result in payload['Body']:
print('\t'.join([
result['Verb'],
result['AliasPrefix'],
result['Group'],
result['Description']
]))
# The Deferred objects allow us to register our callbacks with Twisted to be
# called when their associated IO is complete.
def main(reactor, *argv):
if not argv[1]:
raise Exception('Need a thing to search for!')
query = sys.argv[1]
return makeRequest(query).addCallback(handleResponse).addCallback(printResults)
# So here's the COOL PART: While all that is happening, we can now multitask!
# Here, I use the callLater API to print stuff
reactor.callLater(0.1, print, 'Hanging out!')
reactor.callLater(0.5, print, 'Partying hard!')
# Finally, we tell the reactor to run everything - it'll keep running until
# our API request is done and everything's printed to the screen.
react(main, sys.argv)
In this code, we import Twisted's reactor - this is what's called an event loop. An event loop is in charge of sending asynchronous IO requests to the OS and handling completion events as they occur. Then, we create a bunch of callback functions. These are known as callbacks because the event loop calls them when asynchronous events need to be handled. These functions do all of the stuff that we did in the other script, but break it up into chunks that do some lightweight synchronous things (like print to the screen or parse JSON) before kicking off some async IO and handing control back to the reactor. Finally, we wire them together with the addCallback method on a special object called a Deferred and kick off our task using Twisted's react helper.
This looks like a bunch of hullabaloo, but it enables something really cool: While treq is calling my super slow Azure Function, we can do other stuff! In this case, I just tell the reactor to call the print function at 0.1 seconds in the future and 0.5 seconds into the future, respectively:
reactor.callLater(0.1, print, 'Hanging out!')
reactor.callLater(0.5, print, 'Partying hard!')
Here's our new script in action:
(korbenware) [josh@starmie ~]$ ./verbtionary-twisted.py edit
Searching the Verbtionary for edit...
Hanging out!
Partying hard!
Verb Alias Prefix Group Description
---- ------------ ----- -----------
Edit ed Data Modifies existing data by adding or removing content
Send sd Communications Delivers information to a destination
Stop sp Lifecycle Discontinues an activity
Write wr Communications Adds information to a target
As you can see, it still works, but now we're able to tell the user about how we're hanging out and partying hard.
Of course, printing to the screen isn't very useful, but we can do all sorts of network-related stuff - we could make more requests, or host a webserver, or send and receive emails, or even idle IRC. There are many possibilities!
async/await in Python
As I mentioned previously, this all looks super complicated. Luckily, Python has special keywords that can be used to make this all much easier, namely async def and await (collectively referred to as async/await
). These special keywords let us define coroutines, which can be thought of as special functions that "unpack" things like our Deferreds in Twisted, called awaitables. These awaitables work for many async frameworks, including asyncio and Trio, but as of 2016 they also work with Twisted.
An example should make this more clear. Here's our code refactored to use async/await, and I think you'll be as relieved as I am:
#!/usr/bin/env python3
import sys
# This is just like we did before...
import treq
from twisted.internet import reactor
from twisted.internet.task import react
# To make this work, we have to import a handy function that runs coroutines
# for us called ensureDeferred, which takes a coroutine and returns a Deferred
# that we can then use just like before!
from twisted.internet.defer import ensureDeferred
# Now everything is in one tidy function! Note that where we
# needed callbacks before, we now simply use the "await" keyword.
# Much nicer!
async def searchVerbtionary(argv):
if not argv[1]:
raise Exception('Need a thing to search for!')
query = sys.argv[1]
print(f'Searching the Verbtionary for {query}...')
print()
res = await treq.get(
'https://verbtionary.azurewebsites.net/api/search?query=edit'
)
if res.code != 200:
raise Exception(f'Status {res.code}!')
payload = await res.json()
print('Verb\tAlias Prefix\tGroup\tDescription')
print('----\t------------\t-----\t-----------')
for result in payload['Body']:
print('\t'.join([
result['Verb'],
result['AliasPrefix'],
result['Group'],
result['Description']
]))
def main(reactor, *argv):
# This is where we use ensureDeferred - like in the previous
# example, the react function needs to be handed a Deferred, not a
# coroutine, so we make the conversion here.
return ensureDeferred(searchVerbtionary(argv))
# We're still using the callLater API to do things asynchronously here!
reactor.callLater(0.1, print, 'Hanging out!')
reactor.callLater(0.5, print, 'Partying hard!')
react(main, sys.argv)
Much better! - and if we let 'er rip:
(korbenware) [josh@starmie ~]$ ./verbtionary-asyncawait.py edit
Searching the Verbtionary for edit...
Hanging out!
Partying hard!
Verb Alias Prefix Group Description
---- ------------ ----- -----------
Edit ed Data Modifies existing data by adding or removing content
Send sd Communications Delivers information to a destination
Stop sp Lifecycle Discontinues an activity
Write wr Communications Adds information to a target
The behavior is exactly the same as before, but now instead of having to manually define callbacks and wire all the Deferreds together with the addCallback
API, we can write all of our code in a tidy coroutine and run it when we're ready. async/await turns out to be really convenient for this reason and I almost always want to be using them instead of the Deferreds API directly.
Jupyter
Jupyter is what is called a notebook, a literate programming environment that can take the place of a REPL. Jupyter is very popular in the data science space. I use notebooks often, even when I'm not doing data science. For instance, I recently edited a video using MoviePy and Jupyter, and it looked like this:
In this picture the grey box is called a "cell" and you can edit Python code inside of it. You can then run it with shift-enter
and it'll run the code and, for libraries that support it, output rich content - in this example, an embedded video.
The very video I edited is about implementing a Jupyter kernel and includes a demo of a (non-Python) notebook, and I encourage you to check it out if this interests you.
autoawait in Jupyter
Since the 7.0 release of IPython, Jupyter has supported a very cool feature called "autoawait".
Mathias's post goes into detail on what this is and how to use it, but here's the TL;DR: If you have a cell in a Jupyter notebook that has the "await" keyword in it, it will attempt to run it in an appropriate event loop. For example, here's a snippet from a notebook running some asyncio coroutines:
In this example, I print something, sleep for a second, and then print something again. Because autoawait is configured to work with asyncio, Jupyter transparently figures out how to run the code in the cell as though it's a coroutine. Out of the box, this works with asyncio and can be configured to work with Curio and Trio by using a so-called magic command called %autoawait - for example, %autoawait trio
to use it with Trio.
How does this work?
First, when you run a cell, IPython detects the use of the await keyword in your code. Next, it looks up a function called a "loop_runner" that's registered inside IPython for each of the supported backends. Then, it rewrites the cell into a coroutine and passes it to the loop_runner, which does the equivalent of asyncio's run_until_complete - a little like task.react
. Finally, the output of this coroutine is returned to the user.
The source code most relevant to this is in the IPython.core.async_helpers and IPython.core.interactiveshell modules, and it's earnestly a fun read - I encourage you to take a look.
Limitations to IPython's Approach to Autoawait
Unfortunately, this approach - while requiring minimal refactoring of the IPython codebase - has a few limitations.
First of all, running the loop itself - that is, calling task.react
(or reactor.run
) - is blocking. That means that nothing can happen inside of Jupyter while the autoawaited cell is running. This puts a real damper on our fun - after all, the point of async IO is that we can do more than one thing at a time!
Second, this approach starts the event loop, runs the coroutine, and then stops the event loop every time an autoawaited cell is evaluated. This not only makes it even harder to do async IO in the background - it also means that a naive implementation of autoawait in Twisted would only be able to run a cell once! This is because Twisted's reactor is only designed to start and stop exactly once - it will yell at you very loudly if you try to start it a second time.
Unlike the Curio and Trio loop_runners, using asyncio in Jupyter doesn't have these limitations. That's because ipykernel already has a running asyncio loop and special cases autoawait to run asyncio coroutines as normal, rather than using the provided loop_runner. In the future, it's possible that someone will refactor IPython and ipykernel to allow arbitrary event loops to hook onto this machinery, but for today it's not a generally available API.
How Can We Use Twisted with Jupyter?
So IPython doesn't support Twisted in its autoawait implementation. What are we to do?
One approach is to not use autoawait at all. Twisted can optionally use the asyncio event loop that's already running in Jupyter instead of using its own event loop implementation. This will work just fine - no magics needed! - and our services will happily run in the background. In fact, Glyph and Moshe have used this approach in at least one workshop.
However, we do miss out on that sugar - any time we want to use async/await, we need to define a coroutine with async def
and then manually schedule it using ensureDeferred
, as in our Verbtionary example. We also lose out on a surprising feature of autoawait: the output of async actions staying with their respective cells.
When Python prints output to the screen, Jupyter doesn't have a good way of knowing from which cell that output originated, beyond knowing that a cell is running in the first place. That means that if you create an object that prints asynchronously in the background:
class NoiseMaker():
tick = 1
def _loop(self):
if self._running:
print('LOUD NOISES!')
reactor.callLater(self.tick, self._loop)
def start(self):
self._running = True
self._loop()
def stop(self):
self._running = False
noisemaker = NoiseMaker()
noisemaker.start()
It'll print stuff to the output of that cell until you run the contents of another cell:
This isn't harmful per se. There also isn't an obvious way to fix it without making Jupyter aware of a given coroutine via the autoawait mechanism. If you write things that run in the background that don't print to the screen then you won't have this problem, and if you write things that finish printing within a reasonable amount of time then your output will probably go to the right cell for interactive use. It's workable.
This screenshot comes from a notebook I wrote that demonstrates this more fully, which you can find in my GitHub repo.
Using Crochet to Implement Autoawait
Let's say that we're unhappy with the status quo, and we're dead set on implementing autoawait for Twisted. How can we accomplish this?
The trick is to run Twisted's event loop in a thread. In this strategy, our Twisted loop_runner sends our code to this thread, which then runs it with the event loop. It then waits in the main thread for the output to return, at which point it prints to the screen. This means that Twisted's reactor can run continuously, while allowing us to conform to the loop_runner interface.
For this, we can use a library called Crochet, which does this for us and exposes two decorators to help us out, called @wait_for and @run_in_reactor. @wait_for
runs a wrapped Deferred-returning function in the Twisted thread and blocks the main thread until it's done. It takes a timeout parameter, which will make it so that the function throws an exception if it seems like Twisted is taking too long. @run_in_reactor
is similar but doesn't have the blocking behavior of @wait_for
, and is useful for truly running Twisted code in the background. It instead returns an EventualResult
object, which can then be used to wait for output at a time when it's convenient to do so.
Given that we know about Crochet, our approach becomes clear. We write an IPython extension, which can be loaded with the %load_ext
magic. This extension then sets up Crochet and starts the Twisted reactor in the background, and registers an appropriate loop_runner that uses @wait_for
to pass our coroutines to Twisted. Finally, we register some more magics that let us tune Crochet and make async background calls to @run_in_reactor
.
Using my implementation of this idea looks like this:
In this example, I load the extension and configure autoawait:
%load_ext twisted_ipython
%autoawait twisted
Then I can casually use await with Twisted:
print('Going to sleep...')
await sleep(1)
Shout('I HAVE AWAKENED!')
I've been using this for my Twisted projects for the last year or so, and for me it's worked pretty well. It gives me fully-featured Twisted-capable %autoawait
and it lets me run Twisted code without blocking IPython - all very cool!
There are, however, a few downsides.
One is that this requires both Crochet and Twisted as non-optional dependencies. IPython is able to import Curio and Trio lazily in its implementations and therefore doesn't have to actually ship them - this makes adding baked-in support much more tenable than an implementation that would require shipping an entire framework someone might not even use.
The other, which I think is more unfortunate, is that for code we want to run in the background that leverages Twisted, we now need to use Crochet's @run_in_reactor
decorator - or more accurately, my extension's %run_in_reactor
magic. For me, this is a worthwhile trade-off, since my more typical use cases around making interactive requests is well-suited for autoawait than my less common use case of spawning servers in the background.
twisted_ipython
lives on GitHub and can be installed from PyPI. I hope that this background is interesting and would be stoked if people ended up using my library!
Top comments (2)
Me, four and a half years in the future: THAT’S NOT HOW GUNICORN WORKS!!
Gunicorn works by creating pools of processes or threads that are fed requests with a lightweight queue abstraction. it’s performant in part by avoiding the much older pre-fork strategy.
Otherwise, I feel like this aged well.
This is amazing. Thank you for writing it up! I've done something much more manual with
asyncioreactor
, but I'm going to have to try this out instead!