What the heck is gevent? (Part 1 of 4)
Overview
gevent
is a coroutine-based cooperative multitasking python framework that relies on monkey patching to make all code cooperative. Gevent actually draws its lineage from Eve Online which was implemented using Stackless Python which eventually evolved into eventlet which inspired gevent. Just like Eve Online, gevent is full of skullduggery and confounds outsiders.
There’s a lot in there, so let’s break it down.
Coroutine
Coroutines split up programs into blocking and non-blocking work. Frontend developers are familiar with callback hell — each of those callbacks is a coroutine.
doing some work
done first
done second
In the above example, the functions are coroutines
- they do some work, call a blocking function with a callback for what to do when that blocking function finishes. Importantly coroutines have to explicitly yield control for another coroutine to run which allows us to more easily reason about where we have to be concerned about interleaving (more on this later). coroutines
are not threads! A coroutine
is also known as a greenlet
in gevent.
Cooperative Multitasking
Cooperative Multitasking is a method used to enable a computer to handle more concurrent work than there are workers (e.g. cores on your computer). Cooperative Multitasking dates back to the dawn of timesharing — Windows 3.1 relied on cooperative multitasking to allow multiple applications to run at the same time on a computer with 1 core. The main downside, then and now, is cooperative systems rely on tasks to…well..cooperate. In a preemptive multitasking system like modern operating systems, the OS understands priorities and how much time each task has run for and can interrupt long running process to allow another to run. No such fairness mechanism exists for a cooperative system.
Monkey Patching
Monkey patching is a technique for modifying behavior at runtime for the standard library or third party libraries. Monkey patching is generally considered a bad practice as it can lead to hard-to-understand bugs and requires the monkey patches to be updated with the underlying library being patched. In the case of gevent — monkey patching has to be the absolute first thing a process does — otherwise libraries will get a handle to the real, blocking implementations.
Should have slept 10 seconds
Bringing it all together
Below we have a full example using gevent
- we first monkey patch the standard library which then magically makes time.sleep
cooperative - instead of blocking the CPU it yields control for at least the sleep time. No other code changes were required. gevent
avoids callback hell by monkey patching the standard library, automatically creating coroutines - effectively resumption of work at the yield point. A useful corollary is how C# generates async/await code - it effectively breaks the function up every time there's an await
.
We’ll run this code using gevent.spawn
to simulate concurrent requests.
0: Doing first step
0: Doing second step
1: Doing first step
1: Doing second step
0: Doing third step
1: Doing third step
These coroutines
are cooperative
- notice we never see anything else run between first step
and second step
because control was never yielded.
Gevent — the good parts
gevent
isn't all bad. Coroutines are substantially lighter weight than threads or processes - we can run tens of thousands of greenlets on a host that could run hundreds of threads. gevent
works with almost every third party library with no code changes - twisted, tornado, even asyncio all require using libraries that play nicely with their event loops for any blocking I/O - in contrast gevent
works with requests
out of the gate. gevent
promises "Add these two lines of code to your project and it's magically higher throughput" and it mostly delivers.
Problems with Gevent
Unfortunately it’s not all sunshine and rainbows with gevent
. gevent
can lead to real correctness problems by interleaving code in unexpected ways. If a library is using native code we can still face blocking. Lastly while gevent optimizes for increasing throughput, it can lead to real latency problems.
But this post is already getting quite lengthy — so we’ll go over correctness problems and how to avoid them in part 2 and responsiveness issues in part 3.
If you want to play around with this code yourself, you can download this as a Jupyter Notebook!
This is part of a 4-part series. We suggest you also check out Part 2: Correctness, Part 3: Performance, and Part 4: Applying Learnings to Deliver Value to Users.
Lyft is hiring! If you’re interested in improving peoples lives with the world’s best transportation, join us!