How to think about `async`/`await` in Rust
- async fn is an inversion of control
- Hand-rolling an explicit state machine
- Explicit state machines mean your caller has control
- Writing state machines with async fn
- await is a composition operator
- Summary
(This is a section of the lilos
intro guide that people seemed to like, so
to increase its visibility, I’m lifting it up into its own post and expanding it
a bit. I hope this is a useful companion piece to the post on async
debugging I posted this morning.))
Some documentation of Rust async
and await
has presented it as a seamless
alternative to threads. Just sprinkle these keywords through your code and get
concurrency that scales better! I think this is very misleading. An async fn
is a different thing from a normal Rust fn
, and you need to think about
different things to write correct code in each case.
This post presents a different way of looking at async
that I think is more
useful, and less likely to lead to cancellation-related bugs.
async fn
is an inversion of control
Here is how I think about fn
vs async fn
:
-
A Rust
fn
is a function that will execute until it decides to stop executing (ignoring things like threads being preempted), or until it’s interrupted by a panic. In particular, its caller gives up control by calling it, and cannot decide to “un-call” it halfway through. (And likewise, if yourfn
calls anotherfn
, you give up control to thatfn
, which can decide to enter an infinite loop orpanic!
.) -
A Rust
async fn
is an explicit state machine that you can manipulate and pass around, that happens to be phrased using normal Rust syntax instead of tables andmatch
statements. It generates a hidden type implementing theFuture
trait. The code that calls anasync fn
(or uses anyFuture
, for that matter) has ultimate control over thatFuture
, and can decide when it runs or doesn’t run, and can even discard it before it completes.
This distinction is subtle but very important: an async fn
represents an
inversion of control compared to a normal fn
.
You’ve probably run into inversion of control as a pattern before – it’s often used in things that get referred to as “frameworks.” Have you written a request handler that gets initialized and invoked by a webserver when appropriate to handle events? Inversion of control. Done almost anything in React? Same deal. The important part for our purposes is that, where normal code gets control of the computer from its caller, here the caller gets control of the code instead.
To illustrate the difference, let’s talk about state machines.
Hand-rolling an explicit state machine
If you wrote an explicit state machine by hand, this distinction would be clear in the code. For instance, here’s a simple one:
State machines like this are almost universal in embedded systems, whether they’re phrased explicitly or left implicit. Drivers that have a combination of API entry points and interrupt service routines, for instance, form this kind of state machine. This toy version is written to be small enough to pick apart.
Each time the code that owns your State
calls step
, your code gets the
opportunity to do stuff. At the end of that stuff, it returns, and the calling
code regains control. It can then keep calling step
until it gets true
,
indicating completion; or it could do something else and never call step
again; or it could drop
your state. (Note that it can also choose to keep
calling step
even after getting the true
result! It’s very much in control
here.)
How long will the high and low periods on the pin last? Well, how often will the
caller call step
? Sometimes this is defined by a contract (e.g. “this state
machine advances every 100 ms”), but in this code example, we haven’t done
anything to control timing. The caller could call step
in a loop
and make
the high/low periods as short as possible, or it could sleep for months in
between calls…or never call step
again.
What will the final state of the pin we’re controlling be? Currently, we can’t
say. The caller could leave us paused forever without calling step
, or could
drop us before we finish. So the final state of the pin could be high, low, or
tristate, depending on what the caller chooses. We could make this
better-defined by adding a Drop
impl, so if the caller were to drop
the
State
before it finishes, the pin would do someting predictable:
But if your caller decides to hang on to State
and never call step
, there’s
not really anything State
itself can do about this.
And you want it this way. Really. Keep reading.
Explicit state machines mean your caller has control
That might sound bad, but it’s really powerful. For instance, imagine that your caller looks like this:
let mut state = default;
loop
If we want to step
every time the user presses a key, then we have to accept
the possibility of never step
-ping – because we can’t force the user to
press a key! Being able to create a state machine and have it sit around waiting
forever, at very low cost, is part of the power of writing explicit state
machines.
Writing state machines with async fn
Writing explicit state machines in “long-hand” like this is error-prone
and complex. Let’s rewrite the running example as an async fn
. (The pending!
macro is from the futures
crate, and yields to the caller without waiting for
any particular event. It contains an await
.)
async
That doesn’t reproduce the Drop
behavior if we’re cancelled. To do this in an
async fn
you need to have something in the body of the function that will
perform an action when destroyed. You can roll this by hand, but, I recommend
the scopeguard
crate and its defer!
macro:
async
That’s dramatically less code. It’s also much easier to check for correctness:
-
You can tell at a glance that there’s no way to return to an earlier state from a later one, since doing so would require a
for
,loop
, orwhile
, and there isn’t one here. -
You can see (once you’ve read the docs for the
defer!
macro) that, as soon as the pin gets set high and before we yield control back, the state machine will ensure that the pin gets tristated at the end, no-matter-what. You don’t have to go hunting for a separateDrop
impl.
await
is a composition operator
Often, an application winds up requiring a hierarchy of state machines.
Imagine that you wanted to take the pin-toggling state machine from the previous
section, and ensure that it waits a certain minimum interval between changes. If
the OS provides a “sleep for a certain time period” state machine (as lilos
does) then the easiest way is to plug that into your state machine. Its states
effectively become sub-states within one of your states. This is
composition.
In a hand-rolled state machine, this is hard enough to get right that I’m not going to present a worked example. (Try it if you’re curious!)
But with a state machine expressed using async fn
, it’s trivial, because we
have an operator for it: await
. await
is the most common state machine
composition operator (though not the only one!). It says, “take this other state
machine, and run it to completion as part of my state machine.”
And so, we can add sleeps to our pin-toggler by changing our pending!()
to
instead await
a reusable sleep-for-a-duration state machine:
async
This will ensure that a minimum of 100 ms elapses between our changes to the pin. We can’t impose a maximum using this approach, because – as we saw above – our caller could wait months between stepping our state machine, and that’s part of what we’re signing up for by writing this state machine.
Composition and cancellation interact in wonderful ways. Let’s say you’re using
some_state_machine
and you’re suspicious that it might take more than 200 ms.
You’d like to impose a timeout on it: it will have 200 ms to make progress,
but if it doesn’t complete by the end of that window, it will be cancelled
(drop
-ped).
The easiest way to do this is to use the select_biased!
macro from the
futures
crate. (It’s called biased
because it steps the state machines
inside it from first to last, and if any complete, all the rest are dropped.
This means it’s slightly biased toward completing the earlier ones.)
select_biased!
This is the sort of power we get from the async fn
ecosystem. Doing this with
hand-rolled state machines is probably possible, but would be complex – and
we haven’t even talked about borrowing and lifetimes. That’s a bigger topic
than will fit in this post, but the short version is: borrowing across await
points in an async fn
pretty much Just Does What You’d Expect, but getting it
right in a hand-rolled state machine requires unsafe
and gymnastics.
Summary
From my perspective, this is the fundamental promise of async fn
: easier,
composable, explicit state machines.
If a chunk of code absolutely needs to run to completion without letting
anything else run, use a normal fn
. If a chunk of code doesn’t need to call
any async fn
s, use a normal fn
. Basically, any function that can be
written as a normal fn
without breaking something, should be. It’s easier.
But if you need to write a state machine, use async fn
. It’s harder to
understand than normal fn
because of the inversion of control and potential
for cancellation, but far easier to understand than the code you might write by
hand to do the same thing!
CAUTION: There’s a proposal to make code generic on whether or not it’s
being used async
, so that the same code could produce both a simple function
and a Future
. In this case you’d have to make sure to think about correctness
in all possible ways your code could be used. I am suspicious, and I hope after
reading this section, you are too.