Composing concurrency in drivers
An example of why I like async for embedded
- I2C: the short version
- The patient: an STM32-derived I2C target
- Strolling down the garden path
- Handling stop conditions
- Checking for errors and other conditions
- Repeated starts
- Wait, why not handle stop conditions in the data loops?
- So what did we just do?
- Summary
- Appendix: the full driver
I recently published an article suggesting a different way of looking at
async
and await
in Rust. In it, I discussed
strategies for implementing state machines, and explained why I like async
as
a tool for building such state machines, even without threads.
In this post I’ll work through an example of why I’m so excited about this
technique, by building a real driver for a notoriously tricky bus one piece at a
time, using lilos
.
I2C: the short version
I2C is a two-wire serial protocol used for connecting chips in a larger circuit. It’s basically ubiquitous in the low-end sensor market, in part because Intel standardized it as PMBus for power management in PCs in the 90s. Unless you’ve printed this article on paper, the device you’re using to read has at least one – and probably many – I2C buses inside it.
It’s also far more complex than it looks, has a number of design defects that can make it hard to use reliably, and is only “standardized” in the loosest sense of the word. But that’s a topic for another post.
One of the reasons I2C is complex is that it’s not a simple state machine controlled by one party. As a counterpoint, consider asynchronous serial, the thing spoken by “serial ports” or UARTs or whatever. There’s one communication line in each direction, controlled by one party. In the simplest version where there are no flow-control signals, the other party can do whatever they want to their line, and it doesn’t affect your ability to transmit. They definitely can’t do something in the middle of a byte you’re sending that requires you to change your behavior. The transmit state machines in each direction are basically independent, and entirely in the control of one party each.
With I2C, on the other hand, you can totally pop up in the middle of a byte someone else is sending, and do something that requires them to change their behavior. The most common example of this is called “clock stretching,” where a slow device messes with the bus clock to make it switch more slowly so the device can keep up. This is normal in I2C and is mostly handled by hardware.
The state transitions in I2C that are relevant for this post are:
-
Imposing a timeout on clock stretching. You probably want to let that slow device be slow, but it might try to be slow forever. If your application has real-time requirements, you can’t wait forever (and almost all embedded applications have real-time requirements).
-
Stop conditions. After every byte sent to or received from a target device on the bus, the host can decide it’s done and stop the transaction. This is called a “stop condition.” It can also happen unexpectedly in the middle of things including in the middle of a byte, if the devices you’re working with aren’t strictly spec-compliant – and few are – though this usually gets detected as a bus error, below.
-
Bus errors. Certain patterns of signaling on the bus are forbidden by the standard, and most I2C peripherals on microcontrollers detect these and report them as some sort of “bus error.” There’s no great way to handle a bus error in I2C, so the typical approach is to stop talking and wait for a new transaction that hopefully won’t contain errors.
-
Arbitration loss. Just to make things more fun, you can have multiple hosts on a bus, and multiple targets may try to respond to the same transaction. This conflict gets broken by a process called arbitration, which relies on an electrical property of the I2C bus (if two drivers try to send a 1 and a 0 simultaneously, the 0 wins). I2C peripherals will tend to detect arbitration failures in hardware by reading back the state of the data line – if it’s 0 when they were trying to make it 1, someone else wins the arbitration.
There’s a lot more than can go wrong with I2C in practice, but, that covers the points I’ll be touching on in this post.
The patient: an STM32-derived I2C target
Most STM32-family microcontrollers produced in the past ten years or so have I2C peripherals that are very similar to one another. The driver that motivated this post was written for the STM32G0 family, but it would work almost identically on most other STM32s.
I’m specifically implementing the target side of I2C (historically, and thoughtlessly, called slave). A microcontroller acting as an I2C target can present whatever interface it wants, including emulating another device. For the purposes of this article I’m going to implement a very, very simple made-up interface.
I’m starting with the target side here because it’s significantly harder than the host side. Targets have to deal with other targets winning arbitration on every bit sent, and all the shenanigans that hosts get up to. Most of these principles apply equally well to an I2C host driver.
I have sanded off some of the arcane pointy corners of the real STM32 driver to avoid having to explain why ST made some design decisions (because I can’t explain them). I’ll present the real driver with all its warts in an appendix.
Strolling down the garden path
Let’s start with I2C’s peaceful garden path: exchanging data with no errors, and without considering either stop conditions or repeated starts. I call this a garden path for two reasons:
- It’s an easy path to walk.
- I2C will totally lead you down the garden path if you let it, which is an English idiom for being deceived.
Most intro-to-I2C tutorials start here. (If you also stop here, you’ll produce a flaky I2C device.)
Using our slightly-simplified STM32 I2C peripheral, we can send data when asked (by a “read” transaction from the host) like this:
async !
Here, I2C_IRQ
is a Notify
, a lilos
object for coordinating with
interrupt handlers. You can read more about it in the lilos
docs, but the
short version is: you wait on a Notify
by calling Notify::until
, passing a
closure that checks to see if a specific condition has become true. (This is
because spurious wakes are entirely possible – and in fact we’re going to cause
some later. Stay tuned.)
It’s most likely declared like this:
static I2C_IRQ: Notify = new;
…and will be paired with an interrupt service routine that calls
Notify::notify
on it (omitted here but included in the appendix).
You might be surprised that send_data
never returns normally (-> !
). This is
because, as an I2C target, we can’t actually decide when to stop sending data.
It’s entirely under control of the host. To reflect that, our send routine is
expressed as an infinite loop. (I’ll come back to this later after introducing
some more things.)
We can also receive data (during a “write” transaction) using similarly structured code:
async !
Notice that receiving is using the same Notify
, I2C_IRQ
. You could also
choose to have a separate Notify
per type of event. However, in this case,
using just one will produce smaller code, for reasons I’ll describe later.
receive_data
is also an infinite loop. In I2C, a target can actually request
that the host stop sending data – by setting a signal called NACK
(no-acknowledge) after a byte. However, the host is under no obligation to
honor that signal! So, for simplicity, receive_data
is infinite too.
We can tie our two little state machines together by wrapping them in another that dispatches based on the direction of the current transaction:
async !
Straightforward – this just checks a flag and chooses which function to call.
Like send_data
and receive_data
, it can’t return normally.
So at this point we have a somewhat functional I2C implementation, but with two significant flaws:
- It can’t stop once a transaction has begun!
- It doesn’t handle errors at any point.
Let’s fix that.
Handling stop conditions
I2C hosts can generate a stop condition at almost any time, and we need to respond to them any time we’re in a transaction. Technically, the standard says they’re only supposed to do it at certain points in the protocol, but in practice, hosts do whatever they want.
Our I2C peripheral detects stop conditions and sets a flag in the status register. It automatically stops any transaction that was ongoing. Our job in the driver is to notice this and go back to listening for new activity.
If this was all we had to do, we could express it as a state machine like this:
async
That’s not very useful on its own, because it doesn’t do the rest of I2C – the actual data movement bits. Normally, at this point, we’d go through our data-movement state machines above and insert a bunch of checks for stop conditions. This can work, though I’m not sure I’ve ever seen an I2C driver that didn’t forget to check for at least one situation at some point.
We’re going to do something different: we’re going to interleave our data
movement state machine with our stop condition monitoring state machine. In this
case, I’ll use the select_biased!
macro from the futures
crate.
Let’s assume we have a lilos
task responsible for managing I2C. Its would now
look like this:
async !
(The .fuse()
bit is a futures
-crate requirement, you can ignore it for the
time being – I do.)
The select_biased!
macro will poll the Future
on each of its branches,
completing when any one of them completes. In the event that more than one
complete simultaneously, it prefers earlier branches to later ones (which is why
it’s called biased
).
Because the internals of stop_condition
and handle_data
both register for
interrupts, but serve separate subsets of possible I2C interrupts, we want to
run them “at the same time,” which is exactly what we’ve achieved here:
- On any I2C interrupt, we will wake up and poll the
Future
generated bystop_condition
. If it has completed (i.e. if we should stop), we’ll stop. - Next we’ll poll the
Future
generated byhandle_data
. - Then we’ll go back to sleep waiting for more interrupts.
From the perspective of state machines expressed as async fn
, select_biased!
interleaves the functions. Effectively, we’ve added the check from
stop_condition
to every await
in handle_data
, without modifying
handle_data
.
I think this is really exciting, because it allows for the same separation of concerns that we talk about in software engineering, except applied to state machines. We can express the two state machines – data handling and stop condition checking – in isolation, in ways that are easier to read than if we combined them, and then let the compiler combine them for us.
To underscore how useful this is, let’s add error checking.
Checking for errors and other conditions
In an I2C target, there are two conditions that should cause your driver to abort the current transaction. I described them above; they are
- Bus error (nonsense twiddling on the bus lines)
- Arbitration lost (someone else is also talking!)
The STM32 I2C peripheral will stop driving the bus in either case, so we just need to make sure our state machine responds correctly. The response is very similar to stop conditions, though we’ll add a bit of monitoring:
static ERROR_COUNT: AtomicU32 = new;
async
We can mingle that into our I2C state machine by adding another branch to
select_biased!
. In this case I’m going to add it above the stop condition
handling, so that if an error occurs at the same time as a stop, we report the
error. The whole select_biased!
block now reads:
select_biased!
And we’re now handling errors at every await point.
There’s one thing we still need to address, though.
Repeated starts
It’s possible for an I2C host to generate a “repeated start,” or “restart,” without generating a stop condition. This lets it effectively interrupt its own transaction to start a new one.
The STM32 I2C hardware indicates this by setting the “addressed” bit – the same
one we used to detect a transaction starting at the top of i2c_task
. To handle
repeated starts correctly, we need to also check this bit between bytes.
At this point, you can probably guess how I’m going to do that.
async
And the select_biased!
now reads:
select_biased!
Wait, why not handle stop conditions in the data loops?
I pointed out earlier that the data handling loops are infinite unless
cancelled (by select_biased!
). You may be wondering why we didn’t instead make
it a while
loop watching for a stop condition.
We could do that! But if you look at the actual, non-simplified driver at the end of this post, the transmit and receive loops actually contain multiple await points (because of oddities in ST’s I2C interrupt behavior). We’d need to check for stop conditions, not just at the top of the loop, but at each of these points.
I think it’s much easier to get this right by separating checking for stop conditions, like I’ve done here.
So what did we just do?
Whenever we use fancy language features to express code in a new way, I think it’s important to pause and consider what the program will do in practice.
On lilos
, as in any Rust async runtime, top-level free-running Future
s are
called tasks. Tasks get polled in response to events. The runtime has no
visibility inside the task – the select_biased!
and other constructs are just
code.
So, with the event handler we just wrote, the i2c_task
will run up to the
first await
, which is waiting for the address interrupt signaling that a
transaction has begun, and then suspend (i.e. the Future
will return
Pending
). The runtime can poll i2c_task
’s Future
all it wants, but each
time i2c_task
will open its sleepy eyes just a crack, see that the address bit
has not been set, and go right back to sleep.
Once we get past that point, things get interesting.
When the code is polled with the address bit set, it will run until it reaches
the select_biased!
. select_biased!
will begin by polling each of its branch
futures. In our case, the futures all do roughly the same thing: enable a subset
of interrupt conditions in the I2C controller, and then suspend waiting for the
interrupt to occur.
If all the branch futures return Pending
from poll
– which ours likely
will, the first time – then the select_biased!
itself suspends. On future
poll
of our i2c_task
, select_biased!
will get the opportunity to try its
branches again, and this will continue until one completes.
From this point forward, if an I2C interrupt occurs, the pattern of activity is:
- Interrupt handler pokes our
Notify
. lilos
executor wakes from sleep and polls thei2c_task
future.- The
i2c_task
future delegates toselect_biased!
. - The
error_condition
future gets polled. - The
stop_condition
future gets polled. - The
restart
future gets polled. - The
handle_data
future gets polled. - If none of them resolve, then we stay suspended.
I noted earlier that the Notify
type in lilos
takes a predicate closure to
decide when to wake up, to avoid spurious wakes. The list above shows why. When
an I2C IRQ occurs, generally just one of our branch futures will want to
respond to it, but all will be polled. From the perspective of all the other
branches, the wake is spurious and no work should be done.
It’s also possible to get legitimately spurious wakeups, of course. Some
hardware has a bad habit of generating interrupt requests when no actual event
has happened. There are usually software mechanisms as well – lilos
actually
provides one, lilos::exec::wake_task_by_index
, because you need it if you want
to implement your own mechanisms like Notify
.
In our case, the list of polls above is a pretty accurate depiction of what the
compiled code will do. This is kind of a lot of work for responding to an event,
so the actual driver – the one I wrote for a project instead of an article –
simplifies things. If you revisit the code above, you’ll notice that three of
the select branches all do the same thing – nothing, followed by repeating the
loop. This suggests that they could be merged, and that’s what I did. The driver
has an equivalent of our handle_data
, and then a single terminal_condition
branch for checking errors, stops, and restarts.
This makes the generated code smaller and a little bit faster, which is always nice when you’re resource-constrained.
Summary
If you’re a person who often has to write drivers, or just driver-curious, I
hope this approach has caught your interest. The ability to factor different
parts of a state machine in the implementation, and then combine them together,
seems very powerful in cases like this, where a need can be described as
“decorate every await
point with an X.”
There are a lot of opportunities to use async
in state machines, and I’m only
starting to scratch the surface in my work. Existing tools like the futures
crate are written around the “use async
to simulate threads” use case (and the
std
, high-resources use case), which means they don’t always serve me well in
my embedded work. select_biased!
works in this driver, for instance, but is
syntactically awkward and requires the .fuse()
boilerplate – which also
bloats code size.
I’ll post more on this as I uncover it!
Appendix: the full driver
Here’s the actual STM32 I2C driver code, with all the warts preserved. I’ve only simplified it by removing the parts that were specific to the protocol I was implementing.
use Infallible;
use ;
use Moder;
use ;
use ;
use AtomicArithExt;
use Notify;
use FutureExt;
use stm32_metapac as device;
use interrupt;
// TODO make programmable
const ADDR7: u8 = 0b1100_101;
/// Bus error counter, intended for consumption by a debugger.
static ERR_BUS: AtomicU32 = new;
/// Arbitration lost error counter, intended for consumption by a debugger.
static ERR_ARLO: AtomicU32 = new;
/// Event used for our interrupt.
static EVT: Notify = new;
/// I2C management task.
pub async
/// Initialization routine, factored out of task.
async !
/// Processes an infinite stream of incoming data until it's cancelled.
///
/// This is required to process an _infinite_ stream because the length of the
/// stream is not under our control. Even after we NACK a byte, the host is free
/// to continue clocking us for all eternity if it wishes. It's not clear that
/// the STM32 I2C block lets us abort a transaction / start behaving as though
/// our address never matched, so to avoid stretching the clock, we have to keep
/// on NACKing bogus bytes.
async !
/// Generates an infinite stream of outgoing data until it's cancelled.
///
/// The rationale for the stream being infinite is the same as for
/// `receive_data` above: the host controls the duration of the transfer, not
/// us. So, we have to generate data forever if they ask us to. In this
/// direction we can't even NACK to express our displeasure. Instead, any
/// outgoing data transfer needs to naturally pad itself with ... something.
async !
async
/// Internal, barely-used error type.
;