Composing concurrency in drivers

An example of why I like async for embedded

I recently published an article suggesting a different way of looking at async and await in Rust. In it, I discussed strategies for implementing state machines, and explained why I like async as a tool for building such state machines, even without threads.

In this post I’ll work through an example of why I’m so excited about this technique, by building a real driver for a notoriously tricky bus one piece at a time, using lilos.

I2C: the short version

I2C is a two-wire serial protocol used for connecting chips in a larger circuit. It’s basically ubiquitous in the low-end sensor market, in part because Intel standardized it as PMBus for power management in PCs in the 90s. Unless you’ve printed this article on paper, the device you’re using to read has at least one – and probably many – I2C buses inside it.

It’s also far more complex than it looks, has a number of design defects that can make it hard to use reliably, and is only “standardized” in the loosest sense of the word. But that’s a topic for another post.

One of the reasons I2C is complex is that it’s not a simple state machine controlled by one party. As a counterpoint, consider asynchronous serial, the thing spoken by “serial ports” or UARTs or whatever. There’s one communication line in each direction, controlled by one party. In the simplest version where there are no flow-control signals, the other party can do whatever they want to their line, and it doesn’t affect your ability to transmit. They definitely can’t do something in the middle of a byte you’re sending that requires you to change your behavior. The transmit state machines in each direction are basically independent, and entirely in the control of one party each.

With I2C, on the other hand, you can totally pop up in the middle of a byte someone else is sending, and do something that requires them to change their behavior. The most common example of this is called “clock stretching,” where a slow device messes with the bus clock to make it switch more slowly so the device can keep up. This is normal in I2C and is mostly handled by hardware.

The state transitions in I2C that are relevant for this post are:

There’s a lot more than can go wrong with I2C in practice, but, that covers the points I’ll be touching on in this post.

The patient: an STM32-derived I2C target

Most STM32-family microcontrollers produced in the past ten years or so have I2C peripherals that are very similar to one another. The driver that motivated this post was written for the STM32G0 family, but it would work almost identically on most other STM32s.

I’m specifically implementing the target side of I2C (historically, and thoughtlessly, called slave). A microcontroller acting as an I2C target can present whatever interface it wants, including emulating another device. For the purposes of this article I’m going to implement a very, very simple made-up interface.

I’m starting with the target side here because it’s significantly harder than the host side. Targets have to deal with other targets winning arbitration on every bit sent, and all the shenanigans that hosts get up to. Most of these principles apply equally well to an I2C host driver.

I have sanded off some of the arcane pointy corners of the real STM32 driver to avoid having to explain why ST made some design decisions (because I can’t explain them). I’ll present the real driver with all its warts in an appendix.

Strolling down the garden path

Let’s start with I2C’s peaceful garden path: exchanging data with no errors, and without considering either stop conditions or repeated starts. I call this a garden path for two reasons:

  1. It’s an easy path to walk.
  2. I2C will totally lead you down the garden path if you let it, which is an English idiom for being deceived.

Most intro-to-I2C tutorials start here. (If you also stop here, you’ll produce a flaky I2C device.)

Using our slightly-simplified STM32 I2C peripheral, we can send data when asked (by a “read” transaction from the host) like this:

async fn send_data(i2c: &I2C) -> ! {
    // Configure the peripheral to let us send a byte at a time.
    i2c.control.modify(|w| {
        w.set_nbytes(1);
        w.set_reload(true);
    });
    // Clear the "addressed" flag, freeing the hardware to start the
    // transaction.
    i2c.flags.modify(|w| {
        w.set_addressed(false);
    });

    // The actual sending of data:
    loop {
        // We'd like to receive an interrupt when the TX register
        // is empty.
        i2c.control.modify(|w| w.set_tx_empty_irq(true));
        // Wait for an interrupt, but only proceed if it's the
        // specific case we're watching for (TX Empty).
        I2C_IRQ.until(|| i2c.status.read().tx_empty())
            // Make sure we turn the IRQ back off if cancelled.
            // More on this later.
            .on_cancel(|| {
                i2c.control.modify(|w| w.set_tx_empty_irq(false));
            })
            .await;

        // Send our next byte. For this demo, our next byte
        // is always:
        let next_byte = 0xAA;
        i2c.txdata.write_value(next_byte);

        // Repeat our initial setup to prepare for another byte.
        i2c.control.modify(|w| {
            w.set_nbytes(1);
            w.set_reload(true);
        });
    }
}

Here, I2C_IRQ is a Notify, a lilos object for coordinating with interrupt handlers. You can read more about it in the lilos docs, but the short version is: you wait on a Notify by calling Notify::until, passing a closure that checks to see if a specific condition has become true. (This is because spurious wakes are entirely possible – and in fact we’re going to cause some later. Stay tuned.)

It’s most likely declared like this:

static I2C_IRQ: Notify = Notify::new();

…and will be paired with an interrupt service routine that calls Notify::notify on it (omitted here but included in the appendix).

You might be surprised that send_data never returns normally (-> !). This is because, as an I2C target, we can’t actually decide when to stop sending data. It’s entirely under control of the host. To reflect that, our send routine is expressed as an infinite loop. (I’ll come back to this later after introducing some more things.)

We can also receive data (during a “write” transaction) using similarly structured code:

async fn receive_data(i2c: &I2C) -> ! {
    // Configure the peripheral to let us receive a byte at a time.
    i2c.control.modify(|w| {
        w.set_nbytes(1);
        w.set_reload(true);
    });
    // Clear the "addressed" flag, freeing the hardware to start the
    // transaction.
    i2c.flags.modify(|w| {
        w.set_addressed(false);
    });

    // The actual receiving of data:
    loop {
        // We'd like to receive an interrupt when the RX register
        // is not empty.
        i2c.control.modify(|w| w.set_rx_not_empty_irq(true));
        // Wait for an interrupt, but only proceed if it's the
        // specific case we're watching for (RX Not Empty).
        I2C_IRQ.until(|| i2c.status.read().rx_not_empty())
            // Make sure we turn the IRQ back off if cancelled.
            // More on this later.
            .on_cancel(|| {
                i2c.control.modify(|w| w.set_rx_not_empty_irq(false));
            })
            .await;

        // Pull the next byte out of the RX register. In this
        // case, we're ignoring it; to implement some behavior
        // you would _not_ ignore it.
        let _byte = i2c.rxdata.read();

        // Repeat our initial setup to prepare for another byte.
        i2c.control.modify(|w| {
            w.set_nbytes(1);
            w.set_reload(true);
        });
    }
}

Notice that receiving is using the same Notify, I2C_IRQ. You could also choose to have a separate Notify per type of event. However, in this case, using just one will produce smaller code, for reasons I’ll describe later.

receive_data is also an infinite loop. In I2C, a target can actually request that the host stop sending data – by setting a signal called NACK (no-acknowledge) after a byte. However, the host is under no obligation to honor that signal! So, for simplicity, receive_data is infinite too.

We can tie our two little state machines together by wrapping them in another that dispatches based on the direction of the current transaction:

async fn handle_data(i2c: &I2C) -> ! {
    match i2c.status.read().direction() {
        Dir::Write => receive_data(i2c).await,
        Dir::Read => send_data(i2c).await,
    }
}

Straightforward – this just checks a flag and chooses which function to call. Like send_data and receive_data, it can’t return normally.

So at this point we have a somewhat functional I2C implementation, but with two significant flaws:

  1. It can’t stop once a transaction has begun!
  2. It doesn’t handle errors at any point.

Let’s fix that.

Handling stop conditions

I2C hosts can generate a stop condition at almost any time, and we need to respond to them any time we’re in a transaction. Technically, the standard says they’re only supposed to do it at certain points in the protocol, but in practice, hosts do whatever they want.

Our I2C peripheral detects stop conditions and sets a flag in the status register. It automatically stops any transaction that was ongoing. Our job in the driver is to notice this and go back to listening for new activity.

If this was all we had to do, we could express it as a state machine like this:

async fn stop_condition(i2c: &I2C) {
    // We'd like an interrupt when the stop condition happens.
    i2c.control.modify(|w| w.set_stop_irq(true));

    // Wait for it:
    I2C_IRQ.until(|| i2c.status.read().stop())
        // Ensure our IRQ gets turned off if we're cancelled:
        .on_cancel(|| i2c.control.modify(|w| w.set_stop_irq(false)))
        .await;

    // All done.
}

That’s not very useful on its own, because it doesn’t do the rest of I2C – the actual data movement bits. Normally, at this point, we’d go through our data-movement state machines above and insert a bunch of checks for stop conditions. This can work, though I’m not sure I’ve ever seen an I2C driver that didn’t forget to check for at least one situation at some point.

We’re going to do something different: we’re going to interleave our data movement state machine with our stop condition monitoring state machine. In this case, I’ll use the select_biased! macro from the futures crate.

Let’s assume we have a lilos task responsible for managing I2C. Its would now look like this:

async fn i2c_task(i2c: I2C) -> ! {
    // TODO: setup stuff here (not yet written)

    // Handle transactions forever
    loop {
        // We want an interrupt when the ADDR bit gets set, please
        i2c.control.modify(|w| w.set_addr_irq(true));

        // Wait for it.
        I2C_IRQ.until(|| i2c.status.read().addressed())
            // Disable IRQ if we get cancelled, which isn't likely
            // in this case, but it's nice to be careful.
            .on_cancel(|| i2c.control.modify(|w| w.set_addr_irq(false)))
            .await;

        // Handle the rest of the transaction:
        select_biased! {
            _ = stop_condition().fuse() => {
                // The transaction ended! We'll just pop back up to the
                // top of the loop and wait for the next one.
            }
            x = handle_data().fuse() => {
                // handle_data can't actually return, so, this
                // block won't happen. select_biased! doesn't have
                // a shorthand for this, but we can prove it's impossible
                // by showing that there are no possible values of x.
                // This is an obscure Rust idiom that isn't critical
                // for understanding this article.
                match x {}
            }
        }
    }
}

(The .fuse() bit is a futures-crate requirement, you can ignore it for the time being – I do.)

The select_biased! macro will poll the Future on each of its branches, completing when any one of them completes. In the event that more than one complete simultaneously, it prefers earlier branches to later ones (which is why it’s called biased).

Because the internals of stop_condition and handle_data both register for interrupts, but serve separate subsets of possible I2C interrupts, we want to run them “at the same time,” which is exactly what we’ve achieved here:

From the perspective of state machines expressed as async fn, select_biased! interleaves the functions. Effectively, we’ve added the check from stop_condition to every await in handle_data, without modifying handle_data.

I think this is really exciting, because it allows for the same separation of concerns that we talk about in software engineering, except applied to state machines. We can express the two state machines – data handling and stop condition checking – in isolation, in ways that are easier to read than if we combined them, and then let the compiler combine them for us.

To underscore how useful this is, let’s add error checking.

Checking for errors and other conditions

In an I2C target, there are two conditions that should cause your driver to abort the current transaction. I described them above; they are

The STM32 I2C peripheral will stop driving the bus in either case, so we just need to make sure our state machine responds correctly. The response is very similar to stop conditions, though we’ll add a bit of monitoring:

static ERROR_COUNT: AtomicU32 = AtomicU32::new(0);

async fn error_conditions() {
    // We'd like an interrupt on any error.
    i2c.control.modify(|w| w.set_error_irq(true));

    // Wait for it:
    I2C_IRQ.until(|| {
            let r = i2c.status.read();
            r.bus_error() || r.arbitration_lost()
        })
        // Ensure our IRQ gets turned off if we're cancelled:
        .on_cancel(|| i2c.control.modify(|w| w.set_error_irq(false)))
        .await;

    // Note that an error has occurred, for inspection
    // elsewhere.
    ERROR_COUNT.fetch_add(1, Ordering::Relaxed);

    // All done.
}

We can mingle that into our I2C state machine by adding another branch to select_biased!. In this case I’m going to add it above the stop condition handling, so that if an error occurs at the same time as a stop, we report the error. The whole select_biased! block now reads:

        select_biased! {
            _ = error_conditions().fuse() => {
                // Uh oh! Handle errors here,
                // and then continue the loop.
            }
            _ = stop_condition().fuse() => {
                // The transaction ended! We'll just pop back up to the
                // top of the loop and wait for the next one.
            }
            x = handle_data().fuse() => {
                match x {}
            }
        }

And we’re now handling errors at every await point.

There’s one thing we still need to address, though.

Repeated starts

It’s possible for an I2C host to generate a “repeated start,” or “restart,” without generating a stop condition. This lets it effectively interrupt its own transaction to start a new one.

The STM32 I2C hardware indicates this by setting the “addressed” bit – the same one we used to detect a transaction starting at the top of i2c_task. To handle repeated starts correctly, we need to also check this bit between bytes.

At this point, you can probably guess how I’m going to do that.

async fn restart() {
    // We'd like an interrupt on the addr bit.
    i2c.control.modify(|w| w.set_addr_irq(true));

    // Wait for it:
    I2C_IRQ.until(|| i2c.status.read().addressed())
        // Ensure our IRQ gets turned off if we're cancelled:
        .on_cancel(|| i2c.control.modify(|w| w.set_addr_irq(false)))
        .await;

    // All done.
}

And the select_biased! now reads:

        select_biased! {
            _ = error_conditions().fuse() => {
                // Uh oh! Handle errors here,
                // and then continue the loop.
            }
            _ = stop_condition().fuse() => {
                // The transaction ended! We'll just pop back up to the
                // top of the loop and wait for the next one.
            }
            _ = restart().fuse() => {
                // The ADDR bit is set! We need to start anew.
                // We can do this by simply popping back up to the
                // top of our enclosing loop (i.e. by doing nothing
                // here).
            }
            x = handle_data().fuse() => {
                match x {}
            }
        }

Wait, why not handle stop conditions in the data loops?

I pointed out earlier that the data handling loops are infinite unless cancelled (by select_biased!). You may be wondering why we didn’t instead make it a while loop watching for a stop condition.

We could do that! But if you look at the actual, non-simplified driver at the end of this post, the transmit and receive loops actually contain multiple await points (because of oddities in ST’s I2C interrupt behavior). We’d need to check for stop conditions, not just at the top of the loop, but at each of these points.

I think it’s much easier to get this right by separating checking for stop conditions, like I’ve done here.

So what did we just do?

Whenever we use fancy language features to express code in a new way, I think it’s important to pause and consider what the program will do in practice.

On lilos, as in any Rust async runtime, top-level free-running Futures are called tasks. Tasks get polled in response to events. The runtime has no visibility inside the task – the select_biased! and other constructs are just code.

So, with the event handler we just wrote, the i2c_task will run up to the first await, which is waiting for the address interrupt signaling that a transaction has begun, and then suspend (i.e. the Future will return Pending). The runtime can poll i2c_task’s Future all it wants, but each time i2c_task will open its sleepy eyes just a crack, see that the address bit has not been set, and go right back to sleep.

Once we get past that point, things get interesting.

When the code is polled with the address bit set, it will run until it reaches the select_biased!. select_biased! will begin by polling each of its branch futures. In our case, the futures all do roughly the same thing: enable a subset of interrupt conditions in the I2C controller, and then suspend waiting for the interrupt to occur.

If all the branch futures return Pending from poll – which ours likely will, the first time – then the select_biased! itself suspends. On future poll of our i2c_task, select_biased! will get the opportunity to try its branches again, and this will continue until one completes.

From this point forward, if an I2C interrupt occurs, the pattern of activity is:

  1. Interrupt handler pokes our Notify.
  2. lilos executor wakes from sleep and polls the i2c_task future.
  3. The i2c_task future delegates to select_biased!.
  4. The error_condition future gets polled.
  5. The stop_condition future gets polled.
  6. The restart future gets polled.
  7. The handle_data future gets polled.
  8. If none of them resolve, then we stay suspended.

I noted earlier that the Notify type in lilos takes a predicate closure to decide when to wake up, to avoid spurious wakes. The list above shows why. When an I2C IRQ occurs, generally just one of our branch futures will want to respond to it, but all will be polled. From the perspective of all the other branches, the wake is spurious and no work should be done.

It’s also possible to get legitimately spurious wakeups, of course. Some hardware has a bad habit of generating interrupt requests when no actual event has happened. There are usually software mechanisms as well – lilos actually provides one, lilos::exec::wake_task_by_index, because you need it if you want to implement your own mechanisms like Notify.

In our case, the list of polls above is a pretty accurate depiction of what the compiled code will do. This is kind of a lot of work for responding to an event, so the actual driver – the one I wrote for a project instead of an article – simplifies things. If you revisit the code above, you’ll notice that three of the select branches all do the same thing – nothing, followed by repeating the loop. This suggests that they could be merged, and that’s what I did. The driver has an equivalent of our handle_data, and then a single terminal_condition branch for checking errors, stops, and restarts.

This makes the generated code smaller and a little bit faster, which is always nice when you’re resource-constrained.

Summary

If you’re a person who often has to write drivers, or just driver-curious, I hope this approach has caught your interest. The ability to factor different parts of a state machine in the implementation, and then combine them together, seems very powerful in cases like this, where a need can be described as “decorate every await point with an X.”

There are a lot of opportunities to use async in state machines, and I’m only starting to scratch the surface in my work. Existing tools like the futures crate are written around the “use async to simulate threads” use case (and the std, high-resources use case), which means they don’t always serve me well in my embedded work. select_biased! works in this driver, for instance, but is syntactically awkward and requires the .fuse() boilerplate – which also bloats code size.

I’ll post more on this as I uncover it!

Appendix: the full driver

Here’s the actual STM32 I2C driver code, with all the warts preserved. I’ve only simplified it by removing the parts that were specific to the protocol I was implementing.

use core::convert::Infallible;
use core::sync::atomic::{AtomicU32, Ordering};

use device::gpio::vals::Moder;
use device::i2c::vals::{Addmode, Reload, Dir};
use futures::{select_biased, FutureExt as _};
use lilos::atomic::AtomicArithExt;
use lilos::exec::Notify;
use lilos::util::FutureExt;

use stm32_metapac as device;
use device::interrupt;

// TODO make programmable
const ADDR7: u8 = 0b1100_101;

/// Bus error counter, intended for consumption by a debugger.
#[used]
static ERR_BUS: AtomicU32 = AtomicU32::new(0);
/// Arbitration lost error counter, intended for consumption by a debugger.
#[used]
static ERR_ARLO: AtomicU32 = AtomicU32::new(0);

/// Event used for our interrupt.
static EVT: Notify = Notify::new();

/// I2C management task.
pub async fn task(
    rcc: device::rcc::Rcc,
    gpiob: device::gpio::Gpio,
    i2c: device::i2c::I2c,
) -> Infallible {
    init(rcc, i2c, gpiob);

    unsafe {
        cortex_m::peripheral::NVIC::unmask(device::Interrupt::I2C1);
    }

    // Transaction handling loop. We come up to the top of this whenever we're
    // idle, or if we're re-addressed during a transaction.
    loop {
        // Wait until we're addressed.
        i2c.cr1().modify(|w| w.set_addrie(true));
        EVT.until(|| i2c.isr().read().addr()).await;

        // We don't want to clear ADDR immediately, because we need to set up
        // the first byte of the transaction first. Clearing ADDR ends our clock
        // stretch and, for a read operation, starts transmission of our first
        // response byte.

        i2c.cr2().modify(|w| {
            // We're going to move one byte.
            w.set_nbytes(1);
            // It won't be the last one.
            w.set_reload(Reload::NOTCOMPLETED);
        });
        // Clear ADDR flag, ending any clock stretching.
        i2c.icr().write(|w| w.set_addrcf(true));

        // Use select to compose the end-of-transaction monitor with the data
        // handler. If we get a stop condition, error, or repeated start, which
        // we must eventually, the data handler will be cancelled.
        select_biased! {
            result = terminal_condition(i2c).fuse() => {
                match result {
                    Ok(()) => {
                        // stop condition
                    }
                    Err(Error) => {
                        // error
                    }
                }
            }
            never = handle_data(i2c).fuse() => match never {}
        }
    }
}

/// Initialization routine, factored out of task.
fn init(
    rcc: device::rcc::Rcc,
    i2c: device::i2c::I2c,
    gpiob: device::gpio::Gpio,
) {
    // Un-gate clock to our I2C block.
    rcc.apbenr1().modify(|w| w.set_i2c1en(false));
    cortex_m::asm::dsb();

    // Reference Manual 23.4.5

    // The I2C block comes out of reset with PE=0. For space reasons we'll skip
    // the portion of the init flow from the Reference Manual that's intended
    // for making sure you can safely re-initialize. We have no intention of
    // re-initializing (without a reset).
    
    // Leave analog and digital filters in reset configuration (analog filter
    // on, digital filter off). TODO: research and enable the digital filter.

    // TODO: configure SDADEL and SCLDEL after measuring this on scope - this
    // leaves the prescaler at 1x

    gpiob.moder().modify(|w| {
        w.set_moder(6, Moder::ALTERNATE);
        w.set_moder(7, Moder::ALTERNATE);
    });

    // Configure peripheral to respond to our address.
    i2c.oar1().write(|w| {
        w.set_oa1(u16::from(ADDR7 << 1));
        w.set_oa1mode(Addmode::BIT7);
        w.set_oa1en(true);
    });

    i2c.cr1().write(|w| {
        // Byte-level acking
        w.set_sbc(true);
        // Peripheral on
        w.set_pe(true);
    });
}

async fn handle_data(
    i2c: device::i2c::I2c,
) -> ! {
    // Determine the direction of the transfer.
    match i2c.isr().read().dir() {
        Dir::WRITE => receive_data(i2c).await,
        _ => transmit_data(i2c).await,
    }
}

/// Processes an infinite stream of incoming data until it's cancelled.
///
/// This is required to process an _infinite_ stream because the length of the
/// stream is not under our control. Even after we NACK a byte, the host is free
/// to continue clocking us for all eternity if it wishes. It's not clear that
/// the STM32 I2C block lets us abort a transaction / start behaving as though
/// our address never matched, so to avoid stretching the clock, we have to keep
/// on NACKing bogus bytes.
async fn receive_data(i2c: device::i2c::I2c) -> ! {
    let mut nacking = false;

    loop {
        // Ensure we find out about any data that appears.
        i2c.cr1().modify(|w| w.set_rxie(true));

        // Wait for data, being sure to turn our interrupt enable back off if
        // we're cancelled. (The ISR will turn it off if we wake.)
        EVT.until(|| i2c.isr().read().rxne())
            .on_cancel(|| i2c.cr1().modify(|w| w.set_rxie(false)))
            .await;

        // Pull the data out.
        let byte = i2c.rxdr().read().rxdata();

        // As a demonstration, this interface will ACK 0 and NACK any non-zero
        // byte. This is a stand-in for an actually useful device interface.
        nacking |= byte != 0;

        i2c.cr2().modify(|w| {
            // Clear NACK flag so we ACK
            w.set_nack(nacking);
            // We're going to ACK/NACK after the next byte.
            w.set_nbytes(1);
            // It won't be the last one.
            w.set_reload(Reload::NOTCOMPLETED);
        });
    }
}

/// Generates an infinite stream of outgoing data until it's cancelled.
///
/// The rationale for the stream being infinite is the same as for
/// `receive_data` above: the host controls the duration of the transfer, not
/// us. So, we have to generate data forever if they ask us to. In this
/// direction we can't even NACK to express our displeasure. Instead, any
/// outgoing data transfer needs to naturally pad itself with ... something.
async fn transmit_data(
    i2c: device::i2c::I2c,
) -> ! {
    loop {
        // Ensure we find out about any data that appears.
        i2c.cr1().modify(|w| w.set_txie(true));

        // Wait for events, turning that back off if we're cancelled. (The ISR
        // will turn it off if we wake.)
        EVT.until(|| i2c.isr().read().txis())
            .on_cancel(|| i2c.cr1().modify(|w| w.set_txie(false)))
            .await;

        // Send the next byte. This is a placeholder for reading useful data out
        // of the device, for the purposes of this example.
        let byte = 0xAA;
        i2c.txdr().write(|w| w.set_txdata(byte));

        // Wait for TCR event.
        i2c.cr1().modify(|w| w.set_tcie(true));

        // Wait for events, turning that back off if we're cancelled. (The ISR
        // will turn it off if we wake.)
        EVT.until(|| i2c.isr().read().tcr())
            .on_cancel(|| i2c.cr1().modify(|w| w.set_tcie(false)))
            .await;

        i2c.cr2().modify(|w| {
            // We'll do one byte again.
            w.set_nbytes(1);
            // It won't be the last one.
            w.set_reload(Reload::NOTCOMPLETED);
        });
    }
}

async fn terminal_condition(i2c: device::i2c::I2c) -> Result<(), Error> {
    // Ensure we'll get interrupts on the conditions we're monitoring.
    i2c.cr1().modify(|w| {
        w.set_addrie(true);
        w.set_stopie(true);
        w.set_errie(true);
    });

    // Make sure all our interrupt enables are clear no matter how we leave the
    // routine -- the ISR will have cleared one of them but probably not all. (We
    // likely won't be cancelled because of how the driver is constructed, but
    // it's still nice to do things the right way.)
    scopeguard::defer! {
        i2c.cr1().modify(|w| {
            w.set_addrie(false);
            w.set_stopie(false);
            w.set_errie(false);
        });
    }

    let result = EVT.until(|| {
        let isr = i2c.isr().read();
        let berr = isr.berr();
        let arlo = isr.arlo();
        if berr {
            ERR_BUS.fetch_add_polyfill(1, Ordering::Relaxed);
        }
        if arlo {
            ERR_ARLO.fetch_add_polyfill(1, Ordering::Relaxed);
        }
        if berr || arlo {
            Some(Err(Error))
        } else if isr.stopf() || isr.addr() {
            Some(Ok(()))
        } else {
            None
        }
    }).await;

    // If we've detected our event, clear our flags for next time. Do NOT
    // clear the ADDR flag, as that has side effects! It will be processed in
    // the outer loop.
    i2c.icr().write(|w| {
        w.set_stopcf(true);
        w.set_berrcf(true);
        w.set_arlocf(true);
    });

    result
}

/// Internal, barely-used error type.
#[derive(Copy, Clone, Debug)]
struct Error;

#[interrupt]
fn I2C1() {
    let i2c = device::I2C1;
    let cr1 = i2c.cr1().read();
    let isr = i2c.isr().read();

    let mut bits_to_clear = 0;
    if cr1.txie() && isr.txis() {
        bits_to_clear |= 1 << 1;
    }
    if cr1.rxie() && isr.rxne() {
        bits_to_clear |= 1 << 2;
    }
    if cr1.stopie() && isr.stopf() {
        bits_to_clear |= 1 << 5;
    }
    if cr1.addrie() && isr.addr() {
        bits_to_clear |= 1 << 3;
    }
    if cr1.tcie() && isr.tcr() {
        bits_to_clear |= 1 << 6;
    }
    if cr1.errie() && (isr.arlo() || isr.berr()) {
        bits_to_clear |= 1 << 7;
    }

    if bits_to_clear != 0 {
        i2c.cr1().modify(|w| {
            w.0 &= !bits_to_clear;
        });
        EVT.notify();
    }
}