#rust

From Hubris To Bits

The embedded platform we’ve built for firmware at Oxide is called Hubris. It’s unusual for a microcontroller operating system, and probably the biggest thing that makes it unusual is its use of separately-compiled tasks.

Most firmware applications mash all of their task and OS code together in a common memory space, which is simple and efficient, but can lead to subtle bugs. Hubris instead places bits of code in their own isolated memory areas, where they can run (and crash!) separately. This requires that each bit be compiled as a standalone program.

The CPUs we target don’t have virtual memory, so each of these separate programs has to be laid out at a known place in the address space. This introduces some challenges, and has prevented us from “just” using an off-the-shelf build system.

This post will walk through the process of building a Hubris application from source, from the perspective of the build system, and examine some of these challenges and how we addressed them.

Putting custom firmware on the WASD CODE v2

I have a WASD CODE v2 tenkeyless keyboard, which has been my daily driver for work since about 2017. It’s a great keyboard… mechanically. But its control electronics are fixed-function and don’t quite work the way I want — to say nothing about fancy features like additional key layers.

So I fixed it. Now it runs QMK. (Or some bare-metal custom Rust firmware.) This has let me…

  • Try out QMK without the expense (or waste!) of buying a whole new keyboard.
  • Add features that were missing in the original, like USB n-key rollover (NKRO), 1 kHz polling, backlight fade effects, additional key layers, and key macros.
  • Customize the part of the computer I interact with the most!

You can fix your keyboard, too:

(If you’re from the rest of the world and would like one, contact me! I have to set up each country separately, and I haven’t had time to do all two-hundred-and-some-odd.)

If it’s out of stock on either store, join the waitlist and I’ll make more.

A green circuit board nestled between a keyboard’s arrow keys and its insert/delete cluster.

The user manual is pretty basic, but I’ll flesh it out further if the boards sell!

Read on below for more background.

lilos v1.0 released

After five years of development, something like seven art projects, one commercial product, and many changes to the dark corners of the Rust language, I’ve decided lilos is ready for a 1.0 release!

Some parts I’m excited about include:

  • As of this release, the lilos APIs are entirely cancellation-safe.

  • This release contains contributions from five other people, bringing the total number of contributors to seven! (Want to be number eight? Come say hi!)

  • Thanks to one of those contributors, the operating system tests are now running in CI on QEMU!

(For anyone who’s new, lilos is a tiny embedded operating system that uses Rust async to allow complex multitasking on very limited microcontrollers without requiring dynamic memory allocation. Read more about lilos on my project page, where I link to the docs and provide a curated collection of blog posts on the topic.)

See the release notes if you’re curious about what’s changed. If you’ve got firmware written for an earlier version of lilos (particularly the 0.3.x series) and would like to update (you don’t have to!), those release notes will guide you through the process. There have been some breaking API changes, but I promise they’re all improvements.

The server chose violence

I’m continuing to reflect on the past four years with Hubris — April Fool’s Day was, appropriately enough, the fourth anniversary of the first Hubris user program, and today is the fourth anniversary of the first kernel code. (I wrote the user program first to help me understand what the kernel’s API wanted to look like.)

Of all of Hubris’s design decisions, there’s one that gets a “wait what” response more often than any other. It’s also proving to be a critical part of the system’s overall robustness. In this post, I’ll take a look at our 13th and oddest syscall, REPLY_FAULT.

Who killed the network switch?

We found a neat bug in Hubris this week. Like many bugs, it wasn’t a bug when it was originally written — correct code became a bug as other things changed around it.

I thought the bug itself, and the process of finding and fixing it, provided an interesting window into our development process around Hubris. It’s very rare for us to find a bug in the Hubris kernel, mostly because it’s so small. So I jumped at the opportunity to write this one down.

This is a tale of how two features, each useful on its own, can combine to become a bug. Read on for details.

I made a thing to help you make a thing out of a keypad

I do a lot of electronics projects in my spare time, and I tend to try to make reusable parts to save myself effort in the future. Because I have to order ingredients in certain quantities, I often wind up with more than I need for my project.

So I’ve opened a Tindie store, called Overengineered Widget Laboratories. Right now there’s one product in the store, called Keypad:GO. See, I built a sculpture last summer that needed to interact with people through a phone-style keypad. The keypad interface part of it seemed like something other people could use, so I made a few extras. This is a very easy way to interface a keypad or small keyboard to an embedded electronics project, because it handles all the basics for you — matrix scanning, debouncing, key matrix collisions, etc.

It will also help you reverse engineer the keypad’s circuit, because often cheap keypads arrive without good documentation. In the tiny flash of the embedded microcontroller, I’ve packed a setup wizard that will walk you through the process of setting up the keypad of your choice. All you need is a terminal program. This is honestly my favorite part, and I demonstrate it in the video below.

Mutex without lock, Queue without push: cancel safety in lilos

I’m trying to do something kind of unusual with lilos: in addition to almost all the APIs being safe-in-the-Rust sense, I’m also attempting to create an entire system API that is cancel-safe. I’ve written a lot about Rust’s async feature and its notion of cancellation recently, such as my suggestion for reframing how we think about async/await.

My thoughts on this actually stem from my early work on lilos, where I started beating the drum of cancel-safety back in 2020. My notion of what it means to be cancel-safe has gotten more nuanced since then, and I’ve recently made the latest batch of changes to try to help applications built on lilos be more robust by default.

So, wanna nerd out about async API design and robustness? I know you do.

Getting file/line in await traces

I recently posted about my debugger for async Rust, which can generate what I call “await-traces” for async code that’s suspended and not currently running. I mentioned at the time that it appeared possible to get the source code file name and line number corresponding to the await points, but left that for future work.

This is an update describing that future work.

Composing concurrency in drivers

I recently published an article suggesting a different way of looking at async and await in Rust. In it, I discussed strategies for implementing state machines, and explained why I like async as a tool for building such state machines, even without threads.

In this post I’ll work through an example of why I’m so excited about this technique, by building a real driver for a notoriously tricky bus one piece at a time, using lilos.

How to think about `async`/`await` in Rust

(This is a section of the lilos intro guide that people seemed to like, so to increase its visibility, I’m lifting it up into its own post and expanding it a bit. I hope this is a useful companion piece to the post on async debugging I posted this morning.))

Some documentation of Rust async and await has presented it as a seamless alternative to threads. Just sprinkle these keywords through your code and get concurrency that scales better! I think this is very misleading. An async fn is a different thing from a normal Rust fn, and you need to think about different things to write correct code in each case.

This post presents a different way of looking at async that I think is more useful, and less likely to lead to cancellation-related bugs.

Writing a basic `async` debugger

I’m a big fan of Rust’s async feature, which lets you write explicit state machines like straight-line code. One of the operating systems I maintain, lilos, is almost entirely based on async, and I think it’s a killer feature for embedded development.

async is also popular when writing webservers and other network services. My colleagues at Oxide use it quite a bit. Watching them work has underscored one of the current issues with async, however: the debugging story is not great. In particular, answering the question “why isn’t my program currently doing anything” is very hard.

I’ve been quietly tinkering on some tools to improve the situation since 2021, and I’ve recently released a prototype debugger for lilos: lildb. lildb can print await traces for uninstrumented lilos programs, which are like stack traces, but for suspended futures. I wrote this to help me debug my own programs, but I’m publishing it to try and move the discussion on async debugging forward. To that end, this post will walk through what it does, how it derives the information it uses, and areas where we could improve things.

Safely writing code that isn't thread-safe

One of the nice things about the Rust programming language is that it makes it easier to write correct concurrent (e.g. threaded) programs – to the degree that Rust’s slogan has been, at times, “fearless concurrency.”

But I’d like to tell you about the other side of Rust, which I think is under-appreciated. Rust enables you to write programs that are not concurrent. This feature is missing from most other languages, and is a source of much complexity and bugs.

“But wait,” you might be saying, “of course I can write code that isn’t concurrent in Java or Python or C!”

Can you, though? You can certainly write code that ignores concurrency, and would malfunction if (say) used from multiple threads simultaneously. But that’s not the same thing as writing code that isn’t concurrent – code that simply can’t be used concurrently, by compiler guarantee.

In Rust, you can. Let’s look at why you can do it, and why it’s awesome.

Why Rust mutexes look like they do

One of the common complaints I hear from systems programmers who try Rust is about mutexes, and specifically about the Rust Mutex API. The complaints usually go something like this:

  • They don’t want the mutex to contain data, just a lock.
  • They don’t want to have to manage a “guard” value that unlocks the mutex on drop – often, more specifically, they just want to call an unlock operation because they feel like that’s more explicit.

These changes would make the Rust mutex API equivalent to the C/Posix mutex API. In one case I’ve seen someone try to use Mutex<()> and trickery to fake it.

There’s a problem with this, though: these two aspects of Mutex’s design are inextricably linked to one another, and to Rust’s broader safety guarantees – changing either or both of them will open the door to subtle bugs and corruption due to data races.

A C-style mutex API consisting of some bundle of implicitly guarded data, plus lock and unlock functions, isn’t wise in Rust because it allows safe code to easily commit errors that break memory safety and create data races.

Perhaps controversially, I’d argue that this is also true in C. It’s just more obvious in Rust, because Rust rigorously distinguishes between the notion of “safe” code that cannot commit such errors, and “unsafe” code that can commit such errors if it wishes. C does not make this distinction, and as a result, any code using a mutex in C can trivially produce serious, potentially exploitable, bugs.

In the rest of this post I’ll walk through a typical C mutex API, compare with a typical Rust mutex API, and look at what happens if we change the Rust API to resemble C in various ways.

On Hubris And Humility

Last week I gave a talk at the Open Source Firmware Conference about some of the work I’m doing at Oxide Computer, entitled On Hubris and Humility. There is a video of the talk if you’d like to watch it in video form. It came out pretty alright!

The conference version of the talk has a constantly animated background that makes the video hard for some people to watch. OSFC doesn’t appear to be bothering with either captions or transcripts, so my friends who don’t hear as well as I do (or just don’t want to turn their speakers on!) are kind of out of luck.

And so, here’s a transcript with my slides inlined. The words may not exactly match the audio because this is written from my speaker’s notes. And, yes, my slides are all character art. The browser rendering is imperfect.

I’ve also written an epilogue at the end after the initial response to the talk.

The First-Mover Allocator Pattern

(I’ve updated this pattern, since a lot has changed since 2020. The recommendations here should be ready for the Rust 2024 edition, and are closer to correct in a post-pointer-provenance world.)

Here’s another useful Rust pattern. Like the Typestate Pattern before it, I wrote this because I haven’t seen the sort of obsessively nerdy writeup that I wanted to read. And, as with the Typestate Pattern, I didn’t invent this — I’m merely documenting and generalizing it.

Let The Compiler Do The Work

(Series Overview)

In this series so far, we’ve taken a C program and converted it into a faster, smaller, and reasonably robust Rust program. The Rust program is a recognizable descendant of the C program, and that was deliberate: my goal was to compare and contrast the two languages for optimized code.

In this bonus section, I’ll walk through how we’d write the program from scratch in Rust. In particular, I’m going to rely on compiler auto-vectorization to produce a program that is shorter, simpler, portable, and significantly faster… and without any unsafe.

Can it be? Read on…

Making Safe Things From Unsafe Parts

(Series Overview)

In part 4 we took the unsafe code that deals with treating arrays of f64 as arrays of vectors, and we corralled it into a safe API.

In this installment, we’ll look at the remaining reasons why advance is an unsafe fn, and make it safe — not by removing all the unsafe, but by narrowing it down.

This one’s a doozy — the remaining changes to advance are hard to separate, so I’ve packed them all into one section. Now is probably a good time to refill your coffee.

A More Perfect Union

(Series Overview)

In part 3 we found that our use of uninitialized memory was a premature optimization that didn’t actually improve performance. This left us with only one remaining unsafe function, but, boy, is it a doozy.

In this part, I’ll begin the process of corralling its unsafe optimizations into more clearly safe code, by replacing arbitrary pointer casting with a lightweight abstraction.

Measure What You Optimize

(Series Overview)

In part 2 we introduced Rust references, and this was enough to convert one of our inner functions into safe Rust.

The others are still unsafe. There are several reasons for this. In this, the briefest of sections, we’ll tackle the easiest one: deliberate use of uninitialized memory.

References Available Upon Request

(Series Overview)

In the first part of this tutorial we took an optimized C program and translated it to an equivalent Rust program, complete with all the unsafe weirdness of the original: uninitialized variables, pointer casting and arithmetic, etc.

In this section, we’ll begin using Rust’s features to make the program incrementally more robust, while keeping performance unchanged.

Specifically, we’ll begin by introducing references.

You Can't Write C in Just Any Ol' Language

(Series Overview)

In this part of the series, we’ll take a grungy optimized C program and translate it, fairly literally, into a grungy optimized unsafe Rust program. It’ll get the same results, with the same performance, as the original.

Making really tiny WebAssembly graphics demos

I’ve been studying WebAssembly recently, which has included porting some of my m4vga graphics demos. I started with the Rust and WebAssembly Tutorial, which has you use fancy tools like wasm-pack, wasm-bindgen, webpack, and npm to produce a Rust-powered webpage.

And that’s great! But I want to know how things actually work, and those tools put a lot of code between me and the machine.

In this post, I’ll show how to create a simple web graphics demo using none of those tools — just hand-written Rust, JavaScript, and HTML. There will be no libraries between our code and the platform. It’s the web equivalent of bare metal programming!

The resulting WebAssembly module will be less than 300 bytes. That’s about the same size as the previous paragraph.

The Typestate Pattern in Rust

The typestate pattern is an API design pattern that encodes information about an object’s run-time state in its compile-time type. In particular, an API using the typestate pattern will have:

  1. Operations on an object (such as methods or functions) that are only available when the object is in certain states,

  2. A way of encoding these states at the type level, such that attempts to use the operations in the wrong state fail to compile,

  3. State transition operations (methods or functions) that change the type-level state of objects in addition to, or instead of, changing run-time dynamic state, such that the operations in the previous state are no longer possible.

This is useful because:

  • It moves certain types of errors from run-time to compile-time, giving programmers faster feedback.
  • It interacts nicely with IDEs, which can avoid suggesting operations that are illegal in a certain state.
  • It can eliminate run-time checks, making code faster/smaller.

This pattern is so easy in Rust that it’s almost obvious, to the point that you may have already written code that uses it, perhaps without realizing it. Interestingly, it’s very difficult to implement in most other programming languages — most of them fail to satisfy items number 2 and/or 3 above.

I haven’t seen a detailed examination of the nuances of this pattern, so here’s my contribution.

Rewriting m4vgalib in Rust

If this isn’t your first time visiting my blog, you may recall that I’ve spent the past several years building an elaborate microcontroller graphics demo using C++.

Over the past few months, I’ve been rewriting it — in Rust.

This is an interesting test case for Rust, because we’re very much in C/C++’s home court here: the demo runs on the bare metal, without an operating system, and is very sensitive to both CPU timing and memory usage.

The results so far? The Rust implementation is simpler, shorter (in lines of code), faster, and smaller (in bytes of Flash) than my heavily-optimized C++ version — and because it’s almost entirely safe code, several types of bugs that I fought regularly, such as race conditions and dangling pointers, are now caught by the compiler.

It’s fantastic. Read on for my notes on the process.

Prefer Rust to C/C++ for new code.

This is a position paper that I originally circulated inside the firmware community at X. I’ve gotten requests for a public link, so I’ve cleaned it up and posted it here. This is, obviously, my personal opinion. Please read the whole thing before sending me angry emails.

tl;dr: C/C++ have enough design flaws, and the alternative tools are in good enough shape, that I do not recommend using C/C++ for new development except in extenuating circumstances. In situations where you actually need the power of C/C++, use Rust instead. In other situations, you shouldn’t have been using C/C++ anyway — use nearly anything else.

m4vga

You can now view these demos in your browser!

m4vga is a technique/library for hacking the STM32F407 to generate high-quality analog color video signals with just a handful of resistors.

I wrote the C++ version between 2012 and 2015, and rewrote it in Rust in 2019 to put my money where my mouth is.

I did this because it was an immense technical challenge. Read on for details, including links to a series of blog posts I wrote examining the code in detail.

Learn Rust the Dangerous Way

LRtDW is a series of articles putting Rust features in context for low-level C programmers who maybe don’t have a formal CS background — the sort of people who work on firmware, game engines, OS kernels, and the like. Basically, people like me.

I’ve added Rust to my toolbelt, and I hope to get you excited enough to do the same.

  1. Why Learn Rust the Dangerous Way? Introduction and ground rules.

  2. You can’t write C in just any ol’ language: translating a grungy optimized C program into grungy optimized unsafe Rust.

  3. References available upon request: how Rust references are different from pointers, how they are the same, and why we care.

  4. Measure what you optimize: taking a hard look at an optimization based on uninitialized memory, and converting it to safe code that’s just as fast.

  5. A more perfect union: considering alternatives to pointer casting, and how to write safe wrappers for unsafe operations.

  6. Making safe things from unsafe parts: finally converting most of the program to safe code, and making sure that the unsafe bits are safe-ish.

  7. Let the compiler do the work: a bonus section that looks at how we’d write the program idiomatically in native Rust, and rely on auto-vectorization to make it fast.

lilos

Quick links:

Now that Hubris has gotten some attention, people sometimes ask me if my personal projects are powered by Hubris.

The answer is: no, in general, they are not. My personal projects use my other operating system, lilos, which predates Hubris and takes a fundamentally different approach. It has dramatically lower resource requirements and allows more styles of concurrency.