Blog

Crash recovery in 256 bytes

(This continues my series of posts on the exhubris tools I’m building, to enable more people to use Hubris in their embedded systems.)

One of Hubris’s strongest features is its ability to handle crashes in drivers and other application logic. It leaves the specific crash handling behavior up to the application programmer through a mechanism called a supervisor.

In this post I’ll look at why I made this decision, how it works in practice, and walk through the exhubris supervisor reference implementation, minisuper. (Spoiler: it’s very small.)

Revisiting Hubris appconfigs

So in my day-job over at Oxide we’ve built this nice embedded operating system called Hubris. If you follow my blog, you’re probably aware of it.

I also build a lot of embedded electronics outside my day-job, and people sometimes ask me (often excitedly!) if they’re using Hubris.

The answer so far is “no.” This is for a variety of reasons, but probably the biggest: it’s actually quite difficult to use Hubris for anything if you don’t want your code to live in the Oxide Hubris repo!

I would like to fix this, to enable other teams to use Hubris without having to coordinate with Oxide (or even publish their source code!). I’m starting by trying to address the needs of a single friendly customer: me.

As of this week I have it working, in a set of tools I call exhubris. It’s not by any means done (or all that pleasant to use). I’m going to write some posts about it, to help me think through the design process, and (more importantly!) to solicit feedback from my readers on where they think things should go.

This first post starts with the part of Hubris most users encounter first: the application configuration file, or appconfig.

From Hubris To Bits

The embedded platform we’ve built for firmware at Oxide is called Hubris. It’s unusual for a microcontroller operating system, and probably the biggest thing that makes it unusual is its use of separately-compiled tasks.

Most firmware applications mash all of their task and OS code together in a common memory space, which is simple and efficient, but can lead to subtle bugs. Hubris instead places bits of code in their own isolated memory areas, where they can run (and crash!) separately. This requires that each bit be compiled as a standalone program.

The CPUs we target don’t have virtual memory, so each of these separate programs has to be laid out at a known place in the address space. This introduces some challenges, and has prevented us from “just” using an off-the-shelf build system.

This post will walk through the process of building a Hubris application from source, from the perspective of the build system, and examine some of these challenges and how we addressed them.

Putting custom firmware on the WASD CODE v2

I have a WASD CODE v2 tenkeyless keyboard, which has been my daily driver for work since about 2017. It’s a great keyboard… mechanically. But its control electronics are fixed-function and don’t quite work the way I want — to say nothing about fancy features like additional key layers.

So I fixed it. Now it runs QMK. (Or some bare-metal custom Rust firmware.) This has let me…

  • Try out QMK without the expense (or waste!) of buying a whole new keyboard.
  • Add features that were missing in the original, like USB n-key rollover (NKRO), 1 kHz polling, backlight fade effects, additional key layers, and key macros.
  • Customize the part of the computer I interact with the most!

You can fix your keyboard, too:

(If you’re from the rest of the world and would like one, contact me! I have to set up each country separately, and I haven’t had time to do all two-hundred-and-some-odd.)

If it’s out of stock on either store, join the waitlist and I’ll make more.

A green circuit board nestled between a keyboard’s arrow keys and its insert/delete cluster.

The user manual is pretty basic, but I’ll flesh it out further if the boards sell!

Read on below for more background.

lilos v1.0 released

After five years of development, something like seven art projects, one commercial product, and many changes to the dark corners of the Rust language, I’ve decided lilos is ready for a 1.0 release!

Some parts I’m excited about include:

  • As of this release, the lilos APIs are entirely cancellation-safe.

  • This release contains contributions from five other people, bringing the total number of contributors to seven! (Want to be number eight? Come say hi!)

  • Thanks to one of those contributors, the operating system tests are now running in CI on QEMU!

(For anyone who’s new, lilos is a tiny embedded operating system that uses Rust async to allow complex multitasking on very limited microcontrollers without requiring dynamic memory allocation. Read more about lilos on my project page, where I link to the docs and provide a curated collection of blog posts on the topic.)

See the release notes if you’re curious about what’s changed. If you’ve got firmware written for an earlier version of lilos (particularly the 0.3.x series) and would like to update (you don’t have to!), those release notes will guide you through the process. There have been some breaking API changes, but I promise they’re all improvements.