The embedded platform we’ve built for firmware at Oxide is called Hubris.
It’s unusual for a microcontroller operating system, and probably the biggest
thing that makes it unusual is its use of separately-compiled tasks.
Most firmware applications mash all of their task and OS code together in a
common memory space, which is simple and efficient, but can lead to subtle bugs.
Hubris instead places bits of code in their own isolated memory areas, where
they can run (and crash!) separately. This requires that each bit be compiled as
a standalone program.
The CPUs we target don’t have virtual memory, so each of these separate programs
has to be laid out at a known place in the address space. This introduces some
challenges, and has prevented us from “just” using an off-the-shelf build
system.
This post will walk through the process of building a Hubris application from
source, from the perspective of the build system, and examine some of these
challenges and how we addressed them.
I’m continuing to reflect on the past four years with Hubris — April Fool’s
Day was, appropriately enough, the fourth anniversary of the first Hubris user
program, and today is the fourth anniversary of the first kernel code. (I wrote
the user program first to help me understand what the kernel’s API wanted to
look like.)
Of all of Hubris’s design decisions, there’s one that gets a “wait what”
response more often than any other. It’s also proving to be a critical part of
the system’s overall robustness. In this post, I’ll take a look at our 13th and
oddest syscall, REPLY_FAULT.
We found a neat bug in Hubris this week. Like many bugs, it wasn’t a bug when
it was originally written — correct code became a bug as other things
changed around it.
I thought the bug itself, and the process of finding and fixing it, provided an
interesting window into our development process around Hubris. It’s very rare
for us to find a bug in the Hubris kernel, mostly because it’s so small. So I
jumped at the opportunity to write this one down.
This is a tale of how two features, each useful on its own, can combine to
become a bug. Read on for details.