#embedded

From Hubris To Bits

The embedded platform we’ve built for firmware at Oxide is called Hubris. It’s unusual for a microcontroller operating system, and probably the biggest thing that makes it unusual is its use of separately-compiled tasks.

Most firmware applications mash all of their task and OS code together in a common memory space, which is simple and efficient, but can lead to subtle bugs. Hubris instead places bits of code in their own isolated memory areas, where they can run (and crash!) separately. This requires that each bit be compiled as a standalone program.

The CPUs we target don’t have virtual memory, so each of these separate programs has to be laid out at a known place in the address space. This introduces some challenges, and has prevented us from “just” using an off-the-shelf build system.

This post will walk through the process of building a Hubris application from source, from the perspective of the build system, and examine some of these challenges and how we addressed them.

Putting custom firmware on the WASD CODE v2

I have a WASD CODE v2 tenkeyless keyboard, which has been my daily driver for work since about 2017. It’s a great keyboard… mechanically. But its control electronics are fixed-function and don’t quite work the way I want — to say nothing about fancy features like additional key layers.

So I fixed it. Now it runs QMK. (Or some bare-metal custom Rust firmware.) This has let me…

  • Try out QMK without the expense (or waste!) of buying a whole new keyboard.
  • Add features that were missing in the original, like USB n-key rollover (NKRO), 1 kHz polling, backlight fade effects, additional key layers, and key macros.
  • Customize the part of the computer I interact with the most!

You can fix your keyboard, too:

(If you’re from the rest of the world and would like one, contact me! I have to set up each country separately, and I haven’t had time to do all two-hundred-and-some-odd.)

If it’s out of stock on either store, join the waitlist and I’ll make more.

A green circuit board nestled between a keyboard’s arrow keys and its insert/delete cluster.

The user manual is pretty basic, but I’ll flesh it out further if the boards sell!

Read on below for more background.

lilos v1.0 released

After five years of development, something like seven art projects, one commercial product, and many changes to the dark corners of the Rust language, I’ve decided lilos is ready for a 1.0 release!

Some parts I’m excited about include:

  • As of this release, the lilos APIs are entirely cancellation-safe.

  • This release contains contributions from five other people, bringing the total number of contributors to seven! (Want to be number eight? Come say hi!)

  • Thanks to one of those contributors, the operating system tests are now running in CI on QEMU!

(For anyone who’s new, lilos is a tiny embedded operating system that uses Rust async to allow complex multitasking on very limited microcontrollers without requiring dynamic memory allocation. Read more about lilos on my project page, where I link to the docs and provide a curated collection of blog posts on the topic.)

See the release notes if you’re curious about what’s changed. If you’ve got firmware written for an earlier version of lilos (particularly the 0.3.x series) and would like to update (you don’t have to!), those release notes will guide you through the process. There have been some breaking API changes, but I promise they’re all improvements.

The server chose violence

I’m continuing to reflect on the past four years with Hubris — April Fool’s Day was, appropriately enough, the fourth anniversary of the first Hubris user program, and today is the fourth anniversary of the first kernel code. (I wrote the user program first to help me understand what the kernel’s API wanted to look like.)

Of all of Hubris’s design decisions, there’s one that gets a “wait what” response more often than any other. It’s also proving to be a critical part of the system’s overall robustness. In this post, I’ll take a look at our 13th and oddest syscall, REPLY_FAULT.

Who killed the network switch?

We found a neat bug in Hubris this week. Like many bugs, it wasn’t a bug when it was originally written — correct code became a bug as other things changed around it.

I thought the bug itself, and the process of finding and fixing it, provided an interesting window into our development process around Hubris. It’s very rare for us to find a bug in the Hubris kernel, mostly because it’s so small. So I jumped at the opportunity to write this one down.

This is a tale of how two features, each useful on its own, can combine to become a bug. Read on for details.

I made a thing to help you make a thing out of a keypad

I do a lot of electronics projects in my spare time, and I tend to try to make reusable parts to save myself effort in the future. Because I have to order ingredients in certain quantities, I often wind up with more than I need for my project.

So I’ve opened a Tindie store, called Overengineered Widget Laboratories. Right now there’s one product in the store, called Keypad:GO. See, I built a sculpture last summer that needed to interact with people through a phone-style keypad. The keypad interface part of it seemed like something other people could use, so I made a few extras. This is a very easy way to interface a keypad or small keyboard to an embedded electronics project, because it handles all the basics for you — matrix scanning, debouncing, key matrix collisions, etc.

It will also help you reverse engineer the keypad’s circuit, because often cheap keypads arrive without good documentation. In the tiny flash of the embedded microcontroller, I’ve packed a setup wizard that will walk you through the process of setting up the keypad of your choice. All you need is a terminal program. This is honestly my favorite part, and I demonstrate it in the video below.

An STM32 WFI bug

I really like the STM32 series of microcontrollers in general. They’re generally quite reliable, the peripherals are well tested, and more often than not I can just grab one off the shelf and not think about it too much.

However, like every microcontroller, they do contain implementation bugs, so it’s always important to read the “Errata Sheet” (or in ST’s language, “Device Limitations”) when you’re using a part.

I appear to have hit an implementation bug in certain STM32 lines that is not listed in the errata sheet. I can’t find any specific description of this bug on the internet, so I’ve attempted to nail one down. Hopefully this will come up in the search results for someone who hits this in the future and save them some time.

Mutex without lock, Queue without push: cancel safety in lilos

I’m trying to do something kind of unusual with lilos: in addition to almost all the APIs being safe-in-the-Rust sense, I’m also attempting to create an entire system API that is cancel-safe. I’ve written a lot about Rust’s async feature and its notion of cancellation recently, such as my suggestion for reframing how we think about async/await.

My thoughts on this actually stem from my early work on lilos, where I started beating the drum of cancel-safety back in 2020. My notion of what it means to be cancel-safe has gotten more nuanced since then, and I’ve recently made the latest batch of changes to try to help applications built on lilos be more robust by default.

So, wanna nerd out about async API design and robustness? I know you do.

Composing concurrency in drivers

I recently published an article suggesting a different way of looking at async and await in Rust. In it, I discussed strategies for implementing state machines, and explained why I like async as a tool for building such state machines, even without threads.

In this post I’ll work through an example of why I’m so excited about this technique, by building a real driver for a notoriously tricky bus one piece at a time, using lilos.

Rewriting m4vgalib in Rust

If this isn’t your first time visiting my blog, you may recall that I’ve spent the past several years building an elaborate microcontroller graphics demo using C++.

Over the past few months, I’ve been rewriting it — in Rust.

This is an interesting test case for Rust, because we’re very much in C/C++’s home court here: the demo runs on the bare metal, without an operating system, and is very sensitive to both CPU timing and memory usage.

The results so far? The Rust implementation is simpler, shorter (in lines of code), faster, and smaller (in bytes of Flash) than my heavily-optimized C++ version — and because it’s almost entirely safe code, several types of bugs that I fought regularly, such as race conditions and dangling pointers, are now caught by the compiler.

It’s fantastic. Read on for my notes on the process.

Prefer Rust to C/C++ for new code.

This is a position paper that I originally circulated inside the firmware community at X. I’ve gotten requests for a public link, so I’ve cleaned it up and posted it here. This is, obviously, my personal opinion. Please read the whole thing before sending me angry emails.

tl;dr: C/C++ have enough design flaws, and the alternative tools are in good enough shape, that I do not recommend using C/C++ for new development except in extenuating circumstances. In situations where you actually need the power of C/C++, use Rust instead. In other situations, you shouldn’t have been using C/C++ anyway — use nearly anything else.

Racing the Beam

This post is the fourth in a series looking at the design and implementation of my Glitch demo and the m4vgalib code that powers it.

In part three we took a deep dive into the STM32F407’s internal architecture, and looked at how to sustain the high-bandwidth flow that we set up in part two.

Great, so we have pixels streaming from RAM at a predictable rate — but we don’t have enough RAM to hold an entire frame’s worth of 8-bit pixels! What to do?

Why, we generate the pixels as they’re needed, of course! But that’s easier said than done: generate them how, and from what?

In this article, I’ll take a look at m4vgalib’s answer to these questions: the rasterizer.

A Glitch in the Matrix

This post is the third in a series looking at the design and implementation of my Glitch demo and the m4vgalib code that powers it.

In part two, I showed a fast way to push pixels out of an STM32F407 by getting the DMA controller to run at top speed. I described the mode as follows:

It just runs full-tilt, restricted only by the speed of the “memory” [or memory-mapped peripheral] at either side…

But there’s a weakness in this approach, which can introduce jitter and hurt your video quality. I hinted at it in a footnote:

…and traffic on the AHB matrix, which is very important — I’ll come back to this.

Quite a bit of m4vgalib’s design is dedicated to coordinating matrix traffic, while imposing few restrictions on the application. In this article, with a minimum of movie puns, I’ll explain what that that means and how I achieved it.

Pushing Pixels

This post is the second in a series looking at the design and implementation of my Glitch demo and the m4vgalib code that powers it.

Updated 2015-06-10: clarifications from reader feedback.

For the first technical part in the series, I’d like to start from the very end: getting the finished pixels out of the microprocessor and off to a display.

Why start from the end? Because it’s where I started in my initial experiments, and because my decisions here had significant effects on the shape of the rest of the system.

Introducing Glitch

Hey, look! I made a little graphics demo!

Introducing swddude

I love the ARM Cortex-M series of microcontrollers. The sheer computational power they pack into a teensy, low-power package is almost embarrassing.

But, many Cortex-M parts are small — 4x4 millimeters small — and don’t have the pins left over for JTAG. For these parts, ARM introduced a new debug interface, called SWD.

Unfortunately, SWD isn’t well-supported by open-source tools. Support is in progress in most of them — including my personal favorite, OpenOCD — but I’ve had bad luck so far.

Anton Staaf was having the same issue, and decided to do something about it. He tricked the cheap, commonly-available FTDI FT232H chip into speaking the line-level SWD protocol. We’ve teamed up and, a week or so later, have something to show for it.

m4vga

You can now view these demos in your browser!

m4vga is a technique/library for hacking the STM32F407 to generate high-quality analog color video signals with just a handful of resistors.

I wrote the C++ version between 2012 and 2015, and rewrote it in Rust in 2019 to put my money where my mouth is.

I did this because it was an immense technical challenge. Read on for details, including links to a series of blog posts I wrote examining the code in detail.

lilos

Quick links:

Now that Hubris has gotten some attention, people sometimes ask me if my personal projects are powered by Hubris.

The answer is: no, in general, they are not. My personal projects use my other operating system, lilos, which predates Hubris and takes a fundamentally different approach. It has dramatically lower resource requirements and allows more styles of concurrency.