- What problem are we trying to solve?
- The pattern
- Variation: uninitialized memory
- FAQ / Appendix
Here’s another useful Rust pattern. Like the Typestate Pattern before it, I wrote this because I haven’t seen the sort of obsessively nerdy writeup that I wanted to read. And, as with the Typestate Pattern, I didn’t invent this — I’m merely documenting and generalizing it.
What problem are we trying to solve?
Your program uses a buffer or other chunk of memory. You would prefer not to allocate this memory from the heap — perhaps because your application doesn’t have a heap, or because you want to get any out-of-memory errors at link time, instead of at run-time. You would also prefer to interact with this memory from safe code.
First attempt: as we would write it in C
You can declare a block of memory, fixed in location at link time and independent from the heap, in Rust using essentially the same technique you’d use in C:
static mut BUFFER: = ;
If you’re targeting an embedded device with limited RAM, and you try to declare more of these than will fit in memory, you’ll get a failure at link time. This is good! Better to discover you’re out of memory where your customers can’t see.
However, access to
static mut is unsafe in Rust. This means every time you
want to interact with this buffer, you’re stuck doing something like this:
The simplest approach will scatter little
unsafe blocks all over the codebase,
making them very difficult to reason about!
unsafe is problematic in part
because it destroys your ability to reason locally about code, so it’s best to
keep all the
unsafe related to a thing in one place.
“Well,” you might say, “no problem, I will extract a function and not repeat the
This has made things worse: you’ve replaced code that is
unsafe with code that
is unsound! Off the top of my head, this function can be used to break two of
Rust’s safety rules:
You can get a reference to
&BUFFERand, while holding it, call
write_buffer— altering the contents of the reference that was supposed to be immutable. This is some serious undefined behavior.
Every call to
write_bufferis a potential data race; nothing prevents you from hammering the buffer from multiple threads (or from prioritized interrupt service routines). Also undefined behavior.
There might be others, but that’s enough for me to be wary!
Second attempt: using module visibility
You can bound these risks by isolating
BUFFER in its own module with accessor
routines, so that only the accessors can see
BUFFER. This way, you can reason
about how many sites in the code might create a
&mut BUFFER or
ensure that they never overlap. (Note that this does not fix the data races.
// Slightly safer if this is the ENTIRE MODULE'S CODE. Adding more code to the // module can break soundness. static mut BUFFER: = ; // End of module, so no other code can directly see `BUFFER`
This approach mostly works for a buffer of simple
u8, as long as you’re happy
with a limited set of encapsulated operations. Each function references the
static mut in a small scope, and references don’t escape. It looks like you
can’t possibly have aliasing references… until you consider threads or ISRs.
If it’s possible for
write to be executed concurrently, we’re back in data
race land. Because nothing in the code prevents this, we’re still unsound: we
have created a bug that is waiting to happen later1.
If you’re still happy with a limited accessor API, you can actually
fix the data races by using
AtomicU8 instead of
u8 — but that winds
up being a different solution. You can declare an array of
AtomicU8 as a
mut) and avoid some headaches — but you can never treat
it as a
[u8], and that can be really limiting.
Breaking the second attempt: but I miss references
Reading and writing single bytes is technically enough I guess but is cramping
your style. You want to be able to get pointers to parts of the buffer, to
process stuff incrementally without keeping track of everything as a
you add this function:
This compiles, but only because you told the compiler not to think too hard
about it. You have created a vending machine that will hand out
references to a data structure willy-nilly. It’s incredibly easy to violate the
aliasing rules with this API:
let slice1 = get_write_slice; let slice2 = get_write_slice; // We are now in undefined behavior land
get_read_slice API that returns
mut) is unsound, because
it doesn’t stop another part of the code from sneaking behind your back and
We need a better approach, something that lets us manipulate references however we want without creating subtle and potentially catastrophic bugs.
This is the problem the First-Mover Allocator Pattern addresses.
The basic problem here, and part of the reason why
static mut is unsafe to
access in the first place, is that we can’t reason at compile-time about how
it will be accessed. It’s a global! Will all operations come from one thread, or
many? Are any of the operations reentrant, deliberately or accidentally?
The idea behind the First-Mover Allocator is to spend a small amount at run-time — one byte of RAM and a few instructions — to let us ignore the problem at compile-time. If that sounds like the opposite of what Rust usually tries to do, you’re right. Getting stuff right at compile time is great, but if you can’t, run-time is the second best place to get it right2.
The pattern works like this:
Ensure there is a single point in the program, both textually in code, and at run-time, where the static structure gets used directly. Manipulate it through references everywhere else.
Use a runtime check to detect hitting that point in code more than once, which would mean it isn’t really a single point.
Here is the simplest case, for the 1kiB buffer we saw in the previous section:
use ; /// Returns an exclusive reference to the static buffer to the first /// caller. There should not be a second caller, but if there is, they /// will panic.
This returns a
&mut to a static-scope buffer. The recipient of that
do with it as they wish — anything they can do in safe code is sound:
get_bufferreturns exactly one
BUFFER. No other references to
BUFFERexist, because the caller simply can’t see it — it’s confined to the function body. Were they to call
get_bufferagain, they would not get a new
&mut. (They would
panic.) Thus we avoid aliasing of the buffer.
&mutitself can’t be shared across threads (though it can be sent between threads if the buffer type is
Send), so we’re not concerned about data races.
The caller could re-borrow all or part of the buffer as non-exclusive references, which could then be shared across threads if the buffer’s type is
Sync— but being
Syncmeans that it prevents data races internally, so we’re still good.
It’s implied by some of the points above, but worth calling out explicitly:
why are the
static declarations inside the function body? They don’t
strictly have to be, but it comes back to that drum I can’t stop beating:
local reasoning. If the
static items were at module scope, and the module
were longer than this one function, then the possible places that could touch
BUFFER and violate our invariants gets larger. Putting them inside the
function means that any access has to appear in the function body, so if you
want to convince yourself that any accesses are done correctly, you don’t have
to read any other code.
There is actually another pattern that can be used to prove
properties of this at compile time. I need to devote a separate article to it,
but the basic idea is handing out tokens to threads that are not
using possession of such a token (i.e. passing it as an argument) as evidence
that the caller is confined to the associated thread. There is a simple
example of using this to detect whether API is being called from interrupt
handlers in my m4vga source.
If you’re writing Rust, and worrying about things like static memory allocation, you’re probably concerned about code size, performance, or both. The First-Mover Allocator moves a check from compile-time to run-time. How much does this hurt?
As a worst case: consider what we’re literally asking the machine to do.
In addition to our static buffer, we’ve allocated a single byte for bookkeeping. Assuming your buffer is of nontrivial size, that’s probably affordable.
get_buffer-style function is called exactly once in a working program. (It could also be called zero times, but in that case, the performance is fantastic, so I won’t consider it further.)
get_bufferperforms one atomic swap of a byte in memory, followed by a conditional branch. This is a handful of instructions.
Because it happens exactly once, the execution time of
get_bufferisn’t a huge concern — but the “atomic swap and conditional branch” code will also be reasonably quick on a normal CPU.
Note that the return value of
get_bufferis an array reference, not a slice. If it returned a slice (
&[u8]) then the caller would lose information about the length of the array, and would likely add a bunch of bounds checks that would be trivially removed when accessing the
staticdirectly. However, an array reference (
&[u8; 1024]) is just a pointer and carries length information, so at worst the caller only loses the link-time knowledge of which absolute address it will be accessing. This can cause lower performance code to be generated, particularly on register-poor architectures like x86, but hasn’t shown up in any of my measurements in practice.
As for the practical case: in my tests, the overhead is easy for the compiler to eliminate.
The allocator function
get_bufferis called in only one place in the code (in a working program), so there’s little reason not to inline it.
Once inlined, the caller is now directly accessing the
static— only in the ways
get_bufferdid — enabling the compiler to optimize access code as though it had been directly accessing the
So we’re left with an atomic swap and conditional branch. In my applications, I can afford this in exchange for greater safety and improved local reasoning.
Variation: uninitialized memory
When you declare a
static like we did above:
static mut BUFFER: = ;
you are declaring an initialized chunk of memory — here, initialized to zero, though you could initialize it to any value you like.
Static initialization isn’t magic, and on embedded systems, particularly bare
metal embedded systems, it’s a process you will need to think about. The normal
way for a
static like this to get initialized is during the runtime startup
phase of execution, which happens before the Rust
main is called. The runtime
r0 crate, but you could provide your own) is responsible for
filling out all initialized memory with predefined patterns, like the zeros we
Normally, this is what you want.
Not all RAM is initialized by this process. Stack is typically left uninitialized, as are heaps used by “real” allocators, and — critically for embedded systems — initialization typically only applies to a single contiguous RAM address range.
But what if your system has other kinds of RAM? In particular, consider the STM32 series, which (like many other microcontrollers) sport a “USB SRAM” that is essentially dedicated to the USB controller. This SRAM is in an entirely different section of address space, and has weird access rules, so we can’t expect the runtime to initialize it. If we want to place a buffer there, we’re on our own.
In the examples that follow, I will assume that the linker script you’re using
places anything assigned
link_section = ".usbsram" to the USB SRAM instead of
The first option is to pretend the problem doesn’t exist.
// la la la everything is fine PLEASE DON'T DO THIS static mut BUFFER: = ; // I know that BUFFER isn't *really* initialized // so I am going to do it here: unsafe // Now everything is OK right? ...right?
The problem here is that you’ve just lied to the compiler — you’ve declared
BUFFER as initialized even though it isn’t really. The compiler works hard on
your behalf, and makes use of all the information you give it to produce better
code. If some of that information is dishonest, the compiler will do the wrong
thing. In this case, the optimizer is very likely to notice that you’re
re-zeroing the buffer and remove your loop. Surprise!
We need to come clean, and admit to the compiler that anything we put in
.usbsram doesn’t really get initialized. It would be nice to be able to
express this at a
link_section level, but we can’t, at least not right now. We
can express it for each static item, though:
static mut BUFFER: = uninit;
There — an explicitly uninitialized static buffer. If you’ve read my Learn
Rust The Dangerous Way series, you’ll be no stranger to the
MaybeUninit type; when the solution to your problem involves
you know you’re doing the fun stuff.
To use this uninitialized buffer, we need a slightly more verbose version of
use MaybeUninit; use ; /// Returns an exclusive reference to the static buffer to the first /// caller. There should not be a second caller, but if there is, they /// will panic.
The first thing you probably noticed, particularly if you haven’t played with
uninitialized arrays before, is that we’ve got a lot of
transmute is like a reciprocating saw: it’s always the wrong tool for the job,
except when it isn’t. In this case we use
Once to go from
&mut MaybeUninit<[u8; 1024]>to
&mut [MaybeUninit<u8>; 1024]— that is, to treat an uninitialized array as an array of uninitialized elements. The memory is still uninitialized, but now we can iterate over its uninitialized parts. (Interestingly, this flavor of
transmuteappears to always be safe — so long as you don’t remove the
Once at the end, to go from
&mut [MaybeUninit<u8>; 1024]to plain ol’
&mut [u8; 1024]. This use of
transmuteis only safe because we laboriously went through and ensured that all the array elements were initialized.
Speaking of laborious initialization, we used explicit pointer writes to
initialize the array. This is important to remember in general — failing to do
this is always undefined behavior — but can cause serious practical problems if
the type in your array has a
Drop impl. Most
Drop impls access the contents
of the object being dropped — say, to drop each field. If you use simple
assignment to fill out uninitialized copies of a type with a
Drop impl, you’re
instructing the compiler to drop each one before writing the new one… and
dropping arbitrary contents of uninitialized memory is sure to be bad sooner or
(In reality, USB SRAM is shared with the USB controller via DMA, so the correct
way to access it is more nuanced and should likely be using
In practice, initialize your buffers. The technique shown here is not a performance win — letting the runtime initialize memory will basically always be faster than doing it yourself. This is more code and more complexity and can introduce bugs if done incorrectly. Use uninitialized buffers when:
Your buffer exists in some RAM that cannot be initialized by the runtime (as in this example).
Your buffer is large, and you know the program will write it before it attempts to read. For example, an incoming packet buffer. In this case, you want to return the
&mut [MaybeUninit<T>; N]and let the packet driver handle initializing it later.
FAQ / Appendix
Why are you calling this an allocator? Because it’s a mechanism that
allocates memory. It happens to only manage one block of memory, and it will
only allocate it once, and cannot
free, but it’s still an allocator.
Why “First-Mover”? This pattern is one possible way to safely manage access to static buffers in Rust. This particular way privileges the first caller at the expense of any later callers. This isn’t the only way to do it, so I think it’s important to put something in the name that distinguishes this from other approaches. The name is an analogy to the phrase “first-mover advantage.”
Shouldn’t this just be a macro? Sure; the
cortex_m crate provides a
singleton! macro that implements one version of this. Assuming you don’t need
control over what happens on the error case, and you don’t need any special
initialization behavior, it will work fine. But it’s easy to paint yourself into
a corner where you really do need to write the code yourself (which is how I got
here). Given that the manual implementation is less than 10 lines of code, it’s
worth understanding how it works — even if you usually use it through a macro.
Can you just use
Option::take? Only in a single-threaded context, because
Option with another variant of the same
done with an atomic swap. This means it can’t replace the atomic
I don’t have to do this in C, why is it necessary in Rust? The fact that a C
compiler doesn’t encourage you to do this doesn’t mean you shouldn’t. It’s
dramatically simpler to declare and use
static buffers in C. It’s also usually
a bad idea. It makes code non-reentrant and makes thread safety difficult. It
has been a common source of bugs and unexpected behaviors in C programs dating
back to the language’s origins (there are several cases of this in the early
Unix kernel and standard libraries). Rust is specifically concerned with
checking thread safety and data races at compile time, so this pattern sets off
alarm bells in the compiler, requiring us to put in extra work to reassure it
— and protect ourselves from our own mistakes!
Okay, so, can I use this pattern in C to make static buffers less
error-prone? Kind of. The basic pattern, where a
static is kept in an
isolated scope and only handed to the first caller who asks, works in C, and can
be valuable. (I’ve used it.) However, because C compilers don’t reason about
pointer aliasing, once the pointer has been returned, it’s easy to break things
by accident. This is why I present the pattern as a Rust pattern — it
works in other languages but is particularly useful in Rust.