- Cleaning up offset_Momentum
In the first part of this tutorial we took an optimized C program and translated it to an equivalent Rust program, complete with all the unsafe weirdness of the original: uninitialized variables, pointer casting and arithmetic, etc.
In this section, we’ll begin using Rust’s features to make the program incrementally more robust, while keeping performance unchanged.
Specifically, we’ll begin by introducing references.
Now that we’ve written some very ugly unsafe Rust code, let’s begin the process of cleaning it up, and talk about why this is useful.
Here’s the source code we obtained at the end of the last part:
We’ll begin our quest at the shortest function,
offset_Momentum. Here is its
current implementation, written in a sort of C-Rust pidgin:
Despite its oddness, this function’s only doing one
unsafe thing, and that’s
accessing memory through raw pointers. We can fix this and get to safe Rust in
Why should this be in safe Rust?
Why do we care? Because:
- It will make the code shorter and clearer, but more importantly,
- It will make the code more robust to certain kinds of bugs.
The easiest way to understand the second point is to think about the various ways that the current implementation — and its C ancestor — can fail.
A function that accepts a pointer argument can be handed a legitimate pointer,
NULL pointers aren’t necessarily an error! Many
functions define specific behavior for
NULL — for example, an optional
out-parameter that is filled in unless it’s
NULL. So as a user, it’s not
always clear where
NULL is okay and where it isn’t. You have to read the docs.
In addition to a
NULL pointer, the function could also be called
with a totally bogus pointer, like
(void *) 3, but that’s a lot harder
to do without devious intent. Nevertheless, we’ll make this kind of error
If the caller passes
offset_Momentum, it will access arbitrary
memory at low addresses, and in the best case, it will crash immediately. In
the worst case, it will silently corrupt some important data and then continue
running like nothing happened.
We could defend against this by checking for
NULL. I’ve worked with some
programmers who do this at the top of every function. This makes the failure
precise and reliable (i.e. calling
abort instead of maybe crashing
eventually), but it has a cost at run-time: the program now includes a bunch of
NULL checks, hurting performance and size.
It also has a cost at development time: we, the programmer, have to remember to insert checks at the appropriate places, and then do so! If we miss one, no tool will warn us — we’ll find out much later in a more embarrassing way.
The alternative to making the function defensive is to study every place
where our program calls the function and consider whether
NULL can appear. In
essence, we make sure that nobody ever uses our function wrong. For compact
programs, this is often doable…at first. It’s a hard property to preserve as
the code evolves, though. And in larger programs it’s a non-starter.
Implied array bounds
Under the hood,
offset_Momentum assumes that the
bodies pointer points to
BODIES_COUNT structs. However, nothing about
ensures this, and nothing about its signature communicates this:
Misusing this function is as easy as…
// in C struct body b = ; ; // accesses arbitrary stack memory
In cases like this, the program might crash, or might not. Or might crash sometimes. And that’s the best case; this sort of bug has caused many a security exploit, including Heartbleed.
This kind of misuse is hard to defend against. Sure, you could ask the caller to
pass in a count, like
fread in C, but at best we’d have to check it, and at
worst, they can pass in the wrong count.
Finding this sort of misuse requires us to review every caller. In this particular program, the function is called once; a real program would be bigger, and harder to review.
Initialized vs. uninitialized memory
Sometimes, when a function receives a pointer argument, it’s okay if the memory
it points to is uninitialized — because the function will fill it in.
fread from C is an example.
In other cases, the function will write and read the memory, which then needs
to be initialized. Our
offset_Momentum is in this category — calling it
on uninitialized memory would be, well, bad.
This fact is not apparent from the type signature, and again, misuse is easy:
struct body b; // Uninitialized ;
The program is unlikely to crash, and in non-trivial cases is unlikely to produce a compiler warning2.
On the nbody program, it actually will produce a warning on
recent versions of gcc with
-Wuninitialized — but the warning works if
the errant callsite and the definition of the function are in the same file,
which tends not to be true in programs larger than this one.
Local vs. global reasoning
Our program doesn’t contain any of the errors I just described. Or does it?
You, the programmer, can check for these errors and convince yourself that the program is correct. But to do so, you need to read both the functions and their callers. In a bigger program with complex dataflow, checking might require you to read the entire program.
This is a case of global reasoning. Global reasoning is hard — you might
decide that you’re willing to pay the cost of some
asserts to avoid having to
reason about the whole program. (I would!)
The alternative is local reasoning, where you can convince yourself that a part of a program is correct by reading that part only. I prefer to be able to use local reasoning, because I have other work to do.
asserts allow us to reason locally3. Types, static checking,
and practices like
const-correctness also help. Rust, and in particular the
safe dialect of Rust, is carefully designed to assist you in reasoning locally
rather than globally.
…as long as they can’t be disabled.
assert can be
disabled in C with
-DNDEBUG, and this isn’t unusual in release builds.
Improving robustness with references
Instead of reading carefully to check for those errors, we can prevent all of
them at compile time, by using Rust’s references. Specifically, we’ll switch
bodies parameter from a raw pointer to an array reference.
What’s a reference?
To the computer, a reference4 is exactly the same as a pointer. But to the programmer (us), a reference is a pointer with some additional rules. For now, the important ones are:
It can’t be null.
It must point to initialized memory.
It doesn’t support pointer arithmetic, so it’s never implicitly a pointer to the start of an array.
These rules are part of the type system, so they’re checked at compile time, and have no run-time cost. (In fact, they can enable compiler optimizations that aren’t possible in C.)
C++ also contains something called a “reference.” While Rust uses the same syntax, Rust references are not the same thing as C++ references.
// ,------------------------- Note 1 // | ,-------------- Note 2 // v v
At Note 1 we’ve switched from a
*mut to a
&mut — that is, we’re
requiring the caller to pass us a reference instead of a pointer. That means the
caller is responsible for ensuring that all the reference rules hold —
as we’ll see, usually for free — and so we have no need to check for
Notice at Note 2 that we’re requiring a reference to an array of exactly
BODIES_COUNT structs. If the caller references an array of fewer, they’ll get
a compile error. (They will also get an error if they pass a larger array,
which is technically unnecessary, but fine for our program. Fixing this would
require a new concept of a
slice, which we’ll do another time.)
Now that we’re guaranteed that
bodies is an array of exactly
elements, we know that it’s safe to access elements with indices
0..BODIES_COUNT, and we can go back to using proper square brackets (Note
3). We are technically asking the language to bounds-check our accesses, but
if the compiler can prove to its satisfaction that we remain in-bounds, it will
not produce any run-time bounds-checking code. Cases like this, where numbers
in a fixed range are used to index a fixed-length array, are the easiest. (When
in doubt, try it and check performance.)
With these changes,
offset_Momentum no longer needs to be marked
isn’t doing any
unsafe actions on the inside, and it doesn’t provide the
caller with the ability to violate memory safety. And so,
The final thing we need to do is update its callsite in
main to pass a
reference instead of a pointer:
Here we fulfill all the requirements of a reference5 without runtime cost:
- Since we’re taking the address of a named
static, the result isn’t
null, by definition.
statics get initialized, so we’re not creating an illegal reference to uninitialized memory.
- The type of the array matches.
There is one requirement of
&mut references that I haven’t mentioned
yet, which is that they must not alias one another — a more aggressive
version of C’s
restrict. We satisfy that requirement here, but in a way that
is fragile. I’ll explain why this is, and how to fix it, later.
All that we’ve done is take information that was already implicit in the program
— information that we, the programmer, had used to convince ourselves
that the program was correct — and put it into the actual code. By not
bottlenecking our types through raw pointers, we preserved the fact that the
parameters are never
NULL, are always the size we expect, and have been
In exchange, we gained the right to use the safe
x[i] indexing syntax without
Without reading the whole program, or even understanding the problem we’re
solving, a reviewer can now look at
offset_Momentum and see, structurally,
that it is free of memory safety bugs and data races.
You can apply this technique to our other two functions without running into complications, so I won’t present the diffs in detail. Here’s the program after this transformation:
Inspecting the machine code
rustc 1.39.0, changing only
output_Energy produces almost identical
machine code. Because the compiler is assured that the
bodies array contains
BODIES_COUNT elements, it becomes somewhat more aggressive about unrolling and
pipelining the computation loop.
Measuring the performance
The measurements from
hyperfine show that performance is
unchanged, and possibly improved a bit:
|Command||Mean [s]||Min [s]||Max [s]||Ratio|
|5.277 ± 0.007||5.258||5.282||1.00x|
|5.123 ± 0.024||5.095||5.161||0.97x|
|5.101 ± 0.005||5.093||5.107||0.97x|
This is the promise of safe Rust: similar performance, but with immunity from a variety of common bugs.
In this part, we used references to make more of the program’s behavior explicit. This helps the reader understand the code with local reasoning, and it helps the compiler generate slightly better code.
In part 3, I’ll look at the program’s use of uninitialized memory.