References Available Upon Request

Learn Rust the Dangerous Way, Part 2

2019-12-02

In the first part of this tutorial we took an optimized C program and translated it to an equivalent Rust program, complete with all the unsafe weirdness of the original: uninitialized variables, pointer casting and arithmetic, etc.

In this section, we’ll begin using Rust’s features to make the program incrementally more robust, while keeping performance unchanged.

Specifically, we’ll begin by introducing references.

Cleaning up `offset_Momentum`

Now that we’ve written some very ugly unsafe Rust code, let’s begin the process of cleaning it up, and talk about why this is useful.

Here’s the source code we obtained at the end of the last part:

nbody-1.rs

We’ll begin our quest at the shortest function, offset_Momentum. Here is its current implementation, written in a sort of C-Rust pidgin:

unsafe fn offset_Momentum(bodies: *mut body){
    for i in 0..BODIES_COUNT {
        for m in 0..3 {
            (*bodies.add(0)).velocity[m] -=
                (*bodies.add(i)).velocity[m]
                    * (*bodies.add(i)).mass
                    / SOLAR_MASS;
        }
    }
}

Despite its oddness, this function’s only doing one unsafe thing, and that’s accessing memory through raw pointers. We can fix this and get to safe Rust in one move.

Why should this be in safe Rust?

Why do we care? Because:

It will make the code shorter and clearer, but more importantly,
It will make the code more robust to certain kinds of bugs.

The easiest way to understand the second point is to think about the various ways that the current implementation — and its C ancestor — can fail.

`NULL` pointers

A function that accepts a pointer argument can be handed a legitimate pointer, or the NULL pointer¹. NULL pointers aren’t necessarily an error! Many functions define specific behavior for NULL — for example, an optional out-parameter that is filled in unless it’s NULL. So as a user, it’s not always clear where NULL is okay and where it isn’t. You have to read the docs.

In addition to a NULL pointer, the function could also be called with a totally bogus pointer, like (void *) 3, but that’s a lot harder to do without devious intent. Nevertheless, we’ll make this kind of error harder, too.

If the caller passes NULL to offset_Momentum, it will access arbitrary memory at low addresses, and in the best case, it will crash immediately. In the worst case, it will silently corrupt some important data and then continue running like nothing happened.

We could defend against this by checking for NULL. I’ve worked with some programmers who do this at the top of every function. This makes the failure precise and reliable (i.e. calling abort instead of maybe crashing eventually), but it has a cost at run-time: the program now includes a bunch of code for NULL checks, hurting performance and size.

It also has a cost at development time: we, the programmer, have to remember to insert checks at the appropriate places, and then do so! If we miss one, no tool will warn us — we’ll find out much later in a more embarrassing way.

The alternative to making the function defensive is to study every place where our program calls the function and consider whether NULL can appear. In essence, we make sure that nobody ever uses our function wrong. For compact programs, this is often doable…at first. It’s a hard property to preserve as the code evolves, though. And in larger programs it’s a non-starter.

Implied array bounds

Under the hood, offset_Momentum assumes that the bodies pointer points to at least BODIES_COUNT structs. However, nothing about offset_Momentum itself ensures this, and nothing about its signature communicates this:

void offset_Momentum(body bodies[]);

Rust:

fn offset_Momentum(bodies: *mut body);

Misusing this function is as easy as…

// in C
struct body b = { 0 };
offset_Momentum(&b);  // accesses arbitrary stack memory

In cases like this, the program might crash, or might not. Or might crash sometimes. And that’s the best case; this sort of bug has caused many a security exploit, including Heartbleed.

This kind of misuse is hard to defend against. Sure, you could ask the caller to pass in a count, like fread in C, but at best we’d have to check it, and at worst, they can pass in the wrong count.

Finding this sort of misuse requires us to review every caller. In this particular program, the function is called once; a real program would be bigger, and harder to review.

Initialized vs. uninitialized memory

Sometimes, when a function receives a pointer argument, it’s okay if the memory it points to is uninitialized — because the function will fill it in. fread from C is an example.

In other cases, the function will write and read the memory, which then needs to be initialized. Our offset_Momentum is in this category — calling it on uninitialized memory would be, well, bad.

This fact is not apparent from the type signature, and again, misuse is easy:

struct body b[BODIES_COUNT];  // Uninitialized
offset_Momentum(&b);

The program is unlikely to crash, and in non-trivial cases is unlikely to produce a compiler warning².

On the nbody program, it actually will produce a warning on recent versions of gcc with -Wuninitialized — but the warning works if the errant callsite and the definition of the function are in the same file, which tends not to be true in programs larger than this one.

Local vs. global reasoning

Our program doesn’t contain any of the errors I just described. Or does it?

You, the programmer, can check for these errors and convince yourself that the program is correct. But to do so, you need to read both the functions and their callers. In a bigger program with complex dataflow, checking might require you to read the entire program.

This is a case of global reasoning. Global reasoning is hard — you might decide that you’re willing to pay the cost of some asserts to avoid having to reason about the whole program. (I would!)

The alternative is local reasoning, where you can convince yourself that a part of a program is correct by reading that part only. I prefer to be able to use local reasoning, because I have other work to do.

asserts allow us to reason locally³. Types, static checking, and practices like const-correctness also help. Rust, and in particular the safe dialect of Rust, is carefully designed to assist you in reasoning locally rather than globally.

…as long as they can’t be disabled. assert can be disabled in C with -DNDEBUG, and this isn’t unusual in release builds.

Improving robustness with references

Instead of reading carefully to check for those errors, we can prevent all of them at compile time, by using Rust’s references. Specifically, we’ll switch the bodies parameter from a raw pointer to an array reference.

What’s a reference?

To the computer, a reference⁴ is exactly the same as a pointer. But to the programmer (us), a reference is a pointer with some additional rules. For now, the important ones are:

It can’t be null.
It must point to initialized memory.
It doesn’t support pointer arithmetic, so it’s never implicitly a pointer to the start of an array.

These rules are part of the type system, so they’re checked at compile time, and have no run-time cost. (In fact, they can enable compiler optimizations that aren’t possible in C.)

⁴

C++ also contains something called a “reference.” While Rust uses the same syntax, Rust references are not the same thing as C++ references.

The code

//                           ,------------------------- Note 1
//                           |          ,-------------- Note 2
//                           v          v
fn offset_Momentum(bodies: &mut [body; BODIES_COUNT]) {
    for i in 0..BODIES_COUNT {
        for m in 0..3 {
            //     v----------------------------------- Note 3
            bodies[0].velocity[m] -=
                bodies[i].velocity[m] * bodies[i].mass / SOLAR_MASS;
        }
    }
}

At Note 1 we’ve switched from a *mut to a &mut — that is, we’re requiring the caller to pass us a reference instead of a pointer. That means the caller is responsible for ensuring that all the reference rules hold — as we’ll see, usually for free — and so we have no need to check for violations.

Notice at Note 2 that we’re requiring a reference to an array of exactly BODIES_COUNT structs. If the caller references an array of fewer, they’ll get a compile error. (They will also get an error if they pass a larger array, which is technically unnecessary, but fine for our program. Fixing this would require a new concept of a slice, which we’ll do another time.)

Now that we’re guaranteed that bodies is an array of exactly BODIES_COUNT elements, we know that it’s safe to access elements with indices 0..BODIES_COUNT, and we can go back to using proper square brackets (Note 3). We are technically asking the language to bounds-check our accesses, but if the compiler can prove to its satisfaction that we remain in-bounds, it will not produce any run-time bounds-checking code. Cases like this, where numbers in a fixed range are used to index a fixed-length array, are the easiest. (When in doubt, try it and check performance.)

With these changes, offset_Momentum no longer needs to be marked unsafe: it isn’t doing any unsafe actions on the inside, and it doesn’t provide the caller with the ability to violate memory safety. And so, offset_Momentum is now safe.

The final thing we need to do is update its callsite in main to pass a reference instead of a pointer:

fn main() {
    unsafe {
        offset_Momentum(&mut solar_Bodies);       // <--- Changed!
        output_Energy(solar_Bodies.as_mut_ptr());
        let c = std::env::args().nth(1).unwrap().parse().unwrap();
        for _ in 0..c { advance(solar_Bodies.as_mut_ptr()) }
        output_Energy(solar_Bodies.as_mut_ptr());
    }
}

Here we fulfill all the requirements of a reference⁵ without runtime cost:

Since we’re taking the address of a named static, the result isn’t null, by definition.
statics get initialized, so we’re not creating an illegal reference to uninitialized memory.
The type of the array matches.

⁵

There is one requirement of &mut references that I haven’t mentioned yet, which is that they must not alias one another — a more aggressive version of C’s restrict. We satisfy that requirement here, but in a way that is fragile. I’ll explain why this is, and how to fix it, later.

Evaluation

All that we’ve done is take information that was already implicit in the program — information that we, the programmer, had used to convince ourselves that the program was correct — and put it into the actual code. By not bottlenecking our types through raw pointers, we preserved the fact that the parameters are never NULL, are always the size we expect, and have been initialized.

In exchange, we gained the right to use the safe x[i] indexing syntax without runtime checks.

Without reading the whole program, or even understanding the problem we’re solving, a reviewer can now look at offset_Momentum and see, structurally, that it is free of memory safety bugs and data races.

You can apply this technique to our other two functions without running into complications, so I won’t present the diffs in detail. Here’s the program after this transformation:

nbody-2.rs

Inspecting the machine code

On rustc 1.39.0, changing only output_Energy produces almost identical machine code. Because the compiler is assured that the bodies array contains BODIES_COUNT elements, it becomes somewhat more aggressive about unrolling and pipelining the computation loop.

Measuring the performance

The measurements from hyperfine show that performance is unchanged, and possibly improved a bit:

Command	Mean [s]	Min [s]	Max [s]	Ratio
`./nbody.clang-8.bench 50000000`	5.277 ± 0.007	5.258	5.282	1.00x
`./nbody-1.bench 50000000`	5.123 ± 0.024	5.095	5.161	0.97x
`./nbody-2.bench 50000000`	5.101 ± 0.005	5.093	5.107	0.97x

This is the promise of safe Rust: similar performance, but with immunity from a variety of common bugs.

Conclusion

In this part, we used references to make more of the program’s behavior explicit. This helps the reader understand the code with local reasoning, and it helps the compiler generate slightly better code.

In part 3, I’ll look at the program’s use of uninitialized memory.

#c #rust #tutorial

Cliffle

Cleaning up offset_Momentum