Getting file/line in await traces
An addendum to my async debugging post
I recently posted about my debugger for async
Rust, which can
generate what I call “await-traces” for async
code that’s suspended and not
currently running. I mentioned at the time that it appeared possible to get the
source code file name and line number corresponding to the await points, but
left that for future work.
This is an update describing that future work.
The structure of an async fn type
I picked through this at some length in the original post, but skipped some
details that are relevant for this one – so let’s revisit the topic. When we
write an async fn
in Rust, the compiler generates a data type behind the
scenes. What does that type look like?
Let’s take a very simple function, something like:
async
This causes the compiler to generate types that – if we were instead writing them by hand – would look like…
;
;
We can’t actually write that code, exactly, because it uses names that are
illegal in Rust code (like {async_fn_env#0}
or an enum variant named 3
), but
the compiler can do whatever it wants, and these are the names it uses
internally. This keeps the names from conflicting with the names in your
program.
When you initially call example(something)
it produces an
example::{async_fn_env#0}
that has variant 0 – containing the Unresumed
struct. This holds the argument you passed in.
When you first poll it, it will call other_operation(argument)
, which produces
its own future type. This gets stashed into the __awaitee
field. (In
practice, it gets polled once first to see if it completes immediately, in which
case Suspend0
is skipped and we move directly to Returned
– but that’s not
important here.)
So, when decoding an await trace, we use these __awaitee
fields – potentially
nested – to follow the frames of the trace.
Mapping to DWARF
The mapping of Rust code to DWARF debug information isn’t spectacularly well
documented, but the implementation is quite good. We had to reverse-engineer it
when writing Humility, and I’ve done it again to write lildb
.
If DWARF were represented as something resembling YAML1, and a lot of it removed for the sake of brevity, our types above would be represented as something like:
- structure_type:
name: "{async_fn_env#0}"
variant_part:
discriminator: (stuff)
member:
name: "__state"
type: (reference to u8)
data_member_location: 5
artificial: true
variants:
- variant:
discriminator_value: 0
member:
name: "0"
type: (reference to Unresumed)
- variant:
discriminator_value: 1
member:
name: "1"
type: (reference to Returned)
... more variants
- structure_type:
name: "Unresumed"
members:
- member:
name: "argument"
type: (reference to u32)
... etc
Yeah, I know, I hate YAML too. It just happened to be convenient here.
The first thing that might surprise you here, if you haven’t worked with Rust
DWARF before, is that the Rust enum
is a DWARF structure_type
! DWARF was
designed by C people, and its notion of enum
is a C enum – it can’t have
fields. Rust takes advantage of existing DWARF support for “variant structs”
added to support Ada, decades ago.
The pattern Rust uses to represent an enum
as a “variant struct” is:
- Each
enum
variant becomes a variant of the struct. - A top-level discriminator member tells you where to look to determine which variant is happening.
- Each variant contains a single member, which is itself a struct. This is true even if the variant only contains one field – it always gets wrapped in a struct.
For a normal Rust enum
, the names of the variants and the names of the
generated structs that contain the field are the same.
For a generated async_fn_env
, the names of the variants are (currently, at
least) decimal integers starting at 0, with the SuspendX
variants beginning at
3 – while the names of the generated structs are as I described in the previous
post: “Unresumed”, “Returned,” etc.
Finding file/line
One of the important roles for any debugging format is mapping parts of the generated machine-readable program back to the input source file(s) that produced them. DWARF identifies places in the source files using what it calls “declaration coordinates,” or “decl coords.”
A declaration coordinate consists of
- A source file name,
- A line number,
- A column number.
Any of those can be missing, because everything is optional in DWARF, just to make the lives of debugger writers more interesting. So, you can have something in a program that is mapped to line 12 of … an unspecified file. DWARF is fine with that. The reasons are lost to the mists of time.
As I noted in my last post on the topic, the big issue with trying to use DWARF to debug async programs is that DWARF has no notion of an async program. Rust is kind of … faking it, by using existing DWARF constructs. (I think we’re going to want DWARF extensions for this in the long run.)
So – how do we get the decl coordinates for an entity that DWARF doesn’t know exists, like an await point?
Through reverse engineering, it turns out that rustc
(at least as of 1.69)
records the decl coordinates for await points! It chose to put them… not on
the enum variant… not on the generated struct type… but on the variant
member. That is, they are here:
- structure_type:
name: "{async_fn_env#0}"
variant_part:
discriminator: (stuff)
member:
name: "__state"
type: (reference to u8)
data_member_location: 5
artificial: true
variants:
- variant:
discriminator_value: 0
member:
name: "0"
type: (reference to Unresumed)
decl_file: omg/lol.rs # <------
decl_line: 1234 # <------
- variant:
discriminator_value: 1
member:
name: "1"
type: (reference to Returned)
decl_file: omg/lol.rs # <------
decl_line: 1500 # <------
... more variants
I noticed this when I was reading a raw DWARF dump of a Rust program. (If you write a debugger, you’ll spend a lot of time reading raw DWARF dumps.) This particularly jumped out at me because Rust is, in general, really bad about including decl coords on structures, enums, variants, and the like. These are one of the few cases where you can find the line where an enum variant is declared. Fortunately, this is the one case I needed!
debugdb
is my crate that lildb
is built on. It’s responsible for parsing
DWARF and turning it into an in-memory database format suitable for program
analysis. debugdb
only represents the portions of DWARF that I’ve opted in,
and I hadn’t included decl coord parsing for members of structs/enums because
Rust generally doesn’t include it. So, after adding that, and a chunk of
somewhat gross code to the debugger, we can have our await traces with a side of
line numbers:
>> tasks
current tick-time is: 1620
task 0:
async fn keybad_fw::serial::task
suspended at src/serial.rs:35 # <--- LINE NUMBERS! YAY!
waiting on: async fn lilos::spsc::{impl#4}::pop<keybad_fw::scanner::KeyEvent>
suspended at /home/cbiffle/proj/lilos/os/src/spsc.rs:313
waiting for data in spsc queue at 0x20001d7c
queue type: lilos::spsc::Queue<keybad_fw::scanner::KeyEvent> <.debug_info+0x0000e5b4>
task 1:
async fn keybad_fw::scanner::task
suspended at src/scanner.rs:134
waiting on: async fn lilos::exec::{impl#6}::next_time
suspended at /home/cbiffle/proj/lilos/os/src/exec.rs:965
waiting on: async fn lilos::exec::sleep_until
suspended at /home/cbiffle/proj/lilos/os/src/exec.rs:815
sleeping until: 1621 (1 ms from now)
This makes the await traces a lot more useful for debugging. Previously,
lildb
would just tell you that the task was suspended at “await point 3” in
the function, leaving you to guess which one the compiler chose to be number 3.
Now, you can direct your editor to the exact spot. (If you have more than one
await
on a line, you’ll still need to do some guessing – rustc
doesn’t
generate column numbers for await points. Usually the rest of the trace should
help you infer the answer from context.)
Another step toward our glorious debuggable async future!