loading...

Rust Lifetimes for Safer FFI

jeikabu profile image jeikabu Originally published at rendered-obsolete.github.io on ・5 min read

Lifetimes are one of Rust’s marquee features and pivotal to its safety guarantees. My understanding of them felt largely academic until I found a situation doing FFI to native code that warranted further investigation.

Lifetimes 101

If you’re not familiar with Rust lifetimes, here’s the basic concept; they assist the compiler ensure references don’t exist longer than the thing they reference:

// Type contains a reference to an integer with a lifetime of 'a
struct MyStruct<'a> (&'a i32);

 #[test]
 fn lifetime() {
     let inner = 42;
     // Create an instance of MyStruct with a reference to `inner` and a lifetime of 'a
     let outer = MyStruct(&inner);
 }

Here outer should not be allowed to outlive inner otherwise you would have a “dangling reference”- a pointer to something that no longer exists. In languages with a garbage collector this is generally a non-issue; inner could be kept in the heap for as long as needed. In languages like C/C++ care must be taken (aided by the compiler) to avoid problems like returning a reference/pointer to the stack, “use after free”, and other errors.

Use Case

I’ve been working on a Rust wrapper for a C library, nng. A few months ago I was wrangling runtime statistics.

nng_stats_get() returns a snapshot of runtime statistics as a tree which can be traversed with nng_stat_child() and nng_stat_next(). When no longer needed, nng_stats_free() releases the memory associated with the snapshot invalidating the entire tree.

A simple Rust wrapper:

// Root of statistics tree
pub struct NngStatRoot {
    node: *mut nng_stat,
}

impl NngStatRoot {
    // Create a snapshot of statistics
    pub fn create() -> Option<NngStatRoot> {
        unsafe {
            // Get snapshot as pointer to memory allocated by C library
            let mut node: *mut nng_stat = std::ptr::null_mut();
            let res = nng_stats_get(&mut node);
            if res == 0 {
                Some(NngStatRoot { node })
            } else {
                None
            }
        }
    }
    // Get first "child" node of tree
    pub fn child(&self) -> Option<NngStatChild> {
        unsafe {
            let node = nng_stat_child(self.node);
            NngStatChild::new(node)
        }
    }
}

// When root goes out of scope free the memory
impl Drop for NngStatRoot {
    fn drop(&mut self) {
        unsafe {
            nng_stats_free(self.node)
        }
    }
}

// A "child"; any non-root node of tree
pub struct NngStatChild {
    node: *mut nng_stat,
}

impl NngStatChild {
    // Create a child
    pub fn new(node: *mut nng_stat) -> Option<NngStatChild> {
        if node.is_null() {
            None
        } else {
            Some(NngStatChild { node })
        }
    }
    // Get sibling of this node
    pub fn next(&self) -> Option<NngStatChild> {
        unsafe {
            let node = nng_stat_next(self.node);
            NngStatChild::new(node)
        }
    }
    // Get first "child" of this node
    pub fn child(&self) -> Option<NngStatChild> {
        unsafe {
            let node = nng_stat_child(self.node);
            NngStatChild::new(node)
        }
    }
}

This can be used as follows:

fn stats() {
    let root = NngStatRoot::create().unwrap();
    if let Some(child) = root.child() {
        if let Some(sibling) = child.next() {
            // Do something
        }
    }
} // root dropped here calling nng_stats_free()

Unfortunately, the following also “works” (it compiles and maybe runs), but results in “undefined behavior”:

fn naughty_code() {
    let mut naughty_child: Option<_> = None;
    {
        let root = NngStatRoot::create().unwrap();
        naughty_child = root.child();
    } // root dropped here calling nng_stats_free()

    if let Some(child) = naughty_child {
        if let Some(naughty_sibling) = child.next() {
            debug!("Oh no!");
        }
    }
}

The problem is naughty_child allows a pointer into the statistics snapshot to outlive root and be accessed after nng_stats_free() is called.

Solution Using Lifetimes

I was pretty sure this was a job for lifetimes.

Once you give a struct a lifetime it “infects” everything it touches:

pub struct NngStatChild<'root> {
    node: *mut nng_stat,
}

impl<'root> NngStatChild<'root> {
    pub fn new(node: *mut nng_stat) -> Option<NngStatChild<'root>> {
        //...
    }
//...

In particular, note impl<'root>. Without that you get:

error[E0261]: use of undeclared lifetime name `'root`
  --> runng/tests/tests/stream_tests.rs:77:18
   |
77 | impl NngStatChild<'root> {
   | ^^^^^ undeclared lifetime

After applying the lifetime everywhere you’ll eventually get:

error[E0392]: parameter `'root` is never used
  --> runng/tests/tests/stream_tests.rs:73:24
   |
73 | pub struct NngStatChild<'root> {
   | ^^^^^ unused type parameter
   |
   = help: consider removing `'root` or using a marker such as `std::marker::PhantomData`

Lifetime 'root is unused. It cannot be applied to the pointer:

pub struct NngStatChild<'root> {
    // NB: doesn't compile
    node: *'root mut nng_stat,
}

Lifetimes don’t go on pointers, only references (&):

pub struct NngStatChild<'root> {
    node: &'root mut nng_stat,
}

Switching to a reference has two problems:

  1. Requires lots of casting because the native methods take pointers (*)
  2. Need to be more careful with mut on instances of the struct

The helpful compiler message alludes to another solution, std::marker::PhantomData, which allows our struct to “act like” it owns a reference:

pub struct NngStatChild<'root> {
    node: *mut nng_stat,
    _phantom: marker::PhantomData<&'root nng_stat>,
}

impl<'root> NngStatChild<'root> {
    pub fn new(node: *mut nng_stat) -> Option<NngStatChild<'root>> {
        if node.is_null() {
            None
        } else {
            Some(NngStatChild {
                node,
                // Initialize the phantom
                _phantom: marker::PhantomData,
            })
        }
    }

What’s cool is PhantomData is a Zero-Sized Type; it has no runtime cost (neither CPU nor memory), it exists only at compile-time:

pub struct Phantom<'root> {
    _phantom: marker::PhantomData<&'root nng_stat>,
}

#[test]
fn check_size() {
    assert_eq!(0, std::mem::size_of::<Phantom>());
}

One place that’s warrants special attention is our next() method:

impl<'root> NngStatChild<'root> {
    //...

    // NB: The explicit lifetime on the return value is key!
    pub fn next(&self) -> Option<NngStatChild<'root>> {
        unsafe {
            let node = nng_stat_next(self.node);
            NngStatChild::new(node)
        }
    }
}

We need an explicit lifetime here because without it the lifetime ellision rules would assign the same lifetime as &self. That implies that the lifetimes of the siblings are somehow related, but all that matters is the lifetime of the root.

Let’s revisit our naughty code:

fn naughty_code() {
    let mut naughty_child: Option<_> = None;
    {
        let root = NngStatRoot::create().unwrap();
        naughty_child = root.child();
    } // root dropped here calling nng_stats_free()

    if let Some(child) = naughty_child {
        if let Some(naughty_sibling) = child.next() {
            debug!("Oh no!");
        }
    }
}

Now when we build it the compiler lets us know we did something bad:

error[E0597]: `root` does not live long enough
  --> runng/tests/tests/stats_tests.rs:37:25
   |
37 | naughty_child = root.child();
   | ^^^^ borrowed value does not live long enough
38 | } // root dropped here calling nng_stats_free()
   | - `root` dropped here while still borrowed
39 | 
40 | if let Some(child) = naughty_child {
   | ------------- borrow later used here

Crisis averted, thanks compiler!

Fin

If you read this far Rust might be your cup of tea and you should give it a look- if you haven’t already.

There's still a few things that aren't clear to me. For example, I'm not sure I understand why next() needs an explicit lifetime (it was really a problem when implementing an iterator) but not child().

Full source is on github.

Further reading:

Discussion

markdown guide
 

Excellent write-up! Thanks for this. It's a really good practical example of why lifetimes are so fundamental to Rust and worth getting through the academic understanding. It also demonstrates exactly why Rust wrappers like these are such a good idea and an improvement over using the same API from C.

It's also a good supplement to the example given for PhantomData from the 'nomicon - helpful seeing how it applies to an actual problem.

 

The nomicon is great, such a dense treasure trove of information. I mention it in the end, but I probably should have put it in the beginning.

Glad you enjoyed this. I've found I get a lot out of doing these write-ups as a sort of post-mortem for things I'm working on. Forces me to go back and review things with fresh eyes. In the course of posting this I actually realized and fixed a few things. Namely, I had a lifetime on the "root" as well, but turns out it doesn't seem to be necessary...