DEV Community

Cover image for No More Tears, No More Knots: Arena-Allocated Trees in Rust
Ben Lovy
Ben Lovy

Posted on

No More Tears, No More Knots: Arena-Allocated Trees in Rust

Enter The Arena

When programming in Rust, it's not always straightforward to directly translate idioms you know. One such category is tree-like data structures. These are traditionally built out of Node structs that refer to other Nodes in the tree. To traverse through your structure, you'd use these references, and changing the structure of the tree means changing which nodes are referred to inside each one.

Rust hates that. Quickly you'll start running into problems - for instance, when iterating through nodes, you'll generally need to borrow the structure. After doing so, you'll have a bad time doing anything else with the structure inside.

Trees are a fact of life, though, and very much a useful one at that. It doesn't have to hurt! We can use region-based memory management to pretty much forget about it.

The Desert

I'll briefly mention a few of the other methods I've bashed my head against before trying this today.

The simplest is to use unsafe, which allows you to use raw pointers like you would in C. This forfeits a lot of the benefits we get from using safe Rust, as one use of unsafe will infect your whole crate. Now part of the code is only deemed safe because you, the programmer, have deemed it to be, and not the Rust compiler.

To stick to statically safe Rust, you can wrap your pointers in Rc<RefCell<T>> types, which are reference-counted smart pointers with interior mutability. When you call Rc::clone(&ptr), you get back a brand new pointer to the same data, that can be owned separately from any existing pointer, and when all such Rcs have been dropped the data itself will get dropped. This is a form of static garbage collection. The RefCell that allows you to take mutable borrows of things that aren't mutable, and enforces at runtime instead of statically. This lets you cheat, and will panic!() if screw up, so, hooray I guess. You need to use methods like data.borrow_mut() but then can, for example, change the pointer in a next field using an otherwise immutable borrow of the node during your traversal.

Alternatively you can use Box smart pointers and clone them around, performing a lot of extra work for no reason - this involves deep-copying whole subtrees to make small edits. You do you, but that's not really my thing.

You can even use plain references and introduce explicit lifetimes:

struct Node<'a> {
    val: i32,
    next: &'a Node,
}
Enter fullscreen mode Exit fullscreen mode

Yippee, you're probably sprinkling 'a all over the place now, and there's gonna be a part of you that wants to start getting friendly with b, and whoa there. That's gross, and you're solving a much simpler problem that requires.

All of these options mean pain, and often compromise. At least in my experience, while you often can get to a successful compile your code gets unreadable and unmaintainable fast, and should you ever need to make a different choice you're pretty much back to square one trying to fit it all together. It's also the only way I've ever managed to actually produce a segfault in Rust. I was pretty impressed with myself for screwing up that hard and I wish I had kept better notes about how I got there, but I know it was some nonsense like the above.

The problem is that Rust is keeping a close eye on who owns your nodes and what lifetime each has, but as you build a structure it's not always easy for the compiler to understand what it is you're trying to do. You end up with inferred lifetimes that are too small or not accurate for your structure and no way to efficiently traverse or edit the map. You end up needing to do manual work to convince the compiler you're right, which sucks.

The Oasis

What if your nodes could all have the SAME lifetime? I mean, they essentially do, right? Sure, some may get created after one another, but for all intents and purposes within this program you just care that they're all owned by your top-level tree structure.

There's a super easy way - pop 'em in a Vec<T>:

#[derive(Debug, Default)]
struct ArenaTree<T> 
where
    T: PartialEq
{
    arena: Vec<Node<T>>,
}
Enter fullscreen mode Exit fullscreen mode

Boom. Tree. It's generic for any type that can be compared with ==, and the lifetime problems are solved. You want a node? Use self.arena[idx]. Instead of storing actual references to other nodes, just give 'em each an index:

#[derive(Debug)]
struct Node<T>
where
    T: PartialEq
{
    idx: usize,
    val: T,
    parent: Option<usize>,
    children: Vec<usize>,
}
Enter fullscreen mode Exit fullscreen mode

In this tree, each node has zero or one parents and zero or more children.
New ones will require an ID specified, as well as a value to store, and will not connect to any other nodes:

impl<T> Node<T>
where
    T: PartialEq
{
    fn new(idx: usize, val: T) -> Self {
        Self {
            idx,
            val,
            parent: None,
            children: vec![],
        }
    }
}

Enter fullscreen mode Exit fullscreen mode

You could go on and store as many indices as you want - it's your graph. This is just the example tree I used for Day 6 of AoC (and why we're here).

This is pretty easy to use. When you want a value, you can just ask for its index:

impl<T> ArenaTree<T>
where
    T: PartialEq
{
    fn node(&mut self, val: T) -> usize {
        //first see if it exists
        for node in &self.arena {
            if node.val == val {
                return node.idx;
            }
        }
        // Otherwise, add new node
        let idx = self.arena.len();
        self.arena.push(Node::new(idx, name));
        idx
    }
}
Enter fullscreen mode Exit fullscreen mode

Whether or not it was there previously, you now have an index for that value in your tree. If it wasn't already there, a new node was allocated with no connections to any existing nodes. It will automatically drop when the ArenaTree goes out of scope, so all your nodes will always live as long as any other and all will clean up at the same time.

This snippet shows how easy traversal becomes - you just walk the vector with, e.g., for node in &self.arena. Certain operations become trivial - want the number of nodes? Ask for it:

fn size(&self) -> usize {
    self.arena.len()
}
Enter fullscreen mode Exit fullscreen mode

What about counting how many edges are there? Nothing fancy here either, count them:

fn edges(&self) -> usize {
    self.arena.iter().fold(0, |acc, node| acc + node.children.len())
}
Enter fullscreen mode Exit fullscreen mode

It's still pretty easy to do your standard recursive data structure stuff, though. You can see how deep a node is:

fn depth(&self, idx: usize) -> usize {
    match self.arena[idx].parent {
        Some(id) => 1 + self.depth(id),
        None => 0,
    }
}
Enter fullscreen mode Exit fullscreen mode

Search for a value from the root, returning its depth:

fn depth_to_target(&self, idx: usize, target: &T) -> Option<usize> {
    // are we here?  If so, Some(0)
    if target == &self.arena[idx].val {
        return Some(0);
    }

    // If not, try all children
    for p in &self.arena[idx].children {
        if let Some(x) = self.depth_to_target(*p, &target) {
            return Some(1 + x);
        }
    }
    // If it cant be found, return None
    None
}
Enter fullscreen mode Exit fullscreen mode

You can of course traverse iteratively as well. This method finds the distance between the parents of two nodes using both iterative and recursive traversal to perform a series of depth-first searches:

fn distance_between(&mut self, from: T, target: T) -> usize {
    // If it's not in the tree, this will add a new unconnected node
    // the final function will still return None
    let start_node = self.node(from);
    let mut ret = 0;
    // Start traversal
    let mut trav = &self.arena[start_node];
    // Explore all children, then hop up one
    while let Some(inner) = trav.parent {
        if let Some(x) =  self.depth_to_target(inner, &target) {
            ret += x;
            break;
        }
        trav = &self.arena[inner];
        ret += 1;
    }
    // don't go all the way to target, just orbit
    ret - 1
}
Enter fullscreen mode Exit fullscreen mode

This repeats a little work on each backtrack, but at even puzzle scale computes nearly instantly. It's quite concise and readable, not words I'm used to using for Rust trees!

Inserting will depend on the domain, but this application received input as PARENT)CHILD, so my insert looked like this:

fn insert(&mut self, orbit: &str) {
    // Init nodes
    let split = orbit.split(')').collect::<Vec<&str>>();
    let inner = self.node(split[0]);
    let outer = self.node(split[1]);
    // set orbit
    match self.object_arena[outer].parent {
        Some(_) => panic!("Attempt to overwrite existing orbit"),
        None => self.object_arena[outer].parent = Some(inner),
    }
    // set parents
    self.object_arena[inner].children.push(outer);
}
Enter fullscreen mode Exit fullscreen mode

To recap, whenever you want to manipulate a given node you just need its index to do so. These are handily Copy, so don't worry too much about manipulating them. To get a node's index, call tree.node(val). It will always succeed by performing a lookup first and then allocating it to your tree's arena if it wasn't already there. Then it's up to you to manipulate the node's fields to the indices where it belongs: self.arena[idx].children.push(outer);. You never need to worry about the memory again, your Vec will drop itself when it can. You define the structure of the tree yourself by what indices are stored in each node and what happens when you insert a new one.

Basically, it's a tree like you want it to be, but it's in Rust and you don't even have to fight about it, and it's great.

Here's a playground link to poke and prod at.

cover image by Amanda Flavell on Unsplash

Top comments (33)

Collapse
 
deciduously profile image
Ben Lovy • Edited

You're ahead of me - I've been slacking this year! You're right - T is a stand-in for whatever type you want as long as it implements the PartialEq trait. There is a usage example in the playground link at the bottom of the post, here's a snippet showing some usage where T is String:

let mut tree: ArenaTree<String> = ArenaTree::default();
let hello = tree.node("Hello".into());
let world = tree.node("World".into());
tree.arena[hello].children.push(world);
tree.arena[world].parent = Some(hello);
Enter fullscreen mode Exit fullscreen mode

The values stored in the hello and world variables are just indexes into the arena vector. You can replace String with any type that implements PartialEq:

#[derive(PartialEq)]
struct CoolData {
    data: String
}
// ...
let mut tree: ArenaTree<CoolData> = ArenaTree::default();
let hello_struct = CoolData { data: "Hello".to_string() };
let hello = tree.node(hello_struct);
Enter fullscreen mode Exit fullscreen mode

The rest will work the same. Does this help?

 
deciduously profile image
Ben Lovy

Rust is definitely a difficult choice especially once you start hitting things like trees. Educational for sure, but you'll hit a lot more friction than other languages for some of the data structures you'll need. Good luck!

Collapse
 
gypsydave5 profile image
David Wickes

This is interesting! The two thoughts I had were:

  • this is not how I solved this problem for AoC.

  • this really highlights that array indexes are just pointers.

I also had a helluva time trying to make a tree in Rust. This is ace!

Collapse
 
deciduously profile image
Ben Lovy

this really highlights that array indexes are just pointers.

Boring as it's been, being forced to drill data structures in C++ has admittedly been pretty helpful. Nothing is complicated.

Willing to share how you solved it? I saw this solution when I first read the problem and didn't spend much time thinking of other strategies.

Collapse
 
gypsydave5 profile image
David Wickes

Willingly.

I used a reverse lookup hash table. Each key was the name of a node, and its value was the name of the parent node. This was taking advantage of the way the nodes only ever had one parent.

Distances between nodes was just the sum of the distance between them and their first common ancestor. First common ancestor was calculated by walking one of the nodes all the way up to root, collecting the node names, then walk the other one up until you find a name that's in the other list.

I'd like to say I was being clever, but I definitely stumbled across this one.

Thread Thread
 
deciduously profile image
Ben Lovy • Edited

walk the other one up until you find a name that's in the other list

Oh, but that IS clever. At a glance though, you're stepping through the first list for each check as you walk up the second - is there a more efficient way to perform that check, or am I overestimating the computational hit? Not that I'd expect it to practically matter...

Thread Thread
 
gypsydave5 profile image
David Wickes

is there a more efficient way to perform that check, or am I overestimating the computational hit?

Yeah, it's definitely not optimal. Repeatedly 'finding' in a list is annoying. If I was trying to optimize...

You could store the 'list' as yet another hash table, keying off the node name with values set to the distance, as there's no need to store that information as an ordered sequence if you save the distance. That way you could query it to find out if a node is present, step forward on the other path when its not, or add the distance returned from the hash to the distance travelled from the other path. That's a bit neater.

I'm beginning to wonder whether I just try to solve everything with hashes...

Thread Thread
 
gypsydave5 profile image
David Wickes

is there a more efficient way to perform that check, or am I overestimating the computational hit?

Yeah, it's definitely not optimal. Repeatedly 'finding' in a list is annoying. If I was trying to optimize...

You could store the 'list' as yet another hash table, keying off the node name with values set to the distance, as there's no need to store that information as an ordered sequence if you save the distance. That way you could query it to find out if a node is present, step forward on the other path when its not, or add the distance returned from the hash to the distance travelled from the other path. That's a bit neater.

I'm beginning to wonder whether I just try to solve everything with hashes...

Thread Thread
 
deciduously profile image
Ben Lovy

It's hashes all the way down...

Collapse
 
andreyorst profile image
Andrey Orst

Nice code, I had thought about doing trees this way myself, but I feel that this tree is not as flexible as pointer based tree. For instance, if I would like to add one tree as a children to another tree I would have to copy whole tree node by node updating each index. This is kinda slow process. In comparison, in pointer based tree we just say that this root node is now a child of that tree.

Another example is deleting subtrees. If we were to remove some nodes with deallocating the memory, then we're removing data from vector, which in turn will move all values in a vector for each node we delete. This is like super unoptimized for a tree.

I think this solution is great for grow-only trees, and not for trees that change frequently.

Collapse
 
deciduously profile image
Ben Lovy

Both great points. As always, you've gotta fit the data structure to the task, and this solves some problems but not all.

For point two, could you just "leak" those nodes, and then de-allocate in bulk as part of some other operation if needed? @mkm mentioned the node leaking issue, but it seems like an easier solve than the problem you describe, where each node is actually moving for each de-allocation.

Collapse
 
andreyorst profile image
Andrey Orst

As always, you've gotta fit the data structure to the task, and this solves some problems but not all.

Exactly. There are always some tradeoffs.

For point two, could you just "leak" those nodes, and then de-allocate in bulk as part of some other operation if needed?

I think leak is an option, however only if trees are consistently small, and their lifetime determines that those will be deallocated relatively soon. It is a leak after all, and leaks are more painful over time as they tend to grow.

Though in my case (not related to AOC) it wasn't an option because I parse code to a huge tree structure and then do big amount of node manipulation on reduction, which also has tree attachment as a some part of reduction in some cases, which in turn will be removed in the end. So after all reductions the resulting arena would be bloated with so many trees that the memory usage would actually be quite big. The tree would die in the end, of course, but this is not an option for really huge structures that I can theoretically parse.

Thread Thread
 
deciduously profile image
Ben Lovy

Interesting. I think I'm going to keep playing with this code to see if it can be made more practical. It was so much more ergonomic than any alternative for solving the AoC problem, and took such a minimal amount of planning and code, that I feel all of these concerns could be at least mitigated if not totally resolved with a little more logic built in.

Thread Thread
 
andreyorst profile image
Andrey Orst

I slightly doubt, but kudos if you will come up with something. Leaking nodes seems the only performant way for removal, because if we remove items from vector, we'll have to update IDs of all nodes that were allocated after removed ID in the arena, which can be huge amount of work if we decided to remove leftmost root's child. In this sense I think HashMap is more affordable, though we're going to have to sacrifice some performance to hashing and looking up.

Thread Thread
 
deciduously profile image
Ben Lovy

HashMap

Heh, that was exactly what I had in mind. You're probably right.

update IDs of all nodes that were allocated after removed ID in the arena

Could this be mitigated by not indexing arena directly, and instead searching for node.idx? All edit operations would just need to ensure they're managing the stored indices properly, but to me it looks like you wouldn't ever need to change an index of a node after it's allocated. It won't match the actual index in the arena, but I'm not sure that matters.

It's not something I know anything about, but I think there are crates designed around sparse data structures as well. Assuming trade-offs but for trees expected to have long lives over which they grow and shrink a lot, could be useful.

Collapse
 
bretthancox profile image
bretthancox

I'm in the middle of day 6 of AoC right now and I'm hitting issues with the shortest route between me and Santa. This article is a great reminder that rust's type system gives me more flexibility than I'm used to. I've been trying to force HashMap and vectors to work, but your solution is so elegant. Thanks for the write up. A fresh perspective is what I needed.

Collapse
 
deciduously profile image
Ben Lovy

Rust and Clojure are a great pair for these challenges. They both afford a high degree of expressive freedom but from completely different directions.

Collapse
 
frolovdev profile image
Andrey Frolov

Hi there, thanks for the post. What about if I want to have a different trees/subtrees builded from the common "pool" of nodes? Like a two separate heads of the list pointed to the same node.

Does this data structure fits well for such case? (of course if we wrap all nodes in Rc)

Collapse
 
deciduously profile image
Ben Lovy

Hi - because all nodes only refer to each other via a Copy index, instead of directly pointing to each other, you should be able to set up such structures without even needing to wrap in Rc. The nodes are only accessed via the arena as needed, the only reference to them is just a number that can be passed around without issue.

Collapse
 
frolovdev profile image
Andrey Frolov

Thank you for the clarification

Thread Thread
 
deciduously profile image
Ben Lovy

The caveat would be simultaneous access - you would need to employ a lock of some sort if you expect multiple threads or tasks to mutably borrow the data in a given node.

Collapse
 
mkm profile image
Marko Mikulicic

This is a nice way to stay memory safe with respect to the general memory, but with respect to your high-level "memory cells" (the node entries in the arena) you're on your own: the compiler won't protect you from accidentally "leaking" nodes in a long living tree (where a leak is just a node whose index is not pointed to by other nodes) nor preventing two nodes to point to the same node (breaking the tree invariant etc)

Collapse
 
deciduously profile image
Ben Lovy • Edited

Great points, absolutely. There is some work to be done here for practical use beyond coding challenges.

I'm very much a novice - it sounds to me like point 2 could be resolved by just writing better methods that verify this property on insertion. Would that suffice?

Leaked nodes are a little trickier - you could conceivably write up a little "stop-the-world" style cleanup that sweeps through for unlinked nodes, but it's not immediately clear to me how to build that in to this structure without requiring the consumer to be aware.

Collapse
 
rpalo profile image
Ryan Palo

This is neat! I ended up using a HashMap of nodes and their children’s names, but that involves a lot of duplication. I was reading through the rust docs and saw that they recommended refcell> for stuff like this, but it was pretty intimidating.

Collapse
 
deciduously profile image
Ben Lovy

Totally - I've gone the Rc<RefCell<T>> route for other projects, and it does get you there but it always feels hacky and brittle. In C++, you really just use the Rc-equivalent shared_ptr when you actually want multiple ownership over some data, not to skirt around language semantics when its convenient for you - this strategy is kinda antithetical to what draws me to Rust in the first place! There are definitely legitimate uses of runtime interior mutability but I don't feel this is one of them.

Collapse
 
u_dev profile image
u_dev

This is awesome
One thing i would like to understand is, the ideal look up time for a tree is O(logn) but that doesnt seem to be followed here since you are iterating all the nodes in the vector.

for node in &self.arena {
            if node.val == val {
                return node.idx;
            }
        }
Enter fullscreen mode Exit fullscreen mode

this seems to take O(n) time. Is this a known tradeoff?

Collapse
 
jchimene profile image
jchimene

Thanks!
Still an issue in
impl ArenaTree
where
T: PartialEq
{
fn node(&mut self, val: T) -> usize {
//first see if it exists
for node in &self.arena {
if node.val == val {
return node.idx;
}
}
// Otherwise, add new node
let idx = self.arena.len();
self.arena.push(Node::new(idx, name));
idx
}
}
I think you want to s/name/val/

Collapse
 
gklijs profile image
Gerard Klijs

Nice solution to a common rust problem. Thanks for sharing.

Collapse
 
dydokamil profile image
Kamil Dydo • Edited

When creating a node you're setting idx to be the length of the current Vec of children. If you then remove the one before last element and create another one at the end 2 last elements will have the same index.

Collapse
 
mike239x profile image
Mike Lezhnin

I think it is better to keep number of nodes you have in a separate variable, so that you can easily remove nodes from the tree while leaving them inside of the vector.

Collapse
 
deciduously profile image
Ben Lovy

Great point - this code conflates ArenaTree::size() with the length of the vector, which are separate concepts. Your implementation is absolutely more correct.

Collapse
 
chichid profile image
chichid • Edited

You sure have worked around the borrow checker issue and "sprinkling" 'a all over the place but here is what you didn't mention:

  • You now made it a lot easier to create memory leaks. And at best there will be nodes laying around for more time than they should be. Basically, no borrow checking, but also you need to manage that memory.
  • Say goodbye to performant thread safety. If you want to do some parallel processing for all the nodes in your tree your only bet now is to sprinkle Arcs all over the place. To give you an idea: Arc::new for simple struct may take 200 to 300% more per allocation. Not to mention the runtime cost of atomic resource counting.
  • The amount of copy that this will introduce is going to slow down your application if you have big tree (ex. a GUI).

I'd go with dealing with the Borrow checker when necessary instead of working around it. Even through it introduces ugly verbose lifecycles everywhere.

Collapse
 
pingu1m profile image
Felipe Gusmao

Thanks for the article. Loved it.