Kurt

Posted on Feb 22, 2023 • Originally published at getcode.substack.com on Feb 22, 2023

A Nibble of Quadtrees in Rust

#rust #quadtree #datastructure #tree

Nibble: a small piece of food bitten off. In computing: half a byte of information. Every nibble explains a computing science or software engineering idea in five minutes.

That’s right, I just typed quad and tree in the stock image search. Photo by Appic on Unsplash

Quadtrees are a tree data structure in which each non-leaf node has exactly four children. They are related to binary trees and frequently represent properties of two-dimensional space, such as point locations or regions.

All code for this post is here: https://github.com/kurtschelfthout/quadtrees.

The basics of quadtrees

As a data structure, quadtrees are straightforward trees where each branch node has exactly four children.

We can store a variety of information in such a tree, typically related to two-dimensional data, such as the locations of objects in a plane, a summary of a part of an image like its average color, or information about lines or shapes in a region of a drawing. Why are binary trees, quadtrees, and octrees used more often than 5-ary trees or 12-ary trees? Because of the relation to one, two, and three-dimensional space. It takes two line segments to partition a (one-dimensional) line segment, four squares to partition a square, and eight cubes to partition a cube in three-dimensional space.

Like all data structures, quadtrees are used primarily for performance reasons. For example, when simulating a large number of moving objects, naively checking for collisions means comparing the position of each object with every other. If we store the objects' positions in a quadtree, we can limit collision detection to a few regions of interest, greatly reducing the work needed.

In this post, we'll look specifically at region quadtrees. A region quadtree partitions a 2D space into regions - each node in a region quadtree represents a rectangular region. Regions can be split further into four sub-regions, which can be split in turn, until some desired resolution is reached.

Representing images with quadtrees

I'll be using quadtrees to generate lower-resolution versions of an image. The idea is to repeatedly divide the image into four regions. Each region, which can contain many pixels, is approximated by a single value - a "big pixel". I'll take the mean of the RGB values in the region as the color of that big pixel.

As a first attempt at this, I used a complete quadtree - meaning that each level of the tree is fully populated and the tree is perfectly balanced. The leaf nodes represent one pixel each, their parents represent a two-by-two pixel region, and so on. To get a lower-resolution version of the image, we can cut off the tree at the desired level.

Here's an illustration. The original four-by-four image is in the bottom left, and the lowest level of the tree represents it exactly.

Color-challenged people will have to trust me that the colors match up.

Further up in the tree, in the middle level, each region is two by two pixels and is only an approximation of the original. In the top left and bottom right corners, the approximation happens to be exact. At the root level, there is just a single big pixel.

Thanks to the magic of WebAssembly, which compiled my Rust code below to something that runs in the browser, you can play around with quadtrees on a more realistic image:

Playground for complete quadtrees applied to HAL 9000 image

Here's a screenshot of the playground in action. The original image on the left, the image as represented by an adjustable level in the quadtree on the right.

Do these look like those robo-boobs from Austin Powers? Asking for a friend.

That works reasonably well for symmetrical images like the one above, but most images have parts we care less about. It'd be nice if we could focus the quadtree to divide just the interesting parts of the image into smaller regions. To do that, we can no longer have a complete quadtree - interesting regions will gain more levels than uninteresting ones, and the tree will be unbalanced.

Assuming that's fine, how should we determine which region to focus on? The idea is to work iteratively. To start with, we approximate the entire image with a single region of the image's mean color. At each step, we calculate an error value for each leaf region in the quadtree. The error indicates how well the region represents the corresponding region in the original image. I used the mean squared difference between the mean color for that region and the actual colors.

While the error for a particular region is greater than we tolerate, keep subdividing it into four more sub-regions. We'll end up with an approximation of the original image that creates more, smaller regions in interesting parts of the image, and includes fewer, larger regions in less interesting parts.

Here is a demo of region quadtrees in action. You can interactively adjust the desired error and the minimum region length.

Playground for region quadtrees applied to owl image

On the left is the original image and on the right is the image represented by a quadtree. With a higher error tolerance, you'll see fewer and bigger regions, concentrated on busier parts of the image:

Quadtrees laugh at your thousands of years of natural selection for camouflage.

Implementing region quadtrees in Rust

That's it for the demos, let's look at some code for region quadtrees - the second approach.

The representation of a region quadtree is straightforward:

struct Region {
    x: usize,
    y: usize,
    width: usize,
    height: usize,
}

enum RegionQuadTree {
    Leaf(Region, Rgba),
    Branch([Box<RegionQuadTree>; 4]),
}

A quadtree is either a leaf or a branch. A leaf stores the region it represents and its color as an RGBA (Red+Green+Blue+Alpha) value. I chose RGBA simply because that's how HTML's canvas represents a pixel of image data. You can store any information about a region in leaf nodes. A branch only has references to its four children. It usually makes sense to store more information on branches, such as the region they apply to, but I didn't need it here.

The driver for the demo is the following function:

pub fn subdivide_until(&mut self, error_threshold: f32, min_region_length: usize) {
    loop {
        let new_quadtree =
            self.quadtree
                .subdivide(&self.image, error_threshold, min_region_length);
        match new_quadtree {
            Some(qt) => self.quadtree = qt,
            None => break,
        }
    }
}

The function subdivides a quadtree until it reaches a small enough error, or until the length of its regions' sides becomes too small. The latter to avoid continuing indefinitely and underflow problems. The subdivide function has the following signature:

fn subdivide(
    &self,
    image: &Image,
    error_threshold: f32,
    min_region_length: usize,
) -> Option<RegionQuadTree>

It returns a new, subdivided quadtree, or None if either the error threshold or the min region length is met:

fn subdivide(
    &self,
    image: &Image,
    error_threshold: f32,
    min_region_length: usize,
) -> Option<RegionQuadTree> {

    let region = self.region();
    if self.get_error(image) < error_threshold
        || region.height <= min_region_length
        || region.width <= min_region_length
    {
        return None;
    }

    match self {
        RegionQuadTree::Leaf(region, _) => {
            /// ...
        }
        RegionQuadTree::Branch(children) => {
            /// ...
        }
    }
}

The implementation for the cases is straightforward, if verbose.

Subdividing a leaf means splitting it into four regions, taking care not to miss any pixels, and then creating a new branch to hold the four new children:

match self {
    RegionQuadTree::Leaf(region, _) => {
        let half_width_l = (region.width as f64 / 2.0).floor() as usize;
        let half_width_r = (region.width as f64 / 2.0).ceil() as usize;
        let half_height_up = (region.height as f64 / 2.0).floor() as usize;
        let half_height_dwn = (region.height as f64 / 2.0).ceil() as usize;

        let children = [
            Box::new(RegionQuadTree::leaf(
                region.x,
                region.y,
                half_width_l,
                half_height_up,
                image,
            )),
            Box::new(RegionQuadTree::leaf(
                region.x,
                region.y + half_height_up,
                half_width_l,
                half_height_dwn,
                image,
            )),
            Box::new(RegionQuadTree::leaf(
                region.x + half_width_l,
                region.y,
                half_width_r,
                half_height_up,
                image,
            )),
            Box::new(RegionQuadTree::leaf(
                region.x + half_width_l,
                region.y + half_height_up,
                half_width_r,
                half_height_dwn,
                image,
            )),
        ];
        Some(RegionQuadTree::Branch(children))
    }
    RegionQuadTree::Branch(children) => {
        // ...
    }
}

Subdividing a branch means subdividing each of its children. We avoid creating a new tree if all children report they don't need to be subdivided further (by returning None). We also have to put the tree back together in case some children subdivide while others do not.

match self {
    RegionQuadTree::Leaf(region, _) => {
        // ...
    }
    RegionQuadTree::Branch(children) => {
        // call subdivide on all children - zip with existing children to use 
        // as default later.
        let sub_children: Vec<_> = children
            .iter()
            .map(|child| child.subdivide(image, error_threshold, min_region_length))
            .zip(children)
            .collect();

        // all children returned None - avoid returning a superfluous new tree.
        if sub_children.iter().all(|c| c.0.is_none()) {
            return None;
        }

        // replace any None result on children with the "old" child.
        let children = sub_children
            .into_iter()
            .map(|(nc, oc)| Box::new(nc.unwrap_or(*oc.clone())))
            .collect::<Vec<_>>()
            .try_into()
            .unwrap();
        Some(RegionQuadTree::Branch(children))
    }
}

Those are the most important bits - you can look at the rest of the code in the repo.

There's also an implementation of a complete quadtree. The main difference with a region quadtree is that we can figure out the number of nodes beforehand. For example, a complete quadtree with three levels (not counting the root) has 1+4+16+64 nodes. Like with a binary tree, this can be leveraged to store a complete quadtree in an array, and use indices to represent the parent-child structure: children of the node at index i' are stored at index 4*i + 1 to 4*i + 4.

What does this have to do with z-order?

In the last nibble I said Z-order curves are useful to create quadtrees. I haven't used them here, but it's worth exploring the connection. If you squint at the creation of four new leaf nodes in the RegionQuadTree::Leaf case of subdivide above, you'll see that the leaves are created in z-order: top-left, bottom-left, top-right, bottom-right. That's true recursively, as creating new branches keeps the same order as the existing children. It's slightly clearer in a complete quadtree because all its regions have the same size.

If the regions to insert are known, you can avoid overhead while creating the tree if you feed the arguments to the tree creation processor in z-order. It's also possible to parallelize quadtree creation that way.

In conclusion

Wikipedia lists interesting applications of quadtrees, including image representation and processing, collision detection, and spatial indexing for location queries. At its heart, a quadtree is simply an extension of binary trees, with each node having four children instead of two. However, just like binary trees, this simple structure allows many interesting uses.

Thanks for reading! I write one nibble every month. For more, subscribe to my newletter and follow me on Twitter or Mastodon.