DEV Community

Paula Gearon
Paula Gearon

Posted on • Updated on

Shaking the Tree

Part 7, with part 6 back here, or go back to the beginning here.

A New Structure

The examples so far have all demonstrated linked lists. These are useful in some circumstances but are limited in functionality.

The next structure I'm going to consider is a tree.

The elements (or nodes) of a tree are very similar to the elements in the linked list, but instead of having a single reference to the next element in the list, they have multiple references, for each of the children. Let's consider a simple tree with at most 2 children per node. These are called "Binary Trees".
Simple binary tree
Many people will be familiar with the terminology around trees, but I'm going to just run through some of the main terms quickly.

Simple Tree

A tree is made up of nodes, with each node having values (in this example, these are the numbers) and 0 or more children. This is a binary tree, which means that nodes have just a single value, and at most 2 children. These are shown above as arrows. Note that the bottom nodes have 0 children.

We can see a simple example of this with a basic Node class:

class Node {
  constructor(value) {
    this.value = value;
  }

  setLeft(left) {
    this.left = left;
    return this;
  }

  setRight(right) {
    this.right = right;
    return this;
  }
}

The constructor could take the left and right values if they are available, but I will keep this class simple for now.

The tree shown in the above diagram is ordered and follows the simple rule that all numbers under a node to the left will have smaller values, and all numbers under a node to the right will have larger values. The top of the tree (containing the value "4") is called the root.

So now we can build a small tree with the values 1/2/3:

let one = new Node(1);
let two = new Node(2);
let three = new Node(3);
two.setLeft(one);
two.setRight(three);

The result is a small structure with the value of "2" at the root:

Node { value: 2, left: Node { value: 1 }, right: Node { value: 3 } }

You may have noticed that the setLeft and setRight functions return the node itself. This allows chaining of function calls, which can make code a little more compact. This is also why I did not use the standard setter syntax for Javascript, since that should not be returning a value.

Chaining the function calls means that we can skip saving the nodes with a label where we aren't setting the children.

let two = new Node(2).setLeft(new Node(1)).setRight(new Node(3));

This has exactly the same result as above, but is a little quicker to write.

Now let's build the entire tree found in the initial diagram:

let two = new Node(2).setLeft(new Node(1)).setRight(new Node(3));
let six = new Node(6).setLeft(new Node(5)).setRight(new Node(7));
let root = new Node(4).setLeft(two).setRight(six);

It is possible to do all of this in a single line, but that goes beyond making the code more compact and starts making it obtuse, so let's not go there!

The result for all of this is the tree that was shown in the initial diagram:

Node {
  value: 4,
  left:
   Node { value: 2, left: Node { value: 1 }, right: Node { value: 3 } },
  right:
   Node { value: 6, left: Node { value: 5 }, right: Node { value: 7 } } }

Tree Properties

You may notice that any node in the tree can also be considered as the root of a smaller tree. Each of these is called a branch of the tree. A number of common tree operations are defined by doing some task on the node at the root of a tree (which may be a branch of a larger tree) and also doing the same operation to each of the child branches. This is similar to processing linked lists, where an operation involves some task on the current head of the list, and then the operation on the list that follows that element.

An example of this might be to create a string containing all the values of the tree. To do this, a node can get the string for the left branch, add the current value, and then add the string for the right branch.

function treeString(node) {
  if (node === undefined) return null;                                                                                    
  let leftString = treeString(node.left);                                                                                 
  let rightString = treeString(node.right);
  let result = (leftString === null) ? node.value : leftString + ", " + node.value;
  if (rightString !== null) {
    result += ", " + rightString;
  }
  return result;
}

The ternary expression (using ?:) and the final if statement both help determine if commas are necessary for the list.

By testing for an undefined node on the first line, the function can call treeString on each of its children without having to test if they exist first. You may also notice that this function is recursive, and this test for undefined ensures that the recursion stops when it gets to the bottom of each branch.

Calling treeString(root) we can now get the full contents of the tree as a string:
1, 2, 3, 4, 5, 6, 7

Height

The height of a tree is the maximum number of nodes you have to go down to get to a child with no further children. So the height of the entire tree is 3. The height of the branches at nodes "2" and "6" is 2. Nodes at the bottom all have a height of 1 and are also referred to as leaves.

Calculating the height can be done recursively. The height of a node is the larger of the heights of the left and the right, plus one. If there is no node, then it has a height of 0.

function height(node) {
  if (node === undefined) return 0;
  return 1 + Math.max(height(node.left), height(node.right));
}

Trying this with height(root) gives 3

Balance

An additional feature of this particular tree is that it is balanced. That means that the height of each of the child branches never differs by more than 1. Consequently, this tree of 7 nodes has a height of just 3. Trees do not need to be balanced, but they are much more efficient when they are, as the height will be smaller.

An extreme example of an unbalanced tree would be if the left node were always undefined. The root of our 7-node tree would be "1", and the tree would always only have a child to the right, all the way down to "7". This is the equivalent of a linked list, and the height of this unbalanced tree would be 7.

As trees fill up, they may become unbalanced, and this may need to be quantified. Generally, being unbalanced to the left will result in a negative number, and being unbalanced to the right would provide a positive number:

function balance(node) {
  return height(node.left) - height(node.right);
}

Sorting

So far we have shown that a tree can hold the data you put into it, but it holds it in a strange structure. After getting the data back out, it looks just like what could have been stored in a linked list. What benefits does all this complexity provide? The most common advantage is that they can sort data.

The example code building the tree above was a very manual operation, carefully building the tree's structure to demonstrate precisely what we wanted. This isn't particularly useful in a general case since the incoming data is unlikely to be known ahead of time. Instead, the code that inserts data into the tree can be used to implement the sorting rule mentioned earlier: data that is smaller than a node will go to the left of it, and data that is equal or larger will go to the right.

To do this, let's build a Tree class that stores a root, and can insert data.

class Tree {
  constructor(value = null) {
    this.root = value === null ? null : new Node(value);
  }

  add(value) {
    let node = new Node(value);
    if (this.root === null) {
      this.root = node;
    } else {
      this.insertNode(this.root, node);
    }
    return this;
  }
  // ... more ...
}

The add method starts by constructing a node to hold the value. If the tree does not yet have a root, then this node becomes the root. Otherwise, call the insertNode method, passing the root, and the node to add. The insertNode method could be a function that is external to the class, but I expect it to be in the context of the tree when new features are added, so we will put it inside the class:

  insertNode(node, newNode) {
    if (newNode.value < node.value) {
      if (node.left === undefined) {
        node.setLeft(newNode);
      } else {
        this.insertNode(node.left, newNode);
      }
    } else {
      if (node.right === undefined) {
        node.setRight(newNode);
      } else {
        this.insertNode(node.right, newNode);
      }
    }
    return this;
  }

This looks more complex, so let's consider what is happening.

The first if statement checks if the value of the node being inserted is smaller than the value in the current node. If so, it should be inserted to the left, otherwise it will be inserted on the right. If a different type of sorting pattern is desired, then a different operation to the simple less than operator will be used for this test.

The next part is similar to how the head is addressed in the add function, in that if there is no node on the left, then set the left to this node, otherwise call insertNode with the left node as the starting point.

If the value of newNode is equal to or larger than the value of the current node, then do exactly the same thing with the right.

Sorted π

What happens if we try to insert the first few digits of π into the tree?
3.141592

Let's start with the first 3 digits.

let piTree = new Tree(3).add(1).add(4);

Starting with the 3, this will become the root of the tree. Adding the 1 will go to the left of the root, and adding 4 will go to the right. The nodes are shown here, with boxes to indicate the left and the right references.
Tree for 314

Adding the next few digits expands the tree further:

piTree.add(1).add(5).add(9).add(2);

Adding the 1 will go to the left of the 3, and then get to another 1. We've said that all values that are greater than or equal go to the right, so the new node goes to the right of this 1. The 5 starts at the root and goes right to the 4 node, and the new node to placed to the right of this. Then, the 9 will start at the root, and will continually travel right until it reaches the bottom. Finally, the 2 will go first to the left of the root and then go right until the bottom.
Pi tree

This tree is not well balanced. If the tree were fully balanced, the height would be 3, but instead, the height is 4. While the root is balanced, both of the branches are unbalanced with a balance of 2 (the 1 and 4 nodes have 0 height on the left, and a height of 2 on the right). I'm not going to address how to balance this just yet, but it is not something that should be neglected.

Now that the data has been added, how does the tree look?
treeString(tree.root)
1, 1, 2, 3, 3, 4, 5, 5, 6, 9

This is all the data, in a fully sorted form.

Sorting

Sorting data like this is the basis of indexing data. It allows stored data to be found quickly. For instance, if the data being stored were the words in a dictionary, they could be inserted in any order and the tree would sort them into alphabetical order.

If the data is more complex, then elements could be sorted by functions specific to that data type. For instance, student records could be sorted alphabetically by family name, then given name, and finally by student ID. The function for performing sorts would be provided to the tree constructor and used by the insertNode function. (Recall how I wanted that function inside the class? That gives it access to a function with a name like this.comparator).

Once data has been sorted, it becomes efficient to search through it. When we look for a word in a standard printed dictionary, we typically start at a reasonable place and then jump back and forth by progressively smaller jumps until we narrow in on the word we want. A tree structure will do something similar. The root is similar to starting halfway through a dictionary, and each step down the tree is equivalent to jumping back and forth through the book.

This is an efficient approach to finding a word since you will never need to step further than the height of the tree. If the tree containing n elements is kept balanced, then a binary tree will have a height of about log2(n). This means that every time you double the size of the data, the search only gets 1 step longer. For 1000 items, then the tree will only be 10 steps deep. For 2000, it will just be 11 steps down. This makes finding data fast and efficient.

Recap

In this post we went through the basics of building a binary tree, providing examples in Javascript. Once the basic structure was in place, we added code that could keep the tree sorted. This resulted in a useful structure that can sort data but isn't fully efficient since the resulting tree was unbalanced.

Next, we will put this structure on disk, before considering what can be done to keep it balanced.

Discussion (0)