DEV Community

Cover image for Build Binary Tree from Array
Alexey Yuzhakov
Alexey Yuzhakov

Posted on • Updated on

Build Binary Tree from Array

Intro

If you are interested in algorithms, data structures, and building efficient solutions or just preparing for the coding interview, you are aware of LeetCode and similar websites. Here, I will talk about a data structure called Binary Tree and the ways to build it using the array representation. LeetCode has dozens of such problems to practice with this data structure.

Problem

One of the ways to store the tree is to use an array representation. It’s hard for humans to analyze it, but it can be compact and convenient for machine processing. LeetCode is not an exception, so the typical problem describes the input as an array, converts it to the tree representation, and expects you to provide a solution. But if you want to experiment with the algorithm in your favorite IDE, like PyCharm, instead of a built-in LeetCode editor, here is a problem. The conversion of the array to the tree representation is hidden, and you need to implement it on your own.

The tree node is represented as follows:

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right
Enter fullscreen mode Exit fullscreen mode

An example of the input is below:

[1, 2, 3, null, 4, null, null, 5, 6, null, 7]
Enter fullscreen mode Exit fullscreen mode

Visual representation of such a tree looks like the following:

Binary tree

We need to create a function that accepts an array as a parameter and returns the root node of the created tree:

def build_tree(arr: list[int]) -> TreeNode | None:
    # TODO: implement
Enter fullscreen mode Exit fullscreen mode

Approach #1

At first glance, it looks like the classic approach with 2i + 1 for the left child index and 2i + 2 for the right one should work well. Here is a visual representation of how to find the indexes for children nodes during the array traversal.

Array traversal

The recursive implementation may look like the following:

def build_tree(arr: list[int], i: int, n: int) -> TreeNode | None:
    root = None
    if i < n and arr[i] is not None:
        root = TreeNode(arr[i])
        root.left = build_tree(arr, 2 * i + 1, n)
        root.right = build_tree(arr, 2 * i + 2, n)
    return root
Enter fullscreen mode Exit fullscreen mode

An example of the call:

arr = [1, 2, 3, None, 4, None, None, 5, 6, None, 7]
root = build_tree(arr, 0, len(arr))
Enter fullscreen mode Exit fullscreen mode

I use None to make Python syntax valid and represent absent nodes. We need to notice our tree is not a “complete binary tree” or “binary heap” where all levels, except possibly the last, are completely filled, and nodes on the last level go from left to right. So it’s ok to have the missed nodes. And here is a gotcha! LeetCode uses the specific array representation. Take a look at the visualization once again:

Binary tree

The node with value 4 has two children: 5 and 6. Let’s enumerate our array with indexes:

[0: 1, 1: 2, 2: 3, 3: None, 4: 4, 5: None, 6: None, 7: 5, 8: 6, 9: None, 10: 7]
Enter fullscreen mode Exit fullscreen mode

The node with value 4 has an index 4. Our formula tells us that the left child of it is calculated as follows: 2*4+1=9 and the right one: 2*4=10. But it’s None and 7 values instead of 4 and 5. The correct array representation to comply with the formula should look like the following:

[1, 2, 3, None, 4, None, None, None, None, 5, 6, None, None, None, None, None, None, None, None, None, 7, None, ...]
Enter fullscreen mode Exit fullscreen mode

As you can see, there are a lot of None values, and the total number of elements is much higher. We are talking about the binary tree, and the tree in the example has 5 levels. So, the number of items to store all the possible values is 1+2+2*2+2*2*2+... = 31. Meanwhile, our tree has 7 values and uses only 11 array items to store them.

To validate the correctness of the tree, we can implement a simple string representation of the class (nesting represents the parent-child relationship):

class TreeNode:
    # ...

    def __repr__(self):
        def traverse(root: TreeNode, level: int) -> str:
            if not root:
                return ''
            prefix = '  ' * level
            return f'{prefix}({root.val})\n' + traverse(root.left, level + 1) + traverse(root.right, level + 1)
        return str.rstrip(traverse(self, 0))
Enter fullscreen mode Exit fullscreen mode

Let’s print out our tree:

    (1)
      (2)
        (4)
          (7)
      (3)
Enter fullscreen mode Exit fullscreen mode

The created tree is incorrect. LeetCode uses a more compact representation, which detects children's node offsets using the other way.

Approach #2

To deal with missed values and compact representation, we can utilize the iterative approach and queues this time:

def build_tree(arr: list[int]) -> TreeNode | None:
    if len(arr) == 0:
        return None

    nodes = []

    val = arr.pop(0)
    root = TreeNode(val)
    nodes.append(root)

    while len(arr) > 0:
        curr = nodes.pop(0)

        left_val = arr.pop(0)
        if left_val is not None:
            curr.left = TreeNode(left_val)
            nodes.append(curr.left)

        if len(arr) > 0:
            right_val = arr.pop(0)
            if right_val is not None:
                curr.right = TreeNode(right_val)
                nodes.append(curr.right)

    return root
Enter fullscreen mode Exit fullscreen mode

I use the list type for illustrative purposes, which can be replaced with queue implementation because we have a lot of pop(0) operations.

Let’s validate the correctness of the new method:

    (1)
      (2)
        (4)
          (5)
            (7)
          (6)
      (3)
Enter fullscreen mode Exit fullscreen mode

Everything is correct. The proposed solution works well on the LeetCode’s way of representing the trees as arrays.

Conclusion

The array representation usually stores "binary heaps" or "complete binary trees." But as you can see, some tweaks help to store any binary tree as an array in a rather compact form. But the re-building of the tree from such an array also becomes trickier, and you need to be careful with indexes for missed values.

The complete source code can be found here.

Top comments (1)

Collapse
 
m__mdy__m profile image
mahdi

Very good, I did the same thing with js visually in the browser
link