DEV Community

William Lewis
William Lewis

Posted on

Priority Queues using Binary Heaps

Many tasks in computing require a data structure which provides access to its elements based on their importance. As an example, consider a rudimentary scheduler that performs tasks according to their stated priority, regardless of when they were added. A run-of-the-mill queue will not do here, since it preserves the order of its elements, releasing them in the same order in which they were enqueued. What we need is a priority queue.

Priority queues present the same interface as (simple) queues: we can check if the queue is empty, push elements onto the "back", peek at the "front" element (without removing it), and pop the front element. Their difference, then, is entirely internal. Specifically, whereas a simple queue merely shunts its elements along, maintaining their arrival order, a priority queue must reorganize its elements each time a new one is added or removed. It turns out there's a really clever (and quick) way to perform this reorganization.

In what follows, we'll implement a priority queue three different ways: the first will be naive, straightforward, but also slow; the second will improve on the first by leveraging an invariant known as the "heap property"; the third will generalize the second by allowing any type of element that can be compared.

Setup

We'll be working in TypeScript, so if you'd like to play along at home you'll need to create an empty directory:

> mkdir priority-queue
> cd priority-queue
Enter fullscreen mode Exit fullscreen mode

and add a file named tsconfig.json with the following content:

{
  "compilerOptions": {
    "strict": false
  }
}
Enter fullscreen mode Exit fullscreen mode

You can also use an application like codesandbox.

A Naive Priority Queue

What's the least work we need to do in order to satisfy the "contract" promised by a priority queue? Let's create a skeleton class definition, and fill in the details as we see fit:

class PriorityQueue {
  private elements: number[] = [];

  isEmpty(): boolean {
    throw new Error('unimplemented');
  }

  push(x: number) {
    throw new Error('unimplemented');
  }

  peek(): number {
    throw new Error('unimplemented');
  }

  pop(): number {
    throw new Error('unimplemented');
  }
}
Enter fullscreen mode Exit fullscreen mode

Note that this first queue only stores numbers. This is a simplification for the time being, and we'll see in part 3 how to relax this constraint.

Let's start simple: all that isEmpty needs to do is check if the elements array has a length of 0:

isEmpty(): boolean {
  return this.elements.length === 0;
}
Enter fullscreen mode Exit fullscreen mode

The implementation of push is more involved, but only slightly so. We need to add the provided element to the end of our elements array, and then sort the array (so that its elements are ordered):

push(x: number) {
  this.elements.push(x);
  this.elements.sort((a, b) => a - b);
}
Enter fullscreen mode Exit fullscreen mode

We've decided to sort our elements in ascending order. Since we're treating the last index as the "back" of our queue (where we push elements) and index 0 as the "front" (where we pop elements), this makes our priority queue a "min priority queue" (because smaller elements have higher priority). Note that we could have just as easily opted for a "max priority queue" by either providing an appropriate ordering function to sort, or using unshift instead of push.

Since index 0 is the "front" of our queue, peek simply needs to return the zeroth element if it exists, and throw an error otherwise:

peek(): number {
  if (this.isEmpty()) {
    throw new Error('attempt to `peek` empty priority queue');
  }

  return this.elements[0];
}
Enter fullscreen mode Exit fullscreen mode

Finally, pop needs to shift the front element off of the queue and return it if it exists. Like, peek, we'll have pop throw an error if the queue is empty:

pop(): number {
  if (this.isEmpty()) {
    throw new Error('attempt to `pop` empty priority queue');
  }

  return this.elements.shift();
}
Enter fullscreen mode Exit fullscreen mode

Let's try it out! Add the following at the end of the same file in which you've written your PriorityQueue class, and run npx ts-node <filename>.

const pq = new PriorityQueue();

console.log(pq.isEmpty()); // => true

pq.push(3);
pq.push(2);
pq.push(10);

console.log(pq.isEmpty()); // => false

console.log(pq.peek()); // => 2
console.log(pq.pop()); // => 2
console.log(pq.pop()); // => 3

pq.push(5);
pq.push(42);

console.log(pq.pop()); // => 5
Enter fullscreen mode Exit fullscreen mode

It works!

Amortization

Re-sorting the elements array after every push is potentially quite costly, and in the next iteration we'll see a really clever way to avoid performing a full sort each time an element is added. However, before doing so, it pays to mention that in some cases, a small variation on the naive strategy is suprisingly quite efficient.

Suppose we know that in the vast majority of cases, our priority queue is used in the following way: first, a large number of elements are pushed (without any intervening peeks or pops); then, a corresponding number of elements are popped.
In this case, we're better off waiting to sort the elements until peek or pop is called. Furthermore, we can keep a flag indicating whether any elements have been added since the last sorting occurred, and only re-sort if the flag is true.
Here's what that might look like:

class PriorityQueue {
  private elements: number[] = [];
  private needsSorting = false;

  isEmpty(): boolean {
    return this.elements.length === 0;
  }

  push(x: number) {
    this.elements.push(x);
    this.needsSorting = true;
  }

  peek(): number {
    if (this.isEmpty()) {
      throw new Error('attempt to `peek` empty queue');
    }

    if (this.needsSorting) {
      this.elements.sort((a, b) => a - b);
      this.needsSorting = false;
    }

    return this.elements[0];
  }

  pop(): number {
    if (this.isEmpty()) {
      throw new Error('attempt to `pop` empty priority queue');
    }

    if (this.needsSorting) {
      this.elements.sort((a, b) => a - b);
      this.needsSorting = false;
    }

    return this.elements.shift();
  }
}
Enter fullscreen mode Exit fullscreen mode

Now, pushing is very cheap, and only the first peek or pop in a sequence of peeks or pops is expensive. That is, in the following test

const pq = new PriorityQueue();

function randomInt(): number {
  return Math.floor(Math.random() * 100);
}

for (let i = 0; i < 1000; i++) {
  pq.push(randomInt());
}

for (let i = 0; i < 1000; i++) {
  console.log(pq.pop());
}
Enter fullscreen mode Exit fullscreen mode

only a single sort is actually performed: the first time that pop is called. This strategy of spreading out over time the cost of an expensive operation is known as "amortization".

Unfortunately, priority queues aren't often used in the way described just now. Instead, elements are often constantly pushed and popped with very few uninterrupted runs of either. To improve our priority queue's performance in this type of scenario, we need a new idea.

Binary Heaps

"Heap" is unfortunately an ambiguous term in computing: it may refer to either a region of program memory, or a kind of data structure. It's the second meaning that concerns us here.

A heap is a binary tree with a certain property, and this property relates to how the elements within the tree are arranged. Recall that in a binary tree, a node may have 0, 1, or 2 children: if it has no children, we often refer to it as a "leaf"; if it has 1 or 2, we'll call it a "parent" node. A binary tree has the "heap property" if every parent node is less than or equal to its child nodes.

Let's consider some examples. The following binary trees all use the same elements: 2, 3, 5, 6, 9, 9.

These two trees are both heaps (that is, they have the heap property).

An example heap

Another example of a heap

They also illustrate some potentially confusing aspects of this property. First, a heap doesn't need to be a "full" binary tree (the right child of the root node only has a single child in both of these cases), but only the last "generation" is allowed to be incomplete. Furthermore, if the last generation is incomplete, the nodes need to be "packed" to the left. At the moment, defining these criteria rigorously is not as important as having a general sense of what they require.

Second, as the two examples illustrate, the same elements can be arranged in a heap in many different ways. Thus, heaps are more "fluid" than sorted arrays: knowing the elements of an array along with the fact that it is sorted is enough information to uniquely identify the array; however, this is not the case for heaps. This fluidity is ultimately the source of the gains in efficiency that they allow for.

The tree below does not have the heap property because 5 is a parent of 3, and 9 is a parent of both 6 and 2.

An example of a non-heap

The following tree is also not a heap, because the final generation isn't "packed" to the left (there is a "hole" indicated by the ghosted edge and vertex).

Another example of a non-heap

Maintaining Order

So heaps are "roughly" ordered trees. How does that help us create a more efficient priority queue? In a moment, we'll swap out our (naive) sorted array for a binary heap. Whereas the array needed to be completely re-sorted each time an element was pushed, a heap only needs to be re-heaped, so to speak. Due to the fluid, rough ordering of heaps, this is a much cheaper operation.

Let's look at how this will work. We need a way to locate the smallest element in a heap, and to restore the heap property after a new element is added or removed. The first of these questions has an easy answer: the smallest element of a heap must be at the root (it can't be the child of any element, since all children are greater than or equal to their parent).

What about restoring order? There are two cases to consider. In the first case, when the root element is removed (e.g. during a pop), we're left with two subtrees. Perhaps surprisingly, we'll move the bottom-most, right-most element into the root position, and then patch things up as needed. How does this patching work? We first compare the new root with its two children; if it's smaller than (or equal to) both, we're done; if it's greater than at least one, we'll swap the root with the smaller of its two children. We then repeat this same process on the subtree that now contains the replaced root.

How does this restore the heap property? Note that all subtrees in a heap have the heap property, so if the new root is no greater than either of its children, the entire tree has the heap property. If the new root is greater than one of its children, swapping the smaller child preserves the heap property in the untouched tree, while reducing the problem size. This restoration process requires at most log(d) iterations, where d is the depth of the tree. This is significantly better than a complete re-sort.

Let's step through an example, using one of the trees above. We begin with a heap:

The initial heap

We then remove the root element:

After removing the root

and replace it with the bottom-most, right-most element:

After replacing the root

We compare the new root with its two children. In this case, it's greater than both of them.

Comparing the new root with its children

As such, we swap the new root and the smaller of its two children:

Swapping the root and the smaller child

We then repeat this process on the left subtree. We compare the root with its two children:

Comparing the swapped root with its children

But since the parent (6) is smaller than both children (9 and 9), we're done. We've successfully restored the heap property:

The heap property restored

Restoring order after adding an element is quite similar, but in reverse. We add an element by attaching it to the bottom-most, right-most node in the tree (starting a new "generation" if the bottom row is full). We then compare this new node and its parent: if the parent is smaller (or equal to) the new node, we're done; otherwise, we swap the new node and its parent, and repeat this process one level up.

As before, the fact that each subtree in a heap is itself a heap

A Compact Representation

Discussion (0)