This article was originally published on my website https://coder.today/fast-priority-queues-in-go-hierarchical-queue-86daf6a7553a
Writing O(1) high performance data structures is a multi-post series:
Hierarchical Queue (this article)
-
Ladder Queue (soon)
A priority queue is an abstract data structure, used to store values (any data) with a priority. You can insert data at any time with any priority, but you can only take out the value with the highest priority.
When to use priority queues ⁉
I confess, I didn’t use many of them in my apps, but there are a few problems perfect for it:
triage — any processing you have based on a score/importance
ordering — when your data has a specific order of processing/analysis, ex: process the DAU KPI before MAU
Event driven simulation — priorities can be timestamps
Graph searching
Load balancing — the priority is the inverse of the load
Baby steps
I started a long last relationship with Go. I didn’t want my training to be in vain so I began to look for high performance data structures to implement (that aren’t already). After a few searches I ended up with this image, snippet from a book
The 2 bottom lines attracted my interest, they are O(1) multi-structure priority queues. First of them is the Hierarchical Heap, which is based on a Hierarchical Queue, so let’s dive in.
Hierarchical Queue
I couldn’t find any open-source code or pseudocode for this structure, if you find any issue please let me know.
“When the priority values are limited to small integers (e.g. digital images often have 8-bit or 12-bit integers as pixel values when they come off the imaging sensor), it is possible to allocate a FIFO queue (bucket) for each possible priority value. An array contains a pointer to each of these buckets, and when enqueueing an element, the correct bucket can be directly indexed using the priority value. Both enqueue and dequeue are O(1) operations.”
Papers used for the algorithm and snippets:
It is a simple structure, very fast but it has a few limitations.
It has only a limited number of priority values (in my implementation is uin8(0–255)). Each value has it’s own queue, so increasing the number you end up with a memory overhead.
Once the highest priority queue is empty, it is removed and the next queue can begin to empty, and it cannot be created again. This means we must fill the queues before we start to dequeue, otherwise our performance will decrease , example: The dequeue process has only 1 priority left (255) and we enqueue other elements. All the new elements will be pushed in the 255 queue (because the 0–254 are empty and removed).
Go code go
Priorities can be uint8 and values interface{} .
I decided to remove the 2nd restriction, I think it made the structure too restrictive for most of the real world scenarios. I managed to keep a very similar performance (≤ 50ns per operation). I cached the highest priority that has values and start to dequeue from there. In some cases, the Dequeue process can be O(1 + k) , where K number of empty queues, but overall is amortized if the priorities are well balanced.
This means that the life of the structure is extended, it can be reused and higher priority values can be added even after the dequeue process started.
In the first iteration I used linked lists as queues (used golang list.List), and the average operation time was 150–200ns. I ended up using a faster structure (thanks to a suggestion of Egon Elbre). It reduced the operation time down to under 50ns (it’s a pretty fast damn list), see collection/deque:
CookieJar - A contestant's toolbox
CookieJar is a small collection of common algorithms, data structures and library extensions that were deemed handy for computing competitions at one point or another.
This toolbox is a work in progress for the time being. It may be lacking, and it may change drastically between commits (although every effort is made not to). You're welcome to use it, but it's your head on the line :)
Installation
To get the package, execute:
go get gopkg.in/karalabe/cookiejar.v2
To import this package, add the following line to your code:
import "gopkg.in/karalabe/cookiejar.v2"
For more details, see the package documentation.
Contents
Algorithms:
Data structures:
Extensions:
Because concurrency is important in Go, the structure has a mutex. It can be used manually or automatically (each function call blocks the mutex).
To be as generic as could be, the queues can store any data type interface{}.
Packing up
The Hierarchical Queue structure is part of “priorityqueue” package, it has 100% test coverage, examples, benchmarks and documentation.
bgadrian / data-structures
Abstract data structures Go packages, built with performance and concurrency in mind to learn Go.
Data structures in Go
I am writing a collection of packages for different data structures in GO.
Why? To learn Go, practice basic structures and learning to code fast concurrent algorithms.
All the packages have 100+% test coverage, benchmark tests and godocs. Tested with go 1.9.
!! Warning This library wasn't used in production (yet). !!
priorityqueue
A collection of performant, concurrent safe, complex abstract data structures used for priority queues.
Priority queue is an abstract data type which is like a regular queue or stack data structure, but where additionally each element has a "priority" associated with it. In a priority queue, an element with high priority is served before an element with low priority.
Hierarchical Queue description
An O(1)/O(1+K) priority queue (very fast) implementation for small integers, that uses an assembly of N simple queues. It is optimized for large amount of data BUT with small value priorities…
Next
The next data structure is the Hierarchical Heap, which is based on the Hierarchical Queue, removing it’s limitations for a small cost of performance.
Fast priority queues in Golang: Hierarchical Heap.
Hierarchical Heap is a very efficient Priority Queue O(log n/k) for large data sets.
Contribution
It (was) my first library written in Go, I never used these algorithm in production, so please correct me if I did something wrong.
Top comments (2)
Cool. How does this compare to Pairing heaps or Fibonacci heaps?
Those heaps have
O(log n)
for delete-min (dequeue), this one has constant time for every operation, but as I said in the article, it has limitations.A Heap has to move some values around after a delete is done, as in here, you only have to popup from a queue.