DEV Community

Cover image for Why Hash Maps are Randomized
Jonathan Boudreau
Jonathan Boudreau

Posted on

Why Hash Maps are Randomized

I implemented a hash map (hash table) on one of my side projects recently just for fun and learned an interesting tidbit about security that I wasn't aware of. Figured I'd share it!

How Hash Maps Work

At its core, the idea behind a hash map is that you convert a given key (e.g., a string) into an index value using a hash function. The index will then determine the location of your data on an array. Insert, delete, and read have a constant time complexity (best case 😉). This means that no matter how many elements you have in your hash map the time it takes for your lookups shouldn't really change.

However, since we are compressing information into a single integer, it is possible to have collisions. The solution is to use another array (a bucket), inside of our hash map's array, to hold all collisions.

I'll be skipping over some details, but just to get an idea this is what a hash map implementation would roughly look like:

class HashMap
    constructor() {
        this.buckets = new Array(1000) // arbitrary
    }
    set(key, value) {
        const index = hash(key)
        const bucket = this.buckets[index]
        if(!bucket) {
            this.buckets[index] = [{
                key: key,
                value: value
            }]
        } else {
            const item = bucket.find(item => item.key === key)
            if(item) {
                item.value = value
            } else {
                bucket.push({
                    key: key,
                    value: value
                })
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Denial of Service

Denial of service broadly refers to an attack where the service is brought down either temporarily or permanently.

Quite often the way this is done is by triggering a highly expensive operation on the server side repeatedly.

The Attack

As was mentioned above, when collisions occur more items will be placed into the same bucket. If it happens enough times the performance of your hash map will start to look more like a sequential array of keys and values. In other words the complexity will become linear.

Depending on how critical the performance of your hash map is, your server resources may be taken up easily.

The Solution

For the attacker to cause collisions in our hash map, he needs to know ahead of time what will cause them. Our solution revolves around making it nearly impossible for the attacker to guess the correct combination of keys to cause enough collisions which would impact the performance of the server.

Some hash functions allow a seed to be specified for the hash. When the hash map is created we generate the random seed. This means that each hash map instance will convert the same key to a different index. If it is a cryptographically secure random number generator, we know that the attacker will not be able to guess it.

Here is the rough idea for the solution:

class HashMap
    constructor() {
        this.seed = random()
        this.buckets = new Array(1000) // arbitrary
    }
    set(key, value) {
        const index = hash(this.seed, key)
        const bucket = this.buckets[index]
        if(!bucket) {
            this.buckets[index] = [{
                key: key,
                value: value
            }]
        } else {
            const item = bucket.find(item => item.key === key)
            if(item) {
                item.value = value
            } else {
                bucket.push({
                    key: key,
                    value: value
                })
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Top comments (3)

Collapse
 
tobiassn profile image
Tobias SN • Edited

I honestly always thought that a hash map was just an alternate version of a dictionary. Thanks for clearing that up!

Collapse
 
aghost7 profile image
Jonathan Boudreau

No problem!

Collapse
 
qm3ster profile image
Mihail Malo

a HashMap is one possible concrete data structure that implements the Dictionary ADT(abstract data type)