loading...
Cover image for Learn Data Structure and Algorithm in JavaScript | Part 11

Learn Data Structure and Algorithm in JavaScript | Part 11

edisonpebojots profile image Edison Pebojot(πŸ‘¨β€πŸ’») ・13 min read


Prerequisite (βœ‹πŸ˜)

If you're reading this article right now, please considered to read our Part 01: Big-O Notation, Part 02: JavaScript Unique Part, Part 03: JavaScript Numbers to Part 10: Searching and Sorting Algorithm

Table of Contents
πŸ‘‰ Part 01: Big-O Notation(πŸ”₯πŸ”₯)➑️
πŸ‘‰ Part 02: JavaScript Unique Parts(πŸ”₯πŸ”₯)➑️
πŸ‘‰ Part 03: JavaScript Numbers(πŸ”₯πŸ”₯)➑️
πŸ‘‰ Part 04: JavaScript Strings(πŸ”₯πŸ”₯)➑️
πŸ‘‰ Part 05: JavaScript Array(πŸ”₯πŸ”₯)➑️
πŸ‘‰ Part 06: JavaScript Object(πŸ”₯πŸ”₯)➑️
πŸ‘‰ Part 07: JavaScript Memory Management(πŸ”₯πŸ”₯)➑️
πŸ‘‰ Part 08: Recursion(πŸ”₯πŸ”₯)➑️
πŸ‘‰ Part 09: Sets(πŸ”₯πŸ”₯)➑️
πŸ‘‰ Part 10: Searching and Sorting(πŸ”₯πŸ”₯)βœ‹

Part 11: Hash Tables (😱 πŸ”₯ πŸ“)

Alt Text

A hash table is a fixed-sized data structure in which the size is defined at the start. This part 11 explains how hash tables work, and the method of generating a unique key. By the end of this part 11, you will understand various hashing techniques and know how to implement a hash table. (πŸ˜ƒ)

Introducing Hash Tables (πŸ“πŸ“)

Alt text

Hash tables are excellent for quick storage and retrieval of data based on key-value pairs. In JavaScript, JavaScript objects work this way by defining a key and value.


Alt Text

Figure 11-1. Simple hash table overview

Figure 11-1 shows each key and its associated item. A hash table contains two functions: put() and get(). put() is used for storing data, while get() is used for retrieving data from the hash table. Both of these functions have a time complexity of O(1)O(1) .

localStorage is an example of a data structure based on a hash table. It is a native JavaScript object. It lets developers persist(preserve) data. Let's see an example of localstorage:

Example:

localStorage.setItem("testKey", "testValue");
//-----------------------------------
console.log(localStorage.getItem("testKey")); // prints "testValue"

Note (πŸ“): click remix to edit. And view app to view the result. To see the localstorage example please see the script.js

Hashing Technique (βž‘οΈπŸ”’)

Alt text

The most important part of a hash table is the hash function. The hash function converts a key into an index that stores the data. The three(3) primary requirements for a good hash function are as follows(see below):

  • Deterministic: Equal keys produce equal values.
  • Efficiency: It should be O(1)O(1) in time.
  • Uniform distribution: It makes the most use of the array. (πŸ˜‰)

The first technique for hashing is to use prime numbers. With prime numbers, a uniform distribution of the index can be guaranteed.

Prime Number Hashing (1οΈβƒ£βž‘οΈπŸ”’)

Remember: Prime numbers in hashing are important. Modulus division using prime numbers yields an array index in a distributed(uniform) manner. Let's see an example below:

We have Prime Numbers: 5,Β 7,Β 4,Β 13,Β 15,Β 175,\space 7,\space 4,\space 13,\space 15,\space 17 and a Modulus Number: 1111

5Β %Β 11=55 \space \% \space 11 = 5

7Β %Β 11=77 \space \% \space 11 = 7

4Β %Β 11=44 \space \% \space 11 = 4

13Β %Β 11=213 \space \% \space 11 = 2

15Β %Β 11=415 \space \% \space 11 = 4

17Β %Β 11=617 \space \% \space 11 = 6

Collisions can be seen with 1515 and 44 yielding the same key because they're not even a prime numbers. This is the list of prime numbers just below:

Alt Text

Handling this collision is discussed later below. As of now, I just want to remind you that such collision exist in hash table. What is important here is that modulus by prime numbers guarantees the best distribution for a fixed size, okay? (😐). A small modulus number such as 44 by a non-prime number which is 66 and 1010 guarantees only a range from 00 to 33 and leads to a large number of collisions (πŸ’₯). Let's see an example below:

We have Non-Prime Numbers: 6,Β 106,\space 10 and a Modulus Number: 44

6Β %Β 4=26 \space \% \space 4 = 2

10Β %Β 4=210 \space \% \space 4 = 2

Remember: This is the first(or basic) hashing technique. Let's see an example hash table below:

Alt Text

Figure 11-2. Hash table of size 11, with all empty elements

This hash table of size 11, with all empty elements is the same as the example JavaScript code below(now look (πŸ‘€)):

Example:

[
    { keys: "", values: "" }, // 0
    { keys: "", values: "" }, // 1
    { keys: "", values: "" }, // 2
    { keys: "", values: "" }, // 3
    { keys: "", values: "" }, // 4
    { keys: "", values: "" }, // 5
    { keys: "", values: "" }, // 6
    { keys: "", values: "" }, // 7
    { keys: "", values: "" }, // 8
    { keys: "", values: "" }, // 9
    { keys: "", values: "" }  // 10
]

In this example below(look first), we will only use 4 index for the sake of simplifying my explanation (πŸ˜…), now in the example below, let’s hash the following key-value pairs:

Example:

[
    { keys: 7, values: "hi" },      // 0
    { keys: 23, values: "hello" },  // 1
    { keys: 43, values: "sunny" },  // 2
    { keys: 31, values: "weather" },// 3
]
// We have index 0, 1, 2, 3. With a total of 4

We have Prime Numbers: 7,Β 23,Β 43,Β 317,\space 23,\space 43,\space 31 and a Modulus Number: 1111

7Β %Β 11=77 \space \% \space 11 = 7
23Β %Β 11=123 \space \% \space 11 = 1
43Β %Β 11=1043 \space \% \space 11 = 10
31Β %Β 11=931 \space \% \space 11 = 9

The resulting hash table is shown in Figure 11-3 below:

Alt Text

Figure 11-3. Hash table after inserting the value pairs

Now let’s hash non-prime number 18 with modulus 11:

Results: 18Β %Β 11=718 \space \% \space 11 = 7

This is a problem because 7 already exists in the index and causes an index collision. With a perfect hashing function, there are no collisions. Therefore, there are strategies for handling collisions needed for hash tables which is: Linear Probing, Quadratic Probing, and Double Hashing. Let's see them below. Let's go! (πŸ”₯πŸ”₯πŸ”₯)

Probing (βž‘οΈβ†˜οΈπŸ”„)

To work around collisions, probing finds the next available index in the array. The linear probing technique resolves conflicts by finding the next available index via incremental trials ( n+1n + 1 ), while quadratic probing uses quadratic functions to generate incremental trials ( n2n^{2} ). Don't confuse them for now, we will see more example below! (πŸ™‰)

Linear Probing (πŸ”’βž‘οΈβž‘οΈπŸ”‘)

Linear probing works by finding the next available index by incrementing one index ( n+1n + 1 ). For example, in the case of 18 hashing to the same key of 7, 18 would be hashed into key 8 because that’s the next empty spot. See the example below:

Alt Text

Figure 11-4. Hash table 1 after using linear probing

When the get( key ) function is used, it has to start at the original hash and then iterate. We will discuss later what get( key ) function is just below.

Note (πŸ“): The main disadvantage of linear probing is it easily creates clusters(A grouping of a similar number), which are bad because they create more data to iterate through.

Quadratic Probing (πŸ”’β†—οΈβ†˜οΈπŸ”‘)

Quadratic probing is a good technique for addressing the cluster issue (😏). Quadratic probing uses perfect squares instead of incrementing by 1, and this helps to evenly distribute the indices, as shown in Figure 11-5 below:

h+(1)2,Β h+(2)2,Β h+(3)2,Β h+(4)2Β =Β h+1,Β h+4,Β h+9,Β h+16 h + (1)^2,\space h + (2)^2,\space h + (3)^2,\space h + (4)^2\space = \space h + 1,\space h + 4,\space h + 9,\space h + 16

Alt Text

Figure 11-5. Linear probing (on top) and quadratic probing (on bottom)

Double Hashing (βž‘οΈπŸ”„πŸ”„β¬…οΈ)

Another great way to distribute the keys is by having a second hashing function that hashes the result from the original. These are the three primary requirements for a good second hash function:

  • Different: It needs to be different
  • Efficiency: It should still be O(1)O(1) in time.
  • Nonzero: It should never evaluate to zero.

A second hashing function is as follows. See below:

hash2(x)Β =Β RΒ βˆ’Β (xΒ %Β R)hash2(x) \space = \space R \space βˆ’ \space (x \space \% \space R)

Here, xx is the result from hashing the first, and RR is less than the size of the hash table. Each hash collision is resolved by the following, where ii is the iteration trial number:

iΒ βˆ—Β hash2(x)i \space * \space hash2(x)

Hash Table Implementation (πŸ”’πŸ”¨)

Alt Text

In this section, you will apply three techniques to the same example. The following are the example key-value pairs that will be used:

[
    { keys: 7, values: "hi" },       // 0
    { keys: 23, values: "hello" },   // 1
    { keys: 37, values: "sunny" },   // 2
    { keys: 47, values: "weather" }, // 3
    { keys: 59, values: "wow" },     // 4
    { keys: 73, values: "forty" },   // 5
    { keys: 89, values: "happy" },   // 6
    { keys: 97, values: "sad" }      // 7
]

Using Linear Probing (πŸ”’βž‘οΈβž‘οΈ)

Let’s start the example with simple linear probing:

Example:

function HashTable(size) {
    this.size = size;
    this.keys = this.initArray(size);
    this.values = this.initArray(size);
    this.limit = 0;
}

HashTable.prototype.initArray = function (size) {
    var array = [];
    for (var i = 0; i < size; i++) {
        array.push(null);
    }
    return array;
}

HashTable.prototype.put = function (key, value) {
    if (this.limit >= this.size) throw 'hash table is full'

    var hashedIndex = this.hash(key);

    // Linear probing
    while (this.keys[hashedIndex] != null) {
        hashedIndex++;

        hashedIndex = hashedIndex % this.size;

    }

    this.keys[hashedIndex] = key;
    this.values[hashedIndex] = value;
    this.limit++;
}

HashTable.prototype.get = function (key) {
    var hashedIndex = this.hash(key);

    while (this.keys[hashedIndex] != key) {
        hashedIndex++;

        hashedIndex = hashedIndex % this.size;

    }
    return this.values[hashedIndex];
}

HashTable.prototype.log = function () {
    return {
        keys: this.keys,
        values: this.values
    };
}

HashTable.prototype.hash = function (key) {
    // Check if int
    if (!Number.isInteger(key)) throw 'must be int';
    return key % this.size;
}

var exampletable = new HashTable(13);
exampletable.put(7, "hi");
exampletable.put(23, "hello");
exampletable.put(37, "sunny");
exampletable.put(47, "weather");
exampletable.put(59, "wow");
exampletable.put(73, "fourty");
exampletable.put(89, "happy");
exampletable.put(97, "sad");
exampletable.log();

Execution:

function HashTable(size) { this.size = size; this.keys = this.initArray(size); this.values = this.initArray(size); this.limit = 0; } HashTable.prototype.initArray = function (size) { var array = []; for (var i = 0; i < size; i++) { array.push(null); } return array; } HashTable.prototype.put = function (key, value) { if (this.limit >= this.size) throw 'hash table is full' var hashedIndex = this.hash(key); // Linear probing while (this.keys[hashedIndex] != null) { hashedIndex++; hashedIndex = hashedIndex % this.size; } this.keys[hashedIndex] = key; this.values[hashedIndex] = value; this.limit++; } HashTable.prototype.get = function (key) { var hashedIndex = this.hash(key); while (this.keys[hashedIndex] != key) { hashedIndex++; hashedIndex = hashedIndex % this.size; } return this.values[hashedIndex]; } HashTable.prototype.hash = function (key) { // Check if int if (!Number.isInteger(key)) throw 'must be int'; return key % this.size; } HashTable.prototype.log = function () { return { keys: this.keys, values: this.values }; } var exampletable = new HashTable(13); exampletable.put(7, "hi"); exampletable.put(23, "hello"); exampletable.put(37, "sunny"); exampletable.put(47, "weather"); exampletable.put(59, "wow"); exampletable.put(73, "fourty"); exampletable.put(89, "happy"); exampletable.put(97, "sad"); exampletable.log(); // You can also use exampletable.get(key) if you want to retrieve specific value but now, we only use exampletable.log()

Here is the result on my computer. Take a look carefully (πŸ‘€):
Alt Text

Note (πŸ“): I advice, get a piece of paper or a white board, and draw them, I also advice to draw the code on the paper instead on your computer. You are more likely to remember them. Okay?(πŸ˜…πŸ‘)

Using Quadratic Probing (πŸ”’β†—οΈβ†˜οΈ)

Now, we will just change the put() and get() methods from the linear probing to use quadratic probing. Let's see the example below:

Example:

HashTable.prototype.put = function(key, value) {
    if (this.limit >= this.size) throw 'hash table is full'

    var hashedIndex = this.hash(key),
        squareIndex = 1;

    // quadratic probing
    while (this.keys[hashedIndex % this.size] != null) {
        hashedIndex += Math.pow(squareIndex, 2);
        squareIndex++;
    }

    this.keys[hashedIndex % this.size] = key;
    this.values[hashedIndex % this.size] = value;
    this.limit++;
}

HashTable.prototype.get = function(key) {

    var hashedIndex = this.hash(key),
        squareIndex = 1;

    while (this.keys[hashedIndex % this.size] != key) {
        hashedIndex += Math.pow(squareIndex, 2);
        hashedIndex = hashedIndex % this.size;
        squareIndex++;
    }
    return this.values[hashedIndex % this.size];
}

Execution:

function HashTable(size) { this.size = size; this.keys = this.initArray(size); this.values = this.initArray(size); this.limit = 0; } HashTable.prototype.put = function (key, value) { if (this.limit >= this.size) throw 'hash table is full' var hashedIndex = this.hash(key), squareIndex = 1; // quadratic probing while (this.keys[hashedIndex % this.size] != null) { hashedIndex += Math.pow(squareIndex, 2); squareIndex++; } this.keys[hashedIndex % this.size] = key; this.values[hashedIndex % this.size] = value; this.limit++; } HashTable.prototype.get = function (key) { var hashedIndex = this.hash(key), squareIndex = 1; while (this.keys[hashedIndex % this.size] != key) { hashedIndex += Math.pow(squareIndex, 2); hashedIndex = hashedIndex % this.size; squareIndex++; } return this.values[hashedIndex % this.size]; } HashTable.prototype.hash = function (key) { // Check if int if (!Number.isInteger(key)) throw 'must be int'; return key % this.size; } HashTable.prototype.initArray = function (size) { var array = []; for (var i = 0; i < size; i++) { array.push(null); } return array; } HashTable.prototype.log = function () { return { keys: this.keys, values: this.values }; } var exampletable = new HashTable(13); exampletable.put(7, "hi"); exampletable.put(23, "hello"); exampletable.put(37, "sunny"); exampletable.put(47, "weather"); exampletable.put(59, "wow"); exampletable.put(73, "fourty"); exampletable.put(89, "happy"); exampletable.put(97, "sad"); exampletable.log(); // You can also use exampletable.get(key) if you want to retrieve specific value but now, we only use exampletable.log()

Here is the result on my computer. Take a look carefully (πŸ‘€):
Alt Text

Note (πŸ“): I advice, get a piece of paper or a white board, and draw them, I also advice to draw the code on the paper instead on your computer. You are more likely to remember them. Okay?(πŸ˜…πŸ‘)

If you look at the output. This result is more uniform than the result from linear probing. (πŸ’‘)

Using Double-Hashing in Linear Probing (βž‘οΈπŸ”„πŸ”„)

Finally, let’s combine double-hashing and linear probing. Recall the common second hash function, hash2(x)Β =Β RΒ βˆ’Β (xΒ %Β R)hash2(x) \space = \space R \space βˆ’ \space (x \space \% \space R) , where xx is the result from hashing the first time, and RR is less than the size of the hash table. Let's look at the example below:

Example:

HashTable.prototype.put = function(key, value) {
    if (this.limit >= this.size) throw 'hash table is full'

    var hashedIndex = this.hash(key);

    while (this.keys[hashedIndex] != null) {
        hashedIndex++;

        hashedIndex = hashedIndex % this.size;

    }
    this.keys[hashedIndex] = key;
    this.values[hashedIndex] = value;
    this.limit++;
}

HashTable.prototype.get = function(key) {
    var hashedIndex = this.hash(key);

    while (this.keys[hashedIndex] != key) {
        hashedIndex++;

        hashedIndex = hashedIndex % this.size;

    }
    return this.values[hashedIndex];
}

HashTable.prototype.hash = function(key) {
    if (!Number.isInteger(key)) throw 'must be int'; // check if int
    return this.secondHash(key);
}

HashTable.prototype.secondHash = function(hashedKey) {
    var R = this.size - 2;
    return R - hashedKey % R;
}

Execution:

function HashTable(size) { this.size = size; this.keys = this.initArray(size); this.values = this.initArray(size); this.limit = 0; } HashTable.prototype.put = function (key, value) { if (this.limit >= this.size) throw 'hash table is full' var hashedIndex = this.hash(key); while (this.keys[hashedIndex] != null) { hashedIndex++; hashedIndex = hashedIndex % this.size; } this.keys[hashedIndex] = key; this.values[hashedIndex] = value; this.limit++; } HashTable.prototype.get = function (key) { var hashedIndex = this.hash(key); while (this.keys[hashedIndex] != key) { hashedIndex++; hashedIndex = hashedIndex % this.size; } return this.values[hashedIndex]; } HashTable.prototype.hash = function (key) { if (!Number.isInteger(key)) throw 'must be int'; // check if int return this.secondHash(key); } HashTable.prototype.secondHash = function (hashedKey) { var R = this.size - 2; return R - hashedKey % R; } HashTable.prototype.initArray = function (size) { var array = []; for (var i = 0; i < size; i++) { array.push(null); } return array; } HashTable.prototype.log = function () { return { keys: this.keys, values: this.values }; } var exampletable = new HashTable(13); exampletable.put(7, "hi"); exampletable.put(23, "hello"); exampletable.put(37, "sunny"); exampletable.put(47, "weather"); exampletable.put(59, "wow"); exampletable.put(73, "fourty"); exampletable.put(89, "happy"); exampletable.put(97, "sad"); exampletable.log(); // You can also use exampletable.get(key) if you want to retrieve specific value but now, we only use exampletable.log()

Here is the result on my computer. Take a look carefully (πŸ‘€):
Alt Text

Note (πŸ“): I advice, get a piece of paper or a white board, and draw them, I also advice to draw the code on the paper instead on your computer. You are more likely to remember them. Okay?(πŸ˜…πŸ‘)

Again, double-hashing results in a more uniform distributed array than the result from linear probing. Both quadratic probing and double-hashing are great techniques to reduce the number of collisions in a hash table. Remember: There are collision algorithms far more advanced than these techniques, but they are beyond the scope of this article. (Learn more)

Summary (πŸ“š)

Alt Text

A hash table is a fixed-sized data structure in which the size is defined. Hash tables are implemented using a hash function to generate an index for the array. A good hash function is deterministic, efficient, and uniformly distributive. Hash collisions should be minimized with a good uniformly distributive hash function, but having some collisions is unavoidable (πŸ˜•). Hash collision-handling techniques include: linear probing, quadratic probing, and double-hashing. The next part explores Stacks and Queues, which are dynamically sized data structures. See you soon! (πŸ˜ƒ)


Up NextπŸ‘‰ Part 12: Stacks and Queues (πŸ”₯πŸ”₯) (August 9-10, 2020)


Alt Text

Discussion

markdown guide
 

So, hash table is a dictionary in Python?

 

Good question, yes, they are called hash map library in python instead of hash tables. (πŸ˜ƒ)