DEV Community

Sandrina Pereira
Sandrina Pereira

Posted on • Updated on

Learning Hash Tables with drawings 🎨

I've been learning about Data Structures and it all seems very rocket science. 🙃 To help me understand it better (learning to learn) I've made a small Zine about one of them - Hash Tables ⚡️

At the bottom, you can check the full written version with resources, but for now, check out the illustrative version!

Feel free to download the printable version. Here's also the original handmade version.

Zine Cover

Intro

How to Handle Collisions

FAQ

... and now the written version...

The Magic of Hash Tables

What it is

The most efficient data structure to 🔍Search, ➕Insert and 🗑Delete.

Does all of it in O(1) complexity on average (i.e. Takes the same time to
 do it, no matter the size of data available)

It's used for Caching, Database Searching, and Sets

How it works

An unordered collection of unique keys is mapped to values through the process of hashing.

Take a key --> Run it in a hash function --> that generates a hash

Portuguese --> ✨ --> 2
French --------> ✨ --> 0
English -------> ✨ --> 3

... and add it to the matched index on the table (also known as "bucket"):

hash value
0 Salut
1 -
2 Olá
3 Hello

Hashing is one-way!

A hash can’t be converted back to its original key.

ex: Hashing "English" returns 3, but with 3 you can't get "English"

A hash function must:

  • 🏎 Be fast to run.
  • 🧩 Uniform distribution.
  • 💥 Resolve hash collisions.

How to handle Collisions?

A collision happens when a key corresponds to a hash (index) already taken by another key.

Open Addressing

When a bucket is already taken, other buckets are examined until an unused bucket is found.

  • 💚 Fast when the number of collisions is small.
  • 🚨 Tables can’t be smaller than the number of keys.

Chaining

A bucket can take multiple keys with the same index.

  • 💚 Good for a large number of collisions.
  • 🚨 Wastage of space (some buckets are never used).

💡FAQ

What is the ideal Load Factor?

Some say it’s around 0.7. This means a table with 70 items should have around 100 buckets. But it’s a balance. The emptier the buckets, the faster it is, you have fewer collisions but more memory is used.

What is Dynamic Resize?

It happens when a table is getting out of balance - it’s too full or too empty. It resizes the table and rehashes the keys.

What’s the best Hash Function?

There isn’t a one-fits-all solution. It’s a trade-off between speed and distribution.

A good hash function is the secret to an efficient hash table.

TIP: When the keys are static use a Perfect Hash Function. * No collisions * No empty buckets *

When should I avoid Hash Tables?

  • Search is not the main operation.
  • Need to iterate over the elements.
  • There are duplicated keys.

🎁 Bonus: Applying Simple Hash Table Concept in Javascript!

😰 When using If statements...

You end in an if hell (or switch hell) when dealing with different scenarios to assign a variable.

if(viewport === 'sm') {
    fontSize = 1.0;
} else if (viewport === 'md') {
    fontSize = 1.2;
} // else if ... else if ...

Enter fullscreen mode Exit fullscreen mode

⚡️ But using key:value mapping...

You get a direct lockup - O(1) complexity - no matter the number of options, with a cleaner code. How cool is that?

const sizes = {
    sm: 1,
    md: 1.2,
    lg: 1.2
};

const fontSize = sizes[viewport];
Enter fullscreen mode Exit fullscreen mode

JS


Resources to get started:


Thank you for reading!

Top comments (3)

Collapse
 
wajeehmisbahkhan profile image
Wajeeh Misbah Khan

Understood it better from this article than I did from my semester course.

Also I believe it is else if not if else after the initial if statement.

Collapse
 
a_sandrina_p profile image
Sandrina Pereira

Wow, that's great feedback, thank you! That if else typo was so 🤦‍♀️, fixed!

Collapse
 
ervikasti profile image
Vikas TIwari

Thanks for making it super simple!