Computers don't know how to read string or other data type indices, only integers, so what we must do is create a hash function or a function that converts these non readable values to some type of fixed size digit. There are many existing famous hash functions out there that are readily available. One thing to note is that a converted key to a hash number needs to be within the length of the array. If we have an array of length 100, the hash number could not be 654 or we'd have to add 554 new slots and that just doesn't make any sense, especially knowing we need consecutive memory slots. One way to be sure your hashed item is within the proper fixed size, you would use the Modulus
% operator and take the newly hashed
value % arr.length it would then always fit within the constraints that we have described. I guess in a sense a hashing function is an extremely valuable algorithm, and outside of these hash tables / maps, hash functions are used widely in security, authentication, and cryptography in programming just to name a few.
Things to not with hashing is that we cannot work backwards. If you do not have the original key you will not be able to get it from the hash. So what makes a good hash function? It must be fast, it should distribute indices uniformly, and it must be always yield the same output with the same input, otherwise how would you be able to find the value? Hash Function. Here is a cool JS hash function that can be used. However if you look at the bottom you'll see its not really IE compatible but at this point who cares, you should not be using IE!!!!!!!
When hashing you could run into collision in where multiple inputs will output the same hashed value. Now whenever collision occurs it will increase the time complexity and at worse case scenario, which you remember is how you want to describe it, would be big O of N which means it would be a linear value as N increases it would take that much more time. So how to deal with these collisions. We'll first off with better hashing algorithms and larger hash tables we can reduce the number of collisions. Two ways of dealing with them are:
- Separate chaining - storing the same indices into an actual data structure such as an array or linked list.
- Linear probing - when a collision occurs we search to find the next empty slot, which would not increase the time complexity for our insert, delete, and search. The caveat here is that you'll never be able to fill more items than the length of the array.
Here is my basic implementation of a Hash Table. Hash Table. I went with about as basic as a hash function as possible, and for collision I went with the separate chaining approach. One thing to note here is that I did not account for duplicate keys. In many languages that have Hash Maps by default, will override a value if the same key value pair is passed in. I think it would've been easy enough to implement but at this point I just wanted to build something out so I had knowledge of it. And in the end it's nothing but an extra conditional or two and I think at this point I'm fine with this.