Data Structures: Hash Tables
Imagine you have a very large array and you need to find a specific value but you do not know its index. You would normally have to iterate through the entire array to find the item you are looking for. This is quite inefficient and becomes a problem if the array is very large. Hash tables solve this problem and allow an item to be found very quickly in an array.
Create a Hashtable
Create an array which has a size suitable to your data requirements. eg We will be adding only a few items therefore this array has a size of 10.
Find the index to insert an element
Items need to be inserted at an index which you will be able to find again easily. If you can apply a formula to a value which will always produce the same index, then you can use that formula to store the value at the given index and use the formula again to get the index to easily find the value.
Fortunatly such formula is actually pretty simple. By hashing a value and getting its integer, you can then apply the following: hashedValue mod size = index
- Every time you hash the same item you will get the same hash value. eg anytime you apply your hash formula to the string "a" you get 102.
- Any number mod the length of the array will always return a integer which fits in the array. eg 102 % 10 = 2
- Add the item to the index of the result. eg at "a" at index 2
Now that "a" has been added to index 2 we can easily find "a" again by reapplying the formula to it. This technique can be applied to check if you already have the item in your "collection" or you can use it as a key value pair. eg hash the key "a" and store it with the value "1".
Since the size of the array is not going to be unlimited, our hash mod length formula may put two items in the same index. eg if hash "k" = 112 then 112 % 10 = 2
This is a collision and there are many ways to handle them.
One common solution is to add each item of collision to a linked list. Each time you search for an item at an index we will need to iterate through the linked list to find what we are looking for, therefore the array should be long enough so that these linked lists do not become too large.