Hashing
Hashing is a technique that is used to assign a key to a specific value so that the value can be accessed easily and efficiently.Some examples of how hashing is used in our lives include:
•In libraries, a book is assigned to a unique number/key that can be used to find information about the book we are looking for, through the library computer system, such as its exact position in the library or the users it has been issued to, etc.
In the example above, the books with all the information it has are the value. The number is the key which makes it easier to identify the books.
Assume that you have a value and you want to assign a key to it to make searching easy. To store the values, you can use a simple array like a data structure where keys (integers) can be used directly as an index to store values. However, in most cases, the keys are actually String.
So How Do You Map a String(key) Into Indexes In an Array?
HASH FUNCTION AND HASH TABLE
This is where hash function comes into play, a hash functions take a string(key) then convert it into some sort of integer(smaller key) which is called a hashcode, the hashcode is then remapped into a smaller number which will be the index of the array, this is done so that the array size won't get too big. Now you can access the element in O(1) time because of the index assigned to the key. The name of the array on which the value is store is called a hash table.The hash function uses some sort of mathematical algorithm to convert the string(key) into hash code.
These are a few important methods/algorithm for constructing a hash functions
1.MID-SQUARE
•Square the string/identifier and
then using an appropriate number of bits from the middle of
the square to
obtain the hash-key.
•If the key is a string, it is
converted to a number.
•Steps:
1. Square the value of the key. (k2)
2. Extract the middle r bits of the result obtained in Step 1
Function : h(k) = s
k = key
s = the hash key obtained by selecting r bits from k2
Mid square example:
•Here the entire key participates in generating the address so that there is a better chance
that different addresses are generated even for keys close to each other
•For example,
1. Square the value of the key. (k2)
2. Extract the middle r bits of the result obtained in Step 1
Function : h(k) = s
k = key
s = the hash key obtained by selecting r bits from k2
Mid square example:
•Here the entire key participates in generating the address so that there is a better chance
that different addresses are generated even for keys close to each other
•For example,
Key
squared value middle part
3121 9740641 406
3122 9746884 468
3123 9753129 531.
2.DIVISION
Steps:
•Divide the string/identifier by using the
modulus operator
• It’s the most simple method of hashing an
integer x.
Function: h(z) = z mod M
z = key
M = the value used to divide the key, usually using a prime
number, the table size or the size of memory used.
DIVISION EXAMPLE:
Suppose the table is to store strings. A very simple hash
function would be to add up ASCII values of all the characters in the string
and take modulo of table size, say 97.
“COBB” would be stored at the location
(
64+3 + 64+15 + 64+2 + 64+2) % 97 = 88
“HIKE” would be stored at the location
(
64+8 + 64+9 + 64+11 + 64+5) % 97 = 2
“PPQQ” would be stored at the location
(
64+16 + 64+16 + 64+17 + 64+17) % 97 = 35
“ABCD” would be stored at the location
(
64+1 + 64+2 + 64+3 + 64+4) % 97 = 76
3.FOLDING
The Folding method works in two
steps :
- Divide the key value into a number of parts where each part has the same number of digits except the last part which may have lesser digits than the other parts.
- Add the individual parts. That obtains the sum of part1 + part2 + ... + part n. The hash value produced by ignoring the last carry, if any.
Folding Example:
Let's start by converting the string's characters into numbers. ASCII is a good candidate for this operation
Now, we arrange the numbers we just obtained into groups of some size. Generally
we choose the group size value based on the size of our array which is 105. Since the
numbers, in which we transformed the characters into, contain from two to three
digits, without loss of generality, we may set the group size to two:
numbers, in which we transformed the characters into, contain from two to three
digits, without loss of generality, we may set the group size to two:
The next step is to concatenate the numbers in each group as if they were strings
and find their sum

Now we must make the final step. Let's check whether the number 348933 may
serve as an index of our array of size 105. Naturally, it exceeds the maximum
allowed value 99999. We may easily overcome this problem by applying the
modulo operator in order to find the final result:
348933 % 10000 = 48933.4.Digit Extraction
•A predefined digit of the given number is considered as the hash address.
•Example:
–Consider x = 14,568
–If we extract the 1st, 3rd, and
5th digit, we will get a hash code of 158.
5.Rotating Hash
Steps:
• Use any hash method (such as division or
mid-square method)• After getting the hash code/address from the hash method, do rotation
• Rotation is performed by shifting the digits to get a new hash address.
Example:
-Given hash address = 20021
-Rotation result: 12002 (fold the digits)
COLLISIONS
When we put objects into a hashtable, it is possible that different objects (bythe equals() method) might have the same hashcode. This is called a collision.
Here is the example of collision:
Two different strings ""Aa" and "BB" have the same key:
"Aa" = 'A' * 31 + 'a' = 2112
"BB" = 'B' * 31 + 'B' = 2112
There general and best ways to handle collisions:
Chaining
Chaining is a method that we can use to overcome collision by using a linked list sothat the key/string with the same hash code/index will be linked together using a linked
list. For this, to work the hash table must store not only the value of the object but also
the original key, so that you can find the key searched inside the linked list.
Code implementation:
void chaining(item, h[]) {
hash_key = hash(item);
trail = NULL, lead =
h[hash_key];
while ( lead ) {
if (
strcmp(lead->item, item) == 0 ) { // DUPLICATE ENTRY }
trail = lead;
lead = lead->next;
}
}
p = malloc new node;
p->item = item;
p->next = NULL;
if ( trail == 0 )
h[hash_key] = p; else trail->next = p;
}
HASHING IMPLEMENTATION ON BLOCKCHAIN
Hashing in blockchain refers to the process of having an
input item of whatever length
reflectingan output item of a fixed length. If
we take the example of blockchain use in
cryptocurrencies, transactions of
varying lengths are run through a given hashing
algorithm, and all give an
output that is of a fixed length. This is regardless of the
length of the input transaction. The output is what we call a hash. A good example is
Bitcoin’s
Secure Hashing Algorithm 256 (commonly shortened to SHA-256). Hashing
using
SHA-256 always gives an output result of a fixed length, which has a 256-bits
length (the output is 32 bytes). This is always the case whether the transaction is just
a single word or a complex transaction with huge amounts of
data. What this means is
that keeping track of a transaction becomes easier
when you can recall/trace the
hash. The size of the hash will depend on the
hash function utilized, but the out using a
particular hashing algorithm will
be of a specific size.



No comments:
Post a Comment