Chapter: Programming and Data Structures : Sorting And Searching

Open Addressing (Closed Hashing)

Definition: The technique of finding the availability of another suitable empty location in the hash table when the calculated hash address is already occupied is known as open Addressing. There are three common collisions resolving strategies 1. Linear probing 2. Quadratic Probing 3. Double hashing

OPEN ADDRESSING (CLOSED HASHING):

Open addressing hashing is an alternating technique for resolving collisions with linked list. In this system if a collision occurs, alternative cells are tried until an empty cell is found.

The cell h0(x), h1(x), h2(x)……. Are tried mod Table_Size with F(0) = 0. The function F is the collision resolution strategy. This

technique is generally used where storage space is large. Arrays are used here as hash tables.

Definition: The technique of finding the availability of another suitable empty location in the hash table when the calculated hash address is already occupied is known as open Addressing. There are three common collisions resolving strategies

1. Linear probing

2. Quadratic Probing

3. Double hashing

Linear probing, in which the interval between probes is fixed (usually 1) Quadratic probing, in which the interval between probes is increased by adding

the successive outputs of a quadratic polynomial to the starting value given by the original hash computation

Double hashing, in which the interval between probes is computed by another hash function

A drawback of all these open addressing schemes is that the number of stored entries cannot exceed the number of slots in the bucket array. In fact, even with good hash functions, their performance dramatically degrades when the load factor grows beyond 0.7 or so.

1 Linear probing (using neighboring slots)

Probing is the process of a placing in next available empty position in the hash table. The Linear probing method searches for next suitable position in a linear manner(next by next). Since this method searches for the empty position in a

linear way, this method is referred as linear probing. In linear probing for the ith probe, the

position to be tried is, (h(k) + i) mod

Table_Size, where ‘f’ is a linear function of sequentially in search of an empty cell.

Example for Linear Probing:

Insert the keys 89, 18, 49, 58, 69 into a hash table using in the same hash function as before and the collision resolving strategies. F(i)=i.

size Empty table After 89 After 18 After 49 After 58 After 69 0 49 49 49 1 58 58 2 69 3 4

6 18 18 18 18

7 89 89 89 89 89

Solution: In this e.g initially 89 is insertedatindex8. atThe index

first collision occurs when 49 is inserted. It is put in the next available index

namely ‘0’ whichThekey58 collidesis withopen18,89,.49 afterwards it found an empty cell at the index

Similarly collision 69 is handled. If the table is big enough, a free cell can be

always be found, but the time to do so can get quite large. Even if the table is relatively empty, blocks of occupied cells start forming. This is

known as primary clustering means that any key hashes into the cluster will require several attempts to resolve the collision and then it will add to the cluster.

H(k) = key value mod Table size

The figure shows linear probing example. h(j)=h(k), so the next hash function,h1 is used. A second collision occurs, so h2 is used. Linear probing suffers from primary clustering.

Algorithm for linear probing:

1. Apply hash function on the key value and get the address of the location.

2. If the location is free, then

i) Store the key value at this location, else

ii) Check the remaining locations of the table one after the other till an empty location is reached. Wrap around on the table can be used. When we reach the end of the table, start looking again from the beginning.

iii) Store the key in this empty location.

3.End

Advantages of linear probing:

1. It does not require pointers.

2. It is very simpler to implement.

Disadvantages of linear probing:

1. It forms clusters, which degrades the performance of the hash table for sorting and retrieving data.

2. If any collision occur when the hash table becomes half full, it is difficult to find an empty location in the hash table and hence the insertion process takes a longer time.

2 QUADRATIC PROBING

Better behaviour is usually obtained with quadratic probing, where the secondary hash function depends on the re-hash index: address = h(key) + c i2

• Probe 0thsequence:probe=h(k)modTableSize 1th probe = (h(k) + 1) mod TableSize

2th probe = (h(k) + 4) mod TableSize 3th probe = (h(k) + 9) mod TableSize ith probe = (h(k) + i2) mod TableSize

on the tth re-hash. Quadratic probing does not suffer from primary clustering: keys hashing to the same area are not bad (A more complex function of i may also be used.) Since keys which are mapped to the same value by the primary hash function follow the same sequence of addresses, quadratic probing shows secondary clustering.

However, secondary clustering is not nearly as severe as the clustering shown by linear probes.

However, the collision elements are stored in slots to which other key values map directly, thus the potential for multiple collisions increases as the table becomes full.

3 DOUBLE HASHING:

Double hashing uses a secondary hash function d(k) and handles collisions by placing an item in the first available cell of the series (i + jd(k)) mod N for j = 0, 1,N–1.… , The secondary hash function d(k) cannot have zero values. The table size N must be a prime to allow probing of all the cells f(i) = i * g(k) where g is a second hash function

• Probesequence:

0th probe = h(k) mod TableSize

1th probe = (h(k) + g(k)) mod TableSize 2th probe = (h(k) + 2*g(k)) mod TableSize 3rd probe = (h(k) + 3*g(k)) mod TableSize ith probe = (h(k) + i*g(k)) mod TableSize

Consider a hash table storing integer keys that handles collision with double hashing.

CLUSTERING

Linear probing is subject to a clustering phenomenon. Re-hashes from one location occupy a block of slots in the table which "grows" towards slots to which other keys hash. This exacerbates the collision problem and the number of re-hashed can become large.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Programming and Data Structures : Sorting And Searching : Open Addressing (Closed Hashing) |