Click to See Complete Forum and Search --> : Better than 'x*hash(key) mod N' !?


jachto
February 25th, 2010, 02:16 AM
Hi,

when creating random values within an interval [0,N-1] from a specified Key you often see this calculation,

Index = randomUniformOnInterval(Key, 0, 2^b - 1) mod N, where N << (2^b - 1)


I propose to use this calculation instead. It is faster and in some applications memory utilization will be better, see further down.

Index = N * randomUniformOnInterval(Key, 0, 2^b - 1) / 2^b

Where the divition by 2^b can of course be replace with a b-right-bit-shift operation.
The calculation is better understood looking at the formula from where it was derived.

Index = randomUniformOnInterval(Key, 0, 2^b - 1) / (2^b / N)


Hash tables implementations are often speed up by forcing the table size to be a power of two. This to be able to replace the modulo operation with an and-bit operation, i.e.

Index = randomUniformOnInterval(Key, 0, 2^b - 1) & (2^B), where B << b


So with one additional multiplication you can use arbitrary sizes of the hash table using the above proposed index calculation.



Cheers!
/Christofer

jachto
March 9th, 2010, 04:27 AM
Hi,

as no one have had any objections about the proposed calculation of hash indexes,
I updated the wiki page below.
http://en.wikipedia.org/wiki/Hash_function#Variable_range

Cheers!
/Christofer