Better than 'x*hash(key) mod N' !?

**jachto** · February 25th, 2010, 03:16 AM

Hi,

when creating random values within an interval [0,N-1] from a specified Key you often see this calculation,

Index = randomUniformOnInterval(Key, 0, 2^b - 1) mod N, where N << (2^b - 1)

I propose to use this calculation instead. It is faster and in some applications memory utilization will be better, see further down.

Index = N * randomUniformOnInterval(Key, 0, 2^b - 1) / 2^b

Where the divition by 2^b can of course be replace with a b-right-bit-shift operation.
The calculation is better understood looking at the formula from where it was derived.

Index = randomUniformOnInterval(Key, 0, 2^b - 1) / (2^b / N)

Hash tables implementations are often speed up by forcing the table size to be a power of two. This to be able to replace the modulo operation with an and-bit operation, i.e.

Index = randomUniformOnInterval(Key, 0, 2^b - 1) & (2^B), where B << b

So with one additional multiplication you can use arbitrary sizes of the hash table using the above proposed index calculation.

Cheers!
/Christofer

**jachto** · March 9th, 2010, 05:27 AM

Hi,

as no one have had any objections about the proposed calculation of hash indexes,
I updated the wiki page below.
http://en.wikipedia.org/wiki/Hash_fu...Variable_range

Cheers!
/Christofer

Thread: Better than 'x*hash(key) mod N' !?

Thread Tools

Display

Better than 'x*hash(key) mod N' !?

Re: Better than 'x*hash(key) mod N' !?

Posting Permissions