CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 11 of 11
  1. #1
    Join Date
    Dec 2000
    Location
    NY
    Posts
    324

    How to choose the best hash function?

    Hi All

    suppose i choose the function

    h(k) = k mod m

    where k is the key and m is the number of elements in table
    i want to hash an ip address which is a long type value.

    what will be a "good" m

    thanx.


  2. #2
    Join Date
    Sep 1999
    Location
    NJ
    Posts
    1,299

    Re: How to choose the best hash function?

    Well, if we can assume that the IP addresses are evenly distributed over the range (which I'm going to assume, since you didn't say otherwise), you're probably best off with the modulo function (remainder after a divide). If your number of buckets is a power of 2, then it becomes just a simple AND operation.

    Truth,
    James
    http://www.NJTheater.com
    http://www.NovelTheory.com
    I don't do it for the points (OK, maybe I do), but rating a post is a good way for me to know if I helped.

  3. #3
    Join Date
    Dec 2000
    Location
    NY
    Posts
    324

    Re: How to choose the best hash function?

    Hi

    what you mean by:
    "If your number of buckets is a power of 2, then it becomes just a simple AND operation."




  4. #4
    Join Date
    Sep 1999
    Location
    NJ
    Posts
    1,299

    Re: How to choose the best hash function?

    "Buckets" is the common term for the elements in the array used in a hash table.

    If you have 4 buckets, you need to reduce the IP to a number between 0-3.
    If you have 16 buckets, you need to reduce the IP to a number between 0-15.

    // To reduce a number to the range of 0-3
    n = IP & 3;

    // To reduce a number to the range of 0-15
    n = IP & 15;



    That only works if the number of buckets is a power of 2 (2, 4, 8, 16, 32, 64 etc)


    Truth,
    James
    http://www.NJTheater.com
    http://www.NovelTheory.com
    I don't do it for the points (OK, maybe I do), but rating a post is a good way for me to know if I helped.

  5. #5
    Join Date
    Oct 2001
    Location
    Norway
    Posts
    265

    Re: How to choose the best hash function?

    Having a number of buckets that is divisible by 2 leads to clustering. The number of buckets should ideally be a prime number.

    "My own view on religion is that of Lucretius. I regard it as a disease born of fear and as a source of untold misery to the human race. I cannot, however, deny that it has made some contributions to civilization. It helped in early days to fix the calendar, and it caused Egyptian priests to chronicle eclipses with such care that in time they became able to predict them. These two services I am prepared to acknowledge, but I do not know of any others." -- Bertrand Russell (From his essay Has Religion Made Useful Contributions to Civilization?, first published in 1930)

  6. #6
    Join Date
    Sep 1999
    Location
    NJ
    Posts
    1,299

    Re: How to choose the best hash function?

    That depends on the nature of the range of items being hashed. IF the numbers are evenly distributed (which I listed as a presumption), then the buckets would be evenly filled.

    Truth,
    James
    http://www.NJTheater.com
    http://www.NovelTheory.com
    I don't do it for the points (OK, maybe I do), but rating a post is a good way for me to know if I helped.

  7. #7
    Join Date
    Feb 2000
    Location
    Montréal
    Posts
    13

    Re: How to choose the best hash function?

    I dont know how many adress you want to stock in your table. With Ip adresses, you can have a lot of 132.211.105.x for what Mod may create "grapes" you want to avoid for your collision management.

    Here a cute method: the "Mid square"

    h(k)= Extract4MiddleNumber(k^2);

    h(23456)= Extract4MiddleNumber(550183936) = 1836
    h(23457)= Extract4MiddleNumber(550230849) = 2308
    h(23458)= Extract4MiddleNumber(550277764) = 2777
    h(23459)= Extract4MiddleNumber(550324681) = 3246

    For a larger range than [1,9999], do k^3

    lslx


  8. #8
    Join Date
    Oct 2001
    Location
    Norway
    Posts
    265

    Re: How to choose the best hash function?

    A good hash function shouldnt have to rely on the keys being evenly distributed. The rule of thumb is to either choose a prime for the number of buckets or a number with no small divisors.
    Choosing an even number of buckets is extremely bad.

    "My own view on religion is that of Lucretius. I regard it as a disease born of fear and as a source of untold misery to the human race. I cannot, however, deny that it has made some contributions to civilization. It helped in early days to fix the calendar, and it caused Egyptian priests to chronicle eclipses with such care that in time they became able to predict them. These two services I am prepared to acknowledge, but I do not know of any others." -- Bertrand Russell (From his essay Has Religion Made Useful Contributions to Civilization?, first published in 1930)

  9. #9
    Join Date
    Sep 1999
    Location
    NJ
    Posts
    1,299

    Re: How to choose the best hash function?

    It shouldn't rely on that fact, but, if a detail about the range is known to be true, then the function shouldn't ignore that. The original poster stated that he's hashing IP addresses. If the address come from visitors to an open WWW site there is every reason to believe that the lower order bits will be evenly distributed.

    And remember, the whole point of a hash table is speed. The "mod to a power of 2" function is probably the fastest possible hash code. The "Middle four of the square" hash presented elsewhere in this thread would be orders of magnitude slower -- for every insert and seek! So, even if the even number of buckets leads to a occasional extra collusion (which I do not grant), the amortized time would still be less.

    Truth,
    James
    http://www.NJTheater.com
    http://www.NovelTheory.com
    I don't do it for the points (OK, maybe I do), but rating a post is a good way for me to know if I helped.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured