>>>>> "MARK" == MARK CALLAGHAN <mdcallag@stripped> writes:
MARK> On Tue, Oct 28, 2008 at 1:54 PM, Sergei Golubchik <serg@stripped> wrote:
>> Hi, MARK!
>> On Oct 28, MARK CALLAGHAN wrote:
>>> Is it best to use a power of 2 size array for hash tables created by
>>> hash_init()? This is for the third parameter to _hash_init() and
>>> hash_init() -- default_array_elements.
>> I guess it depends on your application.
>> And my gut feeling tells me that within certain range of values (which
>> isn't exactly narrow) it doesn't matter what you pick :)
MARK> We use a hash table to collect resource metrics per account. I
MARK> extracted some of the hash function code to determine the max number
MARK> of account names that map to the same bucket using a hash table with
MARK> 500 entries and the results were a lot worse than for 512 entries.
MARK> That is why I ask. There are no comments in include/hash.h and
MARK> existing uses of hash_init() don't all use a power of 2.
The hash in MySQL is not totally dynamic and is based on the number of
entries in the bucket, not the number given to hash_init().
The values for hash_init() is just used to create an initial array to
store values and how much it will grow when it gets full; One value
will take one place in the array.
In other words, you can't affect how the hash works by changing the
numbers to hash_init.
The way the hash works, there shouldn't normally be a big speed
difference between 500 and 512 entries. How the hash works is that if
an entries hashes between 501-511, they will be mapped to a the link
at 0-10; In other words, only 10 entries should be a little bit
slower than the others.