Hi!
I changed now ha_innobase.h so that InnoDB allows
keys up to 7000 bytes in length. The change will be
in MySQL-3.23.44 and 4.0.1.
Regards,
Heikki
http://www.innodb.com
>Steve Meyers wrote:
>
>> > > At a previous job, we tested a 32-bit hash function by running it
>> > > against hundreds of thousands of unique URL's stored in our
>> > > database. We found one collision. A 64-bit hash is billions of
>> > > times better (4 billion, to be exact).
>> >
>> > Good to know. I wonder how many collisions I'd find if I ran it over
>> > every URL listed in the directory www.yahoo.com.
>> >
>> > Which 64 bit hash function did you use? Invent your own, or something
>> > "off the shelf"?
>> >
>>
>> We found a public domain one on the net see
http://www.burtleburtle.net/bob/hash/evahash.html
>for some sample code. It's only a 32-bit hash though. However, that same
page
>appears to have instructions for a 64-bit hash function as well, but I
haven't tried
>it at all. I'd be curious to know how many collisions you find hashing all
the
>URL's in yahoo's database :) I don't know how long that would take, but if
you
>do it I'd like to hear the results.
>>
>> Since the hash function takes a key and an initial value, you could try
running
>it with two different initial values and/or keys. This would give you
effectively
>a 128-bit hash, which you could store across two fields in MySQL. I'm
guessing
>that the 64-bit hash will probably be good enough though.
>
>I am not understanding why having a hash and the full url in the database
would
>not take care of the collisions. Even if you had 10 collisions for a 16
bit hash
>(say), if your query was:
>SELECT ... WHERE hash=thehashvalue AND url='theurl' you would get very fast
lookups
>on the hash and the url comparison would not add much to the query at that
point.
> You could even do a partial index on the url, e.g. "KEY( hash, url(200))".
>
>b.
>
>
>