The odds of a spurious collision with MD5 (128 bits) can be phrased this way:
If you have 9 Trillion different items, there is one chance in 9 Trillion that two of them
have the same MD5.
To phrase it another way, it is more likely to be hit by a meteor while winning the
> -----Original Message-----
> From: Dan Nelson [mailto:dnelson@stripped]
> Sent: Tuesday, November 05, 2013 7:56 AM
> To: Li Li
> Cc: mysql@stripped
> Subject: Re: how to create unique key for long varchar?
> In the last episode (Nov 05), Li Li said:
> > I want to create a table with a long varchar column, maybe it's
> the url.
> > according to dns spec, the url's max length is fixed. but I have
> > to deal with url having long params such as
> > a.html?q=aaaaaaaaaaaaaaaaaaaaaaaaaaaa&fl=bbbbbbbbbbbbbbbbbbbbbbbb....
> > I want the url is unique when inserting it.
> > I googled and found
> > http://stackoverflow.com/questions/6800866/how-to-store-urls-in-mysql
> > this post suggests use md5 of url. But in theory, there will be
> > conflict that two different urls will have the same md5(even it's
> > probablitiy is very small). I want to a absolute correct solution.
> > one method i can come up with is using select ... for update
> > 1. begin transaction
> > 2. select url from tb where md5='xxxx' for update
> > 3. if the url is not exist, insert into this url; else do nothing
> It might be more efficient to optimize for the common case here. The
> assumption is that an md5 (or sha1 or sha2) hash collision is extremely
> unlikely, so you could just insert your new row, and if you get a
> "duplicate entry for primary key" error, then you can "select url from
> tb where md5='xxxx' ", and compare the retreived url with the one you
> want to insert.
> Dan Nelson
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql