I prefer your solution in that it's something like Optimistic Locking.
but the problem is that if I define md5 as unique key and there exists
2 different urls with the same md5. I can't insert the second url
On Tue, Nov 5, 2013 at 11:55 PM, Dan Nelson <dnelson@stripped> wrote:
> In the last episode (Nov 05), Li Li said:
>> I want to create a table with a long varchar column, maybe it's the url.
>> according to dns spec, the url's max length is fixed. but I have
>> to deal with url having long params such as
>> I want the url is unique when inserting it.
>> I googled and found
>> this post suggests use md5 of url. But in theory, there will be
>> conflict that two different urls will have the same md5(even it's
>> probablitiy is very small). I want to a absolute correct solution.
>> one method i can come up with is using select ... for update
>> 1. begin transaction
>> 2. select url from tb where md5='xxxx' for update
>> 3. if the url is not exist, insert into this url; else do nothing
> It might be more efficient to optimize for the common case here. The
> assumption is that an md5 (or sha1 or sha2) hash collision is extremely
> unlikely, so you could just insert your new row, and if you get a "duplicate
> entry for primary key" error, then you can "select url from tb where
> md5='xxxx' ", and compare the retreived url with the one you want to insert.
> Dan Nelson