----- Original Message -----
> From: "Hank" <heskin@stripped>
>
> While running a -rq on a large table, I got the following error:
>
> myisamchk: warning: Duplicate key for record at 54381140 against
> record at 54380810
>
> How do I find which records are duplicated (without doing the typical
> self-join or "having cnt(*)>1" query)? This table has 144 million
> rows, so that's not really feasible.
Given that the error is a duplicate *key*, "select <key> from <table> group by
<key> having count(<key>) > 1" is an index-covered query, and should thus
be perfectly feasible :-)
What I'm not so sure about, is wether the duplicate key will show up correctly in the
index - as that index may be marked corrupt - and so, if it falls back to a full
tablescan, it's indeed going to take a long time. If it does, however, there's no other
option anyway: the only way to do it fast is an index, and that index is untrustworthy.
--
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel