You should always have a field that is a unique id for the record (i.e. autoincrement). It
makes it easier for differentiating and
deleting duplicates if needed. This query only works if you have a unique id field and it
will only delete one duplicate record at a
time. So if you have 4 records that are duplicates, you'll need to run it 3 times.
DELETE tablename
FROM tablename,
(SELECT MAX(uid) AS dupid,COUNT(uid) AS dupcnt
FROM tablename
GROUP BY id,url HAVING dupcnt>1)
AS dups
WHERE tablename.uid=dups.dupid;
Here is an example:
uid id url
1 20 google.com
2 25 yahoo.com
3 20 google.com
4 25 yahoo.com
5 20 google.com
The above query would delete uid 4 and 5 on the first run and ui 3 on the second run.
----- Original Message -----
From: "Peter Van Dijck" <petervandijck@stripped>
To: "MYSQL General List" <mysql@stripped>
Sent: Tuesday, September 12, 2006 7:10 PM
Subject: query to find duplicate rows
> Hi all, a though query problem for me...
>
> I have a table with 2 rows that matter: url and id
>
> If url and id are the same in 2 rows, then that's no good (bad data).
>
> I need to find all the rows that are duplicates. I can't think of how
> to approach the sql for this.. any pointers?
>
> Thanks!
> Peter
>
> --
> find videoblogs: http://mefeedia.com
> my blog: http://poorbuthappy.com/ease/
> my job: http://petervandijck.net
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql?unsub=1
>