Philip Hallstrom wrote:
>>> Hi all, a though query problem for me...
>>>
>>> I have a table with 2 rows that matter: url and id
>>>
>>> If url and id are the same in 2 rows, then that's no good (bad data).
>>>
>>> I need to find all the rows that are duplicates. I can't think of how
>>> to approach the sql for this.. any pointers?
>>>
>>
>> Select COUNT(*) as num_entries, url from table WHERE num_entries>1
>> GROUP BY url
>>
>> Untested, but the concept should work for you.
>
> That would mark rows that have different id's, but the same url...
No it wouldn't, he's grouping by the url.
mysql> create table a(a int, b text);
Query OK, 0 rows affected (0.02 sec)
mysql> insert into a(a, b) values (1, 'google.com');
Query OK, 1 row affected (0.00 sec)
mysql> insert into a(a, b) values (2, 'google.com');
Query OK, 1 row affected (0.00 sec)
mysql> insert into a(a, b) values (3, 'yahoo.com');
Query OK, 1 row affected (0.00 sec)
mysql> select count(*) as num, b from a group by b;
+-----+------------+
| num | b |
+-----+------------+
| 2 | google.com |
| 1 | yahoo.com |
+-----+------------+
2 rows in set (0.00 sec)
The final query should be:
mysql> select count(*) as num, b from a group by b having count(*) > 1;
+-----+------------+
| num | b |
+-----+------------+
| 2 | google.com |
+-----+------------+
1 row in set (0.00 sec)