Vivian - Im just curious as to why I shouldnt trust group by? Is there a
bug or something within Mysql regarding the group by clause?
Walt - you could also create a compound primary key that consists of all
columns, then create a unique constraint on the Primary key..... I've used
this many times in the past, and it works quite well. The only caveat is
the SQL will puke when it tries to load the duplicate records into the
table, I personally would use the DEDUP process of Insert/select with a
group by... But you might want to see what Vivian's response is on the
group by issue.
From: Vivian Wang [mailto:vivianwang@stripped]
Sent: Tuesday, July 30, 2002 2:37 PM
To: walt; David Kramer; mysql@stripped
Subject: Re: removing duplicate records
Don't trust group by.
I am using mysqldump and sort -u on *nix for duplicate.
Try both ways, let me know your result.
At 04:11 PM 7/30/2002 -0400, walt wrote:
>The entire row, not just one or two columns, is a duplicate which makes
>I can key or unique index only a few columns once the data is cleaned up to
>prevent this problem.
>If I create an identical table and include either a key or unique index
>(innodb seems to like the index better) on all the columns and do a
>insert into new_table using select * from old_table
>will mysql quit inserting once a duplicate is hit, or will it keep going
>skip over the duplicates?
>On Tuesday 30 July 2002 03:57 pm, David Kramer wrote:
> > You could always use an insert statement into a second table, when
> > performing the insert use a GROUP BY clause to consalidate your records
> > something along these lines but this isnt 100% accurate, I would need
> > table DDL and business rules behind the DEDUP
> > Insert into tableB
> > (
> > column names, ...
> > )
> > (select
> > column a,
> > column b,
> > max(column c), --or you could use MIN
> > from
> > table A
> > group by
> > column a,
> > column b);
> > **JUST make sure your Identifing column, i.e. the column you use to tell
> > which is a duplicate record or not is included in the group by. Also
> > are the business rules behind the DEDUP(Deduplication)? Are the other
> > values contained in the other columns necessary? If you tell me more
> > what your trying to do and provide some Table DDL I can help you write
> > query. Just let me know!
> > Thanks,
> > DK
> > group by statement
> > -----Original Message-----
> > From: walt [mailto:walt@stripped]
> > Sent: Tuesday, July 30, 2002 12:43 PM
> > To: mysql@stripped
> > Subject: removing duplicate records
> > Does anyone know a good way find and remove duplicate records from a
> > I can create an identical table structure and use a script to pull
> > from the existing table and insert them into the new table if they are
> > duplicates, but I'd rather not do it that way. Here is an example of an
> > script I use for Oracle databases
> > delete from employee a
> > where rowid < (
> > select max(rowid)
> > from employee b
> > where b.COL1 = a.COL1
> > and b.COL2 = a.COL2
> > and b.COL# = a.COL#);
> > sql, query
> > Thanks in advance!
>Before posting, please check:
> http://www.mysql.com/manual.php (the manual)
> http://lists.mysql.com/ (the list archive)
>To request this thread, e-mail <mysql-thread115946@stripped>
>To unsubscribe, e-mail
>Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php