List:General Discussion« Previous MessageNext Message »
From:walt Date:July 30 2002 8:11pm
Subject:Re: removing duplicate records
View as plain text  
Thanks David!

The entire row, not just one or two columns, is a duplicate which makes life 
fun..
:-)

I can key or unique index only a few columns once the data is cleaned up to 
prevent this problem.
 
If I create an identical table and include either a key or unique index 
(innodb seems to like the index better) on all the columns and do a

insert into new_table using select * from old_table

will mysql quit inserting once a duplicate is hit, or will it keep going and 
skip over the duplicates?

walt

On Tuesday 30 July 2002 03:57 pm, David Kramer wrote:
> You could always use an insert statement into a second table, when
> performing the insert use a GROUP BY clause to consalidate your records
>
> something along these lines but this isnt 100% accurate, I would need the
> table DDL and business rules behind the DEDUP
>
> Insert into tableB
> (
> 	column names, ...
>
>
> )
> (select
> 	column a,
> 	column b,
> 	max(column c), --or you could use MIN
> from
> 	table A
>
> group by
> 	column a,
> 	column b);
>
>
>
>
>
> **JUST make sure your Identifing column, i.e. the column you use to tell
> which is a duplicate record or not is included in the group by.  Also what
> are the business rules behind the DEDUP(Deduplication)? Are the other
> values contained in the other columns necessary?  If you tell me more about
> what your trying to do and provide some Table DDL I can help you write this
> query.  Just let me know!
>
> Thanks,
>
> DK
>
>   group by statement
>
> -----Original Message-----
> From: walt [mailto:walt@stripped]
> Sent: Tuesday, July 30, 2002 12:43 PM
> To: mysql@stripped
> Subject: removing duplicate records
>
>
> Does anyone know a good way find and remove duplicate records from a table?
> I can create an identical table structure and use a script to pull records
> from the existing table and insert them into the new table if they are not
> duplicates, but I'd rather not do it that way. Here is an example of an sql
> script  I use for Oracle databases
>
> delete   from employee a
> where    rowid < (
> select  max(rowid)
> from    employee b
> where   b.COL1 = a.COL1
> and     b.COL2 = a.COL2
> and     b.COL# = a.COL#);
>
> sql, query
>
> Thanks in advance!



Thread
removing duplicate recordswalt30 Jul
  • Re: removing duplicate recordsGerald Clark30 Jul
RE: removing duplicate recordsDavid Kramer30 Jul
  • Re: removing duplicate recordswalt31 Jul
    • Re: removing duplicate recordsVivian Wang31 Jul
    • RE: removing duplicate recordsGordon31 Jul
  • getting the maximum valuedestr031 Jul
  • Re: getting the maximum valueGerald Clark31 Jul
RE: removing duplicate recordsDavid Kramer31 Jul
RE: getting the maximum valueChris Kay31 Jul