List:General Discussion« Previous MessageNext Message »
From:mos Date:November 20 2005 1:16am
Subject:Is Load Data Infile or Update is faster?
View as plain text  
I am doing a balance line comparison between the rows of an existing table 
and a text file that has newer data in it. So I'm comparing the values 
field by field to the existing rows in the table. If any of the field 
values are different, I need to update the table with these new values. The 
table has around 25 million rows and usually only 1% of the table needs to 
be compared.

I've found 2 ways to update the table with the new values:

1) I could write the new values to a text file and then use "Load Data 
InFile REPLACE ..." which will replace the existing rows for the rows that 
need changing. The problem of course the REPLACE option means it will look 
up the old row using the primary key/unique key, deletes the row, then adds 
the new row. This is disk intensive.

2) The other option would be to execute an Update for each row that needs 
changing and set the changed columns individually. This means the existing 
row will not have to be deleted and only some of the existing row value(s) 
are changed. The problem is there could be 10,000 to 100,000 rows that need 
changing.

So which option is going to be faster? A Load Data Infile that deletes the 
old row and adds a new one, or thousands of Updates changing only 1 to 6 
values at a time?

TIA

Mike

Thread
Is Load Data Infile or Update is faster?mos20 Nov
  • Re: Is Load Data Infile or Update is faster?Rhino20 Nov
    • Re: Is Load Data Infile or Update is faster?mos20 Nov