List:General Discussion« Previous MessageNext Message »
From:Devananda Date:October 11 2005 6:25am
Subject:Re: Non-linear degradation in bulk loads?
View as plain text  
Jon Frisby wrote:
> Everyone,
> 
> We're trying to do some bulk data loads on several different tables (on
> several different machines, using several different techniques) and
> seeing dramatically worse-than-linear performance.
> 
> We've tried the bulk-INSERT syntax, and the LOAD DATA INFILE syntax.
> We've done ALTER TABLE ... DISABLE KEYS, SET FOREIGN_KEY_CHECKS=0 (where
> appropriate), and so forth.
> 
> The one that is the most immediate concern is a table of the form:
> 
> CREATE TABLE `test` (
>   `email` varchar(255) NOT NULL default '',
>   `when_happened` datetime NOT NULL default '0000-00-00 00:00:00',
>   UNIQUE KEY `email` (`email`),
>   KEY `when_happened` (`when_happened`)
> ) TYPE=InnoDB;
> 
> I'm loading data using LOAD DATA INFILE with chunks containing 3.4m rows
> each (~135MB files).  The first chunk was very quick (about 1.5
> minutes), but the tenth chunk has taken 22.6 hours and is still going.
> (It's been getting progessively slower with each chunk...)
> 
> The database is our main sites database but we've dramatically reduced
> the load on that machine over the past couple months through careful
> optimization of our code.  The box is a dual, dual-core Opteron, 8GB of
> RAM running a 32-bit Linux 2.4 kernel and MySQL 4.0.20 (32-bit of
> course).  We have 1GB allocated to the buffer pool, and our usual 1GB *
> 3 log files.  8 I/O threads.
> 
> Load on the box sits at around 6-7, with a large (>50%) amount of time
> spent in wait state, but actual disk throughput to our software RAID
> array (No longer on a SAN...) is quite low -- 6-9k blocks/s out, 1-6k
> blocks/s in.
> 
> Something *has* to be wrong here, but we're not sure what we've missed.
> We've restored larger data sets from a mysqldump in the past in
> dramatically less time on far inferior hardware. (A superset of this
> same data to a schema which is also a superset, PLUS a bunch of other
> rather large tables -- all in ~8 hours on a 3Ware RAID array on a dual
> Xeon w/ 4GB of RAM)
> 
> We're inclined to believe that this is a configuration problem, as
> opposed to a driver or hardware problem given the non-linear nature of
> the performance degradation.  This implies we're doing something truly
> stupid with our loads.  What could cause this kind of strangeness?
> 
> -JF
> 

Hi Jon,

I experienced this same non-linear degradation during large imports, 
exactly like you are describing, about 18 months ago. I don't remember 
if I found a specific cause, but I am fairly certain that it was related 
to a few issues, and that we did resolve it. I have not seen this happen 
with MyISAM tables, and we were able import our full data by breaking it 
into chunks and waiting between each chunk, so I believe it to be 
related to InnoDB's logs in some way. Since you are already importing 
your data in chunks, try making each chunk a separate transaction, or 
waiting until disk activity slows to load the next chunk. Also, there 
have been major improvements to InnoDB in the later 4.1 releases, so if 
possible, I would suggest upgrading.

Hope that helps!
Devananda vdv
Thread
Non-linear degradation in bulk loads?Jon Frisby11 Oct
  • Re: Non-linear degradation in bulk loads?Manoj11 Oct
  • Re: Non-linear degradation in bulk loads?C.R. Vegelin11 Oct
    • Re: Non-linear degradation in bulk loads?mos11 Oct
  • Re: Non-linear degradation in bulk loads?Devananda11 Oct
RE: Non-linear degradation in bulk loads?Jon Frisby11 Oct
Re: Non-linear degradation in bulk loads?Heikki Tuuri11 Oct
RE: Non-linear degradation in bulk loads?Jon Frisby12 Oct
  • Re: Non-linear degradation in bulk loads?Heikki Tuuri12 Oct
RE: Non-linear degradation in bulk loads?Jon Frisby14 Oct
Re: Non-linear degradation in bulk loads?Heikki Tuuri16 Oct
RE: Non-linear degradation in bulk loads?Jon Frisby17 Oct
RE: Non-linear degradation in bulk loads?Jon Frisby17 Oct
  • Re: Non-linear degradation in bulk loads?Heikki Tuuri17 Oct
RE: Non-linear degradation in bulk loads?Jon Frisby17 Oct
  • Re: Non-linear degradation in bulk loads?Heikki Tuuri17 Oct
RE: Non-linear degradation in bulk loads?Jon Frisby17 Oct
RE: Non-linear degradation in bulk loads?Jon Frisby18 Oct
Re: Non-linear degradation in bulk loads?Heikki Tuuri18 Oct