List:General Discussion« Previous MessageNext Message »
From:Joerg Bruehe Date:October 2 2009 9:53am
Subject:Re: many-inserts go faster the second time
View as plain text  
Hi Bennett, all!


Bennett Haselton wrote:
> At 08:24 AM 9/25/2009, Dan Nelson wrote:
>> In the last episode (Sep 25), Bennett Haselton said:
>> > I have a script that runs several times in the evening, and on each
>> run it
>> > adds several thousand entries to a table.
>> >
>> > On the first run, it adds the entries rather slowly.  But then on all
>> > subsequent runs (usually about a minute or two later), the many
>> inserts go
>> > a lot faster.  This is true regardless of how many entries are added by
>> > each run -- whether the first and second run both add 50,000 or the
>> first
>> > and second run both add 10,000, the first run goes slowly and the
>> second
>> > one goes fast.  But by the following evening, the first run is back to
>> > going slowly again.
>> >
>> > It's as if in the minute or two following the first run of the script,
>> > MySQL catches its breath and realizes, hey, that table is getting a
>> lot of
>> > entries added to it, so it waves some magic dust so that the next
>> time I
>> > add a lot of entries, it goes a lot faster.  (Hope I'm not losing
>> anybody
>> > with the technical terminology here.) Then by the next evening the
>> > optimization parameter has exp^W^W^W^W the fairy dust has worn off.
>>
>> More likely, this is a relatively unused table, and the first batch of
>> inserts pulls most of the index and some of the table data into RAM,
>> which
>> makes for much faster lookups on the next run.  What do top and iostat
>> stats
>> show on both runs?  I'd expect heavy disk usage and little CPU on the
>> first
>> run, and light disk and heavier CPU usage on the second.
> 
> That's interesting, I can look at that next time I try it.  But if
> that's the case, wouldn't the first run go slowly at first, but then
> pick up speed once all of the indexes etc. have been pulled into
> memory?  Because that's not what I'm seeing -- if I insert 50,000 in the
> first run, it's slow all the way through, but then the second 50,000 get
> inserted quickly.

Your "fairy dust" is called "access pattern", evaluated by a LRU or
similar policy.

Don't forget you may have caching on two levels: database and operating
system. Both have their own cache aging mechanisms.
The details about caching and its effects will vary by the table handler
you are using, MyISAM structures and policies definitely from InnoDB ones.

Even if MySQL would not cache data and index pages, they would still
reside in the operating system's file I/O cache, so the next access to
them will be faster than the first one - regardless whether you read
them or modify them.

However, sooner or later they will be removed from all caches because
they are not accessed until the next evening, whereas other pages were
accessed and needed space in RAM.
(Here, I ignore the case of a RAM which is larger than all data accessed
for a day, it is too unlikely.)
In the evening, when your job is run again, this starts anew.


HTH,
Jörg

-- 
Joerg Bruehe,  MySQL Build Team,  Joerg.Bruehe@stripped
Sun Microsystems GmbH,   Komturstraße 18a,   D-12099 Berlin
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Haering     Muenchen: HRB161028

Thread
many-inserts go faster the second timeBennett Haselton25 Sep
Re: many-inserts go faster the second timeDan Nelson25 Sep
  • Re: many-inserts go faster the second timeBennett Haselton25 Sep
Re: many-inserts go faster the second timeJoerg Bruehe2 Oct
  • Re: many-inserts go faster the second timeBennett Haselton2 Oct
Re: many-inserts go faster the second timeWm Mussatto2 Oct
Re: many-inserts go faster the second timeJoerg Bruehe6 Oct