From: Wm Mussatto Date: October 2 2009 8:18pm Subject: Re: many-inserts go faster the second time List-Archive: http://lists.mysql.com/mysql/218933 Message-Id: <2448.72.87.181.254.1254514690.squirrel@secure2.csz.com> MIME-Version: 1.0 Content-Type: multipart/alternative;boundary="----=_20091002131810_29175" ------=_20091002131810_29175 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit On Fri, October 2, 2009 12:28, Bennett Haselton wrote: > At 02:53 AM 10/2/2009, Joerg Bruehe wrote: >>Hi Bennett, all! >> >> >>Bennett Haselton wrote: >> > At 08:24 AM 9/25/2009, Dan Nelson wrote: >> >> In the last episode (Sep 25), Bennett Haselton said: >> >> > I have a script that runs several times in the evening, and on each >> >> run it >> >> > adds several thousand entries to a table. >> >> > >> >> > On the first run, it adds the entries rather slowly. But then >> on all >> >> > subsequent runs (usually about a minute or two later), the many >> >> inserts go >> >> > a lot faster. This is true regardless of how many entries are >> added by >> >> > each run -- whether the first and second run both add 50,000 or the >> >> first >> >> > and second run both add 10,000, the first run goes slowly and the >> >> second >> >> > one goes fast. But by the following evening, the first run is >> back to >> >> > going slowly again. >> >> > >> >> > It's as if in the minute or two following the first run of the >> script, >> >> > MySQL catches its breath and realizes, hey, that table is getting a >> >> lot of >> >> > entries added to it, so it waves some magic dust so that the next >> >> time I >> >> > add a lot of entries, it goes a lot faster. (Hope I'm not losing >> >> anybody >> >> > with the technical terminology here.) Then by the next evening the >> >> > optimization parameter has exp^W^W^W^W the fairy dust has worn off. >> >> >> >> More likely, this is a relatively unused table, and the first batch >> of >> >> inserts pulls most of the index and some of the table data into RAM, >> >> which >> >> makes for much faster lookups on the next run. What do top and >> iostat >> >> stats >> >> show on both runs? I'd expect heavy disk usage and little CPU on the >> >> first >> >> run, and light disk and heavier CPU usage on the second. >> > >> > That's interesting, I can look at that next time I try it. But if >> > that's the case, wouldn't the first run go slowly at first, but then >> > pick up speed once all of the indexes etc. have been pulled into >> > memory? Because that's not what I'm seeing -- if I insert 50,000 >> in the >> > first run, it's slow all the way through, but then the second >> 50,000 get >> > inserted quickly. >> >>Your "fairy dust" is called "access pattern", evaluated by a LRU or >>similar policy. >> >>Don't forget you may have caching on two levels: database and operating >>system. Both have their own cache aging mechanisms. >>The details about caching and its effects will vary by the table handler >>you are using, MyISAM structures and policies definitely from InnoDB >> ones. >> >>Even if MySQL would not cache data and index pages, they would still >>reside in the operating system's file I/O cache, so the next access to >>them will be faster than the first one - regardless whether you read >>them or modify them. >> >>However, sooner or later they will be removed from all caches because >>they are not accessed until the next evening, whereas other pages were >>accessed and needed space in RAM. >>(Here, I ignore the case of a RAM which is larger than all data accessed >>for a day, it is too unlikely.) >>In the evening, when your job is run again, this starts anew. > > I understand that, but here's the problem: > > If the speed increase were just the result of values being recently > accessed, then the speed increase should be a function of the number > of inserts that I've already done. So if I insert 10,000 rows in one > script run and then 10,000 rows in a second script run, and the > second script run is a lot faster, then the first 10,000 inserts were > enough to optimize everything. So that should mean if I do 50,000 > inserts in a single script run, then the first 10,000 inserts should > be enough to speed everything up. > > But that's not what I'm seeing. What I'm seeing is that if I do > 10,000 inserts on the first run and 10,000 on the second, then the > first run is slow and the second run is fast. On the other hand if I > do 50,000 inserts on the first run and 50,000 on the second, then the > entire first run is slow and the entire second run is fast. > > In any case, is there anything I can do to force MySQL to manually > pre-optimize the entire table (even if it takes some time to do so, > reading indexes into memory or whatever), other than kludgy solutions > like doing a dummy insert of several thousand rows and then deleting them? > > -Bennett Maybe not dummies but rather a subset of the first group.  Then do the rest of the first set.  ------=_20091002131810_29175--