From: Hank Date: September 23 2011 8:32pm Subject: Re: Slower performance with LOCK TABLES List-Archive: http://lists.mysql.com/mysql/225823 Message-Id: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hello Johan, Just an update. Using the "load index into cache" statement for the 200 million row indexed "source" table, my correlated update statement ran in 1 hour, 45 minutes to update 144 million rows. A 50% increase in performance! Thank you very much, -Hank On Fri, Sep 23, 2011 at 6:56 AM, Johan De Meersman wro= te: > > > ----- Original Message ----- >> From: "Hank" >> >> (please read my ORIGINAL post with all this information). > > Welcome on the Internet, where everyone will tell you everything you need= to know, except for what you want to know :-) > >> I am trying to find a logical or reasonable explanation WHY this would b= e the >> case, despite the fact that the documentation states otherwise (see: Rig= ht here: >> http://dev.mysql.com/doc/refman/5.5/en/lock-tables-restrictions.html) > > I believe you're misinterpreting that, as is the author from the blog you= originally referenced. > > What it says, is "If you are going to run many operations". You are updat= ing many rows, but you are only doing ONE operation: a single update statem= ent. > > While this explains why you're not seeing benefit, I have to admit that I= 'm at a loss, too, as to why you are experiencing an actual slowdown - the = update statement will lock the tables, too, so it shouldn't really make any= difference at all. > >> But if seeing some SQL will make you happy, here is just one example: >> >> UPDATE dest d straight_join source s set d.seq=3Ds.seq WHERE d.key=3Ds.k= ey; > > See, this is why people ask to see your queries. You never mentioned you = were doing a join in the update :-) > > I'll ignore the join condition in the where clause, as it makes little di= fference in this case; but I do note that you're using a straight join. Is = the optimizer really reading the tables in the wrong order, or is it just s= omething you picked up off a blog without realising the impact? > >> =A0Source is indexed by key+seq (key is primary key, but seq is >> =A0included as a covering index). > > Good practice, that should prevent source from being read from disk, if y= our index is fully in the cache. Do you have an index on dest.key, too? Tha= t might help performance as well if it fits in memory, too, because you'll = only need disk access for flushing writes, then. > >> This query takes about 3.5 hours when I don't use LOCK TABLES, and over = 4 hours when I do use LOCK TABLES. > > Most peculiar. Is the difference in performance consistent in repeated ex= ecutions? > >> And before testing each run, I do restart the server so there is no >> query caching and I also use FLUSH TABLES between each test run. > > That's good, as it will give you the worst-case scenario. However, since = such an update is going to wreck your index cache anyway, you may just as w= ell preload the appropriate indices into it beforehand, if the cache is siz= ed big enough to hold them. That might give a minor performance boost, too,= as the server won't have to go to disk every so often to fetch index block= s. > See http://dev.mysql.com/doc/refman/5.0/en/index-preloading.html for that= . > >> | key_buffer_size =A0 =A0 =A0 =A0 =A0| 402653184 | > > 400MB of key buffer, that's not bad at all. Do a /show table status/ in y= our database, and sum the index sizes. If your key buffer is larger than th= is (and why not scale it for growth a bit?) all your indices will fit, whic= h will save on disk access for index lookups *and* for index-covered querie= s. > > > > -- > Bier met grenadyn > Is als mosterd by den wyn > Sy die't drinkt, is eene kwezel > Hy die't drinkt, is ras een ezel >