Just an update. Using the "load index into cache" statement for the
200 million row indexed "source" table, my correlated update statement
ran in 1 hour, 45 minutes to update 144 million rows. A 50% increase
Thank you very much,
On Fri, Sep 23, 2011 at 6:56 AM, Johan De Meersman <vegivamp@stripped> wrote:
> ----- Original Message -----
>> From: "Hank" <heskin@stripped>
>> (please read my ORIGINAL post with all this information).
> Welcome on the Internet, where everyone will tell you everything you need to know,
> except for what you want to know :-)
>> I am trying to find a logical or reasonable explanation WHY this would be the
>> case, despite the fact that the documentation states otherwise (see: Right here:
> I believe you're misinterpreting that, as is the author from the blog you originally
> What it says, is "If you are going to run many operations". You are updating many
> rows, but you are only doing ONE operation: a single update statement.
> While this explains why you're not seeing benefit, I have to admit that I'm at a
> loss, too, as to why you are experiencing an actual slowdown - the update statement will
> lock the tables, too, so it shouldn't really make any difference at all.
>> But if seeing some SQL will make you happy, here is just one example:
>> UPDATE dest d straight_join source s set d.seq=s.seq WHERE d.key=s.key;
> See, this is why people ask to see your queries. You never mentioned you were doing a
> join in the update :-)
> I'll ignore the join condition in the where clause, as it makes little difference in
> this case; but I do note that you're using a straight join. Is the optimizer really
> reading the tables in the wrong order, or is it just something you picked up off a blog
> without realising the impact?
>> Source is indexed by key+seq (key is primary key, but seq is
>> included as a covering index).
> Good practice, that should prevent source from being read from disk, if your index is
> fully in the cache. Do you have an index on dest.key, too? That might help performance as
> well if it fits in memory, too, because you'll only need disk access for flushing writes,
>> This query takes about 3.5 hours when I don't use LOCK TABLES, and over 4 hours
> when I do use LOCK TABLES.
> Most peculiar. Is the difference in performance consistent in repeated executions?
>> And before testing each run, I do restart the server so there is no
>> query caching and I also use FLUSH TABLES between each test run.
> That's good, as it will give you the worst-case scenario. However, since such an
> update is going to wreck your index cache anyway, you may just as well preload the
> appropriate indices into it beforehand, if the cache is sized big enough to hold them.
> That might give a minor performance boost, too, as the server won't have to go to disk
> every so often to fetch index blocks.
> See http://dev.mysql.com/doc/refman/5.0/en/index-preloading.html for that.
>> | key_buffer_size | 402653184 |
> 400MB of key buffer, that's not bad at all. Do a /show table status/ in your
> database, and sum the index sizes. If your key buffer is larger than this (and why not
> scale it for growth a bit?) all your indices will fit, which will save on disk access for
> index lookups *and* for index-covered queries.
> Bier met grenadyn
> Is als mosterd by den wyn
> Sy die't drinkt, is eene kwezel
> Hy die't drinkt, is ras een ezel