List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:October 20 2010 7:41pm
Subject:Re: bzr commit into mysql-next-mr-bugfixing branch (roy.lyseng:3253)
Bug#45174 Bug#50019
View as plain text  
Hello,

Roy Lyseng a écrit, Le 07.10.2010 17:13:
> Hi Guilhem,
> 
> On 02.10.10 17.17, Guilhem Bichot wrote:
>> Hello Roy,
>>
>> Thanks for a clearly commented patch.
> 
> And thanks for a thorough review.
> 
>>
>> Roy Lyseng a écrit, Le 29.09.2010 16:34:
>>> #At file:///home/rl136806/mysql/repo/mysql-work5/ based on
>>> revid:tor.didriksen@stripped
>>>
>>> 3253 Roy Lyseng 2010-09-29
>>> Bug#45174: Incorrectly applied equality propagation caused wrong result
>>> on a query with a materialized semi-join.
>>> Bug#50019: Wrong result for IN-query with materialization.
>> ...
>>> mysql-test/r/subquery_sj_none_jcl7.result
>>> Bug#45174: Incorrectly applied equality propagation caused wrong result
>>> on a query with a materialized semi-join.
>>> Bug#50019: Wrong result for IN-query with materialization.
>>> Results for two new tests added.
>>> Some tests using semijoin materialization show that where clause
>>> has moved from the outer query into the materialized inner query.
>>> This is caused by the changed call to get_first() in
>>> eliminate_item_equal().
>>> Ex: select * from ot where a in(select b from it where b>0);
>>> The clause "b>0" is now evaluated on the inner query materialization.
>>> Performance-wise this is never worse when using MaterializeScan and
>>> usually better for MaterializeLookup. For the latter strategy, the
>>> best possible solution is probably to evaluate the clause in both
>>> queries, this can be subject for a later feature development.
>>
>>> === modified file 'mysql-test/r/subquery_sj_mat.result'
>>> --- a/mysql-test/r/subquery_sj_mat.result 2010-09-20 14:06:02 +0000
>>> +++ b/mysql-test/r/subquery_sj_mat.result 2010-09-29 14:33:39 +0000
>>> @@ -2176,7 +2176,7 @@ create table t3 (a int);
>>> insert into t3 select A.a + 10*B.a from t0 A, t0 B;
>>> explain select * from t3 where a in (select kp1 from t1 where kp1<20);
>>> id select_type table type possible_keys key key_len ref rows Extra
>>> -1 PRIMARY t3 ALL NULL NULL NULL NULL 100 Using where
>>> +1 PRIMARY t3 ALL NULL NULL NULL NULL 100
>>> 1 PRIMARY t1 range kp1 kp1 5 NULL 48 Using where; Using index; 
>>> Materialize
>>
>> This one looks worrying, performance-wise. The disappearing "Using 
>> where" was
>> probably a condition "a<20"; having it attached to t3 allowed to do 
>> less lookups
>> in the materialized table.
>>
>> Looking at "handler%" values in "show status" says that after the 
>> patch, the
>> SELECT used 105 Handler_read_key, instead of 25 before. Other 
>> statistics are
>> unchanged. Handler_write stays at 40: there are 40 row writes into the 
>> tmp table
>> (logical, there are 40 rows having kp1<20 in t1)
>> Thus: after the patch, t3's rows are not filtered anymore with a<20, 
>> so the
>> number of key lookups into the materialized table grows 
>> (handler_read_key).
>> Handler_write is unchanged so nothing has changed for t1. Also 
>> confirmed by the
>> unchanged EXPLAIN row for t1 ("range" and "using where").
>>
>> The commit comment says
>>
>> "Some tests using semijoin materialization show that where clause
>> has moved from the outer query into the materialized inner query."
>>
>> but I think here it's not a move, only a loss.
> 
> As far as I can see, it is a move. The inner Where in the previous case 
> seemed to be just a True predicate. However, index analysis occurs 
> before equality propagation, so the "Using index" clause made the query 
> perform well.

Ooook. Before equality propagation, when we still have the original 
condition kp1<20, we decide to do range access on the inner table... 
After equality propagation the condition is moved outside, but the range 
access fortunately remains.

> I created a new table without the index and reran the query. The result 
> was a materialized table with 1000 rows instead of 40 rows.
> 
> What worries me is that in the existing code, the Where clause is moved 
> into the outer query part unconditionally, meaning that if there is no 
> index on the inner tables, it will be impossible to apply a filter 
> predicate on the inner part. After this fix, a predicate on the outer 
> query will remain on the outer query and a predicate on the inner query 
> will remain on the inner query.

ok, got it. It's indeed a good change from this point of view.

> We could develop a feature to copy the inner query predicate to the 
> outer query (or vice versa) in the case of a materialized semijoin, but 
> now it is at least possible for the user to specify these constraints 
> manually.

right

>>> @@ -2277,7 +2277,7 @@ insert into t3 select A.a + 10*B.a, 'fil
>>> explain select * from t3 where a in (select a from t2) and (a > 5 or 
>>> a < 10);
>>> id select_type table type possible_keys key key_len ref rows Extra
>>> 1 PRIMARY t2 ALL NULL NULL NULL NULL 2 Using where; Materialize; Scan
>>> -1 PRIMARY t3 ref a a 5 test.t2.a 1
>>> +1 PRIMARY t3 ref a a 5 test.t2.a 1 Using where
>>
>> This one is worrying too.
>> Before the patch, the condition "(a>5 or a<10) and a is not null" is 
>> pushed to
>> the tmp table ("t2"), and thus nothing needs to be pushed to t3.
>> After the patch, the condition "a is not null" is pushed to the tmp 
>> table, and
>> the condition "a>5 or a<10" is pushed to t3. We can see that 
>> potentially this
>> means writing more rows to the tmp table, again this is only a loss.
>> This isn't visible in "show status like 'handler%'", because "a>5 or 
>> a<10" is
>> true for all rows of t2 (which are '1' and '2') (so pushing or not 
>> pushing the
>> condition changes nothing); but if I change this condition to "a>1", 
>> then I see
>> the tmp table contains 1 row before the patch and 2 after the patch: 
>> more writes
>> to the tmp table, and thus more reads when scanning it.
...
> 
> I'll still have to analyze this.

Looks like it does not happen in the latest patch. Good!

Thread
bzr commit into mysql-next-mr-bugfixing branch (roy.lyseng:3253) Bug#45174Bug#50019Roy Lyseng29 Sep
  • Re: bzr commit into mysql-next-mr-bugfixing branch (roy.lyseng:3253)Bug#45174 Bug#50019Guilhem Bichot2 Oct
    • Re: bzr commit into mysql-next-mr-bugfixing branch (roy.lyseng:3253)Bug#45174 Bug#50019Roy Lyseng5 Oct
    • Re: bzr commit into mysql-next-mr-bugfixing branch (roy.lyseng:3253)Bug#45174 Bug#50019Roy Lyseng7 Oct
      • Re: bzr commit into mysql-next-mr-bugfixing branch (roy.lyseng:3253)Bug#45174 Bug#50019Guilhem Bichot20 Oct