List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:January 17 2011 2:18pm
Subject:Re: bzr commit into mysql-5.1 branch (ole.john.aske:3534) Bug#58490
View as plain text  
Hello,

Ole John Aske a écrit, Le 11.01.2011 10:00:
> #At file:///net/fimafeng09/export/home/tmp/oleja/mysql/mysql-5.1/ based on
> revid:nirbhay.choubey@stripped
> 
>  3534 Ole John Aske	2011-01-11
>       Fix for bug#58490, 'Incorrect result in multi level OUTER JOIN in combination
> with IS NULL'
>       
>       As this bug is a duplicate of bug#49322, it also include tescases
>       covering this bugreport
>       
>       Qualifying an OUTER JOIN with the condition 'WHERE <column> IS NULL',
>       where <column> is declared as 'NOT NULL' causes the
>       'not_exists_optimize' to be enabled by the optimizer.
>       
>       In evaluate_join_record() the 'not_exists_optimize' caused
>       'NESTED_LOOP_NO_MORE_ROWS' to be returned immediately
>       when a matching row was found.
>       
>       However, as the 'not_exists_optimize' is derived from
>       'JOIN_TAB::select_cond', the usual rules for condition guards
>       also applies for 'not_exist_optimize'. It is therefore incorrect
>       to check 'not_exists_optimize' without ensuring that all guards
>       protecting it is 'open'.

good explanation

>       This fix uses the fact that 'not_exists_optimize' is derived from
>       a 'is_null' condition term in 'tab->select_cond'. Futhrermore,
>       'is_null' will evaluate to 'false' for any 'non-null' rows
>       once all guards protecting the is_null is open.
>       
>       We can use this knowledge as an implicit guard check for the
>       'not_exists_optimize' by moving 'if (...not_exists_optimize)'
>       inside the handling of 'select_cond==false'. It will then
>       not take effect before its guards are open.
>       
>       We also add an assert which requires that a
>       'not_exists_optimize' always comes together with
>       a select_cond. (containing 'is_null').
> 

> === modified file 'mysql-test/r/join_outer.result'
> --- a/mysql-test/r/join_outer.result	2010-10-29 08:23:06 +0000
> +++ b/mysql-test/r/join_outer.result	2011-01-11 09:00:00 +0000
> @@ -1427,4 +1427,146 @@ WHERE t1.f1 = 4 AND t2.f1 IS NOT NULL AN
>  GROUP BY t2.f1, t2.f2;
>  f1	f1	f2
>  DROP TABLE t1,t2;
> +#
> +# Bug#58490: Incorrect result in multi level OUTER JOIN
> +# in combination with IS NULL
> +#
> +CREATE TABLE t1 (i INT NOT NULL);
> +INSERT INTO t1 VALUES (0),    (2),(3),(4);
> +CREATE TABLE t2 (i INT NOT NULL);
> +INSERT INTO t2 VALUES (0),(1),    (3),(4);
> +CREATE TABLE t3 (i INT NOT NULL);
> +INSERT INTO t3 VALUES (0),(1),(2),    (4);
> +CREATE TABLE t4 (i INT NOT NULL);
> +INSERT INTO t4 VALUES (0),(1),(2),(3)   ;
> +SELECT * FROM
> +t1 LEFT JOIN
> +( t2 LEFT JOIN
> +( t3 LEFT JOIN
> +t4
> +ON t4.i = t3.i
> +)
> +ON t3.i = t2.i
> +)
> +ON t2.i = t1.i

I didn't check correctness of all queries' results. I hope you did.

> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc	2010-12-28 23:47:05 +0000
> +++ b/sql/sql_select.cc	2011-01-11 09:00:00 +0000
> @@ -11500,17 +11500,40 @@ evaluate_join_record(JOIN *join, JOIN_TA
>        first_unmatched->found= 1;
>        for (JOIN_TAB *tab= first_unmatched; tab <= join_tab; tab++)
>        {
> -        if (tab->table->reginfo.not_exists_optimize)
> -          return NESTED_LOOP_NO_MORE_ROWS;
>          /* Check all predicates that has just been activated. */
>          /*
>            Actually all predicates non-guarded by first_unmatched->found
>            will be re-evaluated again. It could be fixed, but, probably,
>            it's not worth doing now.
>          */
> +        /*
> +          not_exists_optimize has been created from a
> +          select_cond containing 'is_null'. This 'is_null'
> +          condition is still present on any 'tab' with
> +          'not_exists_optimize'. Furthermore, the usual rules
> +          for condition guards also applies for
> +          'not_exists_optimize' -> When 'is_null==false' we
> +          know all cond. guards are open and we can apply
> +          the 'not_exists_optimize'.
> +        */
> +        DBUG_ASSERT(!(tab->table->reginfo.not_exists_optimize &&
> +                     !tab->select_cond));

This assertion is probably true.
However, imagine we have
   select * from t1 left join t2 on 1 where t2.a is null and t1.a=t2.a;
where t2.a is declared NOT NULL and has an index.
The query above is converted to inner join, which ruins my 
demonstration, but if it were not converted it would be processed like this:
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | 
ref  | rows | Extra                                |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------------------+
|  1 | SIMPLE      | t1    | ALL   | NULL          | NULL | NULL    | 
NULL |    1 | Using where                          |
|  1 | SIMPLE      | t2    | index | NULL          | a    | 4       | 
NULL |    1 | Using where; Using index; Not exists |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------------------+
and the conditions attached to tables (tab->select_cond) would be:
- for t1:
"isnull(`test`.`t1`.`a`)"
- for t2:
"(trigcond_if(found_match(t2), ((`test`.`t2`.`a` = `test`.`t1`.`a`) and 
(trigcond_if(is_not_null_compl(t2), 1, true) and 
trigcond_if(is_not_null_compl(t2), 1, true))), true) and 
trigcond_if(is_not_null_compl(t2), 1, true))"

We can see that nothing in t2's condition is about NOT NULL, though t2 
as "not exists". NOT NULL was moved to t1. The assertion would still 
pass because it is not 0 due to t2.a=t1.a.
What I want to show here is that it's not so straightforward that "this 
'is_null' condition is still present on any 'tab' with 
'not_exists_optimize'": it could have been moved to another earlier 
table by the optimizer, as in t1 above.
But as it seems practically impossible to create a case where the 
assertion fails, let's keep it.

> +
>          if (tab->select_cond && !tab->select_cond->val_int())
>          {
>            /* The condition attached to table tab is false */
> +
> +          if (tab->table->reginfo.not_exists_optimize)
> +          {

This new code follows from "When 'is_null==false' we
know all cond. guards are open and we can apply
the 'not_exists_optimize'."
But what we know is that "tab->select_cond->val_int() is false".
Without much reasoning, tab->select_cond could be made of
   cond1 AND trigcond(tab->found, tab.field IS NULL)
and cond1 could be false, and tab->found could be false, and thus, 
tab->select_cond could be false while IS NULL - and thus 
"not_exists_optimize" - would not be applicable yet. "is_null" would not 
be false yet.
But then:
- cond1 would be _newly_ false; otherwise even the first evaluation of 
tab->select_cond->val_int() at start of evaluate_join_record(), in this 
stack frame or in upper frames for upper tables, would have been false, 
so we wouldn't come into the while() loop above.
- This value change would come from one or more "found" guard having 
changed to "true" (not_null_compl guards are very probably not playing a 
role here, there are for ON conditions).
- Last, cond1 and IS NULL are probably guarded by _the_same_ "found" 
guard, because they are probably pieces of the WHERE and all WHERE 
pieces pertaining to a table probably get the same "found" guard. So if 
cond1 is enabled it's probably true that IS NULL is enabled too, so "not 
exists" is applicable...
Given the number of "probably", I believe it will take a long time 
before we see this new code break (if this happens).
So, ok to push. If possible, please try to explain in a comment why 
tab->select_cond->val_int() being false implies that the _IS_NULL_ piece 
of it is enabled.

For what it's worth, I am not so happy with how not_exists lives a life 
so separate from cond guards and the Item implementing NOT NULL. Makes 
it hard to be 100% sure of when it's applicable. And when later we 
change something in how the optimizer handles NOT NULL, we will have to 
remember to rethink "not exists" :-(


> +            /*
> +              When not_exists_optimize is set: No need to further
> +              explore more records of 'tab' for this partial result.
> +              Any found 'tab' matches are known to evaluate to 'false'.
> +              Returning .._NO_MORE_ROWS will skip rem. 'tab' records.
> +            */
> +            return NESTED_LOOP_NO_MORE_ROWS;
> +          }
> +
>            if (tab == join_tab)
>              found= 0;
>            else
> 
> 
> 
> ------------------------------------------------------------------------
> 
> 


-- 
Mr. Guilhem Bichot <guilhem.bichot@stripped>
Oracle / MySQL / Optimizer team, Lead Software Engineer
Bordeaux, France
www.oracle.com / www.mysql.com
Thread
bzr commit into mysql-5.1 branch (ole.john.aske:3534) Bug#58490Ole John Aske11 Jan
  • Re: bzr commit into mysql-5.1 branch (ole.john.aske:3534) Bug#58490Roy Lyseng11 Jan
  • Re: bzr commit into mysql-5.1 branch (ole.john.aske:3534) Bug#58490Guilhem Bichot17 Jan