List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:August 25 2010 5:50pm
Subject:Re: bzr commit into mysql-5.1-bugteam branch (martin.hansson:3478)
Bug#51070
View as plain text  
Hello Martin,

Martin Hansson a écrit, Le 06.08.2010 15:26:
> #At file:///data0/martin/bzr/bug51070/5.1bt-guilhem/ based on
> revid:bjorn.munch@stripped
> 
>  3478 Martin Hansson	2010-08-06
>       Bug#51070: Query with a NOT IN subquery predicate returns a wrong result set
>       
>       The EXISTS transformation has additional swithces to catch most corner cases
>       that appear when translating an IN predicate (into EXISTS). Guarded conditions
>       are used which are deactivated when a NULL value is seen in the outer
>       expression's row. When the inner query block supplies NULL values, however,
>       they are filtered out because no distinction is made between the guarded
>       conditions; guarded NOT x IS NULL conditions in the HAVING clause that filter
>       out NULL values cannot be de-activated in isolation from those that match
>       values or from the outer expression or NULL's.
>       
>       The above problem is handled by making the guarded conditions remember whether
>       they have rejected a NULL value or not, and index access methods are taking
>       this into account as well. 
>       
>       The bug consisted of 
>       
>       1) Not resetting the property for every nested loop iteration on the inner
>         query's result
>       
>       2) Not propagating the NULL result properly from inner query to IN optimizer.
>       
>       3) A hack that aimed to fix #2 by returning NULL when false was actually
>          the result. This caused failures when #2 was properly fixed.
> 
>     modified:
>       mysql-test/r/subselect4.result
>       mysql-test/t/subselect4.test
>       sql/item_cmpfunc.cc
>       sql/item_subselect.cc

> === modified file 'mysql-test/t/subselect4.test'
> --- a/mysql-test/t/subselect4.test	2010-08-05 10:42:14 +0000
> +++ b/mysql-test/t/subselect4.test	2010-08-06 13:26:42 +0000
> @@ -74,6 +74,67 @@ CREATE VIEW v1 AS SELECT 1 LIKE ( 1 IN (
>  CREATE VIEW v2 AS SELECT 1 LIKE '%' ESCAPE ( 1 IN ( SELECT 1 ) );
>  DROP VIEW v1, v2;
>  
> +--echo # 
> +--echo # Bug#51070: Query with a NOT IN subquery predicate returns a wrong
> +--echo # result set
> +--echo # 
> +CREATE TABLE t1 ( a INT, b INT );
> +INSERT INTO t1 VALUES ( 1, NULL ), ( 2, NULL );
> +
> +CREATE TABLE t2 ( c INT, d INT );
> +INSERT INTO t2 VALUES ( NULL, 3 ), ( NULL, 4 );
> +
> +CREATE TABLE t3 ( e INT, f INT );
> +INSERT INTO t3 VALUES ( NULL, NULL ), ( NULL, NULL );
> +
> +CREATE TABLE t4 ( a INT );
> +INSERT INTO t4 VALUES (1), (2), (3);
> +
> +CREATE TABLE t5 ( a INT );
> +INSERT INTO t5 VALUES (NULL), (2);
> +
> +--replace_column 1 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 x
> +EXPLAIN
> +SELECT * FROM t1 WHERE ( a, b ) NOT IN ( SELECT c, d FROM t2 );
> +SELECT * FROM t1 WHERE ( a, b ) NOT IN ( SELECT c, d FROM t2 );
> +
> +EXPLAIN
> +SELECT * FROM t1 WHERE ( a, b ) NOT IN ( SELECT c, d FROM t2 ) IS NULL;

for the query above you only do EXPLAIN, you don't run the SELECT.

> +SELECT * FROM t1 WHERE ( a, b ) IN ( SELECT c, d FROM t2 ) IS NULL;
> +SELECT * FROM t1 WHERE ( a, b ) NOT IN ( SELECT c, d FROM t2 ) IS UNKNOWN;
> +SELECT * FROM t1 WHERE (( a, b ) NOT IN ( SELECT c, d FROM t2 )) IS UNKNOWN;

as you add (), is it that
( a, b ) NOT IN ( SELECT c, d FROM t2 ) IS UNKNOWN
stands for
( a, b ) NOT IN (( SELECT c, d FROM t2 ) IS UNKNOWN)
? Mmmm no that wouldn't be parsable, (c,d) can't be compared to UNKNOWN.
Then maybe the added () are "just for fun"?

> +
> +SELECT * FROM t1 WHERE 1 = 1 AND ( a, b ) NOT IN ( SELECT c, d FROM t2 );
> +
> +--replace_column 1 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 x
> +EXPLAIN
> +SELECT * FROM t1 WHERE ( a, b ) NOT IN ( SELECT e, f FROM t3 );
> +SELECT * FROM t1 WHERE ( a, b ) NOT IN ( SELECT e, f FROM t3 );
> +
> +--replace_column 1 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 x
> +EXPLAIN
> +SELECT * FROM t2 WHERE ( c, d ) NOT IN ( SELECT a, b FROM t1 );
> +SELECT * FROM t2 WHERE ( c, d ) NOT IN ( SELECT a, b FROM t1 );
> +
> +--replace_column 1 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 x
> +EXPLAIN
> +SELECT * FROM t3 WHERE ( e, f ) NOT IN ( SELECT c, d FROM t2 );
> +SELECT * FROM t3 WHERE ( e, f ) NOT IN ( SELECT c, d FROM t2 );
> +
> +--replace_column 1 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 x
> +EXPLAIN
> +SELECT * FROM t2 WHERE ( c, d ) NOT IN ( SELECT e, f FROM t3 );
> +SELECT * FROM t2 WHERE ( c, d ) NOT IN ( SELECT e, f FROM t3 );
> +
> +SELECT * FROM t1 WHERE ( a, b ) NOT IN 
> +  ( SELECT c, d FROM t2 WHERE c = 1 AND c <> 1 );
> +
> +SELECT * FROM t1 WHERE b NOT IN ( SELECT c FROM t2 WHERE c = 1 );
> +
> +SELECT * FROM t1 WHERE NULL NOT IN ( SELECT c FROM t2 WHERE c = 1 AND c <> 1
> );
> +
> +DROP TABLE t1, t2, t3, t4, t5;
> +
>  
>  --echo #
>  --echo # End of 5.1 tests.
> 
> === modified file 'sql/item_cmpfunc.cc'
> --- a/sql/item_cmpfunc.cc	2010-08-05 10:42:14 +0000
> +++ b/sql/item_cmpfunc.cc	2010-08-06 13:26:42 +0000
> @@ -1751,6 +1751,34 @@ bool Item_in_optimizer::fix_fields(THD *
>  }
>  
>  
> +/**
> +   The implementation of optimized <outer expression> [NOT] IN
> <subquery>
> +   predicates. The implementation works as follows.
> +
> +   For the current value of the outer expression
> +   
> +   - If it contains only NULL values, the original (before rewrite by the
> +     Item_in_subselect rewrite methods) inner subquery is correlated and it

correlated or _not_ correlated?

> +     was previously executed, there is no need to re-execute it, and the
> +     previous return value is returned.
> +
> +   - If it contains NULL values, check if there is a partial match for
> +     the inner query block by evaluating the query. The evaulation will

evaulation

> +     proceed according to special rules set up elsewhere. These rules include:
> +
> +     - The HAVING NOT <column> IS NULL conditions added by the aforementioned

here I would write <inner_column> instead of <column>

> +       rewrite methods will detect whether they evaulated a NULL value and if

evaulated

> +       so, will cause the subquery to evaluate to NULL. See
> +       Item_in_subselect::val_bool().
> +     
> +     - If there is an eligible index for executing the subquery, the special
> +       access method "Full scan on NULL key" is employed which ensures that
> +       the inner query will detect if there are NULL values resulting from the
> +       inner query.
> +
> +   - If it contains no NULL values, the call is forwarded to the inner query
> +     block.
> + */
>  longlong Item_in_optimizer::val_int()
>  {
>    bool tmp;

> @@ -1818,8 +1846,11 @@ longlong Item_in_optimizer::val_int()
>        else 
>        {
>          /* The subquery has to be evaluated */
> -        (void) args[1]->val_bool_result();
> -        null_value= !item_subs->engine->no_rows();
> +        (void) item_subs->val_bool_result();
> +        if (item_subs->engine->no_rows())
> +          null_value= item_subs->null_value;
> +        else
> +          null_value= TRUE;

This is correct but I think we need a long and clear comment to explain 
the logic of this if/else (don't hesitate to write a long example in the 
comment, that's the only way I understand it, and I think future readers 
would benefit from it).

>          if (all_left_cols_null)
>            result_for_null_param= null_value;
>        }

The code itself is ok to push. Tests' results seem ok (I guess they 
would fail without the code fix?).
If QA had a RQG grammar about partial matches, or could quickly write 
one, it would be good to run it... it's hard to be 100% sure about fixes 
to this subselect code.

-- 
Mr. Guilhem Bichot <guilhem.bichot@stripped>
Oracle / MySQL / Optimizer team, Lead Software Engineer
Bordeaux, France
www.oracle.com / www.mysql.com
Thread
Re: bzr commit into mysql-5.1-bugteam branch (martin.hansson:3478)Bug#51070Guilhem Bichot25 Aug
  • Re: bzr commit into mysql-5.1-bugteam branch (martin.hansson:3478)Bug#51070Roy Lyseng25 Aug
  • Re: bzr commit into mysql-5.1-bugteam branch (martin.hansson:3478)Bug#51070Martin Hansson30 Aug
    • Re: bzr commit into mysql-5.1-bugteam branch (martin.hansson:3478)Bug#51070Guilhem Bichot31 Aug
      • Re: bzr commit into mysql-5.1-bugteam branch (martin.hansson:3478)Bug#51070Roy Lyseng31 Aug