List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:May 10 2010 7:58pm
Subject:bzr commit into mysql-next-mr-bugfixing branch (guilhem:3163) Bug#46743
View as plain text  
#At file:///home/mysql_src/bzrrepos/mysql-next-mr-opt-backporting2/ based on revid:oystein.grovlen@stripped

 3163 Guilhem Bichot	2010-05-10
      Fix for BUG#46743 "Azalea processing correlated, aggregate SELECT subqueries incorrectly".
      (Backporting of guilhem@stripped ). See sql_select.cc for details.
     @ mysql-test/r/subselect4.result
        result (before the fix, there would be an error in the "GROUP BY" SELECT not using IFNULL,
        because some rows would be said to match "=NULL").
     @ mysql-test/t/subselect4.test
        test for BUG#46743
     @ sql/sql_select.cc
        Symptom: "=x" condition matched rows when x is NULL, which should
        never be.
        Summary of the problem: likely a failure of 
        http://forge.mysql.com/wiki/MySQL_Internals_Optimizer#Late_NULLs_Filtering
        which didn't eliminate NULL values in some cases.
        Details below.
        
        Consider tables created and filled as in the minimal testcase of
        subselect4.test in this revision.
        Consider three queries:
        1) SELECT (SELECT 1 FROM D WHERE d = c) FROM C ;
        2) SELECT (SELECT 1 FROM D WHERE d = c) FROM C GROUP BY c ;
        3) SELECT (SELECT 1 FROM D WHERE d = IFNULL(c,NULL)) FROM C ;
        C has one row, C.c is NULL, so all 3 queries are equivalent and should
        return NULL (because the subquery gives an empty result): but the
        second query returns "error 1242: Subquery returns more than 1 row",
        which is unexpected.
        It's all about what happens inside the subquery. Let's compare how
        the subquery is optimized and executed in each of those three queries.
        
        For 1) optimizer understands that d=c should eliminate NULL D.d values:
        it sets "null_rejecting" to "true" in add_key_field(). It also sees
        that the condition d=c will be a byproduct of using the "ref"
        index lookup method so does not need to be explicitely evaluated: this
        condition (COND object) is thus removed, this way: make_join_select()
        calls "tmp= make_cond_for_table()" which calls
        make_cond_for_table_from_pred(); around "EQ_FUNC" in that function, a
        zero COND is returned, so "tmp" is 0, thus we come to the "else
         tab->select_cond= sel->cond= NULL" branch. When the subquery
        is executed, join_read_always_key() does not even read keys of D
        because as C.c is NULL, null_rejecting shortcuts.
        
        For 3) because IFNULL is a function, optimizer does not understand
        that d = IFNULL(c,NULL)) should eliminate null D.d values:
        null_rejecting is set to false. The condition is not removed, also
        because IFNULL is a function (both equality members have too different
        types). join_read_always_key() reads the index, searching for NULL
        keys, finds two of them, but they are eliminated by the kept
        condition.
        
        For 2) optimizer does not understand that d=c should eliminate NULL
        D.d values: null_rejecting is false. But the condition is
        eliminated. So join_read_always_key() finds two NULLs and returns them
        (no condition, and null_rejecting=false): error. How did this happen:
         * setting null_rejecting to true requires that both members of d=c
         are Item_field. But as there is GROUP BY, "c" in d=c is an
         Item_outer_ref (built in Item_field::fix_outer_field() around "If an
         outer field is resolved in a grouping select etc"), not an
         Item_field, so null_rejecting is false. 
         * make_cond_for_table() eliminates the COND condition because in d=c,
         test_if_ref() looks for __real__ items: it sees that inside
         Item_outer_ref "c", there is an Item_field, so it sees two Item_field
         on the sides of the equality and thus sees that "ref" will verify the
         condition naturally (see test_if_ref() calling eq_def()).
        
        Overall it looks like there is some assymetry between test_if_ref()
        (compares real_item()) and add_key_field() (compares plain item), the
        former catching a broader set of conditions (thus in some cases the condition
        is eliminated (COND removal) by the former but not "restored" (in the
        form of null_rejecting) by the latter). This assymetry was
        introduced by placing "right_item= right_item->real_item();" in
        test_in_ref() by 
        sp1r-gkodinov/kgeorge@stripped (as a fix for
        BUG 30788; revert this line and the present bug goes away).
        Symmetry is put back by using real_item() in add_key_field() too, which the
        present patch does. It looks logical, because what matters in d=c is
        whether "c" comes from a field, it's ok if "c" goes through some
        intermediate steps like GROUP BY (which makes Item_outer_ref).
        * the bugfix: when computing null_rejecting, use real_item() and not the base item,
        so that we can discover an Item_field inside an Item_outer_ref.
        I use
          if (a)
            null_rejecting= true;
          else
            null_rejecting=false;
        so that the wrong path ("=false") shows up in QA's differential coverage tool which spots lines
        executed only in failing tests (it's on the intranet, contact me for more details).
        * the assertion in add_not_null_conds() must then be updated, because it can now
        meet something else than FIELD_ITEM and must look at the real item. Note that add_not_null_conds()
        is executed in our subquery, but quits the iteration at the "continue;".

    modified:
      mysql-test/r/subselect4.result
      mysql-test/t/subselect4.test
      sql/sql_select.cc
=== modified file 'mysql-test/r/subselect4.result'
--- a/mysql-test/r/subselect4.result	2010-05-07 21:14:09 +0000
+++ b/mysql-test/r/subselect4.result	2010-05-10 19:31:43 +0000
@@ -79,6 +79,51 @@ id	select_type	table	type	possible_keys	
 DROP TABLE t1;
 End of 5.5 tests.
 #
+# BUG#46743 "Azalea processing correlated, aggregate SELECT
+# subqueries incorrectly"
+#
+CREATE TABLE t1 (c int);
+INSERT INTO t1 VALUES (NULL);
+CREATE TABLE t2 (d int , KEY (d));
+INSERT INTO t2 VALUES (NULL),(NULL);
+0 rows in subquery
+SELECT 1 AS RESULT FROM t2,t1 WHERE d = c;
+RESULT
+base query
+SELECT (SELECT 1 FROM t2 WHERE d = c) AS RESULT FROM t1 ;
+RESULT
+NULL
+EXPLAIN EXTENDED SELECT (SELECT 1 FROM t2 WHERE d = c) AS RESULT FROM t1 ;
+id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
+1	PRIMARY	t1	system	NULL	NULL	NULL	NULL	1	100.00	
+2	DEPENDENT SUBQUERY	NULL	NULL	NULL	NULL	NULL	NULL	NULL	NULL	Impossible WHERE noticed after reading const tables
+Warnings:
+Note	1276	Field or reference 'test.t1.c' of SELECT #2 was resolved in SELECT #1
+Note	1003	select (select 1 AS `1` from `test`.`t2` where (`test`.`t2`.`d` = '0')) AS `RESULT` from dual
+first equivalent variant
+SELECT (SELECT 1 FROM t2 WHERE d = IFNULL(c,NULL)) AS RESULT FROM t1 GROUP BY c ;
+RESULT
+NULL
+EXPLAIN EXTENDED SELECT (SELECT 1 FROM t2 WHERE d = IFNULL(c,NULL)) AS RESULT FROM t1 GROUP BY c;
+id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
+1	PRIMARY	t1	system	NULL	NULL	NULL	NULL	1	100.00	
+2	DEPENDENT SUBQUERY	NULL	NULL	NULL	NULL	NULL	NULL	NULL	NULL	Impossible WHERE noticed after reading const tables
+Warnings:
+Note	1276	Field or reference 'test.t1.c' of SELECT #2 was resolved in SELECT #1
+Note	1003	select (select 1 AS `1` from `test`.`t2` where (`test`.`t2`.`d` = ifnull('0',NULL))) AS `RESULT` from dual group by '0'
+second equivalent variant
+SELECT (SELECT 1 FROM t2 WHERE d = c) AS RESULT FROM t1 GROUP BY c ;
+RESULT
+NULL
+EXPLAIN EXTENDED SELECT (SELECT 1 FROM t2 WHERE d = c) AS RESULT FROM t1 GROUP BY c ;
+id	select_type	table	type	possible_keys	key	key_len	ref	rows	filtered	Extra
+1	PRIMARY	t1	system	NULL	NULL	NULL	NULL	1	100.00	
+2	DEPENDENT SUBQUERY	NULL	NULL	NULL	NULL	NULL	NULL	NULL	NULL	Impossible WHERE noticed after reading const tables
+Warnings:
+Note	1276	Field or reference 'test.t1.c' of SELECT #2 was resolved in SELECT #1
+Note	1003	select (select 1 AS `1` from `test`.`t2` where (`test`.`t2`.`d` = '0')) AS `RESULT` from dual group by '0'
+DROP TABLE t1,t2;
+#
 # BUG#49630: Segfault in select_describe() with double 
 #            nested subquery and materialization
 #

=== modified file 'mysql-test/t/subselect4.test'
--- a/mysql-test/t/subselect4.test	2010-05-07 21:14:09 +0000
+++ b/mysql-test/t/subselect4.test	2010-05-10 19:31:43 +0000
@@ -81,6 +81,34 @@ DROP TABLE t1;
 
 --echo End of 5.5 tests.
 
+
+--echo #
+--echo # BUG#46743 "Azalea processing correlated, aggregate SELECT
+--echo # subqueries incorrectly"
+--echo #
+# To see the bug, one would have to undo both the fix for BUG#46743
+# and the one for BUG#47123.
+CREATE TABLE t1 (c int);
+INSERT INTO t1 VALUES (NULL);
+CREATE TABLE t2 (d int , KEY (d)); # index is needed for bug
+INSERT INTO t2 VALUES (NULL),(NULL); # two rows needed for bug
+# we see that subquery returns 0 rows
+--echo 0 rows in subquery
+SELECT 1 AS RESULT FROM t2,t1 WHERE d = c;
+# so here it ends up as NULL
+--echo base query
+SELECT (SELECT 1 FROM t2 WHERE d = c) AS RESULT FROM t1 ;
+EXPLAIN EXTENDED SELECT (SELECT 1 FROM t2 WHERE d = c) AS RESULT FROM t1 ;
+--echo first equivalent variant
+SELECT (SELECT 1 FROM t2 WHERE d = IFNULL(c,NULL)) AS RESULT FROM t1 GROUP BY c ;
+EXPLAIN EXTENDED SELECT (SELECT 1 FROM t2 WHERE d = IFNULL(c,NULL)) AS RESULT FROM t1 GROUP BY c;
+--echo second equivalent variant
+# used to fail with 1242: Subquery returns more than 1 row
+SELECT (SELECT 1 FROM t2 WHERE d = c) AS RESULT FROM t1 GROUP BY c ;
+EXPLAIN EXTENDED SELECT (SELECT 1 FROM t2 WHERE d = c) AS RESULT FROM t1 GROUP BY c ;
+
+DROP TABLE t1,t2;
+
 --echo #
 --echo # BUG#49630: Segfault in select_describe() with double 
 --echo #            nested subquery and materialization

=== modified file 'sql/sql_select.cc'
--- a/sql/sql_select.cc	2010-05-10 12:40:54 +0000
+++ b/sql/sql_select.cc	2010-05-10 19:31:43 +0000
@@ -5076,10 +5076,16 @@ add_key_field(KEY_FIELD **key_fields,uin
     We use null_rejecting in add_not_null_conds() to add
     'othertbl.field IS NOT NULL' to tab->select_cond.
   */
-  (*key_fields)->null_rejecting= ((cond->functype() == Item_func::EQ_FUNC ||
-                                   cond->functype() == Item_func::MULT_EQUAL_FUNC) &&
-                                  ((*value)->type() == Item::FIELD_ITEM) &&
-                                  ((Item_field*)*value)->field->maybe_null());
+  {
+    Item *real= (*value)->real_item();
+    if (((cond->functype() == Item_func::EQ_FUNC) ||
+         (cond->functype() == Item_func::MULT_EQUAL_FUNC)) &&
+        (real->type() == Item::FIELD_ITEM) &&
+        ((Item_field*)real)->field->maybe_null())
+      (*key_fields)->null_rejecting= true;
+    else
+      (*key_fields)->null_rejecting= false;
+  }
   (*key_fields)->cond_guard= NULL;
   
   (*key_fields)->sj_pred_no= get_semi_join_select_list_index(field);
@@ -8638,8 +8644,9 @@ static void add_not_null_conds(JOIN *joi
         {
           Item *item= tab->ref.items[keypart];
           Item *notnull;
-          DBUG_ASSERT(item->type() == Item::FIELD_ITEM);
-          Item_field *not_null_item= (Item_field*)item;
+          Item *real= item->real_item();
+          DBUG_ASSERT(real->type() == Item::FIELD_ITEM);
+          Item_field *not_null_item= (Item_field*)real;
           JOIN_TAB *referred_tab= not_null_item->field->table->reginfo.join_tab;
           /*
             For UPDATE queries such as:


Attachment: [text/bzr-bundle] bzr/guilhem@mysql.com-20100510193143-s61ilhrb6kict0md.bundle
Thread
bzr commit into mysql-next-mr-bugfixing branch (guilhem:3163) Bug#46743Guilhem Bichot14 May