List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:June 20 2011 2:30pm
Subject:bzr commit into mysql-trunk branch (guilhem.bichot:3386) Bug#12668294
View as plain text  
#At file:///home/mysql_src/bzrrepos_new/mysql-next-mr-opt-backporting/ based on revid:guilhem.bichot@stripped

 3386 Guilhem Bichot	2011-06-20
      Fix for Bug#12668294 - GROUP BY ON EMPTY RESULT GIVES EMPTY ROW INSTEAD OF NULL WHEN MATERIALIZATION ON
      Reminder of rules, when a join has no rows:
      a) "SELECT MIN(field) FROM a join;"
      should return a NULL row.
      b) "SELECT MIN(field) FROM a join GROUP BY field;"
      should return no row.
      Summary of bug: send_row_on_empty_set() missed the case of when GROUP
      BY has been optimized away, treating (b) like (a).
     @ mysql-test/r/subquery_mat.result
        A NULL row was wrongly sent, without the code bugfix.
     @ mysql-test/r/subquery_mat_all.result
        Here there was no bug.
     @ mysql-test/r/subquery_mat_none.result
        Here there was no bug
     @ sql/sql_select.h
        Details of what happened are below.
        
        When semijoin is off and materialization is off.
        ================================================
        In resolve_subquery(), IN->EXISTS is done, subquery remains
        non-correlated.
        Constant tables t1 and t3 are discovered in make_join_statistics().
        Subquery, non-correlated, is evaluated in make_join_select(), here:
           if (const_cond && !const_cond->val_int())
        It is found empty, so the WHERE clause is recognized as impossible,
        JOIN::optimize() terminates right after make_join_select(),
        JOIN::exec() calls return_zero_rows(). send_row_on_empty_set() sees
        that we have GROUP BY (group_list!=NULL) so returns "false" and
        return_zero_rows() correctly sends an empty result.
        
        When semijoin is on and materialization is off.
        ===============================================
        Semijoin transformation is not done because there is a LEFT
        JOIN. IN->EXISTS is done, we are back to the case above.
        
        When semijoin is on and materialization is on.
        ==============================================
        Semijoin transformation is not done because there is a LEFT
        JOIN. Materialization is not done (indeed
        flatten_subqueries() doesn't try to fallback to
        materialization - looks like a bug).
        IN->EXISTS is done, we are back to the case above.
        
        When semijoin is off and materialization is on (the bug).
        =========================================================
        In resolve_subquery(), subquery is scheduled for materialization.
        Constant tables t1 and t3 are discovered in make_join_statistics().
        The (top) query's optimization then progresses while *not* caring for
        the subquery; "not caring" is a requirement of subquery
        materialization: see for example:
        <code>
          else if (cond->const_item() && !cond->is_expensive())
          /*
            DontEvaluateMaterializedSubqueryTooEarly:
            TODO: 
            Excluding all expensive functions is too restritive we should exclude only
            materialized IN subquery predicates because they can't yet be evaluated
            here (they need additional initialization that is done later on).
        
            The proper way to exclude the subqueries would be to walk the cond tree
            and check for materialized subqueries there.
          */
          {
            *cond_value= eval_const_cond(cond) ? Item::COND_TRUE : Item::COND_FALSE;
        </code>
        which postpones evaluation of the subquery because it's
        to-be-materialized. So make_join_select() doesn't evaluate the
        subquery this time. Thus, query's optimization is not shortcut.
        The query's execution, in do_select(), sees that all tables of the
        LEFT JOIN (i.e. t1, t3) are constant, here :
        "  if (join->tables == join->const_tables)"
        so we have 0 or one row in the result, then do_select()
        evaluates the WHERE clause here:
        "    if (!join->conds || join->conds->val_int())"
        which evaluates the subquery. Which is found empty so we go into
            else if (join->send_row_on_empty_set())
        But this time, send_row_on_empty_set() returns "true". It's because
        it thinks there is no GROUP BY. It thinks so because the top query's
        optimization, as it was not shortcut early this time, had reached:
            group_list= remove_const(this, (old_group_list= group_list), conds,
                                     rollup.state == ROLLUP::STATE_NONE,
        			     &simple_group);
        This remove_const(), which knew that the GROUP BY column (field3) is
        from a constant table (t1), was able to optimize away GROUP BY: it
        thus set group_list to NULL. Making send_row_on_empty_set() go wrong:
        then NULL was wrongly sent.
        
        The fix: make send_row_on_empty_set() aware of group_optimized_away, a
        variable introduced in
        sp1r-holyfoot/hf@stripped/hfmain.(none)-20070731054604-56491
        for similar-looking problems.

    modified:
      mysql-test/include/subquery_mat.inc
      mysql-test/r/subquery_mat.result
      mysql-test/r/subquery_mat_all.result
      mysql-test/r/subquery_mat_none.result
      sql/sql_select.h
=== modified file 'mysql-test/include/subquery_mat.inc'
--- a/mysql-test/include/subquery_mat.inc	2011-04-01 11:06:48 +0000
+++ b/mysql-test/include/subquery_mat.inc	2011-06-20 14:29:59 +0000
@@ -882,3 +882,32 @@ DROP TABLE t1, t2;
 DROP VIEW v3;
 
 --echo # End Bug#11852644
+
+--echo
+--echo # Bug#12668294 - GROUP BY ON EMPTY RESULT GIVES EMPTY ROW
+--echo # INSTEAD OF NULL WHEN MATERIALIZATION ON
+--echo
+
+CREATE TABLE t1 (col_int_nokey INT) ENGINE=MEMORY;
+CREATE TABLE t2 (col_int_nokey INT) ENGINE=MEMORY;
+INSERT INTO t2 VALUES (8),(7);
+CREATE TABLE t3 (col_int_nokey INT) ENGINE=MEMORY;
+INSERT INTO t3 VALUES (7);
+
+SELECT MIN(t3.col_int_nokey),t1.col_int_nokey AS field3
+FROM t3
+     LEFT JOIN t1
+     ON t1.col_int_nokey
+WHERE (194, 200) IN (
+                     SELECT SQ4_alias1.col_int_nokey,
+                     SQ4_alias2.col_int_nokey
+                     FROM t2 AS SQ4_alias1
+                          JOIN
+                          t2 AS SQ4_alias2
+                          ON SQ4_alias2.col_int_nokey = 5
+                    )
+GROUP BY field3 ;
+
+DROP TABLE t1;
+DROP TABLE t2;
+DROP TABLE t3; 

=== modified file 'mysql-test/r/subquery_mat.result'
--- a/mysql-test/r/subquery_mat.result	2011-04-01 11:06:48 +0000
+++ b/mysql-test/r/subquery_mat.result	2011-06-20 14:29:59 +0000
@@ -1152,4 +1152,30 @@ r
 DROP TABLE t1, t2;
 DROP VIEW v3;
 # End Bug#11852644
+
+# Bug#12668294 - GROUP BY ON EMPTY RESULT GIVES EMPTY ROW
+# INSTEAD OF NULL WHEN MATERIALIZATION ON
+
+CREATE TABLE t1 (col_int_nokey INT) ENGINE=MEMORY;
+CREATE TABLE t2 (col_int_nokey INT) ENGINE=MEMORY;
+INSERT INTO t2 VALUES (8),(7);
+CREATE TABLE t3 (col_int_nokey INT) ENGINE=MEMORY;
+INSERT INTO t3 VALUES (7);
+SELECT MIN(t3.col_int_nokey),t1.col_int_nokey AS field3
+FROM t3
+LEFT JOIN t1
+ON t1.col_int_nokey
+WHERE (194, 200) IN (
+SELECT SQ4_alias1.col_int_nokey,
+SQ4_alias2.col_int_nokey
+FROM t2 AS SQ4_alias1
+JOIN
+t2 AS SQ4_alias2
+ON SQ4_alias2.col_int_nokey = 5
+)
+GROUP BY field3 ;
+MIN(t3.col_int_nokey)	field3
+DROP TABLE t1;
+DROP TABLE t2;
+DROP TABLE t3;
 set optimizer_switch=default;

=== modified file 'mysql-test/r/subquery_mat_all.result'
--- a/mysql-test/r/subquery_mat_all.result	2011-04-01 11:06:48 +0000
+++ b/mysql-test/r/subquery_mat_all.result	2011-06-20 14:29:59 +0000
@@ -1151,4 +1151,30 @@ r
 DROP TABLE t1, t2;
 DROP VIEW v3;
 # End Bug#11852644
+
+# Bug#12668294 - GROUP BY ON EMPTY RESULT GIVES EMPTY ROW
+# INSTEAD OF NULL WHEN MATERIALIZATION ON
+
+CREATE TABLE t1 (col_int_nokey INT) ENGINE=MEMORY;
+CREATE TABLE t2 (col_int_nokey INT) ENGINE=MEMORY;
+INSERT INTO t2 VALUES (8),(7);
+CREATE TABLE t3 (col_int_nokey INT) ENGINE=MEMORY;
+INSERT INTO t3 VALUES (7);
+SELECT MIN(t3.col_int_nokey),t1.col_int_nokey AS field3
+FROM t3
+LEFT JOIN t1
+ON t1.col_int_nokey
+WHERE (194, 200) IN (
+SELECT SQ4_alias1.col_int_nokey,
+SQ4_alias2.col_int_nokey
+FROM t2 AS SQ4_alias1
+JOIN
+t2 AS SQ4_alias2
+ON SQ4_alias2.col_int_nokey = 5
+)
+GROUP BY field3 ;
+MIN(t3.col_int_nokey)	field3
+DROP TABLE t1;
+DROP TABLE t2;
+DROP TABLE t3;
 set optimizer_switch=default;

=== modified file 'mysql-test/r/subquery_mat_none.result'
--- a/mysql-test/r/subquery_mat_none.result	2011-04-01 11:06:48 +0000
+++ b/mysql-test/r/subquery_mat_none.result	2011-06-20 14:29:59 +0000
@@ -1150,4 +1150,30 @@ r
 DROP TABLE t1, t2;
 DROP VIEW v3;
 # End Bug#11852644
+
+# Bug#12668294 - GROUP BY ON EMPTY RESULT GIVES EMPTY ROW
+# INSTEAD OF NULL WHEN MATERIALIZATION ON
+
+CREATE TABLE t1 (col_int_nokey INT) ENGINE=MEMORY;
+CREATE TABLE t2 (col_int_nokey INT) ENGINE=MEMORY;
+INSERT INTO t2 VALUES (8),(7);
+CREATE TABLE t3 (col_int_nokey INT) ENGINE=MEMORY;
+INSERT INTO t3 VALUES (7);
+SELECT MIN(t3.col_int_nokey),t1.col_int_nokey AS field3
+FROM t3
+LEFT JOIN t1
+ON t1.col_int_nokey
+WHERE (194, 200) IN (
+SELECT SQ4_alias1.col_int_nokey,
+SQ4_alias2.col_int_nokey
+FROM t2 AS SQ4_alias1
+JOIN
+t2 AS SQ4_alias2
+ON SQ4_alias2.col_int_nokey = 5
+)
+GROUP BY field3 ;
+MIN(t3.col_int_nokey)	field3
+DROP TABLE t1;
+DROP TABLE t2;
+DROP TABLE t3;
 set optimizer_switch=default;

=== modified file 'sql/sql_select.h'
--- a/sql/sql_select.h	2011-06-17 12:45:22 +0000
+++ b/sql/sql_select.h	2011-06-20 14:29:59 +0000
@@ -2029,10 +2029,16 @@ public:
   void clear();
   bool save_join_tab();
   bool init_save_join_tab();
+  /**
+     If there is an aggregate function (sum_func_count!=0), and no GROUP BY,
+     and the HAVING clause evaluates to "true", we should send a row even if
+     the join contains no rows. This is what the standard requires.
+   */
   bool send_row_on_empty_set()
   {
     return (do_send_rows && tmp_table_param.sum_func_count != 0 &&
-	    !group_list && select_lex->having_value != Item::COND_FALSE);
+	    !(group_list != NULL || group_optimized_away) &&
+            select_lex->having_value != Item::COND_FALSE);
   }
   bool change_result(select_result *result);
   bool is_top_level_join() const


Attachment: [text/bzr-bundle] bzr/guilhem.bichot@oracle.com-20110620142959-tls0a4ery9p0i9km.bundle
Thread
bzr commit into mysql-trunk branch (guilhem.bichot:3386) Bug#12668294Guilhem Bichot20 Jun
  • Re: bzr commit into mysql-trunk branch (guilhem.bichot:3386) Bug#12668294Roy Lyseng23 Jun
    • Re: bzr commit into mysql-trunk branch (guilhem.bichot:3386) Bug#12668294Guilhem Bichot25 Jun
  • Re: bzr commit into mysql-trunk branch (guilhem.bichot:3386) Bug#12668294Jorgen Loland23 Jun
    • Re: bzr commit into mysql-trunk branch (guilhem.bichot:3386) Bug#12668294Guilhem Bichot25 Jun
Re: bzr commit into mysql-trunk branch (guilhem.bichot:3386) Bug#12668294Roy Lyseng25 Jun