List:Commits« Previous MessageNext Message »
From:timour Date:July 13 2007 12:12pm
Subject:bk commit into 5.2 tree (timour:1.2552)
View as plain text  
Below is the list of changes that have just been committed into a local
5.2 repository of tkatchaounov. When tkatchaounov does a push these changes will
be propagated to the main repository and, within 24 hours after the
push, to the public repository.
For information on how to access the public repository
see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html

ChangeSet@stripped, 2007-07-13 15:12:34+03:00, timour@stripped +3 -0
  WL#1110: Subquery optimization: materialization
  (also affects WL3741)
  
  - A new execution method for IN is chosen only if none was chosen. Affects PS/SP.
  - Refactored Item_in_subselect::[transformed, converted_to_sj, use_hash_sj) into
    enum Item_in_subselect::exec_method.

  sql/item_subselect.cc@stripped, 2007-07-13 15:12:30+03:00, timour@stripped +22 -16
    Replaced three boolean flags in Item_in_subselect (transformed, converted_to_sj, use_hash_sj)
    with one enum - Item_in_subselect::exec_method.

  sql/item_subselect.h@stripped, 2007-07-13 15:12:30+03:00, timour@stripped +10 -12
    Replaced three boolean flags in Item_in_subselect (transformed, converted_to_sj, use_hash_sj)
    with one enum - Item_in_subselect::exec_method.

  sql/sql_select.cc@stripped, 2007-07-13 15:12:30+03:00, timour@stripped +35 -28
    Replaced three boolean flags in Item_in_subselect (transformed, converted_to_sj, use_hash_sj)
    with one enum - Item_in_subselect::exec_method.
    
    Changed the choice of execution method for IN to work in such a way, that a new execution
    method is chosen only if none was chosen to this point. This is needed to handle SP, where
    a query inside a SP can be prepared and optimized during its first execution under with
    one method of IN execution. If a subsequent optimization/execution is peformed with another
    execution method, then there is no easy way to undo the permanent transformations done
    for the previous execution method, which leads to a crash.  

# This is a BitKeeper patch.  What follows are the unified diffs for the
# set of deltas contained in the patch.  The rest of the patch, the part
# that BitKeeper cares about, is below these diffs.
# User:	timour
# Host:	zmeiux.home
# Root:	/home/tkatchaounov/mysql/src/5.2-1110

--- 1.550/sql/sql_select.cc	2007-07-09 18:50:25 +03:00
+++ 1.551/sql/sql_select.cc	2007-07-13 15:12:30 +03:00
@@ -520,6 +520,7 @@
   if (!thd->lex->view_prepare_mode)
   {
     Item_subselect *subselect;
+    Item_in_subselect *in_subs= NULL;
     /*
       Are we in a subquery predicate?
       TODO: the block below will be executed for every PS execution without need.
@@ -528,6 +529,9 @@
     {
       bool do_semijoin= !test(thd->variables.optimizer_switch &
                               OPTIMIZER_SWITCH_NO_SEMIJOIN);
+      if (subselect->substype() == Item_subselect::IN_SUBS)
+        in_subs= (Item_in_subselect*)subselect;
+
       /*
         Check if we're in subquery that is a candidate for flattening into a
         semi-join (which is done done in flatten_subqueries()). The
@@ -537,6 +541,7 @@
           3. Subquery does not have GROUP BY or ORDER BY
           4. Subquery does not use aggregate functions or HAVING
           5. Subquery predicate is at the AND-top-level of ON/WHERE clause
+          6. No execution method was already chosen (by a prepared statement).
 
           (*). We are not in a subquery of a single table UPDATE/DELETE that 
                doesn't have a JOIN (TODO: We should handle this at some
@@ -545,17 +550,17 @@
           (**). We're not in a confluent table-less subquery, like
                 "SELECT 1". 
       */
-      if (subselect->substype() == Item_subselect::IN_SUBS &&           // 1
+      if (in_subs &&                                                    // 1
           !select_lex->master_unit()->first_select()->next_select() &&  // 2
           !select_lex->group_list.elements && !order &&                 // 3
           !having && !select_lex->with_sum_func &&                      // 4
           thd->thd_marker &&                                            // 5
           select_lex->outer_select()->join &&                           // (*)
           select_lex->master_unit()->first_select()->leaf_tables &&     // (**) 
-          do_semijoin)
+          do_semijoin &&
+          in_subs->exec_method == Item_in_subselect::NOT_TRANSFORMED)   // 6
       {
         fprintf(stderr, "subq is an sj candidate\n");
-        Item_in_subselect *in_subs= (Item_in_subselect*)subselect;
 
         if (thd->stmt_arena->state != Query_arena::PREPARED)
         {
@@ -584,31 +589,31 @@
       else
       {
         fprintf(stderr, "subq is not an sj candidate\n");
-        Item_in_subselect *in_subs= NULL;
         bool do_materialize= !test(thd->variables.optimizer_switch &
                                    OPTIMIZER_SWITCH_NO_MATERIALIZATION);
         /*
           Check if the subquery predicate can be executed via materialization.
           The required conditions are:
-            1. Subquery predicate is an IN/=ANY subq predicate
-            2. Subquery is a single SELECT (not a UNION)
-            3. Subquery is not a table-less query. In this case there is no
-               point in materializing.
-            4. Subquery predicate is a top-level predicate
-               (this implies it is not negated)
-               TODO: this is a limitation that should be lifeted once we
-               implement correct NULL semantics (WL#3830)
-            5. Subquery is non-correlated
-               TODO:
-               This is an overly restrictive condition. It can be extended to:
-               (Subquery is non-correlated ||
-                Subquery is correlated to any query outer to IN predicate ||
-                (Subquery is correlated to the immediate outer query &&
-                 Subquery !contains {GROUP BY, ORDER BY [LIMIT],
-                 aggregate functions) && subquery predicate is not under "NOT IN"))
+          1. Subquery predicate is an IN/=ANY subq predicate
+          2. Subquery is a single SELECT (not a UNION)
+          3. Subquery is not a table-less query. In this case there is no
+             point in materializing.
+          4. Subquery predicate is a top-level predicate
+             (this implies it is not negated)
+             TODO: this is a limitation that should be lifeted once we
+             implement correct NULL semantics (WL#3830)
+          5. Subquery is non-correlated
+             TODO:
+             This is an overly restrictive condition. It can be extended to:
+             (Subquery is non-correlated ||
+              Subquery is correlated to any query outer to IN predicate ||
+              (Subquery is correlated to the immediate outer query &&
+               Subquery !contains {GROUP BY, ORDER BY [LIMIT],
+               aggregate functions) && subquery predicate is not under "NOT IN"))
+          6. No execution method was already chosen (by a prepared statement).
 
-           (*) The subquery must be part of a SELECT statement. The current
-                condition also excludes multi-table update statements.
+          (*) The subquery must be part of a SELECT statement. The current
+               condition also excludes multi-table update statements.
 
           We have to determine whether we will perform subquery materialization
           before calling the IN=>EXISTS transformation, so that we know whether to
@@ -616,14 +621,15 @@
           Item_in_subselect in an Item_in_optimizer.
         */
         if (do_materialize && 
-            subselect->substype() == Item_subselect::IN_SUBS &&           // 1
+            in_subs  &&                                                   // 1
             !select_lex->master_unit()->first_select()->next_select() &&  // 2
             select_lex->master_unit()->first_select()->leaf_tables &&     // 3
             thd->lex->sql_command == SQLCOM_SELECT)                       // *
         {
-          in_subs= (Item_in_subselect*) subselect;
-          in_subs->use_hash_sj= (in_subs->is_top_level_item() &&          // 4
-                                 !in_subs->is_correlated);                // 5
+          if (in_subs->is_top_level_item() &&                             // 4
+              !in_subs->is_correlated &&                                  // 5
+              in_subs->exec_method == Item_in_subselect::NOT_TRANSFORMED) // 6
+            in_subs->exec_method= Item_in_subselect::MATERIALIZATION;
         }
 
         Item_subselect::trans_res trans_res;
@@ -3101,7 +3107,7 @@
 
   /* 3. Remove the original subquery predicate from the WHERE/ON */
   *(subq_pred->ref_ptr)= new Item_int(1);
-  subq_pred->converted_to_sj= TRUE; // for subsequent executions
+  subq_pred->exec_method= Item_in_subselect::SEMI_JOIN; // for subsequent executions
   /*TODO: also reset the 'with_subselect' there. */
 
   /* n. Adjust the parent_join->tables counter */
@@ -3347,7 +3353,8 @@
           subquery_predicate->substype() == Item_subselect::IN_SUBS)
       {
         Item_in_subselect *in_subs= (Item_in_subselect*) subquery_predicate;
-        if (in_subs->use_hash_sj && in_subs->setup_hash_sj_engine())
+        if (in_subs->exec_method == Item_in_subselect::MATERIALIZATION &&
+            in_subs->setup_hash_sj_engine())
           return TRUE;
       }
     }

--- 1.161/sql/item_subselect.cc	2007-06-30 19:18:40 +03:00
+++ 1.162/sql/item_subselect.cc	2007-07-13 15:12:30 +03:00
@@ -648,8 +648,8 @@
 
 Item_in_subselect::Item_in_subselect(Item * left_exp,
 				     st_select_lex *select_lex):
-  Item_exists_subselect(), left_expr_cache(0), optimizer(0), transformed(0),
-  pushed_cond_guards(NULL), use_hash_sj(0), upper_item(0), converted_to_sj(FALSE)
+  Item_exists_subselect(), left_expr_cache(0), optimizer(0),
+  pushed_cond_guards(NULL), exec_method(NOT_TRANSFORMED), upper_item(0)
 {
   DBUG_ENTER("Item_in_subselect::Item_in_subselect");
   left_expr= left_exp;
@@ -1048,11 +1048,8 @@
     If this IN predicate can be computed via materialization, do not
     perform the IN -> EXISTS transformation.
   */
-  if (use_hash_sj)
-  {
-    transformed= FALSE;
+  if (exec_method == MATERIALIZATION)
     DBUG_RETURN(RES_OK);
-  }
 
   select_lex->uncacheable|= UNCACHEABLE_DEPENDENT;
   if (join->having || select_lex->with_sum_func ||
@@ -1281,11 +1278,8 @@
     If this IN predicate can be computed via materialization, do not
     perform the IN -> EXISTS transformation.
   */
-  if (use_hash_sj)
-  {
-    transformed= FALSE;
+  if (exec_method == MATERIALIZATION)
     DBUG_RETURN(RES_OK);
-  }
 
   /*
     Tranform a (possibly non-correlated) IN subquery into a correlated EXISTS.
@@ -1554,7 +1548,12 @@
   if (result)
     goto err;
 
-  transformed= 1;
+  /*
+    If we didn't choose an execution method up to this point, we choose
+    the IN=>EXISTS transformation.
+  */
+  if (exec_method == NOT_TRANSFORMED)
+    exec_method= IN_TO_EXISTS;
   arena= thd->activate_stmt_arena_if_needed(&backup);
   /*
     Both transformers call fix_fields() only for Items created inside them,
@@ -1587,7 +1586,7 @@
 
 void Item_in_subselect::print(String *str)
 {
-  if (transformed)
+  if (exec_method == IN_TO_EXISTS)
     str->append(STRING_WITH_LEN("<exists>"));
   else
   {
@@ -1603,7 +1602,7 @@
   bool result = 0;
   ref_ptr= ref;
 
-  if (converted_to_sj)
+  if (exec_method == SEMI_JOIN)
     return !( (*ref)= new Item_int(1));
 
   if (thd_arg->lex->view_prepare_mode && left_expr && !left_expr->fixed)
@@ -1727,14 +1726,14 @@
 
 bool Item_in_subselect::is_expensive_processor(uchar *arg)
 {
-  return use_hash_sj;
+  return exec_method == MATERIALIZATION;
 }
 
 
 Item_subselect::trans_res
 Item_allany_subselect::select_transformer(JOIN *join)
 {
-  transformed= 1;
+  exec_method= IN_TO_EXISTS;
   if (upper_item)
     upper_item->show= 1;
   return select_in_like_transformer(join, func);
@@ -1743,7 +1742,7 @@
 
 void Item_allany_subselect::print(String *str)
 {
-  if (transformed)
+  if (exec_method == IN_TO_EXISTS)
     str->append(STRING_WITH_LEN("<exists>"));
   else
   {
@@ -2969,6 +2968,13 @@
       DBUG_RETURN(TRUE);     
     materialize_join->exec();
     is_materialized= TRUE;
+    /*
+      TODO:
+      - unlock all subquery tables as we don't need them. Look at
+        the code for single-value subqueries.
+      - the temp table used for grouping in the subquery can be freed
+        immediately after materialization (yet i's done together with unlocking).
+     */
     /*
       If the subquery returned no rows, there is no need to perform
       lookups for empty subqueries.

--- 1.99/sql/item_subselect.h	2007-06-30 19:18:40 +03:00
+++ 1.100/sql/item_subselect.h	2007-07-13 15:12:30 +03:00
@@ -280,7 +280,6 @@
   Item_in_optimizer *optimizer;
   bool was_null;
   bool abort_on_null;
-  bool transformed;
 
 public:
   /* Used to trigger on/off conditions that were pushed down to subselect */
@@ -289,9 +288,6 @@
   /* Priority of this predicate in the convert-to-semi-join-nest process. */
   int sj_convert_priority;
 
-  /* TRUE <=> This Item was converted to semi-join nest and should be removed */
-  bool converted_to_sj;
-
   /* 
     Location of the subquery predicate. It is either
      - pointer to join nest if the subquery predicate is in the ON expression
@@ -300,12 +296,14 @@
   TABLE_LIST *expr_join_nest;
   Item **ref_ptr;
 
-  /*
-    True if this predicate will be executed via subquery materialization
-    (hash semi-join), false if it should be executed via IN=>EXISTS
-    transformation.
-  */
-  bool use_hash_sj;
+  /* The method chosen to execute the IN predicate.  */
+  enum enum_exec_method {
+    NOT_TRANSFORMED, /* No execution method was chosen for this IN. */
+    SEMI_JOIN,   /* IN was converted to semi-join nest and should be removed. */
+    IN_TO_EXISTS, /* IN was converted to correlated EXISTS. */
+    MATERIALIZATION /* IN will be executed via subquery materialization. */
+  };
+  enum_exec_method exec_method;
 
   bool *get_cond_guard(int i)
   {
@@ -323,8 +321,8 @@
   Item_in_subselect(Item * left_expr, st_select_lex *select_lex);
   Item_in_subselect()
     :Item_exists_subselect(), left_expr_cache(0), optimizer(0),
-    abort_on_null(0), transformed(0), pushed_cond_guards(NULL),
-    use_hash_sj(0), upper_item(0), converted_to_sj(FALSE)
+    abort_on_null(0), pushed_cond_guards(NULL), exec_method(NOT_TRANSFORMED),
+    upper_item(0)
   {}
 
   subs_type substype() { return IN_SUBS; }
Thread
bk commit into 5.2 tree (timour:1.2552)timour13 Jul