List:Commits« Previous MessageNext Message »
From:Oystein Grovlen Date:October 12 2011 12:01pm
Subject:bzr push into mysql-trunk branch (oystein.grovlen:3450 to 3453) WL#5559
View as plain text  
 3453 Oystein Grovlen	2011-10-12
      WL#5559 -  Factor tmp table out of optimizer.
        Part 3: Unify the code for creation of temporary tables
                for GROUP BY/ORDER BY
      
      Changed JOIN::create_intermediate_table() so that it 
      can be used to create both temporary tables, and let it
      return a pointer to the created table, and use it to also
      create JOIN::exec_tmp_table2.
     @ sql/sql_select.cc
        1. Extended JOIN::create_intermediate_table() with parameters
           for characteristics that vary between the two temporary
           tables.
        
        2. Changed JOIN::create_intermediate_table() to return the
           pointer to the created temporary table, instead of 
           assigning it to exec_tmp_table1.
        
        3. Moved code in JOIN::create_intermediate_table() that are
           only relevant to exec_tmp_table1 to JOIN::exec().
        
        4. Moved code in JOIN::create_intermediate_table() related
           to optimization for distinct to a new private function  
           JOIN::optimize_distinct(). (To be called from
           JOIN::exec() after creating the exec_tmp_table1
           for time being, but longer term this should be moved to
           optimization phase.)
        
        5. Call create_intermediate_table() to create 
           exec_tmp_table2.
     @ sql/sql_select.h
        Created new private function JOIN::optimize_distinct() to 
        optimize distinct when used on a subset of tables in query.
        
        Changed signature of JOIN::create_intermediate table to be 
        able to handle creation of both temporary tables used for
        sorting.

    modified:
      sql/sql_select.cc
      sql/sql_select.h
 3452 Oystein Grovlen	2011-10-12
      WL#5559 -  Factor tmp table out of optimizer.
        Part 2: Remove curr_join references from JOIN::exec
      
      Most references in JOIN::exec to member fields are indirect
      through local variable curr_join, which may either refer to
      the original JOIN object or a temporary copy of it.  This
      complicates the code significantly.
      
      In order to simplify the code and ease further refactoring
      of JOIN::exec, this patch removes the need for curr_join by, 
      when a temporary copy is used, executing the code directly 
      on the temporary object.
      
      This is achieved by creating a private function, 
      JOIN::execute(JOIN *parent) to be called from JOIN::exec(),
      and move all code after creation of the first temporary table
      to this function. If a temporary copy is needed, then 
      tmp_join->execute(this) is called. Otherwise,
      this->execute(NULL).
      
      The code makes sure that necessary execution data is copied
      from the temporary JOIN object and back to its parent in order
      to be able to reuse it for later executions as before.
      
      A limited number of example queries has been used to verify that
      there is no performance regression.  Also, using gprof it has been
      verified that the functions previously called from JOIN::exec, 
      is still called the same number of times with the new approach.
     @ sql/sql_select.cc
        Moved core part of JOIN::exec to private function JOIN::execute
        so that query can be executed directly on the current join
        object.  This way, all references to curr_join may be removed.
        
        In case one is executing on the temporary JOIN object, made
        sure to copy data that may be reused on later execution to
        the parent join object.
     @ sql/sql_select.h
        Added private function JOIN::execute that will do the core
        execution of a query.

    modified:
      sql/sql_select.cc
      sql/sql_select.h
 3451 Oystein Grovlen	2011-10-11
      WL#5559 -  Factor tmp table out of optimizer. Part 1.
      
      This patch moves the creation of the table to be used 
      for the intermediate step required by DISTINCT/ORDER BY/
      GROUP BY processing, from optimization phase to 
      execution phase.
        
      The code for creating this tables has been moved to a new 
      function, JOIN::create_intermediate_table(). 
     @ sql/sql_select.cc
        Moved code to create temporary table from JOIN::optimize() to
        a new function JOIN::create_intermediate_table().
        Moved the call of this method to JOIN::exec().
        Some clean-up at end of JOIN::optimize that is possible when
        this code has been moved.
     @ sql/sql_select.h
        Added a new private function JOIN::create_intermediate_table()
        that creates the table to be used for the intermediate step
        required by DISTINCT/ORDER BY/GROUP BY processing.

    modified:
      sql/sql_select.cc
      sql/sql_select.h
 3450 Jorgen Loland	2011-10-10 [merge]
      Merged in WL 5585

    modified:
      mysql-test/suite/opt_trace/r/charset.result
      mysql-test/suite/opt_trace/r/general2_no_prot.result
      mysql-test/suite/opt_trace/r/general2_ps_prot.result
      mysql-test/suite/opt_trace/r/general_no_prot_all.result
      mysql-test/suite/opt_trace/r/general_no_prot_none.result
      mysql-test/suite/opt_trace/r/general_ps_prot_all.result
      mysql-test/suite/opt_trace/r/general_ps_prot_none.result
      mysql-test/suite/opt_trace/r/range_no_prot.result
      mysql-test/suite/opt_trace/r/range_ps_prot.result
      sql/mysqld.cc
      sql/mysqld.h
      sql/opt_explain.cc
      sql/sql_lex.h
      sql/sql_select.cc
      sql/sql_select.h
      sql/sql_union.cc
=== modified file 'sql/sql_select.cc'
--- a/sql/sql_select.cc	2011-10-10 10:36:58 +0000
+++ b/sql/sql_select.cc	2011-10-12 11:08:37 +0000
@@ -1912,7 +1912,6 @@ static int clear_sj_tmp_tables(JOIN *joi
 int
 JOIN::optimize()
 {
-  bool need_distinct;
   ulonglong select_opts_for_readinfo;
   uint no_jbuf_after= UINT_MAX;
 
@@ -2641,150 +2640,11 @@ JOIN::optimize()
   }
 
   tmp_having= having;
-  if (select_options & SELECT_DESCRIBE)
-  {
-    error= 0;
-    DBUG_RETURN(0);
-  }
-  having= 0;
-
-  /*
-    The loose index scan access method guarantees that all grouping or
-    duplicate row elimination (for distinct) is already performed
-    during data retrieval, and that all MIN/MAX functions are already
-    computed for each group. Thus all MIN/MAX functions should be
-    treated as regular functions, and there is no need to perform
-    grouping in the main execution loop.
-    Notice that currently loose index scan is applicable only for
-    single table queries, thus it is sufficient to test only the first
-    join_tab element of the plan for its access method.
-  */
-  need_distinct= TRUE;
-  if (join_tab->is_using_loose_index_scan())
+  if (!(select_options & SELECT_DESCRIBE))
   {
-    tmp_table_param.precomputed_group_by= TRUE;
-    if (join_tab->is_using_agg_loose_index_scan())
-    {
-      need_distinct= FALSE;
-      tmp_table_param.precomputed_group_by= FALSE;
-    }
+    having= NULL;
   }
-
-  /* Create a tmp table if distinct or if the sort is too complicated */
-  if (need_tmp)
-  {
-    DBUG_PRINT("info",("Creating tmp table"));
-    THD_STAGE_INFO(thd, stage_creating_tmp_table);
-
-    init_items_ref_array();
-
-    tmp_table_param.hidden_field_count= (all_fields.elements -
-					 fields_list.elements);
-    ORDER *tmp_group= ((!simple_group && !procedure &&
-                        !(test_flags & TEST_NO_KEY_GROUP)) ? group_list :
-                                                             (ORDER*) 0);
-    /*
-      Pushing LIMIT to the temporary table creation is not applicable
-      when there is ORDER BY or GROUP BY or there is no GROUP BY, but
-      there are aggregate functions, because in all these cases we need
-      all result rows.
-    */
-    ha_rows tmp_rows_limit= ((order == 0 || skip_sort_order) &&
-                             !tmp_group &&
-                             !thd->lex->current_select->with_sum_func) ?
-                            m_select_limit : HA_POS_ERROR;
-
-    if (!(exec_tmp_table1=
-	  create_tmp_table(thd, &tmp_table_param, all_fields,
-                           tmp_group, group_list ? 0 : select_distinct,
-			   group_list && simple_group,
-			   select_options, tmp_rows_limit, "")))
-      DBUG_RETURN(1);
-
-    /*
-      We don't have to store rows in temp table that doesn't match HAVING if:
-      - we are sorting the table and writing complete group rows to the
-        temp table.
-      - We are using DISTINCT without resolving the distinct as a GROUP BY
-        on all columns.
-      
-      If having is not handled here, it will be checked before the row
-      is sent to the client.
-    */    
-    if (tmp_having && 
-	(sort_and_group || (exec_tmp_table1->distinct && !group_list)))
-      having= tmp_having;
-
-    /* if group or order on first table, sort first */
-    if (group_list && simple_group)
-    {
-      DBUG_PRINT("info",("Sorting for group"));
-      THD_STAGE_INFO(thd, stage_sorting_for_group);
-      if (create_sort_index(thd, this, group_list,
-			    HA_POS_ERROR, HA_POS_ERROR, FALSE) ||
-	  alloc_group_fields(this, group_list) ||
-          make_sum_func_list(all_fields, fields_list, 1) ||
-          prepare_sum_aggregators(sum_funcs, need_distinct) ||
-          setup_sum_funcs(thd, sum_funcs))
-      {
-        DBUG_RETURN(1);
-      }
-      group_list=0;
-    }
-    else
-    {
-      if (make_sum_func_list(all_fields, fields_list, 0) ||
-          prepare_sum_aggregators(sum_funcs, need_distinct) ||
-          setup_sum_funcs(thd, sum_funcs))
-      {
-        DBUG_RETURN(1);
-      }
-
-      if (!group_list && ! exec_tmp_table1->distinct && order && simple_order)
-      {
-        DBUG_PRINT("info",("Sorting for order"));
-        THD_STAGE_INFO(thd, stage_sorting_for_order);
-        if (create_sort_index(thd, this, order,
-                              HA_POS_ERROR, HA_POS_ERROR, TRUE))
-        {
-          DBUG_RETURN(1);
-        }
-        order=0;
-      }
-    }
-    
-    /*
-      Optimize distinct when used on some of the tables
-      SELECT DISTINCT t1.a FROM t1,t2 WHERE t1.b=t2.b
-      In this case we can stop scanning t2 when we have found one t1.a
-    */
-
-    if (exec_tmp_table1->distinct)
-    {
-      JOIN_TAB *last_join_tab= join_tab+tables-1;
-      do
-      {
-        if (select_lex->select_list_tables & last_join_tab->table->map)
-          break;
-        last_join_tab->not_used_in_distinct= 1;
-      } while (last_join_tab-- != join_tab);
-      /* Optimize "select distinct b from t1 order by key_part_1 limit #" */
-      if (order && skip_sort_order)
-      {
- 	/* Should always succeed */
-	if (test_if_skip_sort_order(&join_tab[const_tables],
-				    order, unit->select_limit_cnt, 0, 
-                                    &join_tab[const_tables].table->
-                                      keys_in_use_for_order_by))
-	  order=0;
-      }
-    }
-
-    /* If this join belongs to an uncacheable query save the original join */
-    if (select_lex->uncacheable && init_save_join_tab())
-      DBUG_RETURN(-1);                         /* purecov: inspected */
-  }
-
+   
   error= 0;
   DBUG_RETURN(0);
 
@@ -3064,15 +2924,12 @@ JOIN::exec()
   trace_exec.add_select_number(select_lex->select_number);
   Opt_trace_array trace_steps(trace, "steps");
   List<Item> *columns_list= &fields_list;
-  int      tmp_error;
   DBUG_ENTER("JOIN::exec");
 
   DBUG_ASSERT(!(select_options & SELECT_DESCRIBE));
 
   THD_STAGE_INFO(thd, stage_executing);
 
-  const bool has_group_by= this->group;
-
   if (prepare_result())
     DBUG_VOID_RETURN;
 
@@ -3150,31 +3007,94 @@ JOIN::exec()
 			    having);
     DBUG_VOID_RETURN;
   }
+  
+  /*
+    The loose index scan access method guarantees that all grouping or
+    duplicate row elimination (for distinct) is already performed
+    during data retrieval, and that all MIN/MAX functions are already
+    computed for each group. Thus all MIN/MAX functions should be
+    treated as regular functions, and there is no need to perform
+    grouping in the main execution loop.
+    Notice that currently loose index scan is applicable only for
+    single table queries, thus it is sufficient to test only the first
+    join_tab element of the plan for its access method.
+  */
+  if (join_tab && join_tab->is_using_loose_index_scan())
+    tmp_table_param.precomputed_group_by=
+      !join_tab->is_using_agg_loose_index_scan();
+
+  /* Create a tmp table if distinct or if the sort is too complicated */
+  if (need_tmp)
+  {
+    if (!exec_tmp_table1)
+    {
+      /*
+        Create temporary table on first execution of this join.
+        (Will be reused if this is a subquery that is executed several times.)
+      */
+      init_items_ref_array();
+
+      ORDER *tmp_group= 
+        (!simple_group && !procedure && !(test_flags & TEST_NO_KEY_GROUP)) ? 
+        group_list : NULL;
+      
+      tmp_table_param.hidden_field_count= 
+        all_fields.elements - fields_list.elements;
+
+      exec_tmp_table1= create_intermediate_table(&all_fields, tmp_group, 
+                                                 group_list && simple_group);
+      if (!exec_tmp_table1)
+        DBUG_VOID_RETURN;
+
+      if (exec_tmp_table1->distinct)
+        optimize_distinct();
+
+      /* If this join belongs to an uncacheable query save the original join */
+      if (select_lex->uncacheable && init_save_join_tab())
+        DBUG_VOID_RETURN; /* purecov: inspected */
+    }
+
+    if (tmp_join)
+    {
+      /*
+        We are in a non-cacheable subquery. Use the saved join
+        structure after creation of temporary table. 
+	See documentation of tmp_join for details.
+      */
+      tmp_join->execute(this);
+      error= tmp_join->error;
+      DBUG_VOID_RETURN;
+    }
+  }
+
+  execute(NULL);
+
+  DBUG_VOID_RETURN;
+}
 
-  JOIN *curr_join= this;
+void
+JOIN::execute(JOIN *parent)
+{
+  DBUG_ENTER("JOIN::execute");
+  int tmp_error;
   List<Item> *curr_all_fields= &all_fields;
   List<Item> *curr_fields_list= &fields_list;
-  TABLE *curr_tmp_table= 0;
+  TABLE *curr_tmp_table= NULL;
+  JOIN *const main_join= (parent) ? parent : this;
+
+  const bool has_group_by= this->group;
+
   /*
     Initialize examined rows here because the values from all join parts
     must be accumulated in examined_row_count. Hence every join
     iteration must count from zero.
   */
-  curr_join->examined_rows= 0;
+  examined_rows= 0;
 
   /* Create a tmp table if distinct or if the sort is too complicated */
   if (need_tmp)
   {
-    if (tmp_join)
-    {
-      /*
-        We are in a non cacheable sub query. Get the saved join structure
-        after optimization.
-        (curr_join may have been modified during last exection and we need
-        to reset it)
-      */
-      curr_join= tmp_join;
-    }
+    DBUG_ASSERT(exec_tmp_table1);
     curr_tmp_table= exec_tmp_table1;
 
     /* Copy data to the temporary table */
@@ -3186,15 +3106,13 @@ JOIN::exec()
       order.  Exception: LooseScan strategy for semijoin requires
       sorted access even if final result is not to be sorted.
     */
-    if (!curr_join->sort_and_group &&
-        curr_join->const_tables != curr_join->tables && 
-        curr_join->best_positions[curr_join->const_tables].sj_strategy 
-          != SJ_OPT_LOOSE_SCAN)
-      disable_sorted_access(&curr_join->join_tab[curr_join->const_tables]);
-
-    Procedure *save_proc= curr_join->procedure;
-    tmp_error= do_select(curr_join, (List<Item> *) 0, curr_tmp_table, 0);
-    curr_join->procedure= save_proc;
+    if (!sort_and_group && const_tables != tables && 
+        best_positions[const_tables].sj_strategy != SJ_OPT_LOOSE_SCAN)
+      disable_sorted_access(&join_tab[const_tables]);
+
+    Procedure *save_proc= procedure;
+    tmp_error= do_select(this, (List<Item> *) NULL, curr_tmp_table, NULL);
+    procedure= save_proc;
     if (tmp_error)
     {
       error= tmp_error;
@@ -3202,12 +3120,10 @@ JOIN::exec()
     }
     curr_tmp_table->file->info(HA_STATUS_VARIABLE);
     
-    if (curr_join->having)
-      curr_join->having= curr_join->tmp_having= 0; // Allready done
+    if (having)
+      having= tmp_having= NULL; // Already done
     
     /* Change sum_fields reference to calculated fields in tmp_table */
-    if (curr_join != this)
-      curr_join->all_fields= *curr_all_fields;
     if (items1.is_null())
     {
       items1= ref_ptr_array_slice(2);
@@ -3226,30 +3142,29 @@ JOIN::exec()
 				      fields_list.elements, all_fields))
 	  DBUG_VOID_RETURN;
       }
-      if (curr_join != this)
+      if (parent)
       {
-        curr_join->tmp_all_fields1= tmp_all_fields1;
-        curr_join->tmp_fields_list1= tmp_fields_list1;
+        // Copy to parent JOIN for reuse in later executions of subquery
+        parent->items1= items1;
+        parent->tmp_all_fields1= tmp_all_fields1;
+        parent->tmp_fields_list1= tmp_fields_list1;
       }
-      curr_join->items1= items1;
     }
     curr_all_fields= &tmp_all_fields1;
     curr_fields_list= &tmp_fields_list1;
-    curr_join->set_items_ref_array(items1);
+    set_items_ref_array(items1);
     
     if (sort_and_group || curr_tmp_table->group)
     {
-      curr_join->tmp_table_param.field_count+= 
-	curr_join->tmp_table_param.sum_func_count+
-	curr_join->tmp_table_param.func_count;
-      curr_join->tmp_table_param.sum_func_count= 
-	curr_join->tmp_table_param.func_count= 0;
+      tmp_table_param.field_count+= 
+        tmp_table_param.sum_func_count + tmp_table_param.func_count;
+      tmp_table_param.sum_func_count= 0;
+      tmp_table_param.func_count= 0;
     }
     else
     {
-      curr_join->tmp_table_param.field_count+= 
-	curr_join->tmp_table_param.func_count;
-      curr_join->tmp_table_param.func_count= 0;
+      tmp_table_param.field_count+= tmp_table_param.func_count;
+      tmp_table_param.func_count= 0;
     }
     
     // procedure can't be used inside subselect => we do nothing special for it
@@ -3258,9 +3173,9 @@ JOIN::exec()
     
     if (curr_tmp_table->group)
     {						// Already grouped
-      if (!curr_join->order && !curr_join->no_order && !skip_sort_order)
-	curr_join->order= curr_join->group_list;  /* order by group */
-      curr_join->group_list= 0;
+      if (!order && !no_order && !skip_sort_order)
+        order= group_list;  /* order by group */
+      group_list= NULL;
     }
     
     /*
@@ -3271,30 +3186,25 @@ JOIN::exec()
       like SEC_TO_TIME(SUM(...)).
     */
 
-    if ((curr_join->group_list && (!test_if_subpart(curr_join->group_list,
-						   curr_join->order) || 
-				  curr_join->select_distinct)) ||
-	(curr_join->select_distinct &&
-	 curr_join->tmp_table_param.using_indirect_summary_function))
+    if ((group_list && 
+         (!test_if_subpart(group_list, order) || select_distinct)) ||
+        (select_distinct && tmp_table_param.using_indirect_summary_function))
     {					/* Must copy to another table */
       DBUG_PRINT("info",("Creating group table"));
       
       /* Free first data from old join */
-      curr_join->join_free();
-      if (curr_join->make_simple_join(this, curr_tmp_table))
+      join_free();
+      // Set up scan for reading from first temporary table (exec_tmp_table1)
+      if (make_simple_join(main_join, curr_tmp_table))
 	DBUG_VOID_RETURN;
-      calc_group_buffer(curr_join, group_list);
-      count_field_types(select_lex, &curr_join->tmp_table_param,
-			curr_join->tmp_all_fields1,
-			curr_join->select_distinct && !curr_join->group_list);
-      curr_join->tmp_table_param.hidden_field_count= 
-	(curr_join->tmp_all_fields1.elements-
-	 curr_join->tmp_fields_list1.elements);
+      calc_group_buffer(this, group_list);
+      count_field_types(select_lex, &tmp_table_param, tmp_all_fields1,
+                        select_distinct && !group_list);
+      tmp_table_param.hidden_field_count= 
+        tmp_all_fields1.elements - tmp_fields_list1.elements;
       
       
-      if (exec_tmp_table2)
-	curr_tmp_table= exec_tmp_table2;
-      else
+      if (!exec_tmp_table2)
       {
 	/* group data to new table */
 
@@ -3303,74 +3213,72 @@ JOIN::exec()
           functions are precomputed, and should be treated as regular
           functions. See extended comment in JOIN::exec.
         */
-        if (curr_join->join_tab->is_using_loose_index_scan())
-          curr_join->tmp_table_param.precomputed_group_by= TRUE;
+        if (join_tab->is_using_loose_index_scan())
+          tmp_table_param.precomputed_group_by= TRUE;
+
+        tmp_table_param.hidden_field_count= 
+          curr_all_fields->elements - curr_fields_list->elements;
 
-	if (!(curr_tmp_table=
-	      exec_tmp_table2= create_tmp_table(thd,
-						&curr_join->tmp_table_param,
-						*curr_all_fields,
-						(ORDER*) 0,
-						curr_join->select_distinct && 
-						!curr_join->group_list,
-						1, curr_join->select_options,
-						HA_POS_ERROR, "")))
+        if (!(exec_tmp_table2= create_intermediate_table(curr_all_fields,
+                                                         NULL, true)))
 	  DBUG_VOID_RETURN;
-	curr_join->exec_tmp_table2= exec_tmp_table2;
+        if (parent)
+          parent->exec_tmp_table2= exec_tmp_table2;
       }
-      if (curr_join->group_list)
+      curr_tmp_table= exec_tmp_table2;
+
+      if (group_list)
       {
-	if (curr_join->join_tab == join_tab && save_join_tab())
+        if (join_tab == main_join->join_tab && main_join->save_join_tab())
 	{
 	  DBUG_VOID_RETURN;
 	}
 	DBUG_PRINT("info",("Sorting for index"));
 	THD_STAGE_INFO(thd, stage_creating_sort_index);
-	if (create_sort_index(thd, curr_join, curr_join->group_list,
+        if (create_sort_index(thd, this, group_list,
 			      HA_POS_ERROR, HA_POS_ERROR, FALSE) ||
-	    make_group_fields(this, curr_join))
-	{
+            make_group_fields(main_join, this))
 	  DBUG_VOID_RETURN;
-	}
-        sortorder= curr_join->sortorder;
+        if (parent)
+          parent->sortorder= sortorder;
       }
 
       THD_STAGE_INFO(thd, stage_copying_to_group_table);
       DBUG_PRINT("info", ("%s", thd->proc_info));
       tmp_error= -1;
-      if (curr_join != this)
+      if (parent)
       {
-	if (sum_funcs2)
+        if (parent->sum_funcs2)
 	{
-	  curr_join->sum_funcs= sum_funcs2;
-	  curr_join->sum_funcs_end= sum_funcs_end2; 
+	  // Reuse sum_funcs from previous execution of subquery
+          sum_funcs= parent->sum_funcs2;
+          sum_funcs_end= parent->sum_funcs_end2; 
 	}
 	else
 	{
-	  curr_join->alloc_func_list();
-	  sum_funcs2= curr_join->sum_funcs;
-	  sum_funcs_end2= curr_join->sum_funcs_end;
+	  // First execution of this subquery, allocate list of sum_functions
+          alloc_func_list();
+          parent->sum_funcs2= sum_funcs;
+          parent->sum_funcs_end2= sum_funcs_end;
 	}
       }
-      if (curr_join->make_sum_func_list(*curr_all_fields, *curr_fields_list,
-					1, TRUE) ||
-        prepare_sum_aggregators(curr_join->sum_funcs,
-          !curr_join->join_tab->is_using_agg_loose_index_scan()))
+      if (make_sum_func_list(*curr_all_fields, *curr_fields_list, true, true) ||
+          prepare_sum_aggregators(sum_funcs,
+                                  !join_tab->is_using_agg_loose_index_scan()))
         DBUG_VOID_RETURN;
-      curr_join->group_list= 0;
-      if (!curr_join->sort_and_group &&
-          curr_join->const_tables != curr_join->tables)
-        disable_sorted_access(&curr_join->join_tab[curr_join->const_tables]);
-      if (setup_sum_funcs(curr_join->thd, curr_join->sum_funcs) ||
-	  (tmp_error= do_select(curr_join, (List<Item> *) 0, curr_tmp_table,
-				0)))
+      group_list= NULL;
+      if (!sort_and_group && const_tables != tables)
+        disable_sorted_access(&join_tab[const_tables]);
+      if (setup_sum_funcs(thd, sum_funcs) ||
+          (tmp_error= do_select(this, (List<Item> *)NULL, curr_tmp_table, NULL)))
       {
 	error= tmp_error;
 	DBUG_VOID_RETURN;
       }
-      end_read_record(&curr_join->join_tab->read_record);
-      curr_join->const_tables= curr_join->tables; // Mark free for cleanup()
-      curr_join->join_tab[0].table= 0;           // Table is freed
+      end_read_record(&join_tab->read_record);
+      // @todo No tests fail if line below is removed. Remove?
+      const_tables= tables; // Mark free for cleanup()
+      join_tab[0].table= NULL;           // Table is freed
       
       // No sum funcs anymore
       if (items2.is_null())
@@ -3380,110 +3288,103 @@ JOIN::exec()
 				     tmp_fields_list2, tmp_all_fields2, 
 				     fields_list.elements, tmp_all_fields1))
 	  DBUG_VOID_RETURN;
-        if (curr_join != this)
+        if (parent)
         {
-          curr_join->tmp_fields_list2= tmp_fields_list2;
-          curr_join->tmp_all_fields2= tmp_all_fields2;
+	  // Copy to parent JOIN for reuse in later executions of subquery
+          parent->items2= items2;
+          parent->tmp_fields_list2= tmp_fields_list2;
+          parent->tmp_all_fields2= tmp_all_fields2;
         }
       }
-      curr_fields_list= &curr_join->tmp_fields_list2;
-      curr_all_fields= &curr_join->tmp_all_fields2;
-      curr_join->set_items_ref_array(items2);
-      curr_join->tmp_table_param.field_count+= 
-	curr_join->tmp_table_param.sum_func_count;
-      curr_join->tmp_table_param.sum_func_count= 0;
+      curr_fields_list= &tmp_fields_list2;
+      curr_all_fields= &tmp_all_fields2;
+      set_items_ref_array(items2);
+      tmp_table_param.field_count+= tmp_table_param.sum_func_count;
+      tmp_table_param.sum_func_count= 0;
     }
     if (curr_tmp_table->distinct)
-      curr_join->select_distinct=0;		/* Each row is unique */
+      select_distinct= false;               /* Each row is unique */
     
-    curr_join->join_free();			/* Free quick selects */
-    if (curr_join->select_distinct && ! curr_join->group_list)
+    join_free();                        /* Free quick selects */
+    if (select_distinct && !group_list)
     {
       THD_STAGE_INFO(thd, stage_removing_duplicates);
-      if (curr_join->tmp_having)
-	curr_join->tmp_having->update_used_tables();
-      if (remove_duplicates(curr_join, curr_tmp_table,
-			    *curr_fields_list, curr_join->tmp_having))
+      if (tmp_having)
+        tmp_having->update_used_tables();
+      if (remove_duplicates(this, curr_tmp_table, *curr_fields_list, tmp_having))
 	DBUG_VOID_RETURN;
-      curr_join->tmp_having=0;
-      curr_join->select_distinct=0;
+      tmp_having= NULL;
+      select_distinct= false;
     }
     curr_tmp_table->reginfo.lock_type= TL_UNLOCK;
-    if (curr_join->make_simple_join(this, curr_tmp_table))
+    // Set up scan for reading from temporary table
+    if (make_simple_join(main_join, curr_tmp_table))
       DBUG_VOID_RETURN;
-    calc_group_buffer(curr_join, curr_join->group_list);
-    count_field_types(select_lex, &curr_join->tmp_table_param, 
-                      *curr_all_fields, 0);
+    calc_group_buffer(this, group_list);
+    count_field_types(select_lex, &tmp_table_param, *curr_all_fields, false);
     
   }
   if (procedure)
-    count_field_types(select_lex, &curr_join->tmp_table_param, 
-                      *curr_all_fields, 0);
+    count_field_types(select_lex, &tmp_table_param, *curr_all_fields, false);
   
-  if (curr_join->group || curr_join->implicit_grouping ||
-      curr_join->tmp_table_param.sum_func_count ||
+  if (group || implicit_grouping || tmp_table_param.sum_func_count ||
       (procedure && (procedure->flags & PROC_GROUP)))
   {
-    if (make_group_fields(this, curr_join))
-    {
+    if (make_group_fields(main_join, this))
       DBUG_VOID_RETURN;
-    }
     if (items3.is_null())
     {
       if (items0.is_null())
 	init_items_ref_array();
       items3= ref_ptr_array_slice(4);
-      setup_copy_fields(thd, &curr_join->tmp_table_param,
+      setup_copy_fields(thd, &tmp_table_param,
 			items3, tmp_fields_list3, tmp_all_fields3,
 			curr_fields_list->elements, *curr_all_fields);
-      tmp_table_param.save_copy_funcs= curr_join->tmp_table_param.copy_funcs;
-      tmp_table_param.save_copy_field= curr_join->tmp_table_param.copy_field;
-      tmp_table_param.save_copy_field_end=
-	curr_join->tmp_table_param.copy_field_end;
-      if (curr_join != this)
+      if (parent)
       {
-        curr_join->tmp_all_fields3= tmp_all_fields3;
-        curr_join->tmp_fields_list3= tmp_fields_list3;
-      }
-    }
-    else
-    {
-      curr_join->tmp_table_param.copy_funcs= tmp_table_param.save_copy_funcs;
-      curr_join->tmp_table_param.copy_field= tmp_table_param.save_copy_field;
-      curr_join->tmp_table_param.copy_field_end=
-	tmp_table_param.save_copy_field_end;
+        // Copy to parent JOIN for reuse in later executions of subquery
+        parent->tmp_table_param.save_copy_funcs= tmp_table_param.copy_funcs;
+        parent->tmp_table_param.save_copy_field= tmp_table_param.copy_field;
+        parent->tmp_table_param.save_copy_field_end= 
+          tmp_table_param.copy_field_end;
+        parent->tmp_all_fields3= tmp_all_fields3;
+        parent->tmp_fields_list3= tmp_fields_list3;
+      }
+    }
+    else if (parent)
+    {
+      // Reuse data from earlier execution of this subquery. 
+      tmp_table_param.copy_funcs= parent->tmp_table_param.save_copy_funcs;
+      tmp_table_param.copy_field= parent->tmp_table_param.save_copy_field;
+      tmp_table_param.copy_field_end= 
+        parent->tmp_table_param.save_copy_field_end;
     }
     curr_fields_list= &tmp_fields_list3;
     curr_all_fields= &tmp_all_fields3;
-    curr_join->set_items_ref_array(items3);
+    set_items_ref_array(items3);
 
-    if (curr_join->make_sum_func_list(*curr_all_fields, *curr_fields_list,
-				      1, TRUE) || 
-        prepare_sum_aggregators(curr_join->sum_funcs,
-                                !curr_join->join_tab ||
-                                !curr_join->join_tab->
-                                  is_using_agg_loose_index_scan()) ||
-        setup_sum_funcs(curr_join->thd, curr_join->sum_funcs) ||
+    if (make_sum_func_list(*curr_all_fields, *curr_fields_list, true, true) || 
+        prepare_sum_aggregators(sum_funcs,
+                                !join_tab ||
+                                !join_tab-> is_using_agg_loose_index_scan()) ||
+        setup_sum_funcs(thd, sum_funcs) ||
         thd->is_fatal_error)
       DBUG_VOID_RETURN;
   }
-  if (curr_join->group_list || curr_join->order)
+  if (group_list || order)
   {
     DBUG_PRINT("info",("Sorting for send_result_set_metadata"));
     THD_STAGE_INFO(thd, stage_sorting_result);
     /* If we have already done the group, add HAVING to sorted table */
-    if (curr_join->tmp_having && ! curr_join->group_list && 
-	! curr_join->sort_and_group)
+    if (tmp_having && !group_list && !sort_and_group)
     {
       // Some tables may have been const
-      curr_join->tmp_having->update_used_tables();
-      JOIN_TAB *curr_table= &curr_join->join_tab[curr_join->const_tables];
-      table_map used_tables= (curr_join->const_table_map |
-			      curr_table->table->map);
-
-      Item* sort_table_cond= make_cond_for_table(curr_join->tmp_having,
-						 used_tables,
-						 (table_map) 0, 0);
+      tmp_having->update_used_tables();
+      JOIN_TAB *curr_table= &join_tab[const_tables];
+      table_map used_tables= (const_table_map | curr_table->table->map);
+
+      Item* sort_table_cond= make_cond_for_table(tmp_having, used_tables,
+                                                 (table_map) 0, false);
       if (sort_table_cond)
       {
 	if (!curr_table->select)
@@ -3516,9 +3417,8 @@ JOIN::exec()
         */
         if (curr_table->pre_idx_push_cond)
         {
-          sort_table_cond= make_cond_for_table(curr_join->tmp_having,
-                                               used_tables,
-                                               (table_map) 0, 0);
+          sort_table_cond= make_cond_for_table(tmp_having, used_tables,
+                                               (table_map) 0, false);
           if (!sort_table_cond)
             DBUG_VOID_RETURN;
           Item* new_pre_idx_push_cond= 
@@ -3531,17 +3431,15 @@ JOIN::exec()
           curr_table->pre_idx_push_cond= new_pre_idx_push_cond;
 	}
 
-	curr_join->tmp_having= make_cond_for_table(curr_join->tmp_having,
-						   ~ (table_map) 0,
-						   ~used_tables, 0);
-	DBUG_EXECUTE("where",print_where(curr_join->tmp_having,
-                                         "having after sort",
-                                         QT_ORDINARY););
+        tmp_having= make_cond_for_table(tmp_having, ~ (table_map) 0,
+                                        ~used_tables, false);
+        DBUG_EXECUTE("where",
+                     print_where(tmp_having, "having after sort", QT_ORDINARY););
       }
     }
     {
       if (group)
-        curr_join->m_select_limit= HA_POS_ERROR;
+        m_select_limit= HA_POS_ERROR;
       else
       {
         /*
@@ -3552,8 +3450,8 @@ JOIN::exec()
            - as a keyuse attached to the join_tab (ref access),
            - as a semi-join equality attached to materialization semi-join nest.
         */
-        JOIN_TAB *curr_table= &curr_join->join_tab[curr_join->const_tables+1];
-        JOIN_TAB *end_table= &curr_join->join_tab[curr_join->tables];
+        JOIN_TAB *curr_table= &join_tab[const_tables+1];
+        JOIN_TAB *end_table= &join_tab[tables];
         for (; curr_table < end_table ; curr_table++)
         {
           if (curr_table->condition() ||
@@ -3561,12 +3459,12 @@ JOIN::exec()
               curr_table->get_sj_strategy() == SJ_OPT_MATERIALIZE_LOOKUP)
           {
             /* We have to sort all rows */
-            curr_join->m_select_limit= HA_POS_ERROR;
+            m_select_limit= HA_POS_ERROR;
             break;
           }
         }
       }
-      if (curr_join->join_tab == join_tab && save_join_tab())
+      if (join_tab == main_join->join_tab && main_join->save_join_tab())
       {
 	DBUG_VOID_RETURN;
       }
@@ -3580,8 +3478,7 @@ JOIN::exec()
 	OPTION_FOUND_ROWS supersedes LIMIT and is taken into account.
       */
       DBUG_PRINT("info",("Sorting for order by/group by"));
-      ORDER *order_arg=
-        curr_join->group_list ? curr_join->group_list : curr_join->order;
+      ORDER *order_arg= group_list ? group_list : order;
       /*
         filesort_limit:	 Return only this many rows from filesort().
         We can use select_limit_cnt only if we have no group_by and 1 table.
@@ -3591,31 +3488,30 @@ JOIN::exec()
         unit->select_limit_cnt == 1 (we only need one row in the result set)
        */
       const ha_rows filesort_limit_arg=
-        (has_group_by || curr_join->tables > 1)
-        ? curr_join->m_select_limit : unit->select_limit_cnt;
+        (has_group_by || tables > 1) ? m_select_limit : unit->select_limit_cnt;
       const ha_rows select_limit_arg=
         select_options & OPTION_FOUND_ROWS
         ? HA_POS_ERROR : unit->select_limit_cnt;
 
       DBUG_PRINT("info", ("has_group_by %d "
-                          "curr_join->tables %d "
-                          "curr_join->m_select_limit %d "
+                          "tables %d "
+                          "m_select_limit %d "
                           "unit->select_limit_cnt %d",
                           has_group_by,
-                          curr_join->tables,
-                          (int) curr_join->m_select_limit,
+                          tables,
+                          (int) m_select_limit,
                           (int) unit->select_limit_cnt));
 
       if (create_sort_index(thd,
-                            curr_join,
+                            this,
                             order_arg,
                             filesort_limit_arg,
                             select_limit_arg,
-                            curr_join->group_list ? FALSE : TRUE))
+                            !test(group_list)))
 	DBUG_VOID_RETURN;
-      sortorder= curr_join->sortorder;
-      if (curr_join->const_tables != curr_join->tables &&
-          !curr_join->join_tab[curr_join->const_tables].table->sort.io_cache)
+      if (parent)
+        parent->sortorder= sortorder;
+      if (const_tables != tables && !join_tab[const_tables].table->sort.io_cache)
       {
         /*
           If no IO cache exists for the first table then we are using an
@@ -3632,28 +3528,26 @@ JOIN::exec()
     error= thd->is_error();
     DBUG_VOID_RETURN;
   }
-  curr_join->having= curr_join->tmp_having;
-  curr_join->fields= curr_fields_list;
-  curr_join->procedure= procedure;
+  having= tmp_having;
+  fields= curr_fields_list;
 
   THD_STAGE_INFO(thd, stage_sending_data);
   DBUG_PRINT("info", ("%s", thd->proc_info));
-  result->send_result_set_metadata((procedure ? curr_join->procedure_fields_list :
+  result->send_result_set_metadata((procedure ? procedure_fields_list :
                                     *curr_fields_list),
                                    Protocol::SEND_NUM_ROWS | Protocol::SEND_EOF);
-  error= do_select(curr_join, curr_fields_list, NULL, procedure);
-  thd->limit_found_rows= curr_join->send_records;
-  if (curr_join->order &&
-      curr_join->sortorder)
+  error= do_select(this, curr_fields_list, NULL, procedure);
+  thd->limit_found_rows= send_records;
+
+  if (order && sortorder)
   {
     /* Use info provided by filesort. */
-    DBUG_ASSERT(curr_join->tables > curr_join->const_tables);
-    JOIN_TAB *tab= curr_join->join_tab + curr_join->const_tables;
-    thd->limit_found_rows= tab->records;
+    DBUG_ASSERT(tables > const_tables);
+    thd->limit_found_rows= join_tab[const_tables].records;
   }
 
   /* Accumulate the counts from all join iterations of all join parts. */
-  thd->inc_examined_row_count(curr_join->examined_rows);
+  thd->inc_examined_row_count(examined_rows);
   DBUG_PRINT("counts", ("thd->examined_row_count: %lu",
                         (ulong) thd->get_examined_row_count()));
 
@@ -3661,6 +3555,112 @@ JOIN::exec()
 }
 
 
+TABLE*
+JOIN::create_intermediate_table(List<Item> *tmp_table_fields,
+                                ORDER *tmp_table_group, bool save_sum_fields)
+{
+  DBUG_ENTER("JOIN::create_intermediate_table");
+  THD_STAGE_INFO(thd, stage_creating_tmp_table);
+
+  /*
+    Pushing LIMIT to the temporary table creation is not applicable
+    when there is ORDER BY or GROUP BY or there is no GROUP BY, but
+    there are aggregate functions, because in all these cases we need
+    all result rows.
+  */
+  ha_rows tmp_rows_limit= ((order == NULL || skip_sort_order) &&
+                           !tmp_table_group &&
+                           !select_lex->with_sum_func) ?
+    m_select_limit : HA_POS_ERROR;
+
+  TABLE* tab= create_tmp_table(thd, &tmp_table_param, *tmp_table_fields,
+                               tmp_table_group, select_distinct && !group_list,
+                               save_sum_fields, select_options, tmp_rows_limit, 
+                               "");
+  if (!tab)
+    DBUG_RETURN(NULL);
+
+  /*
+    We don't have to store rows in temp table that doesn't match HAVING if:
+    - we are sorting the table and writing complete group rows to the
+      temp table.
+    - We are using DISTINCT without resolving the distinct as a GROUP BY
+      on all columns.
+
+    If having is not handled here, it will be checked before the row
+    is sent to the client.
+  */
+  if (tmp_having &&
+      (sort_and_group || (tab->distinct && !group_list)))
+    having= tmp_having;
+
+  /* if group or order on first table, sort first */
+  if (group_list && simple_group)
+  {
+    DBUG_PRINT("info",("Sorting for group"));
+    THD_STAGE_INFO(thd, stage_sorting_for_group);
+    if (create_sort_index(thd, this, group_list,
+                          HA_POS_ERROR, HA_POS_ERROR, false) ||
+        alloc_group_fields(this, group_list) ||
+        make_sum_func_list(all_fields, fields_list, true) ||
+        prepare_sum_aggregators(sum_funcs,
+                                !join_tab->is_using_agg_loose_index_scan()) ||
+        setup_sum_funcs(thd, sum_funcs))
+    {
+      DBUG_RETURN(NULL);
+    }
+    group_list= NULL;
+  }
+  else
+  {
+    if (make_sum_func_list(all_fields, fields_list, false) ||
+        prepare_sum_aggregators(sum_funcs,
+                                !join_tab->is_using_agg_loose_index_scan()) ||
+        setup_sum_funcs(thd, sum_funcs))
+    {
+      DBUG_RETURN(NULL);
+    }
+
+    if (!group_list && !tab->distinct && order && simple_order)
+    {
+      DBUG_PRINT("info",("Sorting for order"));
+      THD_STAGE_INFO(thd, stage_sorting_for_order);
+      if (create_sort_index(thd, this, order,
+                            HA_POS_ERROR, HA_POS_ERROR, true))
+      {
+        DBUG_RETURN(NULL);
+      }
+      order= NULL;
+    }
+  }
+  DBUG_RETURN(tab);
+}
+
+
+void
+JOIN::optimize_distinct()
+{
+  JOIN_TAB *last_join_tab= join_tab+tables-1;
+  do
+  {
+    if (select_lex->select_list_tables & last_join_tab->table->map)
+      break;
+    last_join_tab->not_used_in_distinct= true;
+  } while (last_join_tab-- != join_tab);
+
+  /* Optimize "select distinct b from t1 order by key_part_1 limit #" */
+  if (order && skip_sort_order)
+  {
+    /* Should always succeed */
+    if (test_if_skip_sort_order(&join_tab[const_tables],
+                                order, unit->select_limit_cnt, false, 
+                                &join_tab[const_tables].table->
+                                keys_in_use_for_order_by))
+      order= NULL;
+  }
+}
+
+
 /**
   Clean up and destroy join object.
 
@@ -16851,7 +16851,7 @@ void setup_tmp_table_column_bitmaps(TABL
   @param param                a description used as input to create the table
   @param fields               list of items that will be used to define
                               column types of the table (also see NOTES)
-  @param group                TODO document
+  @param group                Group key to use for temporary table, NULL if none
   @param distinct             should table rows be distinct
   @param save_sum_fields      see NOTES
   @param select_options

=== modified file 'sql/sql_select.h'
--- a/sql/sql_select.h	2011-10-10 10:36:58 +0000
+++ b/sql/sql_select.h	2011-10-12 11:08:37 +0000
@@ -2162,6 +2162,47 @@ public:
   void drop_unused_derived_keys();
 private:
   /**
+    Execute current query. To be called from @c JOIN::exec.
+
+    If current query is a dependent subquery, this execution is performed on a
+    temporary copy of the original JOIN object in order to be able to restore
+    the original content for re-execution and EXPLAIN. (@note Subqueries may
+    be executed as part of EXPLAIN.) In such cases, execution data that may be
+    reused for later executions will be copied to the original 
+    @c JOIN object (@c parent).
+
+    @param parent Original @c JOIN object when current object is a temporary 
+                  copy. @c NULL, otherwise
+  */
+  void execute(JOIN *parent);
+  
+  /**
+    Create a temporary table to be used for processing DISTINCT/ORDER
+    BY/GROUP BY.
+
+    @note Will modify JOIN object wrt sort/group attributes
+
+    @param tmp_table_fields List of items that will be used to define
+                            column types of the table.
+    @param tmp_table_group  Group key to use for temporary table, NULL if none.
+    @param save_sum_fields  If true, do not replace Item_sum items in 
+                            @c tmp_fields list with Item_field items referring 
+                            to fields in temporary table.
+
+    @returns Pointer to temporary table on success, NULL on failure
+  */
+  TABLE* create_intermediate_table(List<Item> *tmp_table_fields,
+                                   ORDER *tmp_table_group, bool save_sum_fields);
+
+  /**
+    Optimize distinct when used on a subset of the tables.
+
+    E.g.,: SELECT DISTINCT t1.a FROM t1,t2 WHERE t1.b=t2.b
+    In this case we can stop scanning t2 when we have found one t1.a
+  */
+  void optimize_distinct();
+
+  /**
     TRUE if the query contains an aggregate function but has no GROUP
     BY clause. 
   */

No bundle (reason: useless for push emails).
Thread
bzr push into mysql-trunk branch (oystein.grovlen:3450 to 3453) WL#5559Oystein Grovlen12 Oct