3453 Oystein Grovlen 2011-10-12
WL#5559 - Factor tmp table out of optimizer.
Part 3: Unify the code for creation of temporary tables
for GROUP BY/ORDER BY
Changed JOIN::create_intermediate_table() so that it
can be used to create both temporary tables, and let it
return a pointer to the created table, and use it to also
create JOIN::exec_tmp_table2.
@ sql/sql_select.cc
1. Extended JOIN::create_intermediate_table() with parameters
for characteristics that vary between the two temporary
tables.
2. Changed JOIN::create_intermediate_table() to return the
pointer to the created temporary table, instead of
assigning it to exec_tmp_table1.
3. Moved code in JOIN::create_intermediate_table() that are
only relevant to exec_tmp_table1 to JOIN::exec().
4. Moved code in JOIN::create_intermediate_table() related
to optimization for distinct to a new private function
JOIN::optimize_distinct(). (To be called from
JOIN::exec() after creating the exec_tmp_table1
for time being, but longer term this should be moved to
optimization phase.)
5. Call create_intermediate_table() to create
exec_tmp_table2.
@ sql/sql_select.h
Created new private function JOIN::optimize_distinct() to
optimize distinct when used on a subset of tables in query.
Changed signature of JOIN::create_intermediate table to be
able to handle creation of both temporary tables used for
sorting.
modified:
sql/sql_select.cc
sql/sql_select.h
3452 Oystein Grovlen 2011-10-12
WL#5559 - Factor tmp table out of optimizer.
Part 2: Remove curr_join references from JOIN::exec
Most references in JOIN::exec to member fields are indirect
through local variable curr_join, which may either refer to
the original JOIN object or a temporary copy of it. This
complicates the code significantly.
In order to simplify the code and ease further refactoring
of JOIN::exec, this patch removes the need for curr_join by,
when a temporary copy is used, executing the code directly
on the temporary object.
This is achieved by creating a private function,
JOIN::execute(JOIN *parent) to be called from JOIN::exec(),
and move all code after creation of the first temporary table
to this function. If a temporary copy is needed, then
tmp_join->execute(this) is called. Otherwise,
this->execute(NULL).
The code makes sure that necessary execution data is copied
from the temporary JOIN object and back to its parent in order
to be able to reuse it for later executions as before.
A limited number of example queries has been used to verify that
there is no performance regression. Also, using gprof it has been
verified that the functions previously called from JOIN::exec,
is still called the same number of times with the new approach.
@ sql/sql_select.cc
Moved core part of JOIN::exec to private function JOIN::execute
so that query can be executed directly on the current join
object. This way, all references to curr_join may be removed.
In case one is executing on the temporary JOIN object, made
sure to copy data that may be reused on later execution to
the parent join object.
@ sql/sql_select.h
Added private function JOIN::execute that will do the core
execution of a query.
modified:
sql/sql_select.cc
sql/sql_select.h
3451 Oystein Grovlen 2011-10-11
WL#5559 - Factor tmp table out of optimizer. Part 1.
This patch moves the creation of the table to be used
for the intermediate step required by DISTINCT/ORDER BY/
GROUP BY processing, from optimization phase to
execution phase.
The code for creating this tables has been moved to a new
function, JOIN::create_intermediate_table().
@ sql/sql_select.cc
Moved code to create temporary table from JOIN::optimize() to
a new function JOIN::create_intermediate_table().
Moved the call of this method to JOIN::exec().
Some clean-up at end of JOIN::optimize that is possible when
this code has been moved.
@ sql/sql_select.h
Added a new private function JOIN::create_intermediate_table()
that creates the table to be used for the intermediate step
required by DISTINCT/ORDER BY/GROUP BY processing.
modified:
sql/sql_select.cc
sql/sql_select.h
3450 Jorgen Loland 2011-10-10 [merge]
Merged in WL 5585
modified:
mysql-test/suite/opt_trace/r/charset.result
mysql-test/suite/opt_trace/r/general2_no_prot.result
mysql-test/suite/opt_trace/r/general2_ps_prot.result
mysql-test/suite/opt_trace/r/general_no_prot_all.result
mysql-test/suite/opt_trace/r/general_no_prot_none.result
mysql-test/suite/opt_trace/r/general_ps_prot_all.result
mysql-test/suite/opt_trace/r/general_ps_prot_none.result
mysql-test/suite/opt_trace/r/range_no_prot.result
mysql-test/suite/opt_trace/r/range_ps_prot.result
sql/mysqld.cc
sql/mysqld.h
sql/opt_explain.cc
sql/sql_lex.h
sql/sql_select.cc
sql/sql_select.h
sql/sql_union.cc
=== modified file 'sql/sql_select.cc'
--- a/sql/sql_select.cc 2011-10-10 10:36:58 +0000
+++ b/sql/sql_select.cc 2011-10-12 11:08:37 +0000
@@ -1912,7 +1912,6 @@ static int clear_sj_tmp_tables(JOIN *joi
int
JOIN::optimize()
{
- bool need_distinct;
ulonglong select_opts_for_readinfo;
uint no_jbuf_after= UINT_MAX;
@@ -2641,150 +2640,11 @@ JOIN::optimize()
}
tmp_having= having;
- if (select_options & SELECT_DESCRIBE)
- {
- error= 0;
- DBUG_RETURN(0);
- }
- having= 0;
-
- /*
- The loose index scan access method guarantees that all grouping or
- duplicate row elimination (for distinct) is already performed
- during data retrieval, and that all MIN/MAX functions are already
- computed for each group. Thus all MIN/MAX functions should be
- treated as regular functions, and there is no need to perform
- grouping in the main execution loop.
- Notice that currently loose index scan is applicable only for
- single table queries, thus it is sufficient to test only the first
- join_tab element of the plan for its access method.
- */
- need_distinct= TRUE;
- if (join_tab->is_using_loose_index_scan())
+ if (!(select_options & SELECT_DESCRIBE))
{
- tmp_table_param.precomputed_group_by= TRUE;
- if (join_tab->is_using_agg_loose_index_scan())
- {
- need_distinct= FALSE;
- tmp_table_param.precomputed_group_by= FALSE;
- }
+ having= NULL;
}
-
- /* Create a tmp table if distinct or if the sort is too complicated */
- if (need_tmp)
- {
- DBUG_PRINT("info",("Creating tmp table"));
- THD_STAGE_INFO(thd, stage_creating_tmp_table);
-
- init_items_ref_array();
-
- tmp_table_param.hidden_field_count= (all_fields.elements -
- fields_list.elements);
- ORDER *tmp_group= ((!simple_group && !procedure &&
- !(test_flags & TEST_NO_KEY_GROUP)) ? group_list :
- (ORDER*) 0);
- /*
- Pushing LIMIT to the temporary table creation is not applicable
- when there is ORDER BY or GROUP BY or there is no GROUP BY, but
- there are aggregate functions, because in all these cases we need
- all result rows.
- */
- ha_rows tmp_rows_limit= ((order == 0 || skip_sort_order) &&
- !tmp_group &&
- !thd->lex->current_select->with_sum_func) ?
- m_select_limit : HA_POS_ERROR;
-
- if (!(exec_tmp_table1=
- create_tmp_table(thd, &tmp_table_param, all_fields,
- tmp_group, group_list ? 0 : select_distinct,
- group_list && simple_group,
- select_options, tmp_rows_limit, "")))
- DBUG_RETURN(1);
-
- /*
- We don't have to store rows in temp table that doesn't match HAVING if:
- - we are sorting the table and writing complete group rows to the
- temp table.
- - We are using DISTINCT without resolving the distinct as a GROUP BY
- on all columns.
-
- If having is not handled here, it will be checked before the row
- is sent to the client.
- */
- if (tmp_having &&
- (sort_and_group || (exec_tmp_table1->distinct && !group_list)))
- having= tmp_having;
-
- /* if group or order on first table, sort first */
- if (group_list && simple_group)
- {
- DBUG_PRINT("info",("Sorting for group"));
- THD_STAGE_INFO(thd, stage_sorting_for_group);
- if (create_sort_index(thd, this, group_list,
- HA_POS_ERROR, HA_POS_ERROR, FALSE) ||
- alloc_group_fields(this, group_list) ||
- make_sum_func_list(all_fields, fields_list, 1) ||
- prepare_sum_aggregators(sum_funcs, need_distinct) ||
- setup_sum_funcs(thd, sum_funcs))
- {
- DBUG_RETURN(1);
- }
- group_list=0;
- }
- else
- {
- if (make_sum_func_list(all_fields, fields_list, 0) ||
- prepare_sum_aggregators(sum_funcs, need_distinct) ||
- setup_sum_funcs(thd, sum_funcs))
- {
- DBUG_RETURN(1);
- }
-
- if (!group_list && ! exec_tmp_table1->distinct && order && simple_order)
- {
- DBUG_PRINT("info",("Sorting for order"));
- THD_STAGE_INFO(thd, stage_sorting_for_order);
- if (create_sort_index(thd, this, order,
- HA_POS_ERROR, HA_POS_ERROR, TRUE))
- {
- DBUG_RETURN(1);
- }
- order=0;
- }
- }
-
- /*
- Optimize distinct when used on some of the tables
- SELECT DISTINCT t1.a FROM t1,t2 WHERE t1.b=t2.b
- In this case we can stop scanning t2 when we have found one t1.a
- */
-
- if (exec_tmp_table1->distinct)
- {
- JOIN_TAB *last_join_tab= join_tab+tables-1;
- do
- {
- if (select_lex->select_list_tables & last_join_tab->table->map)
- break;
- last_join_tab->not_used_in_distinct= 1;
- } while (last_join_tab-- != join_tab);
- /* Optimize "select distinct b from t1 order by key_part_1 limit #" */
- if (order && skip_sort_order)
- {
- /* Should always succeed */
- if (test_if_skip_sort_order(&join_tab[const_tables],
- order, unit->select_limit_cnt, 0,
- &join_tab[const_tables].table->
- keys_in_use_for_order_by))
- order=0;
- }
- }
-
- /* If this join belongs to an uncacheable query save the original join */
- if (select_lex->uncacheable && init_save_join_tab())
- DBUG_RETURN(-1); /* purecov: inspected */
- }
-
+
error= 0;
DBUG_RETURN(0);
@@ -3064,15 +2924,12 @@ JOIN::exec()
trace_exec.add_select_number(select_lex->select_number);
Opt_trace_array trace_steps(trace, "steps");
List<Item> *columns_list= &fields_list;
- int tmp_error;
DBUG_ENTER("JOIN::exec");
DBUG_ASSERT(!(select_options & SELECT_DESCRIBE));
THD_STAGE_INFO(thd, stage_executing);
- const bool has_group_by= this->group;
-
if (prepare_result())
DBUG_VOID_RETURN;
@@ -3150,31 +3007,94 @@ JOIN::exec()
having);
DBUG_VOID_RETURN;
}
+
+ /*
+ The loose index scan access method guarantees that all grouping or
+ duplicate row elimination (for distinct) is already performed
+ during data retrieval, and that all MIN/MAX functions are already
+ computed for each group. Thus all MIN/MAX functions should be
+ treated as regular functions, and there is no need to perform
+ grouping in the main execution loop.
+ Notice that currently loose index scan is applicable only for
+ single table queries, thus it is sufficient to test only the first
+ join_tab element of the plan for its access method.
+ */
+ if (join_tab && join_tab->is_using_loose_index_scan())
+ tmp_table_param.precomputed_group_by=
+ !join_tab->is_using_agg_loose_index_scan();
+
+ /* Create a tmp table if distinct or if the sort is too complicated */
+ if (need_tmp)
+ {
+ if (!exec_tmp_table1)
+ {
+ /*
+ Create temporary table on first execution of this join.
+ (Will be reused if this is a subquery that is executed several times.)
+ */
+ init_items_ref_array();
+
+ ORDER *tmp_group=
+ (!simple_group && !procedure && !(test_flags & TEST_NO_KEY_GROUP)) ?
+ group_list : NULL;
+
+ tmp_table_param.hidden_field_count=
+ all_fields.elements - fields_list.elements;
+
+ exec_tmp_table1= create_intermediate_table(&all_fields, tmp_group,
+ group_list && simple_group);
+ if (!exec_tmp_table1)
+ DBUG_VOID_RETURN;
+
+ if (exec_tmp_table1->distinct)
+ optimize_distinct();
+
+ /* If this join belongs to an uncacheable query save the original join */
+ if (select_lex->uncacheable && init_save_join_tab())
+ DBUG_VOID_RETURN; /* purecov: inspected */
+ }
+
+ if (tmp_join)
+ {
+ /*
+ We are in a non-cacheable subquery. Use the saved join
+ structure after creation of temporary table.
+ See documentation of tmp_join for details.
+ */
+ tmp_join->execute(this);
+ error= tmp_join->error;
+ DBUG_VOID_RETURN;
+ }
+ }
+
+ execute(NULL);
+
+ DBUG_VOID_RETURN;
+}
- JOIN *curr_join= this;
+void
+JOIN::execute(JOIN *parent)
+{
+ DBUG_ENTER("JOIN::execute");
+ int tmp_error;
List<Item> *curr_all_fields= &all_fields;
List<Item> *curr_fields_list= &fields_list;
- TABLE *curr_tmp_table= 0;
+ TABLE *curr_tmp_table= NULL;
+ JOIN *const main_join= (parent) ? parent : this;
+
+ const bool has_group_by= this->group;
+
/*
Initialize examined rows here because the values from all join parts
must be accumulated in examined_row_count. Hence every join
iteration must count from zero.
*/
- curr_join->examined_rows= 0;
+ examined_rows= 0;
/* Create a tmp table if distinct or if the sort is too complicated */
if (need_tmp)
{
- if (tmp_join)
- {
- /*
- We are in a non cacheable sub query. Get the saved join structure
- after optimization.
- (curr_join may have been modified during last exection and we need
- to reset it)
- */
- curr_join= tmp_join;
- }
+ DBUG_ASSERT(exec_tmp_table1);
curr_tmp_table= exec_tmp_table1;
/* Copy data to the temporary table */
@@ -3186,15 +3106,13 @@ JOIN::exec()
order. Exception: LooseScan strategy for semijoin requires
sorted access even if final result is not to be sorted.
*/
- if (!curr_join->sort_and_group &&
- curr_join->const_tables != curr_join->tables &&
- curr_join->best_positions[curr_join->const_tables].sj_strategy
- != SJ_OPT_LOOSE_SCAN)
- disable_sorted_access(&curr_join->join_tab[curr_join->const_tables]);
-
- Procedure *save_proc= curr_join->procedure;
- tmp_error= do_select(curr_join, (List<Item> *) 0, curr_tmp_table, 0);
- curr_join->procedure= save_proc;
+ if (!sort_and_group && const_tables != tables &&
+ best_positions[const_tables].sj_strategy != SJ_OPT_LOOSE_SCAN)
+ disable_sorted_access(&join_tab[const_tables]);
+
+ Procedure *save_proc= procedure;
+ tmp_error= do_select(this, (List<Item> *) NULL, curr_tmp_table, NULL);
+ procedure= save_proc;
if (tmp_error)
{
error= tmp_error;
@@ -3202,12 +3120,10 @@ JOIN::exec()
}
curr_tmp_table->file->info(HA_STATUS_VARIABLE);
- if (curr_join->having)
- curr_join->having= curr_join->tmp_having= 0; // Allready done
+ if (having)
+ having= tmp_having= NULL; // Already done
/* Change sum_fields reference to calculated fields in tmp_table */
- if (curr_join != this)
- curr_join->all_fields= *curr_all_fields;
if (items1.is_null())
{
items1= ref_ptr_array_slice(2);
@@ -3226,30 +3142,29 @@ JOIN::exec()
fields_list.elements, all_fields))
DBUG_VOID_RETURN;
}
- if (curr_join != this)
+ if (parent)
{
- curr_join->tmp_all_fields1= tmp_all_fields1;
- curr_join->tmp_fields_list1= tmp_fields_list1;
+ // Copy to parent JOIN for reuse in later executions of subquery
+ parent->items1= items1;
+ parent->tmp_all_fields1= tmp_all_fields1;
+ parent->tmp_fields_list1= tmp_fields_list1;
}
- curr_join->items1= items1;
}
curr_all_fields= &tmp_all_fields1;
curr_fields_list= &tmp_fields_list1;
- curr_join->set_items_ref_array(items1);
+ set_items_ref_array(items1);
if (sort_and_group || curr_tmp_table->group)
{
- curr_join->tmp_table_param.field_count+=
- curr_join->tmp_table_param.sum_func_count+
- curr_join->tmp_table_param.func_count;
- curr_join->tmp_table_param.sum_func_count=
- curr_join->tmp_table_param.func_count= 0;
+ tmp_table_param.field_count+=
+ tmp_table_param.sum_func_count + tmp_table_param.func_count;
+ tmp_table_param.sum_func_count= 0;
+ tmp_table_param.func_count= 0;
}
else
{
- curr_join->tmp_table_param.field_count+=
- curr_join->tmp_table_param.func_count;
- curr_join->tmp_table_param.func_count= 0;
+ tmp_table_param.field_count+= tmp_table_param.func_count;
+ tmp_table_param.func_count= 0;
}
// procedure can't be used inside subselect => we do nothing special for it
@@ -3258,9 +3173,9 @@ JOIN::exec()
if (curr_tmp_table->group)
{ // Already grouped
- if (!curr_join->order && !curr_join->no_order && !skip_sort_order)
- curr_join->order= curr_join->group_list; /* order by group */
- curr_join->group_list= 0;
+ if (!order && !no_order && !skip_sort_order)
+ order= group_list; /* order by group */
+ group_list= NULL;
}
/*
@@ -3271,30 +3186,25 @@ JOIN::exec()
like SEC_TO_TIME(SUM(...)).
*/
- if ((curr_join->group_list && (!test_if_subpart(curr_join->group_list,
- curr_join->order) ||
- curr_join->select_distinct)) ||
- (curr_join->select_distinct &&
- curr_join->tmp_table_param.using_indirect_summary_function))
+ if ((group_list &&
+ (!test_if_subpart(group_list, order) || select_distinct)) ||
+ (select_distinct && tmp_table_param.using_indirect_summary_function))
{ /* Must copy to another table */
DBUG_PRINT("info",("Creating group table"));
/* Free first data from old join */
- curr_join->join_free();
- if (curr_join->make_simple_join(this, curr_tmp_table))
+ join_free();
+ // Set up scan for reading from first temporary table (exec_tmp_table1)
+ if (make_simple_join(main_join, curr_tmp_table))
DBUG_VOID_RETURN;
- calc_group_buffer(curr_join, group_list);
- count_field_types(select_lex, &curr_join->tmp_table_param,
- curr_join->tmp_all_fields1,
- curr_join->select_distinct && !curr_join->group_list);
- curr_join->tmp_table_param.hidden_field_count=
- (curr_join->tmp_all_fields1.elements-
- curr_join->tmp_fields_list1.elements);
+ calc_group_buffer(this, group_list);
+ count_field_types(select_lex, &tmp_table_param, tmp_all_fields1,
+ select_distinct && !group_list);
+ tmp_table_param.hidden_field_count=
+ tmp_all_fields1.elements - tmp_fields_list1.elements;
- if (exec_tmp_table2)
- curr_tmp_table= exec_tmp_table2;
- else
+ if (!exec_tmp_table2)
{
/* group data to new table */
@@ -3303,74 +3213,72 @@ JOIN::exec()
functions are precomputed, and should be treated as regular
functions. See extended comment in JOIN::exec.
*/
- if (curr_join->join_tab->is_using_loose_index_scan())
- curr_join->tmp_table_param.precomputed_group_by= TRUE;
+ if (join_tab->is_using_loose_index_scan())
+ tmp_table_param.precomputed_group_by= TRUE;
+
+ tmp_table_param.hidden_field_count=
+ curr_all_fields->elements - curr_fields_list->elements;
- if (!(curr_tmp_table=
- exec_tmp_table2= create_tmp_table(thd,
- &curr_join->tmp_table_param,
- *curr_all_fields,
- (ORDER*) 0,
- curr_join->select_distinct &&
- !curr_join->group_list,
- 1, curr_join->select_options,
- HA_POS_ERROR, "")))
+ if (!(exec_tmp_table2= create_intermediate_table(curr_all_fields,
+ NULL, true)))
DBUG_VOID_RETURN;
- curr_join->exec_tmp_table2= exec_tmp_table2;
+ if (parent)
+ parent->exec_tmp_table2= exec_tmp_table2;
}
- if (curr_join->group_list)
+ curr_tmp_table= exec_tmp_table2;
+
+ if (group_list)
{
- if (curr_join->join_tab == join_tab && save_join_tab())
+ if (join_tab == main_join->join_tab && main_join->save_join_tab())
{
DBUG_VOID_RETURN;
}
DBUG_PRINT("info",("Sorting for index"));
THD_STAGE_INFO(thd, stage_creating_sort_index);
- if (create_sort_index(thd, curr_join, curr_join->group_list,
+ if (create_sort_index(thd, this, group_list,
HA_POS_ERROR, HA_POS_ERROR, FALSE) ||
- make_group_fields(this, curr_join))
- {
+ make_group_fields(main_join, this))
DBUG_VOID_RETURN;
- }
- sortorder= curr_join->sortorder;
+ if (parent)
+ parent->sortorder= sortorder;
}
THD_STAGE_INFO(thd, stage_copying_to_group_table);
DBUG_PRINT("info", ("%s", thd->proc_info));
tmp_error= -1;
- if (curr_join != this)
+ if (parent)
{
- if (sum_funcs2)
+ if (parent->sum_funcs2)
{
- curr_join->sum_funcs= sum_funcs2;
- curr_join->sum_funcs_end= sum_funcs_end2;
+ // Reuse sum_funcs from previous execution of subquery
+ sum_funcs= parent->sum_funcs2;
+ sum_funcs_end= parent->sum_funcs_end2;
}
else
{
- curr_join->alloc_func_list();
- sum_funcs2= curr_join->sum_funcs;
- sum_funcs_end2= curr_join->sum_funcs_end;
+ // First execution of this subquery, allocate list of sum_functions
+ alloc_func_list();
+ parent->sum_funcs2= sum_funcs;
+ parent->sum_funcs_end2= sum_funcs_end;
}
}
- if (curr_join->make_sum_func_list(*curr_all_fields, *curr_fields_list,
- 1, TRUE) ||
- prepare_sum_aggregators(curr_join->sum_funcs,
- !curr_join->join_tab->is_using_agg_loose_index_scan()))
+ if (make_sum_func_list(*curr_all_fields, *curr_fields_list, true, true) ||
+ prepare_sum_aggregators(sum_funcs,
+ !join_tab->is_using_agg_loose_index_scan()))
DBUG_VOID_RETURN;
- curr_join->group_list= 0;
- if (!curr_join->sort_and_group &&
- curr_join->const_tables != curr_join->tables)
- disable_sorted_access(&curr_join->join_tab[curr_join->const_tables]);
- if (setup_sum_funcs(curr_join->thd, curr_join->sum_funcs) ||
- (tmp_error= do_select(curr_join, (List<Item> *) 0, curr_tmp_table,
- 0)))
+ group_list= NULL;
+ if (!sort_and_group && const_tables != tables)
+ disable_sorted_access(&join_tab[const_tables]);
+ if (setup_sum_funcs(thd, sum_funcs) ||
+ (tmp_error= do_select(this, (List<Item> *)NULL, curr_tmp_table, NULL)))
{
error= tmp_error;
DBUG_VOID_RETURN;
}
- end_read_record(&curr_join->join_tab->read_record);
- curr_join->const_tables= curr_join->tables; // Mark free for cleanup()
- curr_join->join_tab[0].table= 0; // Table is freed
+ end_read_record(&join_tab->read_record);
+ // @todo No tests fail if line below is removed. Remove?
+ const_tables= tables; // Mark free for cleanup()
+ join_tab[0].table= NULL; // Table is freed
// No sum funcs anymore
if (items2.is_null())
@@ -3380,110 +3288,103 @@ JOIN::exec()
tmp_fields_list2, tmp_all_fields2,
fields_list.elements, tmp_all_fields1))
DBUG_VOID_RETURN;
- if (curr_join != this)
+ if (parent)
{
- curr_join->tmp_fields_list2= tmp_fields_list2;
- curr_join->tmp_all_fields2= tmp_all_fields2;
+ // Copy to parent JOIN for reuse in later executions of subquery
+ parent->items2= items2;
+ parent->tmp_fields_list2= tmp_fields_list2;
+ parent->tmp_all_fields2= tmp_all_fields2;
}
}
- curr_fields_list= &curr_join->tmp_fields_list2;
- curr_all_fields= &curr_join->tmp_all_fields2;
- curr_join->set_items_ref_array(items2);
- curr_join->tmp_table_param.field_count+=
- curr_join->tmp_table_param.sum_func_count;
- curr_join->tmp_table_param.sum_func_count= 0;
+ curr_fields_list= &tmp_fields_list2;
+ curr_all_fields= &tmp_all_fields2;
+ set_items_ref_array(items2);
+ tmp_table_param.field_count+= tmp_table_param.sum_func_count;
+ tmp_table_param.sum_func_count= 0;
}
if (curr_tmp_table->distinct)
- curr_join->select_distinct=0; /* Each row is unique */
+ select_distinct= false; /* Each row is unique */
- curr_join->join_free(); /* Free quick selects */
- if (curr_join->select_distinct && ! curr_join->group_list)
+ join_free(); /* Free quick selects */
+ if (select_distinct && !group_list)
{
THD_STAGE_INFO(thd, stage_removing_duplicates);
- if (curr_join->tmp_having)
- curr_join->tmp_having->update_used_tables();
- if (remove_duplicates(curr_join, curr_tmp_table,
- *curr_fields_list, curr_join->tmp_having))
+ if (tmp_having)
+ tmp_having->update_used_tables();
+ if (remove_duplicates(this, curr_tmp_table, *curr_fields_list, tmp_having))
DBUG_VOID_RETURN;
- curr_join->tmp_having=0;
- curr_join->select_distinct=0;
+ tmp_having= NULL;
+ select_distinct= false;
}
curr_tmp_table->reginfo.lock_type= TL_UNLOCK;
- if (curr_join->make_simple_join(this, curr_tmp_table))
+ // Set up scan for reading from temporary table
+ if (make_simple_join(main_join, curr_tmp_table))
DBUG_VOID_RETURN;
- calc_group_buffer(curr_join, curr_join->group_list);
- count_field_types(select_lex, &curr_join->tmp_table_param,
- *curr_all_fields, 0);
+ calc_group_buffer(this, group_list);
+ count_field_types(select_lex, &tmp_table_param, *curr_all_fields, false);
}
if (procedure)
- count_field_types(select_lex, &curr_join->tmp_table_param,
- *curr_all_fields, 0);
+ count_field_types(select_lex, &tmp_table_param, *curr_all_fields, false);
- if (curr_join->group || curr_join->implicit_grouping ||
- curr_join->tmp_table_param.sum_func_count ||
+ if (group || implicit_grouping || tmp_table_param.sum_func_count ||
(procedure && (procedure->flags & PROC_GROUP)))
{
- if (make_group_fields(this, curr_join))
- {
+ if (make_group_fields(main_join, this))
DBUG_VOID_RETURN;
- }
if (items3.is_null())
{
if (items0.is_null())
init_items_ref_array();
items3= ref_ptr_array_slice(4);
- setup_copy_fields(thd, &curr_join->tmp_table_param,
+ setup_copy_fields(thd, &tmp_table_param,
items3, tmp_fields_list3, tmp_all_fields3,
curr_fields_list->elements, *curr_all_fields);
- tmp_table_param.save_copy_funcs= curr_join->tmp_table_param.copy_funcs;
- tmp_table_param.save_copy_field= curr_join->tmp_table_param.copy_field;
- tmp_table_param.save_copy_field_end=
- curr_join->tmp_table_param.copy_field_end;
- if (curr_join != this)
+ if (parent)
{
- curr_join->tmp_all_fields3= tmp_all_fields3;
- curr_join->tmp_fields_list3= tmp_fields_list3;
- }
- }
- else
- {
- curr_join->tmp_table_param.copy_funcs= tmp_table_param.save_copy_funcs;
- curr_join->tmp_table_param.copy_field= tmp_table_param.save_copy_field;
- curr_join->tmp_table_param.copy_field_end=
- tmp_table_param.save_copy_field_end;
+ // Copy to parent JOIN for reuse in later executions of subquery
+ parent->tmp_table_param.save_copy_funcs= tmp_table_param.copy_funcs;
+ parent->tmp_table_param.save_copy_field= tmp_table_param.copy_field;
+ parent->tmp_table_param.save_copy_field_end=
+ tmp_table_param.copy_field_end;
+ parent->tmp_all_fields3= tmp_all_fields3;
+ parent->tmp_fields_list3= tmp_fields_list3;
+ }
+ }
+ else if (parent)
+ {
+ // Reuse data from earlier execution of this subquery.
+ tmp_table_param.copy_funcs= parent->tmp_table_param.save_copy_funcs;
+ tmp_table_param.copy_field= parent->tmp_table_param.save_copy_field;
+ tmp_table_param.copy_field_end=
+ parent->tmp_table_param.save_copy_field_end;
}
curr_fields_list= &tmp_fields_list3;
curr_all_fields= &tmp_all_fields3;
- curr_join->set_items_ref_array(items3);
+ set_items_ref_array(items3);
- if (curr_join->make_sum_func_list(*curr_all_fields, *curr_fields_list,
- 1, TRUE) ||
- prepare_sum_aggregators(curr_join->sum_funcs,
- !curr_join->join_tab ||
- !curr_join->join_tab->
- is_using_agg_loose_index_scan()) ||
- setup_sum_funcs(curr_join->thd, curr_join->sum_funcs) ||
+ if (make_sum_func_list(*curr_all_fields, *curr_fields_list, true, true) ||
+ prepare_sum_aggregators(sum_funcs,
+ !join_tab ||
+ !join_tab-> is_using_agg_loose_index_scan()) ||
+ setup_sum_funcs(thd, sum_funcs) ||
thd->is_fatal_error)
DBUG_VOID_RETURN;
}
- if (curr_join->group_list || curr_join->order)
+ if (group_list || order)
{
DBUG_PRINT("info",("Sorting for send_result_set_metadata"));
THD_STAGE_INFO(thd, stage_sorting_result);
/* If we have already done the group, add HAVING to sorted table */
- if (curr_join->tmp_having && ! curr_join->group_list &&
- ! curr_join->sort_and_group)
+ if (tmp_having && !group_list && !sort_and_group)
{
// Some tables may have been const
- curr_join->tmp_having->update_used_tables();
- JOIN_TAB *curr_table= &curr_join->join_tab[curr_join->const_tables];
- table_map used_tables= (curr_join->const_table_map |
- curr_table->table->map);
-
- Item* sort_table_cond= make_cond_for_table(curr_join->tmp_having,
- used_tables,
- (table_map) 0, 0);
+ tmp_having->update_used_tables();
+ JOIN_TAB *curr_table= &join_tab[const_tables];
+ table_map used_tables= (const_table_map | curr_table->table->map);
+
+ Item* sort_table_cond= make_cond_for_table(tmp_having, used_tables,
+ (table_map) 0, false);
if (sort_table_cond)
{
if (!curr_table->select)
@@ -3516,9 +3417,8 @@ JOIN::exec()
*/
if (curr_table->pre_idx_push_cond)
{
- sort_table_cond= make_cond_for_table(curr_join->tmp_having,
- used_tables,
- (table_map) 0, 0);
+ sort_table_cond= make_cond_for_table(tmp_having, used_tables,
+ (table_map) 0, false);
if (!sort_table_cond)
DBUG_VOID_RETURN;
Item* new_pre_idx_push_cond=
@@ -3531,17 +3431,15 @@ JOIN::exec()
curr_table->pre_idx_push_cond= new_pre_idx_push_cond;
}
- curr_join->tmp_having= make_cond_for_table(curr_join->tmp_having,
- ~ (table_map) 0,
- ~used_tables, 0);
- DBUG_EXECUTE("where",print_where(curr_join->tmp_having,
- "having after sort",
- QT_ORDINARY););
+ tmp_having= make_cond_for_table(tmp_having, ~ (table_map) 0,
+ ~used_tables, false);
+ DBUG_EXECUTE("where",
+ print_where(tmp_having, "having after sort", QT_ORDINARY););
}
}
{
if (group)
- curr_join->m_select_limit= HA_POS_ERROR;
+ m_select_limit= HA_POS_ERROR;
else
{
/*
@@ -3552,8 +3450,8 @@ JOIN::exec()
- as a keyuse attached to the join_tab (ref access),
- as a semi-join equality attached to materialization semi-join nest.
*/
- JOIN_TAB *curr_table= &curr_join->join_tab[curr_join->const_tables+1];
- JOIN_TAB *end_table= &curr_join->join_tab[curr_join->tables];
+ JOIN_TAB *curr_table= &join_tab[const_tables+1];
+ JOIN_TAB *end_table= &join_tab[tables];
for (; curr_table < end_table ; curr_table++)
{
if (curr_table->condition() ||
@@ -3561,12 +3459,12 @@ JOIN::exec()
curr_table->get_sj_strategy() == SJ_OPT_MATERIALIZE_LOOKUP)
{
/* We have to sort all rows */
- curr_join->m_select_limit= HA_POS_ERROR;
+ m_select_limit= HA_POS_ERROR;
break;
}
}
}
- if (curr_join->join_tab == join_tab && save_join_tab())
+ if (join_tab == main_join->join_tab && main_join->save_join_tab())
{
DBUG_VOID_RETURN;
}
@@ -3580,8 +3478,7 @@ JOIN::exec()
OPTION_FOUND_ROWS supersedes LIMIT and is taken into account.
*/
DBUG_PRINT("info",("Sorting for order by/group by"));
- ORDER *order_arg=
- curr_join->group_list ? curr_join->group_list : curr_join->order;
+ ORDER *order_arg= group_list ? group_list : order;
/*
filesort_limit: Return only this many rows from filesort().
We can use select_limit_cnt only if we have no group_by and 1 table.
@@ -3591,31 +3488,30 @@ JOIN::exec()
unit->select_limit_cnt == 1 (we only need one row in the result set)
*/
const ha_rows filesort_limit_arg=
- (has_group_by || curr_join->tables > 1)
- ? curr_join->m_select_limit : unit->select_limit_cnt;
+ (has_group_by || tables > 1) ? m_select_limit : unit->select_limit_cnt;
const ha_rows select_limit_arg=
select_options & OPTION_FOUND_ROWS
? HA_POS_ERROR : unit->select_limit_cnt;
DBUG_PRINT("info", ("has_group_by %d "
- "curr_join->tables %d "
- "curr_join->m_select_limit %d "
+ "tables %d "
+ "m_select_limit %d "
"unit->select_limit_cnt %d",
has_group_by,
- curr_join->tables,
- (int) curr_join->m_select_limit,
+ tables,
+ (int) m_select_limit,
(int) unit->select_limit_cnt));
if (create_sort_index(thd,
- curr_join,
+ this,
order_arg,
filesort_limit_arg,
select_limit_arg,
- curr_join->group_list ? FALSE : TRUE))
+ !test(group_list)))
DBUG_VOID_RETURN;
- sortorder= curr_join->sortorder;
- if (curr_join->const_tables != curr_join->tables &&
- !curr_join->join_tab[curr_join->const_tables].table->sort.io_cache)
+ if (parent)
+ parent->sortorder= sortorder;
+ if (const_tables != tables && !join_tab[const_tables].table->sort.io_cache)
{
/*
If no IO cache exists for the first table then we are using an
@@ -3632,28 +3528,26 @@ JOIN::exec()
error= thd->is_error();
DBUG_VOID_RETURN;
}
- curr_join->having= curr_join->tmp_having;
- curr_join->fields= curr_fields_list;
- curr_join->procedure= procedure;
+ having= tmp_having;
+ fields= curr_fields_list;
THD_STAGE_INFO(thd, stage_sending_data);
DBUG_PRINT("info", ("%s", thd->proc_info));
- result->send_result_set_metadata((procedure ? curr_join->procedure_fields_list :
+ result->send_result_set_metadata((procedure ? procedure_fields_list :
*curr_fields_list),
Protocol::SEND_NUM_ROWS | Protocol::SEND_EOF);
- error= do_select(curr_join, curr_fields_list, NULL, procedure);
- thd->limit_found_rows= curr_join->send_records;
- if (curr_join->order &&
- curr_join->sortorder)
+ error= do_select(this, curr_fields_list, NULL, procedure);
+ thd->limit_found_rows= send_records;
+
+ if (order && sortorder)
{
/* Use info provided by filesort. */
- DBUG_ASSERT(curr_join->tables > curr_join->const_tables);
- JOIN_TAB *tab= curr_join->join_tab + curr_join->const_tables;
- thd->limit_found_rows= tab->records;
+ DBUG_ASSERT(tables > const_tables);
+ thd->limit_found_rows= join_tab[const_tables].records;
}
/* Accumulate the counts from all join iterations of all join parts. */
- thd->inc_examined_row_count(curr_join->examined_rows);
+ thd->inc_examined_row_count(examined_rows);
DBUG_PRINT("counts", ("thd->examined_row_count: %lu",
(ulong) thd->get_examined_row_count()));
@@ -3661,6 +3555,112 @@ JOIN::exec()
}
+TABLE*
+JOIN::create_intermediate_table(List<Item> *tmp_table_fields,
+ ORDER *tmp_table_group, bool save_sum_fields)
+{
+ DBUG_ENTER("JOIN::create_intermediate_table");
+ THD_STAGE_INFO(thd, stage_creating_tmp_table);
+
+ /*
+ Pushing LIMIT to the temporary table creation is not applicable
+ when there is ORDER BY or GROUP BY or there is no GROUP BY, but
+ there are aggregate functions, because in all these cases we need
+ all result rows.
+ */
+ ha_rows tmp_rows_limit= ((order == NULL || skip_sort_order) &&
+ !tmp_table_group &&
+ !select_lex->with_sum_func) ?
+ m_select_limit : HA_POS_ERROR;
+
+ TABLE* tab= create_tmp_table(thd, &tmp_table_param, *tmp_table_fields,
+ tmp_table_group, select_distinct && !group_list,
+ save_sum_fields, select_options, tmp_rows_limit,
+ "");
+ if (!tab)
+ DBUG_RETURN(NULL);
+
+ /*
+ We don't have to store rows in temp table that doesn't match HAVING if:
+ - we are sorting the table and writing complete group rows to the
+ temp table.
+ - We are using DISTINCT without resolving the distinct as a GROUP BY
+ on all columns.
+
+ If having is not handled here, it will be checked before the row
+ is sent to the client.
+ */
+ if (tmp_having &&
+ (sort_and_group || (tab->distinct && !group_list)))
+ having= tmp_having;
+
+ /* if group or order on first table, sort first */
+ if (group_list && simple_group)
+ {
+ DBUG_PRINT("info",("Sorting for group"));
+ THD_STAGE_INFO(thd, stage_sorting_for_group);
+ if (create_sort_index(thd, this, group_list,
+ HA_POS_ERROR, HA_POS_ERROR, false) ||
+ alloc_group_fields(this, group_list) ||
+ make_sum_func_list(all_fields, fields_list, true) ||
+ prepare_sum_aggregators(sum_funcs,
+ !join_tab->is_using_agg_loose_index_scan()) ||
+ setup_sum_funcs(thd, sum_funcs))
+ {
+ DBUG_RETURN(NULL);
+ }
+ group_list= NULL;
+ }
+ else
+ {
+ if (make_sum_func_list(all_fields, fields_list, false) ||
+ prepare_sum_aggregators(sum_funcs,
+ !join_tab->is_using_agg_loose_index_scan()) ||
+ setup_sum_funcs(thd, sum_funcs))
+ {
+ DBUG_RETURN(NULL);
+ }
+
+ if (!group_list && !tab->distinct && order && simple_order)
+ {
+ DBUG_PRINT("info",("Sorting for order"));
+ THD_STAGE_INFO(thd, stage_sorting_for_order);
+ if (create_sort_index(thd, this, order,
+ HA_POS_ERROR, HA_POS_ERROR, true))
+ {
+ DBUG_RETURN(NULL);
+ }
+ order= NULL;
+ }
+ }
+ DBUG_RETURN(tab);
+}
+
+
+void
+JOIN::optimize_distinct()
+{
+ JOIN_TAB *last_join_tab= join_tab+tables-1;
+ do
+ {
+ if (select_lex->select_list_tables & last_join_tab->table->map)
+ break;
+ last_join_tab->not_used_in_distinct= true;
+ } while (last_join_tab-- != join_tab);
+
+ /* Optimize "select distinct b from t1 order by key_part_1 limit #" */
+ if (order && skip_sort_order)
+ {
+ /* Should always succeed */
+ if (test_if_skip_sort_order(&join_tab[const_tables],
+ order, unit->select_limit_cnt, false,
+ &join_tab[const_tables].table->
+ keys_in_use_for_order_by))
+ order= NULL;
+ }
+}
+
+
/**
Clean up and destroy join object.
@@ -16851,7 +16851,7 @@ void setup_tmp_table_column_bitmaps(TABL
@param param a description used as input to create the table
@param fields list of items that will be used to define
column types of the table (also see NOTES)
- @param group TODO document
+ @param group Group key to use for temporary table, NULL if none
@param distinct should table rows be distinct
@param save_sum_fields see NOTES
@param select_options
=== modified file 'sql/sql_select.h'
--- a/sql/sql_select.h 2011-10-10 10:36:58 +0000
+++ b/sql/sql_select.h 2011-10-12 11:08:37 +0000
@@ -2162,6 +2162,47 @@ public:
void drop_unused_derived_keys();
private:
/**
+ Execute current query. To be called from @c JOIN::exec.
+
+ If current query is a dependent subquery, this execution is performed on a
+ temporary copy of the original JOIN object in order to be able to restore
+ the original content for re-execution and EXPLAIN. (@note Subqueries may
+ be executed as part of EXPLAIN.) In such cases, execution data that may be
+ reused for later executions will be copied to the original
+ @c JOIN object (@c parent).
+
+ @param parent Original @c JOIN object when current object is a temporary
+ copy. @c NULL, otherwise
+ */
+ void execute(JOIN *parent);
+
+ /**
+ Create a temporary table to be used for processing DISTINCT/ORDER
+ BY/GROUP BY.
+
+ @note Will modify JOIN object wrt sort/group attributes
+
+ @param tmp_table_fields List of items that will be used to define
+ column types of the table.
+ @param tmp_table_group Group key to use for temporary table, NULL if none.
+ @param save_sum_fields If true, do not replace Item_sum items in
+ @c tmp_fields list with Item_field items referring
+ to fields in temporary table.
+
+ @returns Pointer to temporary table on success, NULL on failure
+ */
+ TABLE* create_intermediate_table(List<Item> *tmp_table_fields,
+ ORDER *tmp_table_group, bool save_sum_fields);
+
+ /**
+ Optimize distinct when used on a subset of the tables.
+
+ E.g.,: SELECT DISTINCT t1.a FROM t1,t2 WHERE t1.b=t2.b
+ In this case we can stop scanning t2 when we have found one t1.a
+ */
+ void optimize_distinct();
+
+ /**
TRUE if the query contains an aggregate function but has no GROUP
BY clause.
*/
No bundle (reason: useless for push emails).
| Thread |
|---|
| • bzr push into mysql-trunk branch (oystein.grovlen:3450 to 3453) WL#5559 | Oystein Grovlen | 12 Oct |