From: Roy Lyseng Date: October 25 2011 10:28am Subject: bzr push into mysql-trunk branch (roy.lyseng:3468 to 3469) Bug#12407753 List-Archive: http://lists.mysql.com/commits/141575 X-Bug: 12407753 Message-Id: <20111025102837.B200B203@tyr67.norway.sun.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit 3469 Roy Lyseng 2011-10-25 Bug#12407753: Time to compare a row is missing in cost calculation of semi-join Stage 2: Refactor optimize_wo_join_buffering, best_access_path and others. Refactored how costs are re-calculated for the semi-join strategies, now the intermediate and final calculations are done the same way. Added new calculation functions to class Optimize_table_order. In addition, a few minor inconsistencies around semi-join optimization have been fixed. The following changes were developed by Guilhem Bichot and incorporated in this refactoring patch: Moving cur_sj_inner_tables and emb_sjm_nest to Optimize_table_order, where they naturally belong (emb_sjm_nest added by me). After choosing the final plan, cur_sj_inner_tables should be 0, as our plan normally covers all sj-inner tables. So we add an assert. However there was a bug: backout_nj_sj_state() actually never restored cur_sj_inner_tables, because it was assuming (wrongly) that remaining_tables didn't contain the table-to-remove, whereas it always does. mysql-test/r/opt_trace.bugs_ps_prot_all.result mysql-test/r/opt_trace.general_ps_prot_all.result mysql-test/r/opt_trace.bugs_no_prot_all.result Some changes necessitated by refactoring. Aligned some text with other terminology. mysql-test/r/opt_trace.general_no_prot_all.result For query 'select * from t3 where (a,a,b) in (select * from t1,t2,t4)', (where t3 has one row and is thus a "system" table), plan (t4,t1,t2)+firstmatch was not considered, which was wrong. Here's how it happened. - plan (t2) is tested: cur_sj_inner_tables is set to t1+t2+t4, then plan is pruned; cur_sj_inner_tables remains non-zero because of the bug in backout_nj_sj_state() - plan (t1) is tested: this if() below is not entered because cur_sj_inner_tables is non-zero: if (cur_sj_inner_tables == 0 && // (2) !(remaining_tables & outer_corr_tables) && // (3) (sj_inner_tables == // (4) ((remaining_tables | new_join_tab->table->map) & sj_inner_tables))) { /* Start tracking potential FirstMatch range */ And thus firstmatch won't be tested. sql/sql_const.h Added cost factors for creation of temporary tables. Currently, the old values are still used. Removed unused constant RAID_BLOCK_SIZE. sql/sql_select.cc optimize_semijoin_nests(): Restructured assignments to cost data. Added cost for creation of temporary table (currently zero). best_access_path() is now a member function of Optimize_table_order. Argument join is eliminated as it is passed through object. The "excluded_tables" variable that was sometimes OR'ed to remaining_tables is now a member variable in Optimize_table_order and need not be passed by callers to best_access_path(). choose_table_order(): Exclude unnecessary book-keeping when processing semi-join materialization nests. Call fix_semijoin_strategies() when needed (makes it possible to make fix_semijoin_strategies() a member function of Optimize_table_order). fix_semijoin_strategies() is now a member function of Optimize_table_order. Renamed from fix_semijoin_strategies_for_picked_join_order() because the full function name became extremely long. Rewritten to use the new access path calculation functions. semijoin_fm_ls_access_paths() is renamed from optimize_wo_join_buffering. It will now correctly calculate new access path information for FirstMatch and LooseScan semi-join strategies. and it is modified to be able to make final calculations as well as intermediate. It makes the same join buffering choice in both cases. Notice: Aligned with old strategy for this commit. semijoin_materialize_access_paths() calculates access paths for the MaterializeScan semi-join strategy. It can be used both in intermediate and final stages of table order estimation. semijoin_dupweedout_access_paths() calculates access paths for the DuplicateWeedout semi-join strategy. No corrections are needed for the final stage of table order estimation, so intermediate decisions are kept as final decisions. advance_sj_state() is simplified by using the new access path calculation functions. Some refactoring of variable names done (read_time -> cost, record_count -> rowcount). This cleanup was done since most of the code lines were touched anyway. Also introduced new local variables sj_strategy for simpler logic. After call to greedy_search(), assert that cur_sj_inner_tables is 0. In backout_nj_sj_state(), cur_sj_inner_tables was wrongly backed out: "if ((remaining_tables & emb_sj_nest->sj_inner_tables) == (emb_sj_nest->sj_inner_tables & ~tab->table->map))" was never entered: - if 'tab' is sj-inner, (remaining_tables & emb_sj_nest->sj_inner_tables) contains 'tab', but (emb_sj_nest->sj_inner_tables & ~tab->table->map) doesn't - if 'tab' is not sj-inner, emb_sj_nest is NULL. sql/sql_select.h Member variables cur_sj_inner_tables and emb_sjm_nest moved from class JOIN to class Optimize_table_order. modified: mysql-test/suite/opt_trace/r/bugs_no_prot_all.result mysql-test/suite/opt_trace/r/bugs_ps_prot_all.result mysql-test/suite/opt_trace/r/general_no_prot_all.result mysql-test/suite/opt_trace/r/general_ps_prot_all.result sql/sql_const.h sql/sql_select.cc sql/sql_select.h 3468 Roy Lyseng 2011-10-24 Bug#12664936: Same query executed as where subquery gives different results on in() compare #2 Step 2 of 2 - Bugfix The problem here is that ref access is set up wrongly for a table inside a materialized semi-join nest. The reference is made to a table outside of the semi-join nest, which obviously is not available when performing the materialization. However, best_access_path() has successfully found a ref access pattern based on a table inside the semi-join nest, the problem is only that create_ref_for_key() gets invalid used_tables data when setting up the ref access. Instead of considering all tables that have been traversed already in get_best_combination(), we need to ignore tables that are outside of the semi-join nest. mysql-test/include/subquery_sj.inc Added test for bug#12664936. mysql-test/r/subquery_sj_all.result mysql-test/r/subquery_sj_all_bka.result mysql-test/r/subquery_sj_all_bkaunique.result mysql-test/r/subquery_sj_dupsweed.result mysql-test/r/subquery_sj_dupsweed_bka.result mysql-test/r/subquery_sj_dupsweed_bka_nixbnl.result mysql-test/r/subquery_sj_dupsweed_bkaunique.result mysql-test/r/subquery_sj_firstmatch.result mysql-test/r/subquery_sj_firstmatch_bka.result mysql-test/r/subquery_sj_firstmatch_bka_nixbnl.result mysql-test/r/subquery_sj_firstmatch_bkaunique.result mysql-test/r/subquery_sj_loosescan.result mysql-test/r/subquery_sj_loosescan_bka.result mysql-test/r/subquery_sj_loosescan_bka_nixbnl.result mysql-test/r/subquery_sj_loosescan_bkaunique.result mysql-test/r/subquery_sj_mat.result mysql-test/r/subquery_sj_mat_bka.result mysql-test/r/subquery_sj_mat_bkaunique.result mysql-test/r/subquery_sj_mat_nosj.result mysql-test/r/subquery_sj_none.result mysql-test/r/subquery_sj_none_bka.result mysql-test/r/subquery_sj_none_bka_nixbnl.result mysql-test/r/subquery_sj_none_bkaunique.result Added test results for bug#12664936. mysql-test/r/subquery_sj_all_bka_nixbnl.result mysql-test/r/subquery_sj_mat_bka_nixbnl.result Plan is corrected to not use a reference to an outside column in a materialized semi-join nest. Added test results for bug#12664936. sql/sql_select.cc In JOIN::set_access_methods(), calculated "available tables" as the set of used tables found inside the materialized semi-join nest. Added some cross references to other places where tables from outside the semi-join nest must be excluded. modified: mysql-test/include/subquery_sj.inc mysql-test/r/subquery_sj_all.result mysql-test/r/subquery_sj_all_bka.result mysql-test/r/subquery_sj_all_bka_nixbnl.result mysql-test/r/subquery_sj_all_bkaunique.result mysql-test/r/subquery_sj_dupsweed.result mysql-test/r/subquery_sj_dupsweed_bka.result mysql-test/r/subquery_sj_dupsweed_bka_nixbnl.result mysql-test/r/subquery_sj_dupsweed_bkaunique.result mysql-test/r/subquery_sj_firstmatch.result mysql-test/r/subquery_sj_firstmatch_bka.result mysql-test/r/subquery_sj_firstmatch_bka_nixbnl.result mysql-test/r/subquery_sj_firstmatch_bkaunique.result mysql-test/r/subquery_sj_loosescan.result mysql-test/r/subquery_sj_loosescan_bka.result mysql-test/r/subquery_sj_loosescan_bka_nixbnl.result mysql-test/r/subquery_sj_loosescan_bkaunique.result mysql-test/r/subquery_sj_mat.result mysql-test/r/subquery_sj_mat_bka.result mysql-test/r/subquery_sj_mat_bka_nixbnl.result mysql-test/r/subquery_sj_mat_bkaunique.result mysql-test/r/subquery_sj_mat_nosj.result mysql-test/r/subquery_sj_none.result mysql-test/r/subquery_sj_none_bka.result mysql-test/r/subquery_sj_none_bka_nixbnl.result mysql-test/r/subquery_sj_none_bkaunique.result sql/sql_select.cc === modified file 'mysql-test/suite/opt_trace/r/bugs_no_prot_all.result' --- a/mysql-test/suite/opt_trace/r/bugs_no_prot_all.result 2011-10-19 06:26:28 +0000 +++ b/mysql-test/suite/opt_trace/r/bugs_no_prot_all.result 2011-10-25 10:28:05 +0000 @@ -573,17 +573,16 @@ FROM t1 "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 2.0049, "rows": 1, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 2.3049, "rows": 1, "duplicate_tables_left": false, @@ -598,6 +597,13 @@ FROM t1 } ] /* semijoin_strategy_choice */, "chosen": true + }, + { + "final_semijoin_strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + ] /* tables */ + } /* recalculate_access_paths_and_cost */ } ] /* considered_execution_plans */ }, @@ -1584,8 +1590,7 @@ ON table2 .col_int_key = table1 .col_int "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ { "database": "test", @@ -1631,13 +1636,13 @@ ON table2 .col_int_key = table1 .col_int } /* best_access_path */ } ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 11.569, "rows": 6, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 11.533, "rows": 6, "duplicate_tables_left": false, @@ -1707,8 +1712,7 @@ ON table2 .col_int_key = table1 .col_int "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ { "database": "test", @@ -1754,13 +1758,13 @@ ON table2 .col_int_key = table1 .col_int } /* best_access_path */ } ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 11.569, "rows": 6, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 11.533, "rows": 6, "duplicate_tables_left": false, @@ -1779,6 +1783,9 @@ ON table2 .col_int_key = table1 .col_int ] /* rest_of_plan */ } ] /* rest_of_plan */ + }, + { + "final_semijoin_strategy": "MaterializeLookup" } ] /* considered_execution_plans */ }, === modified file 'mysql-test/suite/opt_trace/r/bugs_ps_prot_all.result' --- a/mysql-test/suite/opt_trace/r/bugs_ps_prot_all.result 2011-10-19 06:26:28 +0000 +++ b/mysql-test/suite/opt_trace/r/bugs_ps_prot_all.result 2011-10-25 10:28:05 +0000 @@ -573,17 +573,16 @@ FROM t1 "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 2.0049, "rows": 1, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 2.3049, "rows": 1, "duplicate_tables_left": false, @@ -598,6 +597,13 @@ FROM t1 } ] /* semijoin_strategy_choice */, "chosen": true + }, + { + "final_semijoin_strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + ] /* tables */ + } /* recalculate_access_paths_and_cost */ } ] /* considered_execution_plans */ }, @@ -1584,8 +1590,7 @@ ON table2 .col_int_key = table1 .col_int "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ { "database": "test", @@ -1631,13 +1636,13 @@ ON table2 .col_int_key = table1 .col_int } /* best_access_path */ } ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 11.569, "rows": 6, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 11.533, "rows": 6, "duplicate_tables_left": false, @@ -1707,8 +1712,7 @@ ON table2 .col_int_key = table1 .col_int "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ { "database": "test", @@ -1754,13 +1758,13 @@ ON table2 .col_int_key = table1 .col_int } /* best_access_path */ } ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 11.569, "rows": 6, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 11.533, "rows": 6, "duplicate_tables_left": false, @@ -1779,6 +1783,9 @@ ON table2 .col_int_key = table1 .col_int ] /* rest_of_plan */ } ] /* rest_of_plan */ + }, + { + "final_semijoin_strategy": "MaterializeLookup" } ] /* considered_execution_plans */ }, === modified file 'mysql-test/suite/opt_trace/r/general_no_prot_all.result' --- a/mysql-test/suite/opt_trace/r/general_no_prot_all.result 2011-10-13 13:22:45 +0000 +++ b/mysql-test/suite/opt_trace/r/general_no_prot_all.result 2011-10-25 10:28:05 +0000 @@ -1595,17 +1595,16 @@ explain SELECT c FROM t5 where c+1 in (s "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 1, "rows": 1, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 1.3, "rows": 1, "duplicate_tables_left": false, @@ -1620,6 +1619,13 @@ explain SELECT c FROM t5 where c+1 in (s } ] /* semijoin_strategy_choice */, "chosen": true + }, + { + "final_semijoin_strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + ] /* tables */ + } /* recalculate_access_paths_and_cost */ } ] /* considered_execution_plans */ }, @@ -5210,11 +5216,50 @@ trace "rows_for_plan": 1, "semijoin_strategy_choice": [ { + "strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + { + "database": "test", + "table": "t1", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "rows": 1, + "cost": 2.0017, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + }, + { + "database": "test", + "table": "t2", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "using_join_cache": true, + "rows": 1, + "cost": 2.0018, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + } + ] /* tables */ + } /* recalculate_access_paths_and_cost */, + "cost": 5.2036, + "rows": 1, + "chosen": true + }, + { "strategy": "DuplicatesWeedout", "cost": 5.3036, "rows": 1, - "duplicate_tables_left": true, - "chosen": true + "duplicate_tables_left": false, + "chosen": false } ] /* semijoin_strategy_choice */, "pruned_by_cost": true @@ -5242,6 +5287,9 @@ trace "pruned_by_heuristic": true } ] /* rest_of_plan */ + }, + { + "final_semijoin_strategy": "DuplicateWeedout" } ] /* considered_execution_plans */ } @@ -5602,11 +5650,50 @@ trace "rows_for_plan": 1, "semijoin_strategy_choice": [ { + "strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + { + "database": "test", + "table": "t1", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "rows": 1, + "cost": 2.0017, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + }, + { + "database": "test", + "table": "t2", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "using_join_cache": true, + "rows": 1, + "cost": 2.0018, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + } + ] /* tables */ + } /* recalculate_access_paths_and_cost */, + "cost": 5.2036, + "rows": 1, + "chosen": true + }, + { "strategy": "DuplicatesWeedout", "cost": 5.3036, "rows": 1, - "duplicate_tables_left": true, - "chosen": true + "duplicate_tables_left": false, + "chosen": false } ] /* semijoin_strategy_choice */, "pruned_by_cost": true @@ -5634,6 +5721,9 @@ trace "pruned_by_heuristic": true } ] /* rest_of_plan */ + }, + { + "final_semijoin_strategy": "DuplicateWeedout" } ] /* considered_execution_plans */ } @@ -7636,8 +7726,7 @@ select * from t6 where d in (select f1() "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ { "database": "test", @@ -7655,13 +7744,13 @@ select * from t6 where d in (select f1() } /* best_access_path */ } ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 3.2213, "rows": 1, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 4.0212, "rows": 1, "duplicate_tables_left": false, @@ -7678,27 +7767,31 @@ select * from t6 where d in (select f1() "chosen": true } ] /* rest_of_plan */ + }, + { + "final_semijoin_strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + { + "database": "test", + "table": "t2", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "rows": 3, + "cost": 2.0212, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + } + ] /* tables */ + } /* recalculate_access_paths_and_cost */ } ] /* considered_execution_plans */ }, { - "reconsidering_access_paths_for_semijoin": { - "strategy": "FirstMatch", - "database": "test", - "table": "t2", - "best_access_path": { - "considered_access_paths": [ - { - "access_type": "scan", - "rows": 3, - "cost": 2.0212, - "chosen": true - } - ] /* considered_access_paths */ - } /* best_access_path */ - } /* reconsidering_access_paths_for_semijoin */ - }, - { "attaching_conditions_to_tables": { "original_condition": "((`test`.`t6`.`d` = `f1`()) and (`test`.`t2`.`s` = 'c'))", "attached_conditions_computation": [ @@ -8325,8 +8418,7 @@ select d into res from t6 where d in (se "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ { "database": "test", @@ -8344,13 +8436,13 @@ select d into res from t6 where d in (se } /* best_access_path */ } ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 3.2213, "rows": 1, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 4.0212, "rows": 1, "duplicate_tables_left": false, @@ -8367,27 +8459,31 @@ select d into res from t6 where d in (se "chosen": true } ] /* rest_of_plan */ + }, + { + "final_semijoin_strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + { + "database": "test", + "table": "t2", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "rows": 3, + "cost": 2.0212, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + } + ] /* tables */ + } /* recalculate_access_paths_and_cost */ } ] /* considered_execution_plans */ }, { - "reconsidering_access_paths_for_semijoin": { - "strategy": "FirstMatch", - "database": "test", - "table": "t2", - "best_access_path": { - "considered_access_paths": [ - { - "access_type": "scan", - "rows": 3, - "cost": 2.0212, - "chosen": true - } - ] /* considered_access_paths */ - } /* best_access_path */ - } /* reconsidering_access_paths_for_semijoin */ - }, - { "attaching_conditions_to_tables": { "original_condition": "((`test`.`t6`.`d` = `f1`()) and (`test`.`t2`.`s` = arg@0))", "attached_conditions_computation": [ @@ -9876,8 +9972,7 @@ select d into res from t6 where d in (se "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ { "database": "test", @@ -9895,13 +9990,13 @@ select d into res from t6 where d in (se } /* best_access_path */ } ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 3.2496, "rows": 1, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 5.0496, "rows": 1, "duplicate_tables_left": false, @@ -9937,27 +10032,31 @@ select d into res from t6 where d in (se "semijoin_strategy_choice": [ ] /* semijoin_strategy_choice */, "pruned_by_cost": true + }, + { + "final_semijoin_strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + { + "database": "test", + "table": "t2", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "rows": 7, + "cost": 2.0496, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + } + ] /* tables */ + } /* recalculate_access_paths_and_cost */ } ] /* considered_execution_plans */ }, { - "reconsidering_access_paths_for_semijoin": { - "strategy": "FirstMatch", - "database": "test", - "table": "t2", - "best_access_path": { - "considered_access_paths": [ - { - "access_type": "scan", - "rows": 7, - "cost": 2.0496, - "chosen": true - } - ] /* considered_access_paths */ - } /* best_access_path */ - } /* reconsidering_access_paths_for_semijoin */ - }, - { "attaching_conditions_to_tables": { "original_condition": "((`test`.`t6`.`d` = `f1`()) and (`test`.`t2`.`s` = arg@0))", "attached_conditions_computation": [ === modified file 'mysql-test/suite/opt_trace/r/general_ps_prot_all.result' --- a/mysql-test/suite/opt_trace/r/general_ps_prot_all.result 2011-10-13 13:22:45 +0000 +++ b/mysql-test/suite/opt_trace/r/general_ps_prot_all.result 2011-10-25 10:28:05 +0000 @@ -1575,17 +1575,16 @@ explain SELECT c FROM t5 where c+1 in (s "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 1, "rows": 1, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 1.3, "rows": 1, "duplicate_tables_left": false, @@ -1600,6 +1599,13 @@ explain SELECT c FROM t5 where c+1 in (s } ] /* semijoin_strategy_choice */, "chosen": true + }, + { + "final_semijoin_strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + ] /* tables */ + } /* recalculate_access_paths_and_cost */ } ] /* considered_execution_plans */ }, @@ -5190,11 +5196,50 @@ trace "rows_for_plan": 1, "semijoin_strategy_choice": [ { + "strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + { + "database": "test", + "table": "t1", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "rows": 1, + "cost": 2.0017, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + }, + { + "database": "test", + "table": "t2", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "using_join_cache": true, + "rows": 1, + "cost": 2.0018, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + } + ] /* tables */ + } /* recalculate_access_paths_and_cost */, + "cost": 5.2036, + "rows": 1, + "chosen": true + }, + { "strategy": "DuplicatesWeedout", "cost": 5.3036, "rows": 1, - "duplicate_tables_left": true, - "chosen": true + "duplicate_tables_left": false, + "chosen": false } ] /* semijoin_strategy_choice */, "pruned_by_cost": true @@ -5222,6 +5267,9 @@ trace "pruned_by_heuristic": true } ] /* rest_of_plan */ + }, + { + "final_semijoin_strategy": "DuplicateWeedout" } ] /* considered_execution_plans */ } @@ -5582,11 +5630,50 @@ trace "rows_for_plan": 1, "semijoin_strategy_choice": [ { + "strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + { + "database": "test", + "table": "t1", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "rows": 1, + "cost": 2.0017, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + }, + { + "database": "test", + "table": "t2", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "using_join_cache": true, + "rows": 1, + "cost": 2.0018, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + } + ] /* tables */ + } /* recalculate_access_paths_and_cost */, + "cost": 5.2036, + "rows": 1, + "chosen": true + }, + { "strategy": "DuplicatesWeedout", "cost": 5.3036, "rows": 1, - "duplicate_tables_left": true, - "chosen": true + "duplicate_tables_left": false, + "chosen": false } ] /* semijoin_strategy_choice */, "pruned_by_cost": true @@ -5614,6 +5701,9 @@ trace "pruned_by_heuristic": true } ] /* rest_of_plan */ + }, + { + "final_semijoin_strategy": "DuplicateWeedout" } ] /* considered_execution_plans */ } @@ -7608,8 +7698,7 @@ select * from t6 where d in (select f1() "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ { "database": "test", @@ -7627,13 +7716,13 @@ select * from t6 where d in (select f1() } /* best_access_path */ } ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 3.2213, "rows": 1, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 4.0212, "rows": 1, "duplicate_tables_left": false, @@ -7650,27 +7739,31 @@ select * from t6 where d in (select f1() "chosen": true } ] /* rest_of_plan */ + }, + { + "final_semijoin_strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + { + "database": "test", + "table": "t2", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "rows": 3, + "cost": 2.0212, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + } + ] /* tables */ + } /* recalculate_access_paths_and_cost */ } ] /* considered_execution_plans */ }, { - "reconsidering_access_paths_for_semijoin": { - "strategy": "FirstMatch", - "database": "test", - "table": "t2", - "best_access_path": { - "considered_access_paths": [ - { - "access_type": "scan", - "rows": 3, - "cost": 2.0212, - "chosen": true - } - ] /* considered_access_paths */ - } /* best_access_path */ - } /* reconsidering_access_paths_for_semijoin */ - }, - { "attaching_conditions_to_tables": { "original_condition": "((`test`.`t6`.`d` = `f1`()) and (`test`.`t2`.`s` = 'c'))", "attached_conditions_computation": [ @@ -8287,8 +8380,7 @@ select d into res from t6 where d in (se "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ { "database": "test", @@ -8306,13 +8398,13 @@ select d into res from t6 where d in (se } /* best_access_path */ } ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 3.2213, "rows": 1, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 4.0212, "rows": 1, "duplicate_tables_left": false, @@ -8329,27 +8421,31 @@ select d into res from t6 where d in (se "chosen": true } ] /* rest_of_plan */ + }, + { + "final_semijoin_strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + { + "database": "test", + "table": "t2", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "rows": 3, + "cost": 2.0212, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + } + ] /* tables */ + } /* recalculate_access_paths_and_cost */ } ] /* considered_execution_plans */ }, { - "reconsidering_access_paths_for_semijoin": { - "strategy": "FirstMatch", - "database": "test", - "table": "t2", - "best_access_path": { - "considered_access_paths": [ - { - "access_type": "scan", - "rows": 3, - "cost": 2.0212, - "chosen": true - } - ] /* considered_access_paths */ - } /* best_access_path */ - } /* reconsidering_access_paths_for_semijoin */ - }, - { "attaching_conditions_to_tables": { "original_condition": "((`test`.`t6`.`d` = `f1`()) and (`test`.`t2`.`s` = arg@0))", "attached_conditions_computation": [ @@ -9852,8 +9948,7 @@ select d into res from t6 where d in (se "semijoin_strategy_choice": [ { "strategy": "FirstMatch", - "recompute_best_access_paths": { - "cause": "join_buffering_not_possible", + "recalculate_access_paths_and_cost": { "tables": [ { "database": "test", @@ -9871,13 +9966,13 @@ select d into res from t6 where d in (se } /* best_access_path */ } ] /* tables */ - } /* recompute_best_access_paths */, + } /* recalculate_access_paths_and_cost */, "cost": 3.2496, "rows": 1, "chosen": true }, { - "strategy": "MaterializationLookup", + "strategy": "MaterializeLookup", "cost": 5.0496, "rows": 1, "duplicate_tables_left": false, @@ -9913,27 +10008,31 @@ select d into res from t6 where d in (se "semijoin_strategy_choice": [ ] /* semijoin_strategy_choice */, "pruned_by_cost": true + }, + { + "final_semijoin_strategy": "FirstMatch", + "recalculate_access_paths_and_cost": { + "tables": [ + { + "database": "test", + "table": "t2", + "best_access_path": { + "considered_access_paths": [ + { + "access_type": "scan", + "rows": 7, + "cost": 2.0496, + "chosen": true + } + ] /* considered_access_paths */ + } /* best_access_path */ + } + ] /* tables */ + } /* recalculate_access_paths_and_cost */ } ] /* considered_execution_plans */ }, { - "reconsidering_access_paths_for_semijoin": { - "strategy": "FirstMatch", - "database": "test", - "table": "t2", - "best_access_path": { - "considered_access_paths": [ - { - "access_type": "scan", - "rows": 7, - "cost": 2.0496, - "chosen": true - } - ] /* considered_access_paths */ - } /* best_access_path */ - } /* reconsidering_access_paths_for_semijoin */ - }, - { "attaching_conditions_to_tables": { "original_condition": "((`test`.`t6`.`d` = `f1`()) and (`test`.`t2`.`s` = arg@0))", "attached_conditions_computation": [ === modified file 'sql/sql_const.h' --- a/sql/sql_const.h 2011-06-30 15:50:45 +0000 +++ b/sql/sql_const.h 2011-10-25 10:28:05 +0000 @@ -190,11 +190,38 @@ #define MATCHING_ROWS_IN_OTHER_TABLE 10 /* - Subquery materialization-related constants + Constants related to the use of temporary tables in query execution. + Lookup and write operations are currently assumed to be equally costly + (concerns HEAP_TEMPTABLE_ROW_COST and DISK_TEMPTABLE_ROW_COST). */ -#define HEAP_TEMPTABLE_LOOKUP_COST 0.05 -#define DISK_TEMPTABLE_LOOKUP_COST 1.0 -#define RAID_BLOCK_SIZE 1024 + +#define HEAP_TEMPTABLE_CREATE_COST 0.0 +#define HEAP_TEMPTABLE_ROW_COST 0.05 +#define DISK_TEMPTABLE_CREATE_COST 0.0 +#define DISK_TEMPTABLE_ROW_COST 1.0 +#if defined(FUTURE) +/* + Creating a Heap temporary table is by benchmark found to be as costly as + writing 10 rows into the table. +*/ +#define HEAP_TEMPTABLE_CREATE_COST 2.0 +/* + Writing a row to or reading a row from a Heap temporary table is equivalent + to evaluating a row in the join engine. +*/ +#define HEAP_TEMPTABLE_ROW_COST 0.2 +/* + Creating a MyISAM table is 20 times slower than creating a Heap table. +*/ +#define DISK_TEMPTABLE_CREATE_COST 40.0 +/* + Generating MyIsam rows sequentially is 2 times slower than generating + Heap rows, when number of rows is greater than 1000. However, we do not have + benchmarks for very large tables, so setting this factor conservatively to + be 5 times slower (ie the cost is 1.0). +*/ +#define DISK_TEMPTABLE_ROW_COST 1.0 +#endif #define MY_CHARSET_BIN_MB_MAXLEN 1 === modified file 'sql/sql_select.cc' --- a/sql/sql_select.cc 2011-10-24 14:01:11 +0000 +++ b/sql/sql_select.cc 2011-10-25 10:28:05 +0000 @@ -77,10 +77,6 @@ static int sort_keyuse(Key_use *a, Key_u static void set_position(JOIN *join, uint index, JOIN_TAB *table, Key_use *key); static bool create_ref_for_key(JOIN *join, JOIN_TAB *j, Key_use *org_keyuse, table_map used_tables); -static void best_access_path(JOIN *join, JOIN_TAB *s, - table_map remaining_tables, uint idx, - bool disable_jbuf, double record_count, - POSITION *pos, POSITION *loose_scan_pos); C_MODE_START static int join_tab_cmp(const void *, const void* ptr1, const void* ptr2); static int join_tab_cmp_straight(const void *, const void* ptr1, const void* ptr2); @@ -289,17 +285,13 @@ public: join->tables - join->const_tables)), prune_level(thd->variables.optimizer_prune_level), thd(thd), join(join), - cur_embedding_map(0), + cur_embedding_map(0), cur_sj_inner_tables(0), emb_sjm_nest(sjm_nest), excluded_tables(sjm_nest ? join->all_table_map & ~sjm_nest->sj_inner_tables : 0) - { - this->join->emb_sjm_nest= sjm_nest; - } + {} ~Optimize_table_order() - { - join->emb_sjm_nest= NULL; - } + {} /** Entry point to table join order optimization. For further description, see class header and private function headers. @@ -320,6 +312,17 @@ private: */ nested_join_map cur_embedding_map; /** + Bitmap of inner tables of semi-join nests that have a proper subset of + their tables in the current join prefix. That is, of those semi-join + nests that have their tables both in and outside of the join prefix. + */ + table_map cur_sj_inner_tables; + /** + If non-NULL, we are optimizing a materialized semi-join nest. + If NULL, we are optimizing a complete join plan. + */ + TABLE_LIST *const emb_sjm_nest; + /** When calculating a plan for a materialized semi-join nest, best_access_plan() needs to know not only the remaining tables within the semi-join nest, but also all tables outside of this nest, because there may @@ -328,18 +331,14 @@ private: @c excluded_tables tracks these tables. */ const table_map excluded_tables; - /** - @todo: Add remaining Join optimization state members here, - ie emb_sjm_nest, positions, cur_sj_inner_tables. - This is not yet possible because (some of them) are accessed through - best_access_path(), which is currently not part of this class. - */ + void best_access_path(JOIN_TAB *s, table_map remaining_tables, uint idx, + bool disable_jbuf, double record_count, + POSITION *pos, POSITION *loose_scan_pos); bool check_interleaving_with_nj(JOIN_TAB *next_tab); void advance_sj_state(table_map remaining_tables, const JOIN_TAB *tab, uint idx, - double *current_record_count, - double *current_read_time, + double *current_rowcount, double *current_cost, POSITION *loose_scan_pos); void backout_nj_sj_state(const table_map remaining_tables, const JOIN_TAB *tab); @@ -358,10 +357,22 @@ private: uint current_search_depth); void consider_complete_plan(uint idx, double record_count, double read_time, Opt_trace_object *trace_obj); - void optimize_wo_join_buffering(uint first_tab, uint last_tab, - table_map last_remaining_tables, - bool first_alt, uint no_jbuf_before, - double *reopt_rec_count, double *reopt_cost); + bool fix_semijoin_strategies(); + bool semijoin_firstmatch_loosescan_access_paths( + uint first_tab, uint last_tab, table_map remaining_tables, + bool loosescan, bool final, + double *newcount, double *newcost); + void semijoin_mat_scan_access_paths( + uint last_inner_tab, uint last_outer_tab, + table_map remaining_tables, TABLE_LIST *sjm_nest, bool final, + double *newcount, double *newcost); + void semijoin_mat_lookup_access_paths( + uint last_inner, TABLE_LIST *sjm_nest, + double *newcount, double *newcost); + void semijoin_dupsweedout_access_paths( + uint first_tab, uint last_tab, + table_map remaining_tables, + double *newcount, double *newcost); static uint determine_search_depth(uint search_depth, uint table_count); /** @@ -369,8 +380,6 @@ private: join_tab_cmp_embedded_first() as static private member functions here. This is currently not possible because they are fed to my_qsort2(), which requires C linkage. - @todo: Add best_access_path() as private member function within this class. - Currently not possible because it is called from outside this class. */ }; @@ -1686,7 +1695,7 @@ bool might_do_join_buffering(uint join_b It may be possible to consolidate the materialization strategies into one. The choice between the strategies is made by the join optimizer (see - advance_sj_state() and fix_semijoin_strategies_for_picked_join_order()). + advance_sj_state() and fix_semijoin_strategies()). This function sets up all fields/structures/etc needed for execution except for setup/initialization of semi-join materialization which is done in setup_sj_materialization() (todo: can't we move that to here also?) @@ -5744,13 +5753,13 @@ static bool optimize_semijoin_nests(JOIN save it. */ const uint n_tables= my_count_bits(sj_nest->sj_inner_tables); - double subjoin_out_rows, subjoin_read_time; - get_partial_join_cost(join, n_tables, - &subjoin_read_time, &subjoin_out_rows); - sj_nest->nested_join->sjm.materialization_cost - .convert_from_cost(subjoin_read_time); - sj_nest->nested_join->sjm.expected_rowcount= subjoin_out_rows; + double sjm_cost; // Estimated cost of semi-join materialization + double sjm_rowcount; // Estimated row count when executed as join + double distinct_rowcount; // Estimated rowcount after duplicate removal + + get_partial_join_cost(join, n_tables, + &sjm_cost, &sjm_rowcount); List &inner_expr_list= sj_nest->nested_join->sj_inner_exprs; /* @@ -5785,44 +5794,54 @@ static bool optimize_semijoin_nests(JOIN double rows= 1.0; while ((tableno = tm_it.next_bit()) != Table_map_iterator::BITMAP_END) rows *= join->map2table[tableno]->table->quick_condition_rows; - sj_nest->nested_join->sjm.expected_rowcount= - min(sj_nest->nested_join->sjm.expected_rowcount, rows); + distinct_rowcount= min(sjm_rowcount, rows); } - if (!(sj_nest->nested_join->sjm.positions= - (st_position*)join->thd->alloc(sizeof(st_position)*n_tables))) - DBUG_RETURN(true); - - memcpy(sj_nest->nested_join->sjm.positions, - join->best_positions + join->const_tables, - sizeof(st_position) * n_tables); - /* Calculate temporary table parameters and usage costs */ uint rowlen= get_tmp_table_rec_length(inner_expr_list); - double lookup_cost; - if (rowlen * subjoin_out_rows< join->thd->variables.max_heap_table_size) - lookup_cost= HEAP_TEMPTABLE_LOOKUP_COST; - else - lookup_cost= DISK_TEMPTABLE_LOOKUP_COST; + double row_cost; // The cost to write or lookup a row in temp. table + double create_cost; // The cost to create a temporary table + // @todo: Size of temp table should be distinct_rowcount + if (rowlen * sjm_rowcount < join->thd->variables.max_heap_table_size) + { + row_cost= HEAP_TEMPTABLE_ROW_COST; + create_cost= HEAP_TEMPTABLE_CREATE_COST; + } + else + { + row_cost= DISK_TEMPTABLE_ROW_COST; + create_cost= DISK_TEMPTABLE_CREATE_COST; + } /* - Let materialization cost include the cost to write the data into the - temporary table: + Let materialization cost include the cost to create the temporary + table and write the rows into it: */ - sj_nest->nested_join->sjm.materialization_cost - .add_io(subjoin_out_rows, lookup_cost); - + sjm_cost+= create_cost + (sjm_rowcount * row_cost); /* Set the cost to do a full scan of the temptable (will need this to consider doing sjm-scan): */ - sj_nest->nested_join->sjm.scan_cost.zero(); - if (sj_nest->nested_join->sjm.expected_rowcount > 0.0) - sj_nest->nested_join->sjm.scan_cost - .add_io(sj_nest->nested_join->sjm.expected_rowcount, lookup_cost); + double scan_cost= distinct_rowcount * row_cost; - sj_nest->nested_join->sjm.lookup_cost.convert_from_cost(lookup_cost); + /* + Create the semi-join materialization object and populate it + with the selected plan and cost data: + */ + if (!(sj_nest->nested_join->sjm.positions= + (st_position*)join->thd->alloc(sizeof(st_position)*n_tables))) + DBUG_RETURN(true); + + memcpy(sj_nest->nested_join->sjm.positions, + join->best_positions + join->const_tables, + sizeof(st_position) * n_tables); + + sj_nest->nested_join->sjm.expected_rowcount= distinct_rowcount; + sj_nest->nested_join->sjm.materialization_cost + .convert_from_cost(sjm_cost); + sj_nest->nested_join->sjm.scan_cost.convert_from_cost(scan_cost); + sj_nest->nested_join->sjm.lookup_cost.convert_from_cost(row_cost); } } } @@ -7340,7 +7359,8 @@ public: #endif } - void init(JOIN *join, JOIN_TAB *s, table_map remaining_tables) + void init(JOIN_TAB *s, table_map remaining_tables, + table_map cur_sj_inner_tables, bool complete_query) { /* Discover the bound equalities. We need to do this if @@ -7355,15 +7375,15 @@ public: FirstMatch strategy) */ best_loose_scan_cost= DBL_MAX; - if (s->emb_sj_nest && !join->emb_sjm_nest && // (1) + if (s->emb_sj_nest && complete_query && // (1) s->emb_sj_nest->nested_join->sj_inner_exprs.elements < 64 && ((remaining_tables & s->emb_sj_nest->sj_inner_tables) == // (2) s->emb_sj_nest->sj_inner_tables) && // (2) - join->cur_sj_inner_tables == 0 && // (3) + cur_sj_inner_tables == 0 && // (3) !(remaining_tables & s->emb_sj_nest->nested_join->sj_corr_tables) && // (4) (remaining_tables & s->emb_sj_nest->nested_join->sj_depends_on) &&// (5) - join->thd->optimizer_switch_flag(OPTIMIZER_SWITCH_LOOSE_SCAN)) + s->join->thd->optimizer_switch_flag(OPTIMIZER_SWITCH_LOOSE_SCAN)) { /* This table is an LooseScan scan candidate */ bound_sj_equalities= get_bound_sj_equalities(s->emb_sj_nest, @@ -7544,25 +7564,9 @@ public: 'join->positions' of length 'idx'. The chosen access method for 's' and its cost are stored in 'join->positions[idx]'. - Notice also that when calculating a plan for a materialized semijoin nest, - this function must not consider key references between tables inside the - semijoin nest and those outside of it. The way to control this is to add - the set of outside tables to the @c remaining_tables argument. - - @todo: Add this function to class Optimize_table_order. When this is done, - best_access_path() will have access to the excluded_tables member, and - there will no longer be a need to pass that set of tables as part of - remaining_tables. Notice also that excluded_tables can be non-empty only - at those places where it is actually passed as argument. - - @param join pointer to the structure providing all context info - for the query @param s the table to be joined by the function @param thd thread for the connection that submitted the query @param remaining_tables set of tables not included in the partial plan yet. - Additionally, when calculating a plan for a - materialized semijoin nest, this must also contain - all tables outside of the semijoin nest. @param idx the length of the partial plan @param disable_jbuf TRUE<=> Don't use join buffering @param record_count estimate for the number of records returned by the @@ -7572,8 +7576,7 @@ public: DBL_MAX if not possible. */ -static void -best_access_path(JOIN *join, +void Optimize_table_order::best_access_path( JOIN_TAB *s, table_map remaining_tables, uint idx, @@ -7582,7 +7585,6 @@ best_access_path(JOIN *join, POSITION *pos, POSITION *loose_scan_pos) { - THD *const thd= join->thd; Key_use *best_key= NULL; uint best_max_key_part= 0; bool found_constraint= false; @@ -7612,12 +7614,13 @@ best_access_path(JOIN *join, !thd->optimizer_switch_flag(OPTIMIZER_SWITCH_BNL); // 2 Loose_scan_opt loose_scan_opt; - DBUG_ENTER("best_access_path"); + DBUG_ENTER("Optimize_table_order::best_access_path"); Opt_trace_object trace_wrapper(trace, "best_access_path"); Opt_trace_array trace_paths(trace, "considered_access_paths"); - loose_scan_opt.init(join, s, remaining_tables); + loose_scan_opt.init(s, remaining_tables, cur_sj_inner_tables, + emb_sjm_nest == NULL); /* This isn't unlikely at all, but unlikely() cuts 6% CPU time on a 20-table @@ -7669,10 +7672,15 @@ best_access_path(JOIN *join, do /* For each way to access the keypart */ { /* + When calculating a plan for a materialized semijoin nest, + we must not consider key references between tables inside the + semijoin nest and those outside of it. This is handled by adding + excluded_tables to remaining_tables below. + if 1. expression doesn't refer to forward tables 2. we won't get two ref-or-null's */ - if (!(remaining_tables & keyuse->used_tables) && + if (!((remaining_tables | excluded_tables) & keyuse->used_tables) && !(ref_or_null_part && (keyuse->optimize & KEY_OPTIMIZE_REF_OR_NULL))) { @@ -8246,13 +8254,13 @@ bool Optimize_table_order::choose_table_ const bool straight_join= test(join->select_options & SELECT_STRAIGHT_JOIN); table_map join_tables; ///< The tables involved in order selection - if (join->emb_sjm_nest) + if (emb_sjm_nest) { /* We're optimizing semi-join materialization nest, so put the tables from this semi-join as first */ jtab_sort_func= join_tab_cmp_embedded_first; - join_tables= join->emb_sjm_nest->sj_inner_tables; + join_tables= emb_sjm_nest->sj_inner_tables; } else { @@ -8269,8 +8277,7 @@ bool Optimize_table_order::choose_table_ } my_qsort2(join->best_ref + join->const_tables, join->tables - join->const_tables, sizeof(JOIN_TAB*), - jtab_sort_func, (void*)join->emb_sjm_nest); - join->cur_sj_inner_tables= 0; + jtab_sort_func, (void*)emb_sjm_nest); Opt_trace_object wrapper(&join->thd->opt_trace); Opt_trace_array @@ -8284,6 +8291,15 @@ bool Optimize_table_order::choose_table_ DBUG_RETURN(true); } + DBUG_ASSERT(cur_sj_inner_tables == 0); // Terminate without any semijoin nests + + // Remaining part of this function not needed when processing semi-join nests. + if (emb_sjm_nest) + DBUG_RETURN(false); + + // Fix semi-join strategies and perform final cost calculation. + if (fix_semijoin_strategies()) + DBUG_RETURN(true); /* Store the cost of this query into a user variable Don't update last_query_cost for statements that are not "flat joins" : @@ -8545,8 +8561,7 @@ void Optimize_table_order::optimize_stra DBUG_ASSERT(!check_interleaving_with_nj(s)); /* Find the best access method from 's' to the current partial plan */ POSITION loose_scan_pos; - best_access_path(join, s, excluded_tables | join_tables, - idx, FALSE, record_count, + best_access_path(s, join_tables, idx, false, record_count, join->positions + idx, &loose_scan_pos); /* compute the cost of the new plan extended with 's' */ @@ -9080,8 +9095,7 @@ bool Optimize_table_order::best_extensio 1) there are no semijoin nests or 2) we are optimizing a materialized semijoin nest. */ - const bool has_sj= - !(join->select_lex->sj_nests.is_empty() || join->emb_sjm_nest); + const bool has_sj= !(join->select_lex->sj_nests.is_empty() || emb_sjm_nest); /* 'eq_ref_extended' are the 'remaining_tables' which has already been @@ -9120,8 +9134,7 @@ bool Optimize_table_order::best_extensio /* Find the best access method from 's' to the current partial plan */ POSITION loose_scan_pos; - best_access_path(join, s, excluded_tables | remaining_tables, - idx, false, record_count, + best_access_path(s, remaining_tables, idx, false, record_count, position, &loose_scan_pos); /* Compute the cost of extending the plan with 's' */ @@ -9403,8 +9416,7 @@ table_map Optimize_table_order::eq_ref_e if (remaining_tables == 0) DBUG_RETURN(0); - const bool has_sj= - !(join->select_lex->sj_nests.is_empty() || join->emb_sjm_nest); + const bool has_sj= !(join->select_lex->sj_nests.is_empty() || emb_sjm_nest); /* The section below adds 'eq_ref' joinable tables to the QEP in the order @@ -9449,8 +9461,7 @@ table_map Optimize_table_order::eq_ref_e POSITION loose_scan_pos; /* Find the best access method from 's' to the current partial plan */ - best_access_path(join, s, excluded_tables | remaining_tables, - idx, FALSE, record_count, + best_access_path(s, remaining_tables, idx, false, record_count, position, &loose_scan_pos); /* @@ -9719,8 +9730,6 @@ prev_record_reads(JOIN *join, uint idx, /** @brief Fix semi-join strategies for the picked join order - @param join Pointer to JOIN object with picked join order - @return FALSE if success, TRUE if error @details @@ -9758,23 +9767,25 @@ prev_record_reads(JOIN *join, uint idx, join and semi-join order from left to right. */ -static bool fix_semijoin_strategies_for_picked_join_order(JOIN *join) +bool Optimize_table_order::fix_semijoin_strategies() { table_map remaining_tables= 0; table_map handled_tabs= 0; - Opt_trace_context * const trace= &join->thd->opt_trace; - DBUG_ENTER("fix_semijoin_strategies_for_picked_join_order"); + + DBUG_ENTER("Optimize_table_order::fix_semijoin_strategies"); if (join->select_lex->sj_nests.is_empty()) - DBUG_RETURN(FALSE); + DBUG_RETURN(false); + + Opt_trace_context *const trace= &thd->opt_trace; for (uint tableno= join->tables - 1; tableno != join->const_tables - 1; tableno--) { - POSITION *pos= join->best_positions + tableno; - JOIN_TAB *s= pos->table; - TABLE_LIST *emb_sj_nest= s->emb_sj_nest; + POSITION *const pos= join->best_positions + tableno; + JOIN_TAB *const s= pos->table; + TABLE_LIST *const emb_sj_nest= s->emb_sj_nest; uint first; LINT_INIT(first); // Set by every branch except SJ_OPT_NONE which doesn't use it @@ -9789,8 +9800,8 @@ static bool fix_semijoin_strategies_for_ const uint table_count= my_count_bits(emb_sj_nest->sj_inner_tables); Semijoin_mat_exec* sjm_exec; if (!(sjm_exec= new (join->thd->mem_root) - Semijoin_mat_exec(table_count, FALSE))) - DBUG_RETURN(TRUE); + Semijoin_mat_exec(table_count, false))) + DBUG_RETURN(true); emb_sj_nest->sj_mat_exec= sjm_exec; /* This memcpy() copies a partial QEP produced by @@ -9813,134 +9824,72 @@ static bool fix_semijoin_strategies_for_ join->best_positions[first].n_sj_tables= table_count; join->best_positions[first].sj_strategy= SJ_OPT_MATERIALIZE_LOOKUP; + Opt_trace_object trace_final_strategy(trace); + trace_final_strategy.add_alnum("final_semijoin_strategy", + "MaterializeLookup"); DBUG_EXECUTE("opt", print_sjm(emb_sj_nest);); } else if (pos->sj_strategy == SJ_OPT_MATERIALIZE_SCAN) { - POSITION *first_inner= join->best_positions + pos->sjm_scan_last_inner; - TABLE_LIST *mat_sj_nest= first_inner->table->emb_sj_nest; - const uint table_count= my_count_bits(mat_sj_nest->sj_inner_tables); + const uint last_inner= pos->sjm_scan_last_inner; + TABLE_LIST *const sjm_nest= + (join->best_positions + last_inner)->table->emb_sj_nest; + const uint table_count= my_count_bits(sjm_nest->sj_inner_tables); + first= last_inner - table_count + 1; Semijoin_mat_exec* sjm_exec; if (!(sjm_exec= new (join->thd->mem_root) - Semijoin_mat_exec(table_count, TRUE))) - DBUG_RETURN(TRUE); - mat_sj_nest->sj_mat_exec= sjm_exec; - first= pos->sjm_scan_last_inner - table_count + 1; + Semijoin_mat_exec(table_count, true))) + DBUG_RETURN(true); + sjm_nest->sj_mat_exec= sjm_exec; memcpy(join->best_positions + first, // stale semijoin strategy here too - mat_sj_nest->nested_join->sjm.positions, + sjm_nest->nested_join->sjm.positions, sizeof(POSITION) * table_count); join->best_positions[first].sj_strategy= SJ_OPT_MATERIALIZE_SCAN; join->best_positions[first].n_sj_tables= table_count; - /* - Do what advance_sj_state did: re-run best_access_path for every table - in the [last_inner_table + 1; pos..) range - */ - double prefix_rec_count; - /* Get the prefix record count */ - if (first == join->const_tables) - prefix_rec_count= 1.0; - else - prefix_rec_count= join->best_positions[first-1].prefix_record_count; - - /* Add materialization record count*/ - prefix_rec_count *= mat_sj_nest->nested_join->sjm.expected_rowcount; - - table_map rem_tables= remaining_tables; - for (uint i= tableno; i != (first + table_count - 1); i--) - rem_tables |= join->best_positions[i].table->table->map; - - POSITION dummy; - join->cur_sj_inner_tables= 0; - for (uint i= first + table_count; i <= tableno; i++) - { - Opt_trace_object oto0(trace); - Opt_trace_object oto1(trace, - "reconsidering_access_paths_for_semijoin"); - oto1.add_alnum("strategy", "MaterializationScan"). - add_utf8_table(join->best_positions[i].table->table); - best_access_path(join, join->best_positions[i].table, rem_tables, i, FALSE, - prefix_rec_count, join->best_positions + i, &dummy); - prefix_rec_count *= join->best_positions[i].records_read; - rem_tables &= ~join->best_positions[i].table->table->map; - } - DBUG_EXECUTE("opt", print_sjm(mat_sj_nest);); + Opt_trace_object trace_final_strategy(trace); + trace_final_strategy.add_alnum("final_semijoin_strategy", + "MaterializeScan"); + // Recalculate final access paths for this semi-join strategy + double rowcount, cost; + semijoin_mat_scan_access_paths(last_inner, tableno, + remaining_tables, sjm_nest, true, + &rowcount, &cost); + + DBUG_EXECUTE("opt", print_sjm(sjm_nest);); } else if (pos->sj_strategy == SJ_OPT_FIRST_MATCH) { first= pos->first_firstmatch_table; join->best_positions[first].sj_strategy= SJ_OPT_FIRST_MATCH; join->best_positions[first].n_sj_tables= tableno - first + 1; - POSITION dummy; // For loose scan paths - double record_count= (first== join->const_tables)? 1.0: - join->best_positions[tableno - 1].prefix_record_count; - - table_map rem_tables= remaining_tables; - for (uint idx= first; idx <= tableno; idx++) - { - rem_tables |= join->best_positions[idx].table->table->map; - } - /* - Re-run best_access_path to produce best access methods that do not use - join buffering - */ - join->cur_sj_inner_tables= 0; - for (uint idx= first; idx <= tableno; idx++) - { - if (join->best_positions[idx].use_join_buffer) - { - Opt_trace_object oto0(trace); - Opt_trace_object oto1(trace, - "reconsidering_access_paths_for_semijoin"); - oto1.add_alnum("strategy", "FirstMatch"). - add_utf8_table(join->best_positions[idx].table->table); - best_access_path(join, join->best_positions[idx].table, - rem_tables, idx, TRUE /* no jbuf */, - record_count, join->best_positions + idx, &dummy); - } - record_count *= join->best_positions[idx].records_read; - rem_tables &= ~join->best_positions[idx].table->table->map; - } + Opt_trace_object trace_final_strategy(trace); + trace_final_strategy.add_alnum("final_semijoin_strategy", "FirstMatch"); + + // Recalculate final access paths for this semi-join strategy + double rowcount, cost; + (void)semijoin_firstmatch_loosescan_access_paths(first, tableno, + remaining_tables, false, true, + &rowcount, &cost); } else if (pos->sj_strategy == SJ_OPT_LOOSE_SCAN) { first= pos->first_loosescan_table; - POSITION *first_pos= join->best_positions + first; - POSITION loose_scan_pos; // For loose scan paths - double record_count= (first== join->const_tables)? 1.0: - join->best_positions[tableno - 1].prefix_record_count; - - table_map rem_tables= remaining_tables; - for (uint idx= first; idx <= tableno; idx++) - rem_tables |= join->best_positions[idx].table->table->map; - /* - Re-run best_access_path to produce best access methods that do not use - join buffering - */ - join->cur_sj_inner_tables= 0; - for (uint idx= first; idx <= tableno; idx++) - { - if (join->best_positions[idx].use_join_buffer || (idx == first)) - { - Opt_trace_object oto0(trace); - Opt_trace_object oto1(trace, - "reconsidering_access_paths_for_semijoin"); - oto1.add_alnum("strategy", "LooseScan"). - add_utf8_table(join->best_positions[idx].table->table); - best_access_path(join, join->best_positions[idx].table, - rem_tables, idx, TRUE /* no jbuf */, - record_count, join->best_positions + idx, - &loose_scan_pos); - if (idx==first) - join->best_positions[idx]= loose_scan_pos; - } - rem_tables &= ~join->best_positions[idx].table->table->map; - record_count *= join->best_positions[idx].records_read; - } + Opt_trace_object trace_final_strategy(trace); + trace_final_strategy.add_alnum("final_semijoin_strategy", "LooseScan"); + + // Recalculate final access paths for this semi-join strategy + double rowcount, cost; + (void)semijoin_firstmatch_loosescan_access_paths(first, tableno, + remaining_tables, true, true, + &rowcount, &cost); + + POSITION *const first_pos= join->best_positions + first; first_pos->sj_strategy= SJ_OPT_LOOSE_SCAN; - first_pos->n_sj_tables= my_count_bits(first_pos->table->emb_sj_nest->sj_inner_tables); + first_pos->n_sj_tables= + my_count_bits(first_pos->table->emb_sj_nest->sj_inner_tables); } else if (pos->sj_strategy == SJ_OPT_DUPS_WEEDOUT) { @@ -9951,6 +9900,10 @@ static bool fix_semijoin_strategies_for_ first= pos->first_dupsweedout_table; join->best_positions[first].sj_strategy= SJ_OPT_DUPS_WEEDOUT; join->best_positions[first].n_sj_tables= tableno - first + 1; + + Opt_trace_object trace_final_strategy(trace); + trace_final_strategy.add_alnum("final_semijoin_strategy", + "DuplicateWeedout"); } uint i_end= first + join->best_positions[first].n_sj_tables; @@ -9991,13 +9944,11 @@ static bool fix_semijoin_strategies_for_ - create data structures describing ref access methods. */ + bool JOIN::get_best_combination() { DBUG_ENTER("JOIN::get_best_combination"); - if (fix_semijoin_strategies_for_picked_join_order(this)) - DBUG_RETURN(true); - if (!(join_tab= new (thd->mem_root) JOIN_TAB[tables])) DBUG_RETURN(true); @@ -11978,7 +11929,7 @@ static bool setup_join_buffering(JOIN_TA COST_VECT cost; ha_rows rows; uint bufsz= 4096; - JOIN_CACHE *prev_cache=0; + JOIN_CACHE *prev_cache= NULL; const bool bnl_on= join->thd->optimizer_switch_flag(OPTIMIZER_SWITCH_BNL); const bool bka_on= join->thd->optimizer_switch_flag(OPTIMIZER_SWITCH_BKA); const uint tableno= tab - join->join_tab; @@ -12022,6 +11973,7 @@ static bool setup_join_buffering(JOIN_TA if (tab->first_upper != NULL && !tab->first_upper->use_join_cache) goto no_join_cache; + /* Use join cache with FirstMatch semi-join strategy only when semi-join contains only one table. @@ -15329,103 +15281,372 @@ bool Optimize_table_order::check_interle /** - Change access methods not to use join buffering and adjust costs accordingly + Find best access paths for semi-join FirstMatch or LooseScan strategy + and calculate rowcount and cost based on these. - @param first_tab The first tab to do re-optimization for - @param last_tab The last tab to do re-optimization for - @param last_remaining_tables Bitmap of tables that are not in the - [0...last_tab] join prefix - @param first_alt TRUE <=> Use the LooseScan plan for the first_tab - @param no_jbuf_before Don not allow join buffering before this table - @param[out] reopt_rec_count New output record count - DBL_MAX if we could find no plan. - @param[out] reopt_cost New join prefix cost - DBL_MAX if we could find no plan. + @param first_tab The first tab to calculate access paths for, + this is always a semi-join inner table. + @param last_tab The last tab to calculate access paths for, + this is always a semi-join inner table. + @param remaining_tables Bitmap of tables that are not in the + [0...last_tab] join prefix + @param loosescan If true, use LooseScan strategy, otherwise FirstMatch + @param final If true, use and update access path data in + join->best_positions, otherwise use join->positions + and update a local buffer. + @param[out] rowcount New output row count + @param[out] newcost New join prefix cost - @details - Given a join prefix [0; ... first_tab], change the access to the tables - in the [first_tab; last_tab] not to use join buffering. This is needed - because some semi-join strategies cannot be used together with the join - buffering. - In general case the best table order in [first_tab; last_tab] range with - join buffering is different from the best order without join buffering but - we don't try finding a better join order. (TODO ask Igor why did we - chose not to do this in the end. that's actually the difference from the - forking approach) -*/ + @return True if strategy selection successful, false otherwise. -void Optimize_table_order::optimize_wo_join_buffering( - uint first_tab, uint last_tab, - table_map last_remaining_tables, - bool first_alt, uint no_jbuf_before, - double *reopt_rec_count, double *reopt_cost) + @details + Calculate best access paths for the tables of a semi-join FirstMatch or + LooseScan strategy, given the order of tables provided in join->positions + (or join->best_positions when calculating the cost of a final plan). + Calculate estimated cost and rowcount for this plan. + Given a join prefix [0; ... first_tab-1], change the access to the tables + in the range [first_tab; last_tab] according to the constraints set by the + relevant semi-join strategy. Those constraints are: + + - For the LooseScan strategy, join buffering cannot be used for the inner + tables, only for the outer tables. + + - For the FirstMatch strategy, join buffering can be used if there is a + single inner table in the semi-join nest. + + For FirstMatch, the handled range of tables may be a mix of inner tables + and non-dependent outer tables. The first and last table in the handled + range are always inner tables. +*/ + +bool Optimize_table_order::semijoin_firstmatch_loosescan_access_paths( + uint first_tab, uint last_tab, table_map remaining_tables, + bool loosescan, bool final, + double *newcount, double *newcost) { + DBUG_ENTER( + "Optimize_table_order::semijoin_firstmatch_loosescan_access_paths"); double cost, outer_fanout, inner_fanout= 1.0; - table_map reopt_remaining_tables= last_remaining_tables; - Opt_trace_context * const trace= &join->thd->opt_trace; - DBUG_ENTER("Optimize_table_order::optimize_wo_join_buffering"); - - Opt_trace_object trace_recompute(trace, "recompute_best_access_paths"); - trace_recompute.add_alnum("cause", "join_buffering_not_possible"); + Opt_trace_context *const trace= &thd->opt_trace; + Opt_trace_object recalculate(trace, "recalculate_access_paths_and_cost"); Opt_trace_array trace_tables(trace, "tables"); + POSITION *const positions= final ? join->best_positions : join->positions; + if (first_tab > join->const_tables) { - cost= join->positions[first_tab - 1].prefix_cost.total_cost(); - outer_fanout= join->positions[first_tab - 1].prefix_record_count; + cost= positions[first_tab - 1].prefix_cost.total_cost(); + outer_fanout= positions[first_tab - 1].prefix_record_count; } else { - cost= 0.0; + cost= 0.0; outer_fanout= 1.0; } + /* + LooseScan: May use join buffering for all tables after last inner table. + FirstMatch: May use join buffering if there is only one inner table. + */ + const uint table_count= + my_count_bits(positions[first_tab].table->emb_sj_nest->sj_inner_tables); + uint no_jbuf_before; + if (loosescan) + { + for (no_jbuf_before= last_tab; no_jbuf_before > first_tab; no_jbuf_before--) + { + if (positions[no_jbuf_before].table->emb_sj_nest != NULL) + break; // Encountered the last inner table? + } + no_jbuf_before++; + } + else + no_jbuf_before= (table_count > 1) ? last_tab + 1 : first_tab; + + // @todo: Remove this bug preservation + if (!loosescan && !final) + no_jbuf_before= last_tab; + else if (!loosescan && final) + no_jbuf_before= last_tab + 1; + else if (loosescan && !final) + no_jbuf_before= first_tab + table_count; + else if (loosescan && final) + no_jbuf_before= last_tab + 1; for (uint i= first_tab; i <= last_tab; i++) - reopt_remaining_tables |= join->positions[i].table->table->map; + remaining_tables|= positions[i].table->table->map; for (uint i= first_tab; i <= last_tab; i++) { - JOIN_TAB *rs= join->positions[i].table; - POSITION pos, loose_scan_pos; - - if ((i == first_tab && first_alt) || join->positions[i].use_join_buffer) + JOIN_TAB *const tab= positions[i].table; + POSITION regular_pos, loose_scan_pos; + POSITION *const dst_pos= final ? positions + i : ®ular_pos; + POSITION *pos; // Position for later calculations + /* + We always need a new calculation for the first inner table in + the LooseScan strategy. Notice the use of loose_scan_pos. + */ + if ((i == first_tab && loosescan) || positions[i].use_join_buffer) { - /* Find the best access method that would not use join buffering */ Opt_trace_object trace_one_table(trace); - trace_one_table.add_utf8_table(rs->table); - best_access_path(join, rs, reopt_remaining_tables, i, + trace_one_table.add_utf8_table(tab->table); + + // Find the best access method with specified join buffering strategy. + best_access_path(tab, remaining_tables, i, i < no_jbuf_before, inner_fanout * outer_fanout, - &pos, &loose_scan_pos); + dst_pos, &loose_scan_pos); + if (i == first_tab && loosescan) // Use loose scan position + *dst_pos= loose_scan_pos; + pos= dst_pos; } else - pos= join->positions[i]; - - if (i == first_tab && first_alt) - pos= loose_scan_pos; + pos= positions + i; // Use result from prior calculation /* Terminate search if best_access_path found no possible plan. Otherwise we will be getting infinite cost when summing up below. */ - if (pos.read_time == DBL_MAX) + if (pos->read_time == DBL_MAX) { - *reopt_rec_count= DBL_MAX; - *reopt_cost= DBL_MAX; - DBUG_VOID_RETURN; + DBUG_ASSERT(loosescan && !final); + DBUG_RETURN(false); } - reopt_remaining_tables &= ~rs->table->map; - cost += pos.read_time; + remaining_tables&= ~tab->table->map; - if (rs->emb_sj_nest) - inner_fanout *= pos.records_read; + if (tab->emb_sj_nest) + inner_fanout*= pos->records_read; else - outer_fanout *= pos.records_read; + outer_fanout*= pos->records_read; + + cost+= pos->read_time; // @todo: Bug preserving + } + + *newcount= outer_fanout; + *newcost= cost; + + DBUG_RETURN(true); +} + + +/** + Find best access paths for semi-join MaterializeScan strategy + and calculate rowcount and cost based on these. + + @param last_inner_tab The last tab in the set of inner tables + @param last_outer_tab The last tab in the set of outer tables + @param remaining_tables Bitmap of tables that are not in the join prefix + including the inner and outer tables processed here. + @param sjm_nest Pointer to semi-join nest for inner tables + @param final If true, use and update access path data in + join->best_positions, otherwise use join->positions + and update a local buffer. + @param[out] rowcount New output row count + @param[out] newcost New join prefix cost + + @details + Calculate best access paths for the outer tables of the MaterializeScan + semi-join strategy. All outer tables may use join buffering. + The prefix row count is adjusted with the estimated number of rows in + the materialized tables, before taking into consideration the rows + contributed by the outer tables. +*/ + +void Optimize_table_order::semijoin_mat_scan_access_paths( + uint last_inner_tab, uint last_outer_tab, + table_map remaining_tables, TABLE_LIST *sjm_nest, bool final, + double *newcount, double *newcost) +{ + DBUG_ENTER("Optimize_table_order::semijoin_mat_scan_access_paths"); + + Opt_trace_context *const trace= &thd->opt_trace; + Opt_trace_object recalculate(trace, "recalculate_access_paths_and_cost"); + Opt_trace_array trace_tables(trace, "tables"); + double cost, rowcount; + POSITION *const positions= final ? join->best_positions : join->positions; + const uint inner_count= my_count_bits(sjm_nest->sj_inner_tables); + + // Get the prefix cost. + if (last_inner_tab < join->const_tables + inner_count) + { + rowcount= 1.0; + cost= 0.0; + } + else + { + rowcount= positions[last_inner_tab - inner_count].prefix_record_count; + cost= positions[last_inner_tab - inner_count].prefix_cost.total_cost(); } - *reopt_rec_count= outer_fanout; - *reopt_cost= cost; + // Add materialization cost. + cost+= sjm_nest->nested_join->sjm.materialization_cost.total_cost() + + rowcount * sjm_nest->nested_join->sjm.scan_cost.total_cost(); + rowcount*= sjm_nest->nested_join->sjm.expected_rowcount; + + for (uint i= last_inner_tab + 1; i <= last_outer_tab; i++) + remaining_tables|= positions[i].table->table->map; + + // Need to rerun best_access_path as rowcount of join prefix has changed. + for (uint i= last_inner_tab + 1; i <= last_outer_tab; i++) + { + Opt_trace_object trace_one_table(trace); + JOIN_TAB *const tab= positions[i].table; + trace_one_table.add_utf8_table(tab->table); + POSITION regular_pos, dummy; + POSITION *const dst_pos= final ? positions + i : ®ular_pos; + best_access_path(tab, remaining_tables, i, false, + rowcount, dst_pos, &dummy); + remaining_tables&= ~tab->table->map; + rowcount*= dst_pos->records_read; + cost+= dst_pos->read_time; // @todo: Bug preserving + } + + *newcount= rowcount; + *newcost= cost; + + DBUG_VOID_RETURN; +} + + +/** + Find best access paths for semi-join MaterializeLookup strategy. + and calculate rowcount and cost based on these. + + @param last_inner Index of the last inner table + @param sjm_nest Pointer to semi-join nest for inner tables + @param[out] rowcount New output row count + @param[out] newcost New join prefix cost + + @details + All outer tables may use join buffering, so there is no need to recalculate + access paths nor costs for these. + Add cost of materialization and scanning the materialized table to the + costs of accessing the outer tables. +*/ + +void Optimize_table_order::semijoin_mat_lookup_access_paths( + uint last_inner, TABLE_LIST *sjm_nest, + double *newcount, double *newcost) +{ + DBUG_ENTER("Optimize_table_order::semijoin_mat_lookup_access_paths"); + + const uint inner_count= my_count_bits(sjm_nest->sj_inner_tables); + double rowcount, cost; + + if (last_inner < join->const_tables + inner_count) + { + cost= 0.0; + rowcount= 1.0; + } + else + { + cost= join->positions[last_inner-inner_count].prefix_cost.total_cost(); + rowcount= join->positions[last_inner-inner_count].prefix_record_count; + } + + cost+= sjm_nest->nested_join->sjm.materialization_cost.total_cost() + + rowcount * sjm_nest->nested_join->sjm.lookup_cost.total_cost(); + + *newcount= rowcount; + *newcost= cost; + + DBUG_VOID_RETURN; +} + + +/** + Find best access paths for semi-join DuplicateWeedout strategy + and calculate rowcount and cost based on these. + + @param first_tab The first tab to calculate access paths for + @param last_tab The last tab to calculate access paths for + @param remaining_tables Bitmap of tables that are not in the + [0...last_tab] join prefix + @param[out] newcount New output row count + @param[out] newcost New join prefix cost + + @return True if strategy selection successful, false otherwise. + + @details + Notice that new best access paths need not be calculated. + The proper access path information is already in join->positions, + because DuplicateWeedout can handle any join buffering strategy. + The only action performed by this function is to calculate + output rowcount, and an updated cost estimate. + + The cost estimate is based on performing a join over the involved + tables, but we must also add the cost of creating and populating + the temporary table used for duplicate removal, and the cost of + doing lookups against this table. +*/ + +void Optimize_table_order::semijoin_dupsweedout_access_paths( + uint first_tab, uint last_tab, + table_map remaining_tables, + double *newcount, double *newcost) +{ + DBUG_ENTER("Optimize_table_order::semijoin_dupsweedout_access_paths"); + + double cost, rowcount; + double inner_fanout= 1.0; + double outer_fanout= 1.0; + uint rowsize; // Row size of the temporary table + if (first_tab == join->const_tables) + { + cost= 0.0; + rowcount= 1.0; + rowsize= 0; + } + else + { + cost= join->positions[first_tab - 1].prefix_cost.total_cost(); + rowcount= join->positions[first_tab - 1].prefix_record_count; + rowsize= 8; // This is not true but we'll make it so + } + + for (uint j= first_tab; j <= last_tab; j++) + { + const POSITION *const p= join->positions + j; + if (p->table->emb_sj_nest) + { + inner_fanout*= p->records_read; + } + else + { + outer_fanout*= p->records_read; + + rowsize+= p->table->table->file->ref_length; + } + cost+= p->read_time; // @todo bug preserving + } + + /* + Add the cost of temptable use. The table will have outer_fanout rows, + and we will make + - rowcount * outer_fanout writes + - rowcount * inner_fanout * outer_fanout lookups. + We assume here that a lookup and a write has the same cost. + */ + double one_lookup_cost, create_cost; + if (outer_fanout * rowsize > thd->variables.max_heap_table_size) + { + one_lookup_cost= DISK_TEMPTABLE_ROW_COST; + create_cost= DISK_TEMPTABLE_CREATE_COST; + } + else + { + one_lookup_cost= HEAP_TEMPTABLE_ROW_COST; + create_cost= HEAP_TEMPTABLE_CREATE_COST; + } + double write_cost= join->positions[first_tab].prefix_record_count * + outer_fanout * one_lookup_cost; + double full_lookup_cost= join->positions[first_tab].prefix_record_count * + outer_fanout* inner_fanout * one_lookup_cost; + cost+= create_cost + write_cost + full_lookup_cost; + + *newcount= rowcount * outer_fanout; + *newcost= cost; + DBUG_VOID_RETURN; } @@ -15437,8 +15658,8 @@ void Optimize_table_order::optimize_wo_j @param new_join_tab Join tab that we are adding to the join prefix @param idx Index of this join tab (i.e. number of tables in the prefix) - @param[in,out] current_record_count Estimate of #rows in join prefix's output - @param[in,out] current_read_time Cost to execute the join prefix + @param[in,out] current_rowcount Estimate of #rows in join prefix's output + @param[in,out] current_cost Cost to execute the join prefix @param loose_scan_pos A POSITION with LooseScan plan to access table new_join_tab (produced by last best_access_path call) @@ -15472,18 +15693,34 @@ void Optimize_table_order::optimize_wo_j See setup_semijoin_dups_elimination() for a description of what kinds of join prefixes each strategy can handle. + + A note on access path, rowcount and cost estimates: + - best_extension_by_limited_search() performs *initial calculations* + of access paths, rowcount and cost based on the operation being + an inner join or an outer join operation. These estimates are saved + in join->positions. + - advance_sj_state() performs *intermediate calculations* based on the + same table information, but for the supported semi-join strategies. + The access path part of these calculations are not saved anywhere, + but the rowcount and cost of the best semi-join strategy are saved + in join->positions. + - Because the semi-join access path information was not saved previously, + fix_semijoin_strategies() must perform *final calculations* of + access paths, rowcount and cost when saving the selected table order + in join->best_positions. The results of the final calculations will be + the same as the results of the "best" intermediate calculations. */ void Optimize_table_order::advance_sj_state( table_map remaining_tables, const JOIN_TAB *new_join_tab, uint idx, - double *current_record_count, double *current_read_time, + double *current_rowcount, double *current_cost, POSITION *loose_scan_pos) { Opt_trace_context * const trace= &thd->opt_trace; TABLE_LIST *const emb_sj_nest= new_join_tab->emb_sj_nest; POSITION *const pos= join->positions + idx; - + uint sj_strategy= SJ_OPT_NONE; // Initially: No chosen strategy /* Semi-join nests cannot be nested, hence we never need to advance the semi-join state of a materialized semi-join query. @@ -15491,16 +15728,15 @@ void Optimize_table_order::advance_sj_st within a semi-join nest have emb_sj_nest != NULL, which triggers several of the actions inside this function. */ - DBUG_ASSERT(join->emb_sjm_nest == NULL); + DBUG_ASSERT(emb_sjm_nest == NULL); /* Add this table to the join prefix */ remaining_tables &= ~new_join_tab->table->map; DBUG_ENTER("Optimize_table_order::advance_sj_state"); - pos->prefix_cost.convert_from_cost(*current_read_time); - pos->prefix_record_count= *current_record_count; - pos->sj_strategy= SJ_OPT_NONE; + pos->prefix_cost.convert_from_cost(*current_cost); + pos->prefix_record_count= *current_rowcount; Opt_trace_array trace_choices(trace, "semijoin_strategy_choice"); @@ -15560,7 +15796,7 @@ void Optimize_table_order::advance_sj_st are in the join prefix 4. All inner tables are still part of remaining_tables. */ - if (!join->cur_sj_inner_tables && // (2) + if (cur_sj_inner_tables == 0 && // (2) !(remaining_tables & outer_corr_tables) && // (3) (sj_inner_tables == // (4) ((remaining_tables | new_join_tab->table->map) & sj_inner_tables))) @@ -15589,44 +15825,40 @@ void Optimize_table_order::advance_sj_st if (!(pos->firstmatch_need_tables & remaining_tables)) { - /* - Got a complete FirstMatch range. - Calculate correct costs and fanout - */ - double reopt_cost, reopt_rec_count; + // Got a complete FirstMatch range. Calculate access paths and cost + double cost, rowcount; /* We use the same FirstLetterUpcase as in EXPLAIN */ Opt_trace_object trace_one_strategy(trace); trace_one_strategy.add_alnum("strategy", "FirstMatch"); - optimize_wo_join_buffering(pos->first_firstmatch_table, idx, - remaining_tables, FALSE, idx, - &reopt_rec_count, &reopt_cost); - if (reopt_cost < DBL_MAX) - { - /* - We don't yet know what are the other strategies, so pick the - FirstMatch. + (void)semijoin_firstmatch_loosescan_access_paths( + pos->first_firstmatch_table, idx, + remaining_tables, false, false, + &rowcount, &cost); + /* + We don't yet know what are the other strategies, so pick FirstMatch. - We ought to save the alternate POSITIONs produced by - optimize_wo_join_buffering but the problem is that providing save - space uses too much space. Instead, we will re-calculate the - alternate POSITIONs after we've picked the best QEP. - */ - pos->sj_strategy= SJ_OPT_FIRST_MATCH; - *current_read_time= reopt_cost; - *current_record_count= reopt_rec_count; - trace_one_strategy.add("cost", *current_read_time). - add("rows", *current_record_count); - handled_by_fm_or_ls= pos->firstmatch_need_tables; - } - trace_one_strategy.add("chosen", - pos->sj_strategy == SJ_OPT_FIRST_MATCH); + We ought to save the alternate POSITIONs produced by + semijoin_firstmatch_loosescan_access_paths() but the problem is that + providing save space uses too much space. + Instead, we will re-calculate the alternate POSITIONs after we've + picked the best QEP. + */ + sj_strategy= SJ_OPT_FIRST_MATCH; + *current_cost= cost; + *current_rowcount= rowcount; + trace_one_strategy.add("cost", *current_cost). + add("rows", *current_rowcount); + handled_by_fm_or_ls= pos->firstmatch_need_tables; + + trace_one_strategy.add("chosen", true); } } } /* LooseScan Strategy */ + if (thd->optimizer_switch_flag(OPTIMIZER_SWITCH_LOOSE_SCAN)) { - POSITION *first=join->positions+pos->first_loosescan_table; + POSITION *const first= join->positions+pos->first_loosescan_table; /* LooseScan strategy can't handle interleaving between tables from the semi-join that LooseScan is handling and any other tables. @@ -15662,10 +15894,8 @@ void Optimize_table_order::advance_sj_st inner tables and outer correlated tables into the prefix. */ - first=join->positions + pos->first_loosescan_table; - uint n_tables= my_count_bits(first->table->emb_sj_nest->sj_inner_tables); - /* Got a complete LooseScan range. Calculate its cost */ - double reopt_cost, reopt_rec_count; + // Got a complete LooseScan range. Calculate access paths and cost + double cost, rowcount; Opt_trace_object trace_one_strategy(trace); trace_one_strategy.add_alnum("strategy", "LooseScan"); /* @@ -15673,13 +15903,10 @@ void Optimize_table_order::advance_sj_st somewhere but reserving space for all cases would require too much space. We will re-calculate POSITION structures later on. */ - optimize_wo_join_buffering(pos->first_loosescan_table, idx, - remaining_tables, - TRUE, //first_alt - pos->first_loosescan_table + n_tables, - &reopt_rec_count, - &reopt_cost); - if (reopt_cost < DBL_MAX) + if (semijoin_firstmatch_loosescan_access_paths( + pos->first_loosescan_table, idx, + remaining_tables, true, false, + &rowcount, &cost)) { /* We don't yet have any other strategies that could handle this @@ -15688,34 +15915,33 @@ void Optimize_table_order::advance_sj_st the join prefix to be considered) so unconditionally pick the LooseScan. */ - pos->sj_strategy= SJ_OPT_LOOSE_SCAN; - *current_read_time= reopt_cost; - *current_record_count= reopt_rec_count; - trace_one_strategy.add("cost", *current_read_time). - add("rows", *current_record_count); + sj_strategy= SJ_OPT_LOOSE_SCAN; + *current_cost= cost; + *current_rowcount= rowcount; + trace_one_strategy.add("cost", *current_cost). + add("rows", *current_rowcount); handled_by_fm_or_ls= first->table->emb_sj_nest->sj_inner_tables; } - trace_one_strategy.add("chosen", - pos->sj_strategy == SJ_OPT_LOOSE_SCAN); + trace_one_strategy.add("chosen", true); } } /* - Update join->cur_sj_inner_tables (Used by FirstMatch in this function and + Update cur_sj_inner_tables (Used by FirstMatch in this function and LooseScan detector in best_access_path) */ if (emb_sj_nest) { - join->cur_sj_inner_tables |= emb_sj_nest->sj_inner_tables; + cur_sj_inner_tables|= emb_sj_nest->sj_inner_tables; pos->dups_producing_tables |= emb_sj_nest->sj_inner_tables; /* Remove the sj_nest if all of its SJ-inner tables are in cur_table_map */ if (!(remaining_tables & emb_sj_nest->sj_inner_tables)) - join->cur_sj_inner_tables &= ~emb_sj_nest->sj_inner_tables; + cur_sj_inner_tables&= ~emb_sj_nest->sj_inner_tables; } pos->dups_producing_tables &= ~handled_by_fm_or_ls; - /* 4. MaterializeLookup and MaterializeScan strategy handler */ + /* MaterializeLookup and MaterializeScan strategy handler */ const int sjm_strategy= semijoin_order_allows_materialization(join, remaining_tables, new_join_tab, idx); @@ -15744,35 +15970,20 @@ void Optimize_table_order::advance_sj_st emb_sj_nest->sj_inner_tables | emb_sj_nest->nested_join->sj_depends_on; pos->sjm_scan_last_inner= idx; - Opt_trace_object(trace).add_alnum("strategy", "MaterializationScan"). + Opt_trace_object(trace).add_alnum("strategy", "MaterializeScan"). add_alnum("choice", "deferred"); } else if (sjm_strategy == SJ_OPT_MATERIALIZE_LOOKUP) { - COST_VECT prefix_cost; - int first_tab= (int)idx - my_count_bits(emb_sj_nest->sj_inner_tables); - double prefix_rec_count; - if (first_tab < (int)join->const_tables) - { - prefix_cost.zero(); - prefix_rec_count= 1.0; - } - else - { - prefix_cost= join->positions[first_tab].prefix_cost; - prefix_rec_count= join->positions[first_tab].prefix_record_count; - } - - double mat_read_time= prefix_cost.total_cost(); - mat_read_time += - emb_sj_nest->nested_join->sjm.materialization_cost.total_cost() + - prefix_rec_count * emb_sj_nest->nested_join->sjm.lookup_cost.total_cost(); + // Calculate access paths and cost for MaterializeLookup strategy + double cost, rowcount; + semijoin_mat_lookup_access_paths(idx, emb_sj_nest, &rowcount, &cost); Opt_trace_object trace_one_strategy(trace); - trace_one_strategy.add_alnum("strategy", "MaterializationLookup"). - add("cost", mat_read_time).add("rows", prefix_rec_count). + trace_one_strategy.add_alnum("strategy", "MaterializeLookup"). + add("cost", cost).add("rows", rowcount). add("duplicate_tables_left", pos->dups_producing_tables != 0); - if (mat_read_time < *current_read_time || pos->dups_producing_tables) + if (cost < *current_cost || pos->dups_producing_tables) { /* NOTE: When we pick to use SJM[-Scan] we don't memcpy its POSITION @@ -15780,70 +15991,42 @@ void Optimize_table_order::advance_sj_st back when making one step back in join optimization. That's done after the QEP has been chosen. */ - pos->sj_strategy= SJ_OPT_MATERIALIZE_LOOKUP; - *current_read_time= mat_read_time; - *current_record_count= prefix_rec_count; + sj_strategy= SJ_OPT_MATERIALIZE_LOOKUP; + *current_cost= cost; + *current_rowcount= rowcount; pos->dups_producing_tables &= ~emb_sj_nest->sj_inner_tables; } - trace_one_strategy.add("chosen", - pos->sj_strategy == SJ_OPT_MATERIALIZE_LOOKUP); + trace_one_strategy.add("chosen", sj_strategy == SJ_OPT_MATERIALIZE_LOOKUP); } - /* 4.A SJM-Scan second phase check */ + /* MaterializeScan second phase check */ + /* + The optimizer does not support that we have inner tables from more + than one semi-join nest within the table range. + */ + if (pos->sjm_scan_need_tables && + emb_sj_nest != NULL && + emb_sj_nest != + join->positions[pos->sjm_scan_last_inner].table->emb_sj_nest) + pos->sjm_scan_need_tables= 0; + if (pos->sjm_scan_need_tables && /* Have SJM-Scan prefix */ !(pos->sjm_scan_need_tables & remaining_tables)) { - TABLE_LIST *mat_nest= + TABLE_LIST *const sjm_nest= join->positions[pos->sjm_scan_last_inner].table->emb_sj_nest; - const uint table_count= my_count_bits(mat_nest->sj_inner_tables); - double prefix_cost; - double prefix_rec_count; - int first_tab= pos->sjm_scan_last_inner + 1 - table_count; + double cost, rowcount; Opt_trace_object trace_one_strategy(trace); - trace_one_strategy.add_alnum("strategy", "MaterializationScan"); - - /* Get the prefix cost */ - if (first_tab == (int)join->const_tables) - { - prefix_rec_count= 1.0; - prefix_cost= 0.0; - } - else - { - prefix_cost= join->positions[first_tab - 1].prefix_cost.total_cost(); - prefix_rec_count= join->positions[first_tab - 1].prefix_record_count; - } - - /* Add materialization cost */ - prefix_cost+= - mat_nest->nested_join->sjm.materialization_cost.total_cost() + - prefix_rec_count * mat_nest->nested_join->sjm.scan_cost.total_cost(); - prefix_rec_count*= mat_nest->nested_join->sjm.expected_rowcount; - - uint i; - table_map rem_tables= remaining_tables; - for (i= idx; i != (first_tab + table_count - 1); i--) - rem_tables |= join->positions[i].table->table->map; - - POSITION curpos, dummy; - /* Need to re-run best-access-path as we prefix_rec_count has changed */ - { - Opt_trace_object trace_recompute(trace, "recompute_best_access_paths"); - trace_recompute.add_alnum("cause", "costs_of_prefix_changed"); - Opt_trace_array trace_tables(trace, "tables"); + trace_one_strategy.add_alnum("strategy", "MaterializeScan"); - for (i= first_tab + table_count; i <= idx; i++) - { - Opt_trace_object trace_one_table(trace); - trace_one_table.add_utf8_table(join->positions[i].table->table); - best_access_path(join, join->positions[i].table, rem_tables, i, FALSE, - prefix_rec_count, &curpos, &dummy); - prefix_rec_count *= curpos.records_read; - prefix_cost += curpos.read_time; - } - } + semijoin_mat_scan_access_paths(pos->sjm_scan_last_inner, idx, + remaining_tables, sjm_nest, false, + &rowcount, &cost); + trace_one_strategy.add("cost", cost). + add("rows", rowcount). + add("duplicate_tables_left", pos->dups_producing_tables != 0); /* Use the strategy if * it is cheaper then what we've had, or @@ -15852,21 +16035,17 @@ void Optimize_table_order::advance_sj_st comparing cost without semi-join duplicate removal with cost with duplicate removal is not an apples-to-apples comparison. */ - trace_one_strategy.add("cost", prefix_cost). - add("rows", prefix_rec_count). - add("duplicate_tables_left", pos->dups_producing_tables != 0); - if (prefix_cost < *current_read_time || pos->dups_producing_tables) + if (cost < *current_cost || pos->dups_producing_tables) { - pos->sj_strategy= SJ_OPT_MATERIALIZE_SCAN; - *current_read_time= prefix_cost; - *current_record_count= prefix_rec_count; - pos->dups_producing_tables &= ~mat_nest->sj_inner_tables; + sj_strategy= SJ_OPT_MATERIALIZE_SCAN; + *current_cost= cost; + *current_rowcount= rowcount; + pos->dups_producing_tables &= ~sjm_nest->sj_inner_tables; } - trace_one_strategy.add("chosen", - pos->sj_strategy == SJ_OPT_MATERIALIZE_SCAN); + trace_one_strategy.add("chosen", sj_strategy == SJ_OPT_MATERIALIZE_SCAN); } - /* 5. Duplicate Weedout strategy handler */ + /* Duplicate Weedout strategy handler */ { /* Duplicate weedout can be applied after all ON-correlated and @@ -15884,6 +16063,8 @@ void Optimize_table_order::advance_sj_st if (pos->dupsweedout_tables && !(remaining_tables & pos->dupsweedout_tables)) { + Opt_trace_object trace_one_strategy(trace); + trace_one_strategy.add_alnum("strategy", "DuplicatesWeedout"); /* Ok, reached a state where we could put a dups weedout point. Walk back and calculate @@ -15898,63 +16079,9 @@ void Optimize_table_order::advance_sj_st We need to calculate the cost in case #2 also because we need to make choice between this join order and others. */ - uint first_tab= pos->first_dupsweedout_table; - double dups_cost; - double prefix_rec_count; - double sj_inner_fanout= 1.0; - double sj_outer_fanout= 1.0; - uint temptable_rec_size; - if (first_tab == join->const_tables) - { - prefix_rec_count= 1.0; - temptable_rec_size= 0; - dups_cost= 0.0; - } - else - { - dups_cost= join->positions[first_tab - 1].prefix_cost.total_cost(); - prefix_rec_count= join->positions[first_tab - 1].prefix_record_count; - temptable_rec_size= 8; /* This is not true but we'll make it so */ - } - - table_map dups_removed_fanout= 0; - for (uint j= pos->first_dupsweedout_table; j <= idx; j++) - { - POSITION *p= join->positions + j; - dups_cost += p->read_time; - if (p->table->emb_sj_nest) - { - sj_inner_fanout *= p->records_read; - dups_removed_fanout |= p->table->table->map; - } - else - { - sj_outer_fanout *= p->records_read; - temptable_rec_size += p->table->table->file->ref_length; - } - } - - /* - Add the cost of temptable use. The table will have sj_outer_fanout - records, and we will make - - sj_outer_fanout table writes - - sj_inner_fanout*sj_outer_fanout lookups. - - */ - double one_lookup_cost; - if (sj_outer_fanout*temptable_rec_size > - thd->variables.max_heap_table_size) - one_lookup_cost= DISK_TEMPTABLE_LOOKUP_COST; - else - one_lookup_cost= HEAP_TEMPTABLE_LOOKUP_COST; - - double write_cost= join->positions[first_tab].prefix_record_count* - sj_outer_fanout * one_lookup_cost; - double full_lookup_cost= join->positions[first_tab].prefix_record_count* - sj_outer_fanout* sj_inner_fanout * - one_lookup_cost; - dups_cost += write_cost + full_lookup_cost; - + double rowcount, cost; + semijoin_dupsweedout_access_paths(pos->first_dupsweedout_table, idx, + remaining_tables, &rowcount, &cost); /* Use the strategy if * it is cheaper then what we've had, or @@ -15963,22 +16090,26 @@ void Optimize_table_order::advance_sj_st to consider (it needs "the most" tables in the prefix) and we can't leave duplicate-producing tables not handled by any strategy. */ - Opt_trace_object trace_one_strategy(trace); - trace_one_strategy.add_alnum("strategy", "DuplicatesWeedout"). - add("cost", dups_cost). - add("rows", prefix_rec_count * sj_outer_fanout). + trace_one_strategy. + add("cost", cost). + add("rows", rowcount). add("duplicate_tables_left", pos->dups_producing_tables != 0); - if (dups_cost < *current_read_time || pos->dups_producing_tables) + if (cost < *current_cost || pos->dups_producing_tables) { - pos->sj_strategy= SJ_OPT_DUPS_WEEDOUT; - *current_read_time= dups_cost; - *current_record_count= prefix_rec_count * sj_outer_fanout; - pos->dups_producing_tables &= ~dups_removed_fanout; + sj_strategy= SJ_OPT_DUPS_WEEDOUT; + *current_cost= cost; + *current_rowcount= rowcount; + /* + Note, dupsweedout_tables contains inner and outer tables, even though + "dups_producing_tables" are always inner table. Ok for this use. + */ + pos->dups_producing_tables &= ~pos->dupsweedout_tables; } - trace_one_strategy.add("chosen", - pos->sj_strategy == SJ_OPT_DUPS_WEEDOUT); + trace_one_strategy.add("chosen", sj_strategy == SJ_OPT_DUPS_WEEDOUT); } } + pos->sj_strategy= sj_strategy; + DBUG_VOID_RETURN; } @@ -15997,7 +16128,7 @@ void Optimize_table_order::advance_sj_st order and update the nested joins counters and cur_embedding_map. It is ok to call this for the first table in join order (for which check_interleaving_with_nj() has not been called). - - advance_sj_state(): removes the last table from join->cur_sj_inner_tables + - advance_sj_state(): removes the last table from cur_sj_inner_tables bitmap. The algorithm is the reciprocal of check_interleaving_with_nj(), hence @@ -16033,9 +16164,7 @@ void Optimize_table_order::advance_sj_st will however not influence NJ1 since it did not un-cover the last table in NJ2. - @param remaining_tables remaining tables to optimize, assumed to not contain - tab - (@todo but this assumption is violated in practice) + @param remaining_tables remaining tables to optimize, must contain 'tab' @param tab join table to remove, assumed to be the last in current partial join order. */ @@ -16043,6 +16172,8 @@ void Optimize_table_order::advance_sj_st void Optimize_table_order::backout_nj_sj_state(const table_map remaining_tables, const JOIN_TAB *tab) { + DBUG_ASSERT(remaining_tables & tab->table->map); + /* Restore the nested join state */ TABLE_LIST *last_emb= tab->table->pos_in_table_list->embedding; @@ -16069,12 +16200,13 @@ void Optimize_table_order::backout_nj_sj TABLE_LIST *const emb_sj_nest= tab->emb_sj_nest; if (emb_sj_nest) { - /* If we're removing the last SJ-inner table, remove the sj-nest */ + /* + If we're removing the first SJ-inner table, we are not in the SJ-nest + anymore: + */ if ((remaining_tables & emb_sj_nest->sj_inner_tables) == - (emb_sj_nest->sj_inner_tables & ~tab->table->map)) - { - join->cur_sj_inner_tables &= ~emb_sj_nest->sj_inner_tables; - } + emb_sj_nest->sj_inner_tables) + cur_sj_inner_tables&= ~emb_sj_nest->sj_inner_tables; } } === modified file 'sql/sql_select.h' --- a/sql/sql_select.h 2011-10-24 13:25:03 +0000 +++ b/sql/sql_select.h 2011-10-25 10:28:05 +0000 @@ -1802,22 +1802,9 @@ public: POSITION *best_positions; /******* Join optimization state members start *******/ - /* - If non-NULL, we are optimizing a materialized semi-join nest. - If NULL, we are optimizing a complete join plan. - This member is used only within the class Optimize_table_order, and - within class Loose_scan_opt (called from best_access_path()). - */ - TABLE_LIST *emb_sjm_nest; /* Current join optimization state */ POSITION *positions; - /* - Bitmap of inner tables of semi-join nests that have a proper subset of - their tables in the current join prefix. That is, of those semi-join - nests that have their tables both in and outside of the join prefix. - */ - table_map cur_sj_inner_tables; /* We also maintain a stack of join optimization states in * join->positions[] */ /******* Join optimization state members end *******/ No bundle (reason: useless for push emails).