From: Roy Lyseng Date: October 26 2011 1:29pm Subject: bzr push into mysql-trunk branch (roy.lyseng:3469 to 3470) Bug#113106350 List-Archive: http://lists.mysql.com/commits/141585 X-Bug: 113106350 Message-Id: <20111026132954.33E8E203@tyr67.norway.sun.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit 3470 Roy Lyseng 2011-10-26 Bug#113106350: crash in memcpy from join_cache::write_record_data This problem may happen if we combine the Duplicate Weedout semi-join strategy with join buffering, and the outer table is a derived table. The problem is that Duplicate Weedout needs the row id of all involved tables, hence the table is instructed to save the row id. Join buffering needs to save and restore the rowid into a buffer, for that purpose an st_cache_field object is allocated. This object contains a pointer to the row id buffer, but at the time of object creation, the table (being a derived table) has not yet been opened and the buffer pointer is NULL. The fix is to save a reference to the st_cache_field object in the corresponding JOIN_TAB and bind it with the buffer after the derived table has been materialized. mysql-test/t/derived.test Added test case for bug#113106350. mysql-test/r/derived.result Added test case results for bug#113106350. sql/sql_join_cache.cc Save reference to rowid copy object when setting up fields to copy. sql/sql_select.cc Bind the rowid buffer of a TABLE object to a copy object that is used to copy a rowid. Needed when temporary tables are opened as part of execution. sql/sql_select.h Added reference to copy object used to copy rowid in JOIN_TAB. Added a bind_buffer() access function for st_cache_field. modified: mysql-test/r/derived.result mysql-test/t/derived.test sql/sql_join_cache.cc sql/sql_select.cc sql/sql_select.h 3469 Roy Lyseng 2011-10-25 Bug#12407753: Time to compare a row is missing in cost calculation of semi-join Stage 2: Refactor optimize_wo_join_buffering, best_access_path and others. Refactored how costs are re-calculated for the semi-join strategies, now the intermediate and final calculations are done the same way. Added new calculation functions to class Optimize_table_order. In addition, a few minor inconsistencies around semi-join optimization have been fixed. The following changes were developed by Guilhem Bichot and incorporated in this refactoring patch: Moving cur_sj_inner_tables and emb_sjm_nest to Optimize_table_order, where they naturally belong (emb_sjm_nest added by me). After choosing the final plan, cur_sj_inner_tables should be 0, as our plan normally covers all sj-inner tables. So we add an assert. However there was a bug: backout_nj_sj_state() actually never restored cur_sj_inner_tables, because it was assuming (wrongly) that remaining_tables didn't contain the table-to-remove, whereas it always does. mysql-test/r/opt_trace.bugs_ps_prot_all.result mysql-test/r/opt_trace.general_ps_prot_all.result mysql-test/r/opt_trace.bugs_no_prot_all.result Some changes necessitated by refactoring. Aligned some text with other terminology. mysql-test/r/opt_trace.general_no_prot_all.result For query 'select * from t3 where (a,a,b) in (select * from t1,t2,t4)', (where t3 has one row and is thus a "system" table), plan (t4,t1,t2)+firstmatch was not considered, which was wrong. Here's how it happened. - plan (t2) is tested: cur_sj_inner_tables is set to t1+t2+t4, then plan is pruned; cur_sj_inner_tables remains non-zero because of the bug in backout_nj_sj_state() - plan (t1) is tested: this if() below is not entered because cur_sj_inner_tables is non-zero: if (cur_sj_inner_tables == 0 && // (2) !(remaining_tables & outer_corr_tables) && // (3) (sj_inner_tables == // (4) ((remaining_tables | new_join_tab->table->map) & sj_inner_tables))) { /* Start tracking potential FirstMatch range */ And thus firstmatch won't be tested. sql/sql_const.h Added cost factors for creation of temporary tables. Currently, the old values are still used. Removed unused constant RAID_BLOCK_SIZE. sql/sql_select.cc optimize_semijoin_nests(): Restructured assignments to cost data. Added cost for creation of temporary table (currently zero). best_access_path() is now a member function of Optimize_table_order. Argument join is eliminated as it is passed through object. The "excluded_tables" variable that was sometimes OR'ed to remaining_tables is now a member variable in Optimize_table_order and need not be passed by callers to best_access_path(). choose_table_order(): Exclude unnecessary book-keeping when processing semi-join materialization nests. Call fix_semijoin_strategies() when needed (makes it possible to make fix_semijoin_strategies() a member function of Optimize_table_order). fix_semijoin_strategies() is now a member function of Optimize_table_order. Renamed from fix_semijoin_strategies_for_picked_join_order() because the full function name became extremely long. Rewritten to use the new access path calculation functions. semijoin_fm_ls_access_paths() is renamed from optimize_wo_join_buffering. It will now correctly calculate new access path information for FirstMatch and LooseScan semi-join strategies. and it is modified to be able to make final calculations as well as intermediate. It makes the same join buffering choice in both cases. Notice: Aligned with old strategy for this commit. semijoin_materialize_access_paths() calculates access paths for the MaterializeScan semi-join strategy. It can be used both in intermediate and final stages of table order estimation. semijoin_dupweedout_access_paths() calculates access paths for the DuplicateWeedout semi-join strategy. No corrections are needed for the final stage of table order estimation, so intermediate decisions are kept as final decisions. advance_sj_state() is simplified by using the new access path calculation functions. Some refactoring of variable names done (read_time -> cost, record_count -> rowcount). This cleanup was done since most of the code lines were touched anyway. Also introduced new local variables sj_strategy for simpler logic. After call to greedy_search(), assert that cur_sj_inner_tables is 0. In backout_nj_sj_state(), cur_sj_inner_tables was wrongly backed out: "if ((remaining_tables & emb_sj_nest->sj_inner_tables) == (emb_sj_nest->sj_inner_tables & ~tab->table->map))" was never entered: - if 'tab' is sj-inner, (remaining_tables & emb_sj_nest->sj_inner_tables) contains 'tab', but (emb_sj_nest->sj_inner_tables & ~tab->table->map) doesn't - if 'tab' is not sj-inner, emb_sj_nest is NULL. sql/sql_select.h Member variables cur_sj_inner_tables and emb_sjm_nest moved from class JOIN to class Optimize_table_order. modified: mysql-test/suite/opt_trace/r/bugs_no_prot_all.result mysql-test/suite/opt_trace/r/bugs_ps_prot_all.result mysql-test/suite/opt_trace/r/general_no_prot_all.result mysql-test/suite/opt_trace/r/general_ps_prot_all.result sql/sql_const.h sql/sql_select.cc sql/sql_select.h === modified file 'mysql-test/r/derived.result' --- a/mysql-test/r/derived.result 2011-10-19 06:26:28 +0000 +++ b/mysql-test/r/derived.result 2011-10-26 13:23:01 +0000 @@ -1615,3 +1615,23 @@ field1 field2 SET @@SESSION.optimizer_switch= @save_switch; DROP TABLE t1, t2; # +# Bug#13106350: MRR initialization on a derived table caused crash. +# +SET @save_switch= @@optimizer_switch; +SET @@optimizer_switch="firstmatch=off,loosescan=off,materialization=off"; +CREATE TABLE t1 (pk INTEGER PRIMARY KEY, vc VARCHAR(20)); +INSERT INTO t1 VALUES(7, 'seven'), (13, 'thirteen'); +CREATE TABLE t2 (pk INTEGER PRIMARY KEY, vc1 VARCHAR(20), vc2 VARCHAR(20)); +INSERT INTO t2 VALUES(7, 'seven', 's'), (14, 'fourteen', 'f'); +CREATE TABLE t3 (pk INTEGER PRIMARY KEY, vc VARCHAR(20)); +INSERT INTO t3 VALUES(5, 'f'), (6, 's'), (7, 's'); +SELECT derived.vc +FROM (SELECT * FROM t1) AS derived +WHERE derived.vc IN ( +SELECT t2.vc1 +FROM t2 JOIN t3 ON t2.vc2=t3.vc); +vc +seven +SET @@optimizer_switch= @save_switch; +DROP TABLE t1, t2, t3; +# === modified file 'mysql-test/t/derived.test' --- a/mysql-test/t/derived.test 2011-10-05 13:16:38 +0000 +++ b/mysql-test/t/derived.test 2011-10-26 13:23:01 +0000 @@ -977,4 +977,31 @@ SET @@SESSION.optimizer_switch= @save_sw DROP TABLE t1, t2; --echo # +--echo # Bug#13106350: MRR initialization on a derived table caused crash. +--echo # +SET @save_switch= @@optimizer_switch; +SET @@optimizer_switch="firstmatch=off,loosescan=off,materialization=off"; + +CREATE TABLE t1 (pk INTEGER PRIMARY KEY, vc VARCHAR(20)); + +INSERT INTO t1 VALUES(7, 'seven'), (13, 'thirteen'); + +CREATE TABLE t2 (pk INTEGER PRIMARY KEY, vc1 VARCHAR(20), vc2 VARCHAR(20)); + +INSERT INTO t2 VALUES(7, 'seven', 's'), (14, 'fourteen', 'f'); + +CREATE TABLE t3 (pk INTEGER PRIMARY KEY, vc VARCHAR(20)); + +INSERT INTO t3 VALUES(5, 'f'), (6, 's'), (7, 's'); + +SELECT derived.vc +FROM (SELECT * FROM t1) AS derived +WHERE derived.vc IN ( + SELECT t2.vc1 + FROM t2 JOIN t3 ON t2.vc2=t3.vc); + +SET @@optimizer_switch= @save_switch; +DROP TABLE t1, t2, t3; + +--echo # === modified file 'sql/sql_join_cache.cc' --- a/sql/sql_join_cache.cc 2011-08-17 07:32:04 +0000 +++ b/sql/sql_join_cache.cc 2011-10-26 13:23:01 +0000 @@ -364,6 +364,7 @@ void JOIN_CACHE:: create_remaining_field copy->field= 0; copy->referenced_field_no= 0; copy->get_rowid= NULL; + tab->copy_current_rowid= copy; length+= copy->length; data_field_count++; copy++; === modified file 'sql/sql_select.cc' --- a/sql/sql_select.cc 2011-10-25 10:28:05 +0000 +++ b/sql/sql_select.cc 2011-10-26 13:23:01 +0000 @@ -19294,7 +19294,12 @@ sub_select(JOIN *join,JOIN_TAB *join_tab /* Materialize table prior reading it */ if (join_tab->materialize_table && !join_tab->table->pos_in_table_list->materialized) + { error= (*join_tab->materialize_table)(join_tab); + // Bind to the rowid buffer managed by the TABLE object. + if (join_tab->copy_current_rowid) + join_tab->copy_current_rowid->bind_buffer(join_tab->table->file->ref); + } if (!error) error= (*join_tab->read_first_record)(join_tab); === modified file 'sql/sql_select.h' --- a/sql/sql_select.h 2011-10-25 10:28:05 +0000 +++ b/sql/sql_select.h 2011-10-26 13:23:01 +0000 @@ -333,6 +333,8 @@ inline bool sj_is_materialize_strategy(u */ enum quick_type { QS_NONE, QS_RANGE, QS_DYNAMIC_RANGE}; +struct st_cache_field; + typedef struct st_join_table : public Sql_alloc { st_join_table(); @@ -488,9 +490,11 @@ public: /* Used by DuplicateElimination. tab->table->ref must have the rowid - whenever we have a current record. + whenever we have a current record. copy_current_rowid needed because + we cannot bind to the rowid buffer before the table has been opened. */ int keep_current_rowid; + st_cache_field *copy_current_rowid; /* NestedOuterJoins: Bitmap of nested joins this table is part of */ nested_join_map embedding_map; @@ -644,6 +648,7 @@ st_join_table::st_join_table() found_match(FALSE), keep_current_rowid(0), + copy_current_rowid(NULL), embedding_map(0) { /** @@ -691,6 +696,8 @@ typedef struct st_cache_field { /* The remaining structure fields are used as containers for temp values */ uint blob_length; /**< length of the blob to be copied */ uint offset; /**< field offset to be saved in cache buffer */ + + void bind_buffer(uchar *buffer) { str= buffer; } } CACHE_FIELD; No bundle (reason: useless for push emails).