3830 Hemant Kumar 2012-03-22
Making rpl.rpl_report_port test experimental.
3829 Guilhem Bichot 2012-03-22
Fix for BUG#13685026 ASSERTION CUR_SJ_INNER_TABLES == 0 IN
cur_sj_inner_tables was not properly computed when
the search depth is less than the number of tables.
Scenario of the bug follows.
Search depth is 4, we have 5 tables (one outer "outr", and 4 sj-inner
When search depth is >=5, we explore plans by traversing a tree of
plans (of depth 5, because we have 5 tables). As we go from the root (no
tables in the plan) to nodes (add more tables), we modify
cur_sj_inner_tables, to represent the open semijoin nests in this
partial plan; when we reach leaves, we have no open nests. When we go
back towards the root, we enter and leave open nests, so we modify
cur_sj_inner_tables; when we're back to the root (have explored all
plans), we have no open nests, cur_sj_inner_tables is back to 0.
With search depth =4, things work differently.
We walk down the tree with depth 4. We select the best 4-table plan,
which is alias1/4/5/3. We thus decide that alias1 will be the first
table in the complete 5-table not-yet-determined plan.
So we put alias1 as first table (search code for comment "select the
first table in the optimal extension as most promising").
And we then explore, with a depth of 4, the sub-tree rooted at
alias1. We explore plan alias1/alias4/5/3/outr and choose
it, then we explore plan alias1/alias5 and reject it for cost
reasons. This has the following problems:
- cur_sj_inner_tables is 0 when we start exploration of the sub-tree
(this is the first oddity: alias1 is part of the plan, so the semijoin
nest is open, but cur_sj_inner_tables is 0!)
- we add alias5 ("push" it): this adds bits to cur_sj_inner_tables
- we discard the plan, so we "pop" alias5: backout_nj_sj_state()
does *not* rollback cur_sj_inner_tables to 0 because it does so only
when popping the *first* table of the nest (here, alias1); as alias1
is not popped in this algorithm (as it's "forever glued to the plan's
first cell"), cur_sj_inner_tables remains non-zero (second oddity)
- we come to the end of choose_plan(), and the assertion fails.
The chosen solution is:
- notice similarity between POSITION::dups_producing_tables and
- notice that POSITION::dups_producing_tables has no problem because
it is attached to POSITION, and thus needs no "backing out" and is
preserved when we stick alias1 into the first cell of the final plan:
both oddities disappear.
- thus, delete cur_sj_inner_tables and use dups_producing_tables
instead. It makes one less state member to maintain.
This required a few small changes:
- dups_producing_tables was left uninitialized (i.e. random) if the
query had no semijoin; as this patch uses this variable more (for
example in the call to Loose_scan_opt.init(), which runs even if we
have no semijoin nest), we need to have it always set, so we
initialize it to 0 in the constructor of POSITION.
- we now set sj_strategy to SJ_OPT_NONE in this same constructor,
which allows removing three "else join->positions[idx].sj_strategy=
SJ_OPT_NONE" branches from sql_planner.cc
- the assertion on cur_sj_inner_tables==0 is now on
dups_producing_tables of the last POSITION of the plan; it shouldn't
be tested if the plan is partial (where it's legal to have inner
tables not yet handled with some strategy as we haven't seen all
I wondered if cur_embedding_map has the same bug. The answer is that
it used to have it, but that was fixed, see BUG#38795 (search for "Update
the interleaving state after extending the current partial plan with a
new table" in code). Thus, an assertion on this variable is added to
Now that cur_sj_inner_tables is gone, backout_nj_sj_state() does not
do anything related to semijoin, so is renamed to backout_nj_state().
And Loose_scan_opt::init(), which does not really need the map of inner
tables but only needs to know if there are any, has his
cur_sj_inner_tables argument changed to bool.
Actually, dups_producing_tables and cur_sj_inner_tables are not
exactly the same thing. When we have an inside-out plan (possibly
suitable for loosescan, semijoin materialization scan, duplicates
weedout), have passed the inner tables but not yet the outer tables,
cur_sj_inner_tables is 0 (we're in no nest) but
dups_producing_tables!=0 (because the passed inner tables are not yet
handled with some strategy, as we need to see outer tables before
making a decision).
In advance_sj_state(), Firstmatch or Loosescan start a potential range
if they see that cur_sj_inner_tables==0. Let's consider the following
example of inside-out scenario: a query with two nests:
select * from ot1,ot2 where ot1 SJ it1 and (ot1,ot2) SJ it2,
and this order of tables:
it2 is before ot1 and ot2 so this plan starts "inside-out".
Let's assume that advance_sj_state() is at it1.
Firstmatch is chosen for it1 (cur_sj_inner_tables==0). Then
advance_sj_state() comes to ot2, and, in order to handle it2, chooses
some inside-out strategy, for example duplicates weedout. So we have a
dups-weedout range from it2 to ot2, and inside it, a firstmatch range
Then fix_semijoin_strategies(), taking
tables from last to first, will see that ot2 has dups-weedout, and
will clear any strategy in the range [it2,ot2]: firstmatch will be
wiped out (see end of fix_semijoin_strategies()) (I have verified this
with a debugger and a real query). So the end result is
that this firstmatch choice in advance_sj_state() was useless:
implementation is not ready for overlapping ranges of
tables. The same is true of setup_semijoin_dups_elimination() (see how
it "leaps" by lengths of pos->n_sj_tables, i.e. leaps from range's start
to range's end, not setting up any strategy for any contained
After this patch, Firstmatch will not be chosen for it1 in
advance_sj_state(), because firstmatch will test
dups_producing_tables==0 instead of cur_sj_inner_tables==0, and
dups_producing_tables!=0 between inner and outer tables. The useless
choice will not be made. It is a very good thing, because before this
patch, the fact that firstmatch declared that it would handle it1,
whereas it would be cancelled later in fix_semijoin_strategies(), was
sort of "lying" to the further calls to advance_sj_state() for ot2.
During debugging, it was observed that the optimizer trace gives
confusing output when search depth is not infinite. Indeed, searching
is done like this:
a) explore 4-table plans (shown in trace)
b) pick first table of best 4-table plan (not shown in trace)
c) explore 4-table plans rooted at this first table: shown in trace,
but the fact that they start with the first table is not shown.
Thus the output gets confusing: (a) could show plan t1-t2-t3-t4, then
(c) could show plan t2-t3-t4-t5, giving the impression that this plan
was chosen over t1-t2-t3-t4, while in fact it's t1-t2-t3-t4-t5 which
is chosen. This confusion is fortunately removed by printing the "plan
prefix" when we add a table to the partial plan; this will be done in
a separate patch (bug 13799348).
=== modified file 'mysql-test/collections/default.experimental'
--- a/mysql-test/collections/default.experimental 2012-03-22 01:52:59 +0000
+++ b/mysql-test/collections/default.experimental 2012-03-22 11:34:33 +0000
@@ -23,6 +23,7 @@ rpl.rpl_row_sp011 @solaris
rpl.rpl_seconds_behind_master # BUG#11765124 2010-11-24 luis fails sporadically on pb2
rpl.rpl_show_slave_running # BUG#12346048 2011-04-11 sven fails sporadically on pb2
rpl.rpl_parallel_switch_sequential # BUG#13626976 2012-02-07 dlenev Fails sporadically on pb2
+rpl.rpl_report_port # BUG#13812374 2012-03-22 Hemant Fails after the post fix for bug#13812374
sys_vars.max_sp_recursion_depth_func @solaris # Bug#11753919 2010-01-20 alik Several test cases fail on Solaris with error Thread stack overrun
No bundle (reason: useless for push emails).
|• bzr push into mysql-trunk branch (hemant.hk.kumar:3829 to 3830) ||Hemant Kumar||22 Mar|