List:Commits« Previous MessageNext Message »
From:Hemant Kumar Date:March 22 2012 11:36am
Subject:bzr push into mysql-trunk branch (hemant.hk.kumar:3829 to 3830)
View as plain text  
 3830 Hemant Kumar	2012-03-22
      Making rpl.rpl_report_port test experimental.

    modified:
      mysql-test/collections/default.experimental
 3829 Guilhem Bichot	2012-03-22
      Fix for BUG#13685026 ASSERTION CUR_SJ_INNER_TABLES == 0 IN
      OPTIMIZE_TABLE_ORDER::CHOOSE_TABLE_ORDER:
      cur_sj_inner_tables was not properly computed when
      the search depth is less than the number of tables.
     @ sql/sql_planner.cc
        Scenario of the bug follows.
        Search depth is 4, we have 5 tables (one outer "outr", and 4 sj-inner
        "aliasN").
        
        When search depth is >=5, we explore plans by traversing a tree of
        plans (of depth 5, because we have 5 tables). As we go from the root (no
        tables in the plan) to nodes (add more tables), we modify
        cur_sj_inner_tables, to represent the open semijoin nests in this
        partial plan; when we reach leaves, we have no open nests. When we go
        back towards the root, we enter and leave open nests, so we modify
        cur_sj_inner_tables; when we're back to the root (have explored all
        plans), we have no open nests, cur_sj_inner_tables is back to 0.
        
        With search depth =4, things work differently.
        We walk down the tree with depth 4. We select the best 4-table plan,
        which is alias1/4/5/3. We thus decide that alias1 will be the first
        table in the complete 5-table not-yet-determined plan.
        So we put alias1 as first table (search code for comment "select the
        first table in the optimal extension as most promising").
        And we then explore, with a depth of 4, the sub-tree rooted at
        alias1. We explore plan alias1/alias4/5/3/outr and choose
        it, then we explore plan alias1/alias5 and reject it for cost
        reasons. This has the following problems:
        - cur_sj_inner_tables is 0 when we start exploration of the sub-tree
        (this is the first oddity: alias1 is part of the plan, so the semijoin
        nest is open, but cur_sj_inner_tables is 0!)
        - we add alias5 ("push" it): this adds bits to cur_sj_inner_tables
        - we discard the plan, so we "pop" alias5: backout_nj_sj_state()
        does *not* rollback cur_sj_inner_tables to 0 because it does so only
        when popping the *first* table of the nest (here, alias1); as alias1
        is not popped in this algorithm (as it's "forever glued to the plan's
        first cell"), cur_sj_inner_tables remains non-zero (second oddity)
        - we come to the end of choose_plan(), and the assertion fails.
        
        The chosen solution is:
        - notice similarity between POSITION::dups_producing_tables and
        cur_sj_inner_tables
        - notice that POSITION::dups_producing_tables has no problem because
        it is attached to POSITION, and thus needs no "backing out" and is
        preserved when we stick alias1 into the first cell of the final plan:
        both oddities disappear.
        - thus, delete cur_sj_inner_tables and use dups_producing_tables
        instead. It makes one less state member to maintain.
        
        This required a few small changes:
        - dups_producing_tables was left uninitialized (i.e. random) if the
        query had no semijoin; as this patch uses this variable more (for
        example in the call to Loose_scan_opt.init(), which runs even if we
        have no semijoin nest), we need to have it always set, so we
        initialize it to 0 in the constructor of POSITION.
        - we now set sj_strategy to SJ_OPT_NONE in this same constructor,
        which allows removing three "else join->positions[idx].sj_strategy=
        SJ_OPT_NONE" branches from sql_planner.cc
        - the assertion on cur_sj_inner_tables==0 is now on
        dups_producing_tables of the last POSITION of the plan; it shouldn't
        be tested if the plan is partial (where it's legal to have inner
        tables not yet handled with some strategy as we haven't seen all
        tables).
        
        I wondered if cur_embedding_map has the same bug. The answer is that
        it used to have it, but that was fixed, see BUG#38795 (search for "Update
        the interleaving state after extending the current partial plan with a
        new table" in code). Thus, an assertion on this variable is added to
        consider_complete_plan().
        
        Now that cur_sj_inner_tables is gone, backout_nj_sj_state() does not
        do anything related to semijoin, so is renamed to backout_nj_state().
        And Loose_scan_opt::init(), which does not really need the map of inner
        tables but only needs to know if there are any, has his
        cur_sj_inner_tables argument changed to bool.
        
        Actually, dups_producing_tables and cur_sj_inner_tables are not
        exactly the same thing. When we have an inside-out plan (possibly
        suitable for loosescan, semijoin materialization scan, duplicates
        weedout), have passed the inner tables but not yet the outer tables,
        cur_sj_inner_tables is 0 (we're in no nest) but
        dups_producing_tables!=0 (because the passed inner tables are not yet
        handled with some strategy, as we need to see outer tables before
        making a decision).
        In advance_sj_state(), Firstmatch or Loosescan start a potential range
        if they see that cur_sj_inner_tables==0. Let's consider the following
        example of inside-out scenario: a query with two nests:
        select * from ot1,ot2 where ot1 SJ it1 and (ot1,ot2) SJ it2,
        and this order of tables:
        it2
        ot1
        it1
        ot2
        it2 is before ot1 and ot2 so this plan starts "inside-out".
        Let's assume that advance_sj_state() is at it1.
        Firstmatch is chosen for it1 (cur_sj_inner_tables==0). Then
        advance_sj_state() comes to ot2, and, in order to handle it2, chooses
        some inside-out strategy, for example duplicates weedout. So we have a
        dups-weedout range from it2 to ot2, and inside it, a firstmatch range
        (it1).
        Then fix_semijoin_strategies(), taking
        tables from last to first, will see that ot2 has dups-weedout, and
        will clear any strategy in the range [it2,ot2]: firstmatch will be
        wiped out (see end of fix_semijoin_strategies()) (I have verified this
        with a debugger and a real query). So the end result is
        that this firstmatch choice in advance_sj_state() was useless:
        implementation is not ready for overlapping ranges of 
        tables. The same is true of setup_semijoin_dups_elimination() (see how
        it "leaps" by lengths of pos->n_sj_tables, i.e. leaps from range's start
        to range's end, not setting up any strategy for any contained
        subrange).
        After this patch, Firstmatch will not be chosen for it1 in
        advance_sj_state(), because firstmatch will test
        dups_producing_tables==0 instead of cur_sj_inner_tables==0, and
        dups_producing_tables!=0 between inner and outer tables. The useless
        choice will not be made. It is a very good thing, because before this
        patch, the fact that firstmatch declared that it would handle it1,
        whereas it would be cancelled later in fix_semijoin_strategies(), was
        sort of "lying" to the further calls to advance_sj_state() for ot2.
        
        During debugging, it was observed that the optimizer trace gives
        confusing output when search depth is not infinite. Indeed, searching
        is done like this:
        a) explore 4-table plans (shown in trace)
        b) pick first table of best 4-table plan (not shown in trace)
        c) explore 4-table plans rooted at this first table: shown in trace,
        but the fact that they start with the first table is not shown.
        Thus the output gets confusing: (a) could show plan t1-t2-t3-t4, then
        (c) could show plan t2-t3-t4-t5, giving the impression that this plan
        was chosen over t1-t2-t3-t4, while in fact it's t1-t2-t3-t4-t5 which
        is chosen. This confusion is fortunately removed by printing the "plan
        prefix" when we add a table to the partial plan; this will be done in
        a separate patch (bug 13799348).
     @ sql/sql_planner.h
        removed cur_sj_inner_tables

    modified:
      mysql-test/include/subquery_sj.inc
      mysql-test/r/subquery_sj_all.result
      mysql-test/r/subquery_sj_all_bka.result
      mysql-test/r/subquery_sj_all_bka_nixbnl.result
      mysql-test/r/subquery_sj_all_bkaunique.result
      mysql-test/r/subquery_sj_dupsweed.result
      mysql-test/r/subquery_sj_dupsweed_bka.result
      mysql-test/r/subquery_sj_dupsweed_bka_nixbnl.result
      mysql-test/r/subquery_sj_dupsweed_bkaunique.result
      mysql-test/r/subquery_sj_firstmatch.result
      mysql-test/r/subquery_sj_firstmatch_bka.result
      mysql-test/r/subquery_sj_firstmatch_bka_nixbnl.result
      mysql-test/r/subquery_sj_firstmatch_bkaunique.result
      mysql-test/r/subquery_sj_loosescan.result
      mysql-test/r/subquery_sj_loosescan_bka.result
      mysql-test/r/subquery_sj_loosescan_bka_nixbnl.result
      mysql-test/r/subquery_sj_loosescan_bkaunique.result
      mysql-test/r/subquery_sj_mat.result
      mysql-test/r/subquery_sj_mat_bka.result
      mysql-test/r/subquery_sj_mat_bka_nixbnl.result
      mysql-test/r/subquery_sj_mat_bkaunique.result
      mysql-test/r/subquery_sj_mat_nosj.result
      mysql-test/r/subquery_sj_none.result
      mysql-test/r/subquery_sj_none_bka.result
      mysql-test/r/subquery_sj_none_bka_nixbnl.result
      mysql-test/r/subquery_sj_none_bkaunique.result
      sql/sql_planner.cc
      sql/sql_planner.h
      sql/sql_select.h
=== modified file 'mysql-test/collections/default.experimental'
--- a/mysql-test/collections/default.experimental	2012-03-22 01:52:59 +0000
+++ b/mysql-test/collections/default.experimental	2012-03-22 11:34:33 +0000
@@ -23,6 +23,7 @@ rpl.rpl_row_sp011  @solaris
 rpl.rpl_seconds_behind_master            # BUG#11765124 2010-11-24 luis fails sporadically on pb2
 rpl.rpl_show_slave_running               # BUG#12346048 2011-04-11 sven fails sporadically on pb2
 rpl.rpl_parallel_switch_sequential       # BUG#13626976 2012-02-07 dlenev Fails sporadically on pb2
+rpl.rpl_report_port                      # BUG#13812374 2012-03-22 Hemant Fails after the post fix for bug#13812374 
 
 sys_vars.max_sp_recursion_depth_func @solaris # Bug#11753919 2010-01-20 alik Several test cases fail on Solaris with error Thread stack overrun
 

No bundle (reason: useless for push emails).
Thread
bzr push into mysql-trunk branch (hemant.hk.kumar:3829 to 3830) Hemant Kumar22 Mar