List:Commits« Previous MessageNext Message »
From:Ole John Aske Date:February 22 2011 3:44pm
Subject:bzr push into mysql-trunk branch (ole.john.aske:3701 to 3702) Bug#11766256
View as plain text  
 3702 Ole John Aske	2011-02-22
      Addendum to fix for bug#11766256: Added missing include files for MTR test.

    added:
      mysql-test/include/check_qep.inc
      mysql-test/include/expect_qep.inc
 3701 Ole John Aske	2011-02-22
      Fix for Bug#11766256: GREEDY OPTIMIZER PRODUCE STUPID QUERY EXECUTION PLANS
      
      This a a collection of the former 3 patches for this bug:
      
        - Patch1: Fixed several defects in the greedy optimization
          (http://lists.mysql.com/commits/129562)
      
        - Patch2: Maintains order of best_ref[]
          (http://lists.mysql.com/commits/130054)
      
        - Patch3: Refactoring and cleanup related to prev patches.
          (http://lists.mysql.com/commits/130055)
      
      On the request from the reviewers these patches has been collected
      into a single changeset for review.
      
      PATCH1:
      =======
      Fixed several defects in the greedy optimization:
      
      1) The greedy optimizer calculate the 'compare-cost' (CPU-cost)
         for iterating over the partial plan result at each level in
         the query plan as 'record_count / (double) TIME_FOR_COMPARE'
      
         This cost was only used localy for 'best' calculation at each
         level, and *not* accumulated into the total cost for the query plan.
      
         This fix add the 'CPU-cost' of processing 'current_record_count'
         records at each level to 'current_read_time' *before* it is used as
         'accumulated cost' argument to recursive 
         best_extension_by_limited_search() calls. This ensures that the
         cost of a huge join-fanout early in the QEP is correctly
         reflected in the cost of the final QEP.
      
         To get identical cost for a 'best' optimized query and a
         straight_join with the same join order, the same change was also
         applied to optimize_straight_join() and get_partial_join_cost()
      
      2) Furthermore to get equal cost for 'best' optimized query and a
         straight_join we had to subtract the same '0.001' in optimize_straight_join()
         as we already do in best_extension_by_limited_search()
      
      3) When best_extension_by_limited_search() aggregated the 'best' plan a
         plan was 'best' by the check :
      
         'if ((search_depth == 1) || (current_read_time < join->best_read))'
      
         The term '(search_depth == 1' incorrectly caused a new best plan to be
         collected whenever we reached the specified 'search_depth' - Even if
         this partial query plan was more expensive than what we had already
         found.
      
          ... See further comment and pure numbers for this in bugreport.
      
      PATCH2:
      ======
      In order for the greedy optimizers 'prune' logic to quickly find a
      'good' execution plan, and prune the other less promissing plans, best_ref[]
      is sorted by E(#rows) before we start calculating query plans.
      
      However, due to how swap_variables() was used inside
      best_extension_by_limited_search(), best_ref[] quickly
      became 'scrambled', and was no longer sorted when
      multiple partial plans had been evaluated.
      
      This changeset maintains the order of the unevaluated part of
      best_ref[]. Besides reducing time to find the 'best' query plan,
      this also reduces the risk for incorrectly pruning away the optimal 
      query plan.
      
      This fix also include 'TIME_FOR_COMPARE' as part of the 'cost'
      considdered when less promissing query plans:
      
       - This 'compare cost' is included everywhere else in the
         best-calculations.
       - Try to run the extended 'greedy_optimizer' test wo/ and 
         it demonstrate how it fails to prune 'bad' plans -> and likely tiemouts.
      
      PATCH3:
      ======
      Refactoring cleanup related the above patches for this bug:
      
       - Added 'TIME_FOR_COMPARE' to current_read_time where
         it is initially calculated - Removed later adding of
         TIME_FOR_COMPARE whenever it is used.
      
       - Used local variable '*position' instead of 'join->positions + idx'
         a few places.
      
       - Replaced a few references to 'restore_prev_nj_state()'
         with 'backout_nj_sj_state()' in comments.
         (Has previously been renamed)
      
      This patch contains no (intentional) changes of logic.
                                                               
     @ mysql-test/r/greedy_optimizer.result
        Updated resultfile after greedy optimizer changes. 
        Don't care about increased 'cost' as this is a relative number.
        
        However, several 'Total_handler_reads' has decreased, and none
        increased - Which should be a fairly good indication of this fix
        improving the existing query plan.
        
        Also added a lots of new tests which will fail wo/ this fix.
     @ mysql-test/r/join.result
        Accepted udated 'cost'
     @ mysql-test/r/join_cache_jcl1.result
        Accepted new query plan which looks better as ALL is joined in later.
        (Decreased fanout)
     @ mysql-test/r/join_cache_jcl2.result
        Accepted new query plan which looks better as ALL is joined in later.
        (Decreased fanout)
     @ mysql-test/r/join_cache_jcl3.result
        Accepted new query plan which looks better as ALL is joined in later.
        (Decreased fanout)
     @ mysql-test/r/join_cache_jcl4.result
        Accepted new query plan which looks better as ALL is joined in later.
        (Decreased fanout)
     @ mysql-test/r/status.result
        Accepted udated 'cost'
     @ mysql-test/r/subquery_sj_none.result
        ALL/REF -joining tables with most rows later improved the query plan
        (Decreased fanout).
        
        Same for doing REF after EQ_REF in the last query (decreased fanout)

    modified:
      mysql-test/r/greedy_optimizer.result
      mysql-test/r/join.result
      mysql-test/r/join_cache_jcl1.result
      mysql-test/r/join_cache_jcl2.result
      mysql-test/r/join_cache_jcl3.result
      mysql-test/r/join_cache_jcl4.result
      mysql-test/r/status.result
      mysql-test/r/subquery_sj_none.result
      mysql-test/t/greedy_optimizer.test
      sql/sql_select.cc
=== added file 'mysql-test/include/check_qep.inc'
--- a/mysql-test/include/check_qep.inc	1970-01-01 00:00:00 +0000
+++ b/mysql-test/include/check_qep.inc	2011-02-22 15:44:04 +0000
@@ -0,0 +1,50 @@
+# include/check_qep.inc
+#
+# SUMMARY
+#
+#    Designed to be used together with include/check_qep.inc
+#
+#    $query should be assigned a select statement using 
+#    straight_join to force the tables to be joined in most 
+#    optimal order.
+#
+#    expect_qep.inc will then store the estimated 'Last_query_cost'
+#    and total # 'Handler_read%' for this straight_joined query.
+#
+#    We should then assign a non-straight_join'ed version of
+#    the same query to $query and execute it using 
+#    'include/check_qep.inc'. Its estimated cost and
+#    #handler_reads will then be verified against the
+#    previous straight_joined query.
+#
+# USAGE
+#
+#    let $query= <select straight_join optimal statement>;
+#    --source include/expect_qep.inc
+#    let $query= <select statement>;
+#    --source include/check_qep.inc
+#
+# EXAMPLE
+#    t/greedy_optimizer.test
+#
+
+flush status;
+eval EXPLAIN $query;
+eval $query;
+
+let $cost=
+ query_get_value(SHOW STATUS LIKE 'Last_query_cost', Value, 1);
+
+let $reads=
+`select sum(variable_value)
+   from information_schema.session_status
+   where VARIABLE_NAME like 'Handler_read%'`;
+
+#echo Cost: $cost, Handler_reads: $reads;
+
+if ($cost != $best_cost)
+{ echo ### FAILED: Query_cost: $cost, expected: $best_cost ###;
+}
+if ($reads != $best_reads)
+{ echo ### FAILED: Handler_reads: $reads, expected: $best_reads ###;
+}

=== added file 'mysql-test/include/expect_qep.inc'
--- a/mysql-test/include/expect_qep.inc	1970-01-01 00:00:00 +0000
+++ b/mysql-test/include/expect_qep.inc	2011-02-22 15:44:04 +0000
@@ -0,0 +1,44 @@
+# include/expect_qep.inc
+#
+# SUMMARY
+#
+#    Designed to be used together with include/check_qep.inc
+#
+#    $query should be assigned a select statement using 
+#    straight_join to force the tables to be joined in most 
+#    optimal order.
+#
+#    expect_qep.inc will then store the estimated 'Last_query_cost'
+#    and total # 'Handler_read%' for this straight_joined query.
+#
+#    We should then assign a non-straight_join'ed version of
+#    the same query to $query and execute it using 
+#    'include/check_qep.inc'. Its estimated cost and
+#    #handler_reads will then be verified against the
+#    previous straight_joined query.
+#
+# USAGE
+#
+#    let $query= <select straight_join optimal statement>;
+#    --source include/expect_qep.inc
+#    let $query= <select statement>;
+#    --source include/check_qep.inc
+#
+# EXAMPLE
+#    t/greedy_optimizer.test
+#
+
+flush status;
+eval EXPLAIN $query;
+eval $query;
+
+let $best_cost=
+  query_get_value(SHOW STATUS LIKE 'Last_query_cost', Value, 1);
+
+let $best_reads=
+`select sum(variable_value)
+   from information_schema.session_status
+   where VARIABLE_NAME like 'Handler_read%'`;
+
+#echo Expect, cost: $best_cost, Handler_reads: $best_reads;
+

No bundle (reason: useless for push emails).
Thread
bzr push into mysql-trunk branch (ole.john.aske:3701 to 3702) Bug#11766256Ole John Aske22 Feb