List:Commits« Previous MessageNext Message »
From:Olav Sandstaa Date:April 5 2011 11:16am
Subject:Re: bzr commit into mysql-trunk branch (ole.john.aske:3552) Bug#41740
Bug#58225 Bug#59326
View as plain text  
Hi Ole John,

Thanks for making this patch. I think this is a very good idea and a 
good approach for reducing the time the optimizer takes for running 
greedy search when there is a lot of eq_ref joined tables. This is 
clearly shown by the test case you have added. Without the patch my run 
of the greedy_optimizer test timed out after about 5 hours running. With 
the patch it takes about 3 seconds.... Thanks also for adding a lot of 
good documentation to the new code.

The code looks correct and solves the problem. My main concern with it 
is the increased complexity of our join optimizer code. The greedy 
optimizer is already fairly complex. Adding another recursive algorithm 
inside the existing recursive search makes it even more complex. I think 
it would be good to hear what other thinks about this increased 
complexity. If others think it is OK then I am also OK with it.

Besides that I have mostly only minor comments, see inline.

Olav


On 02/ 1/11 10:43 AM, Ole John Aske wrote:
> #At file:///net/fimafeng09/export/home/tmp/oleja/mysql/mysql-trunk/ based on
> revid:ole.john.aske@stripped
>
>   3552 Ole John Aske	2011-02-01
>        Another addendum fix related to bug#59326, and particularly
>        its relatives/duplicates bug#41740 and bug#58225.
>
>        Fixes the problem that complex queries containing a lot of EQ_REF'ed
>        joined tables takes 'forever' to produce a query plan.
>
>
>        The fix depends on the fact that when a table is
>        joined by an unique key there is a 1::1 relation
>        between the rows being joined. Assuming we have multiple
>        such 1::1 (star-)joined relations in a sequence,
>        without other join types inbetween. Then all of
>        these 'eq_ref-joins' will be calculated to return the excact
>        same #rows and having identical 'cost' (or 'read_time').
>
>        This leads to that we can append such a contigous sequence
>        of eq_ref-joins to a partial plan in any order without
>        affecting the total cost of the query plan. Exploring the
>        different permutations of these eq_refs in the 'greedy'
>        optimizations will simply be a waste of precious CPU cycles.
>
>        This chain of eq_refs can further be handled as a single
>        entity wrt. the full 'greedy' exploration of the possible
>        join plans. This will reduce the 'N' in the O(N!) complexity
>        of the full greedy search.
>
>        The function eq_ref_extension_by_limited_search() has been
>        implemented doing as described above. Once an EQ_REF joined
>        table is encountered, it is invoked to append any remaining
>        EQ_REFs to the join plan. This chain of EQ_REFs will then
>        act like a single entity in the further exploration of possible
>        join plans.
>
>        It should be noted that best_extension_by_limited_search()
>        will still explore the other plans starting with non-EQ_REF
>        operations as possibly better alternatives to the EQ_REF-extended
>        query plan.
>
>        Usage of eq_ref_extension_by_limited_search() is
>        controlled by 'optimizer_prune_level=1'.
>
>        This fix also introduce the function plan_is_complete() which
>        contains common functionality between best_extension_by_limited_search()
>        and eq_ref_extension_by_limited_search()
>
>        greedy_optimizer.test has been extensively extended with new testcases
>        containing lots of EQ_REF's.
>
>        No existing test/query plans are expected to change as result of this fix.
>        It should only produce the 'best' plan (a lot!) faster. - Like the
>        testcase for bug#58225 which now completes the QEP in ~0.05s in my
>        DEBUG compiled sandbox (With 'optimizer_search_depth= 62'). This
>        previously took  63min+42.437s :-o
>
>      modified:
>        mysql-test/r/greedy_optimizer.result
>        mysql-test/t/greedy_optimizer.test
>        sql/sql_select.cc
>
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc	2011-01-31 15:10:55 +0000
> +++ b/sql/sql_select.cc	2011-02-01 09:43:41 +0000
> @@ -90,6 +90,11 @@ static bool best_extension_by_limited_se
>                                                uint idx, double record_count,
>                                                double read_time, uint depth,
>                                                uint prune_level);
> +static table_map eq_ref_extension_by_limited_search(JOIN *join,
> +                                               table_map remaining_tables,
> +                                               uint idx, double record_count,
> +                                               double read_time, uint depth,
> +                                               uint prune_level);
>   static uint determine_search_depth(JOIN* join);
>   C_MODE_START
>   static int join_tab_cmp(const void *dummy, const void* ptr1, const void* ptr2);
> @@ -7877,10 +7882,12 @@ semijoin_order_allows_materialization(co
>       worst-case complexity of this algorithm is<=
>       O(N*N^search_depth/search_depth). When serch_depth>= N, then the
>       complexity of greedy_search is O(N!).
> +    'N' is the number of 'non eq_ref' tables + 'eq_ref groups' which normally
> +    are considerable less than total numbers of tables in the query.
>
>     @par
>       In the future, 'greedy_search' might be extended to support other
> -    implementations of 'best_extension', e.g. some simpler quadratic procedure.
> +    implementations of 'best_extension'.
>
>     @param join             pointer to the structure providing all context info
>                             for the query
> @@ -8035,6 +8042,47 @@ void get_partial_join_cost(JOIN *join, u
>   }
>
>
> +/*

Doxygen comment should start with /**

> +  A completed (partial) plan has been found.
> +
> +  'join' is either a partial QEP with 'search_depth' relations,
> +   or the best complete QEP so far, whichever is smaller.
> +
> +  If this is our 'best' plan explored sofar, we record this
> +  query plan and its cost.

Minor suggestion (feel free to ignore). add  doxygen comment for the 
arguments to the function

> +*/
> +static void
> +plan_is_complete(JOIN      *join,
> +                 uint      idx,
> +                 double    record_count,
> +                 double    read_time)
> +{
> +  /*
> +    We may have to make a temp table, note that this is only a
> +    heuristic since we cannot know for sure at this point.
> +    Hence it may be wrong.
> +  */
> +  if (join->sort_by_table&&
> +      join->sort_by_table !=
> +      join->positions[join->const_tables].table->table)
> +  {
> +    read_time+= record_count;
> +  }
> +
> +  if (read_time<  join->best_read)
> +  {
> +    memcpy((uchar*) join->best_positions, (uchar*) join->positions,
> +            sizeof(POSITION) * (idx + 1));
> +    join->best_read= read_time - 0.001;  // Why?
> +  }
> +  DBUG_EXECUTE("opt", print_plan(join, idx+1,
> +                                 record_count,
> +                                 read_time,
> +                                 read_time,
> +                                 "full_plan"););
> +}
> +
> +
>   /**
>     Find a good, possibly optimal, query execution plan (QEP) by a possibly
>     exhaustive search.

* extension_by_limited_search():

   -the documentation / pseudo code must be updated


> @@ -8182,6 +8230,7 @@ best_extension_by_limited_search(JOIN
>       allowed_tables= join->emb_sjm_nest->sj_inner_tables;
>
>     bool has_sj= !join->select_lex->sj_nests.is_empty();
> +  table_map eq_ref_extended(0);

Suggestion: add a comment for explaining the "eq_ref_extended" variable

>
>     JOIN_TAB *s, **pos;
>     JOIN_TAB *saved_refs[MAX_TABLES];
> @@ -8200,6 +8249,7 @@ best_extension_by_limited_search(JOIN
>
>       if ((remaining_tables&  real_table_bit)&&
>           (allowed_tables&  real_table_bit)&&
> +        !(eq_ref_extended&  real_table_bit)&&
>           !(remaining_tables&  s->dependent)&&
>           (!idx || !check_interleaving_with_nj(s)))
>       {
> @@ -8279,46 +8329,76 @@ best_extension_by_limited_search(JOIN
>
>         if ( (search_depth>  1)&&  (remaining_tables& 
> ~real_table_bit)&  allowed_tables )
>         {
> -        /* Explore more best extensions of plan */
> -        if (best_extension_by_limited_search(join,
> -                                             remaining_tables& 
> ~real_table_bit,
> -                                             idx + 1,
> +        /* Explore more extensions of plan */
> +        do  // Not a real loop, 'break' used to simplifiy logic!

I understand why you have introduced this. Still I think it would be 
better if this "non-real" do loop (and the goto done) could have been 
eliminated. I think it solves a problem now, but this code will be very 
hard to make changes to later (hard to maintain).

> +        {
> +          /*
> +            If possible, use heuristic to avoid a full expansion of partial QEP.
> +            However, it can't be avoided if:
> +              1) Pruning is enabled.
> +              2) and, It is joined by (EQ_)REF key.
> +              3) and, It is a 1::1 relation.
> +          */

I found the above comment confusing. Mayby it was just me who did not 
understand it but I think it could have been made more clear?

> +          if (prune_level == 1&&                              // 1)
> +              position->key != NULL&&                         // 2)
> +              position->records_read<= 1.0)                  // 3)
> +          {
> +            /*
> +              Join in this 'position' is an EQ_REF-joined table, append more
> EQ_REFs.
> +              We do this only for the first EQ_REF we encounter which will then
> +              include other EQ_REFs from 'remaining_tables' and inform about which
> +              tables was 'eq_ref_extended'. These are later 'pruned' as they was
> +              processed here.
> +            */
> +            if (!eq_ref_extended)
> +            {
> +              /* Try an EQ_REF-joined expansion of the partial plan */
> +              eq_ref_extended= real_table_bit |
> +                eq_ref_extension_by_limited_search(
> +                                               join,
> +                                               remaining_tables& 
> ~real_table_bit,
> +                                               idx + 1,
> +                                               current_record_count,
> +                                               current_read_time,
> +                                               search_depth - 1,
> +                                               prune_level);
> +              if (eq_ref_extended == ~(table_map)0)
> +                DBUG_RETURN(TRUE);      // Failed
> +              if (eq_ref_extended == remaining_tables)
> +                goto done;
> +              break;                    // Skip 'best_extension...()
> +            }
> +            else       // Skip, as described above
> +            {
> +              DBUG_EXECUTE("opt", print_plan(join, idx+1,
>                                                current_record_count,
> +                                             read_time,
>                                                current_read_time,
> -                                             search_depth - 1,
> -                                             prune_level))
> -          DBUG_RETURN(TRUE);
> +                                             "pruned_by_eq_ref_heuristic"););
> +              break;                    // Skip 'best_extension...()
> +            }
> +          } // if (prunable...)
> +
> +          /* Fallthrough: Explore more best extensions of plan */
> +          if (best_extension_by_limited_search(join,
> +                                               remaining_tables& 
> ~real_table_bit,
> +                                               idx + 1,
> +                                               current_record_count,
> +                                               current_read_time,
> +                                               search_depth - 1,
> +                                               prune_level))
> +            DBUG_RETURN(TRUE);
> +        } while (false);
>         }
>         else
> -      { /*
> -          'join' is either the best partial QEP with 'search_depth' relations,
> -          or the best complete QEP so far, whichever is smaller.
> -        */
> -        if (join->sort_by_table&&
> -            join->sort_by_table !=
> -            join->positions[join->const_tables].table->table)
> -          /*
> -             We may have to make a temp table, note that this is only a
> -             heuristic since we cannot know for sure at this point.
> -             Hence it may be wrong.
> -          */
> -          current_read_time+= current_record_count;
> -        if (current_read_time<  join->best_read)
> -        {
> -          memcpy((uchar*) join->best_positions, (uchar*) join->positions,
> -                 sizeof(POSITION) * (idx + 1));
> -          join->best_read= current_read_time - 0.001;
> -        }
> -        DBUG_EXECUTE("opt", print_plan(join, idx+1,
> -                                       current_record_count,
> -                                       read_time,
> -                                       current_read_time,
> -                                       "full_plan"););
> +      {
> +        plan_is_complete(join, idx, current_record_count, current_read_time);

Thanks for factorizing this part out into a separate function.

>         }
>         backout_nj_sj_state(remaining_tables, s);
>       }
>     }
>
> +done:
>     // Restore previous #rows sorted best_ref[]
>     memcpy(join->best_ref + idx, saved_refs, sizeof(JOIN_TAB*) *
> (join->tables-idx));
>     DBUG_RETURN(FALSE);
> @@ -8326,6 +8406,281 @@ best_extension_by_limited_search(JOIN
>
>
>   /**
> +  Heuristic utility used by best_extension_by_limited_search().
> +  Adds EQ_REF-joined tables to the partial plan without
> +  extensive 'greedy' cost calculation.
> +
> +    When a table is joined by an unique key there is a
> +    1::1 relation between the rows being joined. Assuming we
> +    have multiple such 1::1 (star-)joined relations in a
> +    sequence, without other join types inbetween. Then all of
> +    these 'eq_ref-joins' will be estimated to return the excact
> +    same #rows and having identical 'cost' (or 'read_time').
> +
> +    This leads to that we can append such a contigous sequence
> +    of eq_ref-joins to a partial plan in any order without
> +    affecting the total cost of the query plan. Exploring the
> +    different permutations of these eq_refs in the 'greedy'
> +    optimizations will simply be a waste of precious CPU cycles.
> +
> +    Once we have appended a single eq_ref-join to a partial
> +    plan, we may use eq_ref_extension_by_limited_search() to search
> +    'remaining_tables' for more eq_refs which will form a contigous
> +    set of eq_refs in the QEP.
> +
> +    Effectively, this chain of eq_refs will be handled as a single
> +    entity wrt. the full 'greedy' exploration of the possible
> +    join plans. This will reduce the 'N' in the O(N!) complexity
> +    of the full greedy search.
> +
> +    The algorithm start by already having a eq_ref joined table
> +    in position[idx-1] when called. It then search for more
> +    eq_ref-joinable 'remaining_tables' which are added directly
> +    to the partial QEP without further cost analysis. The algorithm
> +    continues until it either has constructed a complete plan,
> +    constructed a partial plan with size = search_depth, or could not
> +    find more eq_refs to append.
> +
> +    In the later case the algorithm continues into
> +    'best_extension_by_limited_search' which does a 'greedy'
> +    search for the next table to add - Possibly with later
> +    eq_ref_extensions.
> +
> +    The final optimal plan is stored in 'join->best_positions'. The
> +    corresponding cost of the optimal plan is in 'join->best_read'.

I think you should only use two "spaces" in front of the comments (yes, 
I know the original code has the same formatting).

> +
> +    @code
> +    procedure eq_ref_extension_by_limited_search(
> +      pplan in,             // in, partial plan of tables-joined-so-far
> +      pplan_cost,           // in, cost of pplan
> +      remaining_tables,     // in, set of tables not referenced in pplan
> +      best_plan_so_far,     // in/out, best plan found so far
> +      best_plan_so_far_cost,// in/out, cost of best_plan_so_far
> +      search_depth)         // in, maximum size of the plans being considered
> +    {
> +      if find 'eq_ref' table T from remaining_tables
> +      {
> +        // Calculate the cost of using table T as above
> +        cost = complex-series-of-calculations;
> +
> +        // Add the cost to the cost so far.
> +        pplan_cost+= cost;
> +
> +        if (pplan_cost>= best_plan_so_far_cost)
> +          // pplan_cost already too great, stop search
> +          continue;
> +
> +        pplan= expand pplan by best_access_method;
> +        remaining_tables= remaining_tables - table T;
> +        eq_ref_extension_by_limited_search(pplan, pplan_cost,
> +                                           remaining_tables,
> +                                           best_plan_so_far,
> +                                           best_plan_so_far_cost,
> +                                           search_depth - 1);
> +      }
> +      else
> +      {
> +        best_extension_by_limited_search(pplan, pplan_cost,
> +                                         remaining_tables,
> +                                         best_plan_so_far,
> +                                         best_plan_so_far_cost,
> +                                         search_depth - 1);
> +      }
> +    }
> +    @endcode

Thanks for adding a very good explanation for this new strategy!

> +
> +  @note
> +    The parameter 'search_depth' provides control over the recursion
> +    depth, and thus the size of the resulting optimal plan.
> +
> +  @param join             pointer to the structure providing all context info
> +                          for the query
> +  @param remaining_tables set of tables not included into the partial plan yet
> +  @param idx              length of the partial QEP in 'join->positions';
> +                          since a depth-first search is used, also corresponds
> +                          to the current depth of the search tree;
> +                          also an index in the array 'join->best_ref';
> +  @param record_count     estimate for the number of records returned by the
> +                          best partial plan
> +  @param read_time        the cost of the best partial plan
> +  @param search_depth     maximum depth of the recursion and thus size of the
> +                          found optimal plan
> +                          (0<  search_depth<= join->tables+1).
> +  @param prune_level      pruning heuristics that should be applied during
> +                          optimization
> +                          (values: 0 = EXHAUSTIVE, 1 = PRUNE_BY_TIME_OR_ROWS)
> +
> +  @retval
> +    'table_map'          Map of those tables appended to the EQ_REF-joined sequence
> +  @retval
> +    ~(table_map)0        Fatal error
> +*/
> +
> +static table_map
> +eq_ref_extension_by_limited_search(JOIN      *join,
> +                                   table_map remaining_tables,
> +                                   uint      idx,
> +                                   double    record_count,
> +                                   double    read_time,
> +                                   uint      search_depth,
> +                                   uint      prune_level)
> +{
> +  DBUG_ENTER("eq_ref_extension_by_limited_search");
> +
> +  table_map allowed_tables= ~(table_map)0;
> +  if (join->emb_sjm_nest)
> +    allowed_tables= join->emb_sjm_nest->sj_inner_tables;
> +
> +  bool has_sj= !join->select_lex->sj_nests.is_empty();

Consider adding a "const" to the has_sj variable.

> +
> +  /*
> +    The brackeded section below add 'eq_ref' joinable tables to
> +    the QEP in the order they are found in the 'remaining_tables' set.
> +    See above description for why we can add these without greedy
> +    cost analysis.
> +  */
> +  if (remaining_tables&  allowed_tables)
> +  {
> +    JOIN_TAB *s, **pos;
> +    table_map eq_ref_ext(0);
> +
> +    JOIN_TAB *saved_refs[MAX_TABLES];
> +    memcpy(saved_refs, join->best_ref + idx, sizeof(JOIN_TAB*) *
> (join->tables-idx));
> +
> +    for (pos= join->best_ref + idx ; (s= *pos) ; pos++)
> +    {
> +      /*
> +        Don't move swap inside conditional code: All items
> +        should be swapped to maintain '#rows' ordered tables.
> +        This is critical for early pruning of bad plans.
> +      */
> +      swap_variables(JOIN_TAB*, join->best_ref[idx], *pos);
> +
> +      /*
> +        Consider table for 'eq_ref' heuristic if:
> +          1)      It might use a keyref for best_access_path
> +          2) and, Table remains to be handled.
> +          3) and, It is allowed by emb_sjm_nest restrictions
> +          4) and, It is independent of those not yet in partial plan.
> +          5) and, It passed the interleaving check.
> +      */
> +      table_map real_table_bit= s->table->map;

Consider adding "const" to real_table_bit (I think it makes it easier to 
read when you are told that this variable do not change value later....)

> +      if (s->keyuse&&      // 1)
> +          (remaining_tables&  real_table_bit)&&      // 2)
> +          (allowed_tables&  real_table_bit)&&      // 3)
> +          !(remaining_tables&  s->dependent)&&      // 4)
> +          (!idx || !check_interleaving_with_nj(s)))  // 5)
> +      {
> +        POSITION *position= join->positions + idx;
> +        POSITION loose_scan_pos;
> +
> +        /* Find the best access method from 's' to the current partial plan */
> +        best_access_path(join, s, remaining_tables, idx, FALSE, record_count,
> +                         join->positions + idx,&loose_scan_pos);
> +
> +        /* EQ_REF prune logic is based on that all joins
> +           in the ref_extension has the same #rows and cost.
> +           ->  The total cost of the QEP is independent of the order
> +           of joins within this 'ref_extension'.
> +           Expand QEP with all 'identical' REFs in
> +          'join->positions' order.
> +        */
> +        if (position->key&&
> +            position->read_time    == (position-1)->read_time&&
> +            position->records_read == (position-1)->records_read)
> +        {
> +          double current_record_count, current_read_time;
> +
> +          /* Add the cost of extending the plan with 's' */
> +          current_record_count= record_count * position->records_read;
> +          current_read_time=    read_time
> +                                + position->read_time
> +                                + current_record_count / (double)TIME_FOR_COMPARE;
> +
> +          if (has_sj)
> +          {
> +            /*
> +              Even if there are no semijoins, advance_sj_state() has a
> +              significant cost (takes 9% of time in a 20-table plan search),
> +              hence the if() above, which is also more efficient than the
> +              same if() inside advance_sj_state() would be.
> +            */
> +            advance_sj_state(join, remaining_tables, s, idx,
> +&current_record_count,&current_read_time,
> +&loose_scan_pos);
> +          }
> +          else
> +            join->positions[idx].sj_strategy= SJ_OPT_NONE;
> +
> +          /* Expand only partial plans with lower cost than the best QEP so far */
> +          if (current_read_time>= join->best_read)
> +          {
> +            DBUG_EXECUTE("opt", print_plan(join, idx+1,
> +                                           current_record_count,
> +                                           read_time,
> +                                           current_read_time,
> +                                           "prune_by_cost"););
> +            backout_nj_sj_state(remaining_tables, s);
> +            continue;
> +          }
> +
> +          eq_ref_ext= real_table_bit;
> +          if ( (search_depth>  1)&&  (remaining_tables& 
> ~real_table_bit)&  allowed_tables )
> +          {
> +            DBUG_EXECUTE("opt", print_plan(join, idx + 1,
> +                                           current_record_count,
> +                                           read_time,
> +                                           current_read_time,
> +                                           "EQ_REF_extension"););
> +
> +            /* Recursively EQ_REF-extend the current partial plan */
> +            eq_ref_ext|=
> +                eq_ref_extension_by_limited_search(join,
> +                                                   remaining_tables& 
> ~real_table_bit,
> +                                                   idx + 1,
> +                                                   current_record_count,
> +                                                   current_read_time,
> +                                                   (search_depth>  1)
> +                                                      ? search_depth - 1 : 0,
> +                                                   prune_level);
> +          }
> +          else
> +          {
> +            plan_is_complete(join, idx, current_record_count, current_read_time);
> +          }
> +          backout_nj_sj_state(remaining_tables, s);
> +          memcpy(join->best_ref + idx, saved_refs, sizeof(JOIN_TAB*) *
> (join->tables-idx));
> +          DBUG_RETURN(eq_ref_ext);
> +        } // if ('is eq_ref')
> +
> +        backout_nj_sj_state(remaining_tables, s);
> +      } // check_interleaving_with_nj()
> +    } // for (...)
> +
> +    memcpy(join->best_ref + idx, saved_refs, sizeof(JOIN_TAB*) *
> (join->tables-idx));
> +    /*
> +      'eq_ref' heuristc didn't find a table to be appended to
> +      the query plan. We need to use the greedy search
> +      for finding the next table to be added.
> +    */
> +    DBUG_ASSERT(!eq_ref_ext);
> +    if (best_extension_by_limited_search(join,
> +                                         remaining_tables,
> +                                         idx,
> +                                         record_count,
> +                                         read_time,
> +                                         search_depth,
> +                                         prune_level))
> +      DBUG_RETURN(~(table_map)0);
> +
> +    DBUG_RETURN(eq_ref_ext);

Question: is it necessary to have the call to 
best_extension_by_limited_search() from this function? I think it would 
be easier to follow the logic if this was only recursively called from 
best_extension_by_limited_search().

> +  }
> +
> +  DBUG_RETURN(0);
> +}
> +
> +
> +/**
>     Find how much space the prevous read not const tables takes in cache.
>   */
>

Other suggestions:

   -add a comment to both extension_by_limited_search() and 
eq_ref_extension_by_limited_search() that if a change/bug fix is done to 
either of these also consider if it is relevant for the other.

>
>


Thread
bzr commit into mysql-trunk branch (ole.john.aske:3552) Bug#41740 Bug#58225Bug#59326Ole John Aske1 Feb
  • Re: bzr commit into mysql-trunk branch (ole.john.aske:3552) Bug#41740Bug#58225 Bug#59326Olav Sandstaa5 Apr
  • Re: bzr commit into mysql-trunk branch (ole.john.aske:3552) Bug#41740Bug#58225 Bug#59326Evgeny Potemkin28 Apr
    • Re: bzr commit into mysql-trunk branch (ole.john.aske:3552) Bug#41740Bug#58225 Bug#59326Ole John Aske28 Apr