List:Commits« Previous MessageNext Message »
From:Roy Lyseng Date:May 23 2011 10:38am
Subject:Re: bzr commit into mysql-trunk branch (roy.lyseng:3371) Bug#12316645
View as plain text  
On 23.05.11 11.28, Øystein Grøvlen wrote:
> Hi Roy,
>
> You can find my comments for the code in-line. I will get back to you
> with comments for test changes later.
>
> On 12/05/2011 17:26, Roy Lyseng wrote:
>  > #At file:///home/rl136806/mysql/repo/mysql-work0/ based on
> revid:roy.lyseng@stripped
>  >
>  > 3371 Roy Lyseng 2011-05-12
>  > Bug#12316645: Wrong cost calculation with optimizer_join_cache_level settings
>  >
>  > best_access_path() did not take properly into account the setting of
>  > optimizer_join_cache_level when calculating the access path that gives
>  > the smallest cost to access a given table.
>  >
>  > There are three changes applied in this bug fix:
>  > - best_access_path() properly accounts for join cache level 0, ie
>  > do not use join caching.
>  > - best_access_path() does account for join cache level greater than 3
>  > when a table is outer-joined to the current join prefix.
>  > - best_access_path() does account for join cache level greater than 3
>  > when a table is semi-joined to the current join prefix.
>  >
>  > Static properties for setting join buffering has been centralized to
>  > a new function set_join_buffer_properties().
>  >
>  > Further on, setting of join cache level is changed.
>  > check_join_cache_usage() is renamed to setup_join_cache().
>  >
>  > Cost calculation for joins with join buffering is better than it used
>  > to be, but it is still not fair. Sometimes, cost is calculated using
>  > join buffering, but the decision has to be reverted later.
>  > Cost calculation for BKA join strategy is also not considered properly
>  > (but this patch changes nothing in this area).
>
> ...
>
>  > === modified file 'sql/sql_select.cc'
>  > --- a/sql/sql_select.cc 2011-05-10 06:35:59 +0000
>  > +++ b/sql/sql_select.cc 2011-05-12 15:25:46 +0000
>
> ...
>
>  > @@ -10676,12 +10729,28 @@ void revise_cache_usage(JOIN_TAB *join_t
>  > @param icp_other_tables_ok[out] TRUE if condition pushdown supports
>  > other tables presence
>  >
>  > + @return false if successful, true if error.
>  > + Currently, allocation errors for join cache objects are ignored,
>  > + and regular execution is chosen silently.
>  > +
>  > @details
>  > The function finds out whether the table 'tab' can be joined using a join
>  > buffer. This check is performed after the best execution plan for 'join'
>  > has been chosen. If the function decides that a join buffer can be employed
>  > then it selects the most appropriate join cache object that contains this
>  > join buffer.
>  > + If it has already been decided to not use join buffering for this table,
>  > + no action is taken.
>  > +
>  > + Often it is already decided that join buffering will be used earlier in
>  > + the optimization process, and this will also ensure that the most correct
>  > + cost for the operation is calculated, and hence the probability of
>  > + choosing an optimal join plan is higher. However, some join buffering
>  > + decisions cannot currently be taken before this stage, hence we need this
>  > + function to decide the most accurate join buffering strategy. But
>  > + long-term it is the goal that join buffering strategy is decided
>  > + when the plan is selected.
>  > +
>
> ØG: Maybe you could create a @todo tag for the latest sentence.

Indeed.
>
>  > The result of the check and the type of the the join buffer to be used
>  > depend on:
>  > - the access method to access rows of the joined table
>  > @@ -10711,6 +10780,12 @@ void revise_cache_usage(JOIN_TAB *join_t
>  > If the function creates a join cache object it tries to initialize it. The
>  > failure to do this results in an invocation of the function that destructs
>  > the created object.
>  > +
>  > + Bitmap describing the chosen cache's properties is assigned to
>  > + tab->use_join_cache:
>  > + 1) the algorithm (JOIN_CACHE::ALG_NONE, JOIN_CACHE::ALG_BNL,
>  > + JOIN_CACHE::ALG_BKA, JOIN_CACHE::ALG_BKA_UNIQUE)
>  > + 2) the buffer's type (JOIN_CACHE::NON_INCREMENTAL_BUFFER or not).
>  >
>  > @note
>  > An inner table of a nested outer join or a nested semi-join can be currently
>  > @@ -10748,18 +10823,11 @@ void revise_cache_usage(JOIN_TAB *join_t
>  > }
>  > #endif
>  >
>  > - @return
>  > - Bitmap describing the chosen cache's properties:
>  > - 1) the algorithm (JOIN_CACHE::ALG_NONE, JOIN_CACHE::ALG_BNL,
>  > - JOIN_CACHE::ALG_BKA, JOIN_CACHE::ALG_BKA_UNIQUE)
>  > - 2) the buffer's type (JOIN_CACHE::NON_INCREMENTAL_BUFFER or not).
>  > */
>  >
>  > -static
>  > -uint check_join_cache_usage(JOIN_TAB *tab,
>  > - JOIN *join, ulonglong options,
>  > - uint no_jbuf_after,
>  > - bool *icp_other_tables_ok)
>  > +static bool setup_join_cache(JOIN_TAB *tab, JOIN *join,
>  > + ulonglong options, uint no_jbuf_after,
>  > + bool *icp_other_tables_ok)
>  > {
>  > uint flags;
>  > COST_VECT cost;
>  > @@ -10772,10 +10840,12 @@ uint check_join_cache_usage(JOIN_TAB *ta
>  > uint i= tab-join->join_tab;
>  > const uint tab_sj_strategy= tab->get_sj_strategy();
>  > *icp_other_tables_ok= TRUE;
>  > -
>  > - if (cache_level == 0 || i == join->const_tables)
>  > - return JOIN_CACHE::ALG_NONE;
>  >
>  > + if (cache_level == 0 || i == join->const_tables)
>  > + {
>  > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
>  > + return false;
>  > + }
>  > if (options& SELECT_NO_JOIN_CACHE)
>  > goto no_join_cache;
>  > /*
>  > @@ -10784,80 +10854,81 @@ uint check_join_cache_usage(JOIN_TAB *ta
>  > */
>  > if (tab->use_quick == QS_DYNAMIC_RANGE)
>  > goto no_join_cache;
>  > -
>  > +
>  > + /* No join buffering if prevented by no_jbuf_after */
>  > + if (i> no_jbuf_after)
>  > + {
>  > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
>  > + goto no_join_cache;
>  > + }
>  > + /* Non-linked join buffers can't guarantee one match */
>  > + if (force_unlinked_cache&&
>  > + tab->is_inner_table_of_outer_join()&&
>  > + !tab->is_single_inner_of_outer_join())
>  > + {
>  > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
>  > + goto no_join_cache;
>  > + }
>  > + /*
>  > + An inner table of an outer join nest must not use join buffering if
>  > + the first inner table of that outer join nest does not use join buffering.
>  > + This condition is not handled by earlier optimizer stages.
>  > + */
>  > + if (tab->first_inner != NULL&&
>  > + tab->first_inner != tab&&
>  > + !tab->first_inner->use_join_cache)
>  > + goto no_join_cache;
>  > + /*
>  > + The first inner table of an outer join nest must not use join buffering
>  > + if the tables in the embedding outer join nest do not use join buffering.
>  > + This condition is not handled by earlier optimizer stages.
>  > + */
>  > + if (tab->first_upper != NULL&&
>  > + !tab->first_upper->use_join_cache)
>  > + goto no_join_cache;
>  > /*
>  > Use join cache with FirstMatch semi-join strategy only when semi-join
>  > contains only one table.
>  > */
>  > if (tab_sj_strategy == SJ_OPT_FIRST_MATCH&&
>  > !tab->is_single_inner_of_semi_join())
>  > + {
>  > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
>  > goto no_join_cache;
>  > - /*
>  > - Non-linked join buffers can't guarantee one match
>  > - */
>  > - if (force_unlinked_cache&&
>  > - tab->is_inner_table_of_outer_join()&&
>  > - !tab->is_single_inner_of_outer_join())
>  > - goto no_join_cache;
>  > -
>  > - /* No join buffering if prevented by no_jbuf_after */
>  > - if (i> no_jbuf_after)
>  > - goto no_join_cache;
>  > -
>  > + }
>  > /* No join buffering if this semijoin nest is handled by loosescan */
>  > if (tab_sj_strategy == SJ_OPT_LOOSE_SCAN)
>  > + {
>  > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
>  > goto no_join_cache;
>  > -
>  > - /* Neither if semijoin Materialization */
>  > + }
>  > + /*
>  > + No join buffering if this semijoin nest is handled by materialization.
>  > + This condition is not handled by earlier optimizer stages.
>  > + */
>  > if (sj_is_materialize_strategy(tab_sj_strategy))
>  > goto no_join_cache;
>  >
>  > - for (JOIN_TAB *first_inner= tab->first_inner; first_inner;
>  > - first_inner= first_inner->first_upper)
>  > - {
>  > - if (first_inner != tab&& !first_inner->use_join_cache)
>  > - goto no_join_cache;
>  > - }
>  > - if (tab_sj_strategy == SJ_OPT_FIRST_MATCH&&
>  > - tab->first_sj_inner_tab != tab&&
>  > - !tab->first_sj_inner_tab->use_join_cache)
>  > - goto no_join_cache;
>  > - if (!tab[-1].use_join_cache)
>  > - {
>  > - /*
>  > - Check whether table tab and the previous one belong to the same nest of
>  > - inner tables and if so do not use join buffer when joining table tab.
>  > - */
>  > - if (tab->first_inner)
>  > - {
>  > - for (JOIN_TAB *first_inner= tab[-1].first_inner;
>  > - first_inner;
>  > - first_inner= first_inner->first_upper)
>  > - {
>  > - if (first_inner == tab->first_inner)
>  > - goto no_join_cache;
>  > - }
>  > - }
>  > - else if (tab_sj_strategy == SJ_OPT_FIRST_MATCH&&
>  > - tab->first_sj_inner_tab == tab[-1].first_sj_inner_tab)
>  > - goto no_join_cache;
>  > - }
>  > -
>
> ØG: I must admit that it is a bit difficult to verify that the few
> simple if-tests you have added is equivalent to the complex mix of
> for-loops and if-tests above. It would be great if you could give some
> justification for this simplification in the commit comments.

I can try. Often I do not understand the logic of the existing code myself, but 
now I am pretty confident it means that the existing code is wrong, or at least 
more complex than it should be :D

The challenge then is to write the simplest possible code that does what is 
needed. A big part of the job is also to write understandable comments that 
match the code. If the reviewer can understand what goes on based on this, I 
think that I have done a pretty good job...
>
>  > if (!force_unlinked_cache)
>  > prev_cache= tab[-1].cache;
>  >
>  > switch (tab->type) {
>  > case JT_ALL:
>  > + /* Outer joined and semi-joined tables require join cache level> 2 */
>  > if (cache_level<= 2&&
>  > (tab->first_inner || tab_sj_strategy == SJ_OPT_FIRST_MATCH))
>  > + {
>  > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
>  > goto no_join_cache;
>  > + }
>  > if ((options& SELECT_DESCRIBE) ||
>  > ((tab->cache= new JOIN_CACHE_BNL(join, tab, prev_cache))&&
>  > !tab->cache->init()))
>  > {
>  > *icp_other_tables_ok= FALSE;
>  > DBUG_ASSERT(might_do_join_buffering(cache_level, tab));
>  > - return JOIN_CACHE::ALG_BNL | force_unlinked_cache;
>  > + tab->use_join_cache= JOIN_CACHE::ALG_BNL | force_unlinked_cache;
>  > + return false;
>  > }
>  > goto no_join_cache;
>  > case JT_SYSTEM:
>
> ...
>
>  > === modified file 'sql/sql_select.h'
>  > --- a/sql/sql_select.h 2011-04-28 11:55:16 +0000
>  > +++ b/sql/sql_select.h 2011-05-12 15:25:46 +0000
>  > @@ -333,6 +333,7 @@ typedef struct st_join_table : public Sq
>  > */
>  > ha_rows limit;
>  > TABLE_REF ref;
>  > + bool prevent_join_buffer; ///< Never use join buffering for this tbl
>
> join_buffer / join_cache? More consistency in naming would be good here.

I checked the 6.0 user documentation which uses the term "join buffering" before 
settling on this name.
>
>  > /** Join cache type (same as return code of check_join_cache_level() */
>  > uint use_join_cache;
>  > JOIN_CACHE *cache;
>

Thanks,
Roy
Thread
bzr commit into mysql-trunk branch (roy.lyseng:3371) Bug#12316645Roy Lyseng12 May
  • Re: bzr commit into mysql-trunk branch (roy.lyseng:3371) Bug#12316645Øystein Grøvlen23 May
    • Re: bzr commit into mysql-trunk branch (roy.lyseng:3371) Bug#12316645Roy Lyseng23 May
  • Re: bzr commit into mysql-trunk branch (roy.lyseng:3371) Bug#12316645Øystein Grøvlen26 May
    • Re: bzr commit into mysql-trunk branch (roy.lyseng:3371) Bug#12316645Roy Lyseng7 Jun