Hi Roy,
You can find my comments for the code in-line. I will get back to you
with comments for test changes later.
On 12/05/2011 17:26, Roy Lyseng wrote:
> #At file:///home/rl136806/mysql/repo/mysql-work0/ based on
revid:roy.lyseng@stripped
>
> 3371 Roy Lyseng 2011-05-12
> Bug#12316645: Wrong cost calculation with
optimizer_join_cache_level settings
>
> best_access_path() did not take properly into account the
setting of
> optimizer_join_cache_level when calculating the access path
that gives
> the smallest cost to access a given table.
>
> There are three changes applied in this bug fix:
> - best_access_path() properly accounts for join cache level 0, ie
> do not use join caching.
> - best_access_path() does account for join cache level
greater than 3
> when a table is outer-joined to the current join prefix.
> - best_access_path() does account for join cache level
greater than 3
> when a table is semi-joined to the current join prefix.
>
> Static properties for setting join buffering has been
centralized to
> a new function set_join_buffer_properties().
>
> Further on, setting of join cache level is changed.
> check_join_cache_usage() is renamed to setup_join_cache().
>
> Cost calculation for joins with join buffering is better than
it used
> to be, but it is still not fair. Sometimes, cost is calculated
using
> join buffering, but the decision has to be reverted later.
> Cost calculation for BKA join strategy is also not considered
properly
> (but this patch changes nothing in this area).
...
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc 2011-05-10 06:35:59 +0000
> +++ b/sql/sql_select.cc 2011-05-12 15:25:46 +0000
...
> @@ -10676,12 +10729,28 @@ void revise_cache_usage(JOIN_TAB *join_t
> @param icp_other_tables_ok[out] TRUE if condition pushdown supports
> other tables presence
>
> + @return false if successful, true if error.
> + Currently, allocation errors for join cache objects are
ignored,
> + and regular execution is chosen silently.
> +
> @details
> The function finds out whether the table 'tab' can be joined
using a join
> buffer. This check is performed after the best execution plan
for 'join'
> has been chosen. If the function decides that a join buffer can
be employed
> then it selects the most appropriate join cache object that
contains this
> join buffer.
> + If it has already been decided to not use join buffering for
this table,
> + no action is taken.
> +
> + Often it is already decided that join buffering will be used
earlier in
> + the optimization process, and this will also ensure that the
most correct
> + cost for the operation is calculated, and hence the probability of
> + choosing an optimal join plan is higher. However, some join
buffering
> + decisions cannot currently be taken before this stage, hence we
need this
> + function to decide the most accurate join buffering strategy. But
> + long-term it is the goal that join buffering strategy is decided
> + when the plan is selected.
> +
ØG: Maybe you could create a @todo tag for the latest sentence.
> The result of the check and the type of the the join buffer to
be used
> depend on:
> - the access method to access rows of the joined table
> @@ -10711,6 +10780,12 @@ void revise_cache_usage(JOIN_TAB *join_t
> If the function creates a join cache object it tries to
initialize it. The
> failure to do this results in an invocation of the function
that destructs
> the created object.
> +
> + Bitmap describing the chosen cache's properties is assigned to
> + tab->use_join_cache:
> + 1) the algorithm (JOIN_CACHE::ALG_NONE, JOIN_CACHE::ALG_BNL,
> + JOIN_CACHE::ALG_BKA, JOIN_CACHE::ALG_BKA_UNIQUE)
> + 2) the buffer's type (JOIN_CACHE::NON_INCREMENTAL_BUFFER or not).
>
> @note
> An inner table of a nested outer join or a nested semi-join can
be currently
> @@ -10748,18 +10823,11 @@ void revise_cache_usage(JOIN_TAB *join_t
> }
> #endif
>
> - @return
> - Bitmap describing the chosen cache's properties:
> - 1) the algorithm (JOIN_CACHE::ALG_NONE, JOIN_CACHE::ALG_BNL,
> - JOIN_CACHE::ALG_BKA, JOIN_CACHE::ALG_BKA_UNIQUE)
> - 2) the buffer's type (JOIN_CACHE::NON_INCREMENTAL_BUFFER or not).
> */
>
> -static
> -uint check_join_cache_usage(JOIN_TAB *tab,
> - JOIN *join, ulonglong options,
> - uint no_jbuf_after,
> - bool *icp_other_tables_ok)
> +static bool setup_join_cache(JOIN_TAB *tab, JOIN *join,
> + ulonglong options, uint no_jbuf_after,
> + bool *icp_other_tables_ok)
> {
> uint flags;
> COST_VECT cost;
> @@ -10772,10 +10840,12 @@ uint check_join_cache_usage(JOIN_TAB *ta
> uint i= tab-join->join_tab;
> const uint tab_sj_strategy= tab->get_sj_strategy();
> *icp_other_tables_ok= TRUE;
> -
> - if (cache_level == 0 || i == join->const_tables)
> - return JOIN_CACHE::ALG_NONE;
>
> + if (cache_level == 0 || i == join->const_tables)
> + {
> + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
> + return false;
> + }
> if (options& SELECT_NO_JOIN_CACHE)
> goto no_join_cache;
> /*
> @@ -10784,80 +10854,81 @@ uint check_join_cache_usage(JOIN_TAB *ta
> */
> if (tab->use_quick == QS_DYNAMIC_RANGE)
> goto no_join_cache;
> -
> +
> + /* No join buffering if prevented by no_jbuf_after */
> + if (i> no_jbuf_after)
> + {
> + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
> + goto no_join_cache;
> + }
> + /* Non-linked join buffers can't guarantee one match */
> + if (force_unlinked_cache&&
> + tab->is_inner_table_of_outer_join()&&
> + !tab->is_single_inner_of_outer_join())
> + {
> + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
> + goto no_join_cache;
> + }
> + /*
> + An inner table of an outer join nest must not use join buffering if
> + the first inner table of that outer join nest does not use join
buffering.
> + This condition is not handled by earlier optimizer stages.
> + */
> + if (tab->first_inner != NULL&&
> + tab->first_inner != tab&&
> + !tab->first_inner->use_join_cache)
> + goto no_join_cache;
> + /*
> + The first inner table of an outer join nest must not use join
buffering
> + if the tables in the embedding outer join nest do not use join
buffering.
> + This condition is not handled by earlier optimizer stages.
> + */
> + if (tab->first_upper != NULL&&
> + !tab->first_upper->use_join_cache)
> + goto no_join_cache;
> /*
> Use join cache with FirstMatch semi-join strategy only when
semi-join
> contains only one table.
> */
> if (tab_sj_strategy == SJ_OPT_FIRST_MATCH&&
> !tab->is_single_inner_of_semi_join())
> + {
> + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
> goto no_join_cache;
> - /*
> - Non-linked join buffers can't guarantee one match
> - */
> - if (force_unlinked_cache&&
> - tab->is_inner_table_of_outer_join()&&
> - !tab->is_single_inner_of_outer_join())
> - goto no_join_cache;
> -
> - /* No join buffering if prevented by no_jbuf_after */
> - if (i> no_jbuf_after)
> - goto no_join_cache;
> -
> + }
> /* No join buffering if this semijoin nest is handled by loosescan */
> if (tab_sj_strategy == SJ_OPT_LOOSE_SCAN)
> + {
> + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
> goto no_join_cache;
> -
> - /* Neither if semijoin Materialization */
> + }
> + /*
> + No join buffering if this semijoin nest is handled by
materialization.
> + This condition is not handled by earlier optimizer stages.
> + */
> if (sj_is_materialize_strategy(tab_sj_strategy))
> goto no_join_cache;
>
> - for (JOIN_TAB *first_inner= tab->first_inner; first_inner;
> - first_inner= first_inner->first_upper)
> - {
> - if (first_inner != tab&& !first_inner->use_join_cache)
> - goto no_join_cache;
> - }
> - if (tab_sj_strategy == SJ_OPT_FIRST_MATCH&&
> - tab->first_sj_inner_tab != tab&&
> - !tab->first_sj_inner_tab->use_join_cache)
> - goto no_join_cache;
> - if (!tab[-1].use_join_cache)
> - {
> - /*
> - Check whether table tab and the previous one belong to the
same nest of
> - inner tables and if so do not use join buffer when joining
table tab.
> - */
> - if (tab->first_inner)
> - {
> - for (JOIN_TAB *first_inner= tab[-1].first_inner;
> - first_inner;
> - first_inner= first_inner->first_upper)
> - {
> - if (first_inner == tab->first_inner)
> - goto no_join_cache;
> - }
> - }
> - else if (tab_sj_strategy == SJ_OPT_FIRST_MATCH&&
> - tab->first_sj_inner_tab == tab[-1].first_sj_inner_tab)
> - goto no_join_cache;
> - }
> -
ØG: I must admit that it is a bit difficult to verify that the few
simple if-tests you have added is equivalent to the complex mix of
for-loops and if-tests above. It would be great if you could give some
justification for this simplification in the commit comments.
> if (!force_unlinked_cache)
> prev_cache= tab[-1].cache;
>
> switch (tab->type) {
> case JT_ALL:
> + /* Outer joined and semi-joined tables require join cache level>
2 */
> if (cache_level<= 2&&
> (tab->first_inner || tab_sj_strategy == SJ_OPT_FIRST_MATCH))
> + {
> + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE);
> goto no_join_cache;
> + }
> if ((options& SELECT_DESCRIBE) ||
> ((tab->cache= new JOIN_CACHE_BNL(join, tab, prev_cache))&&
> !tab->cache->init()))
> {
> *icp_other_tables_ok= FALSE;
> DBUG_ASSERT(might_do_join_buffering(cache_level, tab));
> - return JOIN_CACHE::ALG_BNL | force_unlinked_cache;
> + tab->use_join_cache= JOIN_CACHE::ALG_BNL | force_unlinked_cache;
> + return false;
> }
> goto no_join_cache;
> case JT_SYSTEM:
...
> === modified file 'sql/sql_select.h'
> --- a/sql/sql_select.h 2011-04-28 11:55:16 +0000
> +++ b/sql/sql_select.h 2011-05-12 15:25:46 +0000
> @@ -333,6 +333,7 @@ typedef struct st_join_table : public Sq
> */
> ha_rows limit;
> TABLE_REF ref;
> + bool prevent_join_buffer; ///< Never use join buffering
for this tbl
join_buffer / join_cache? More consistency in naming would be good here.
> /** Join cache type (same as return code of
check_join_cache_level() */
> uint use_join_cache;
> JOIN_CACHE *cache;
--
Øystein