From: Roy Lyseng Date: May 23 2011 10:38am Subject: Re: bzr commit into mysql-trunk branch (roy.lyseng:3371) Bug#12316645 List-Archive: http://lists.mysql.com/commits/137846 Message-Id: <4DDA3935.3000208@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit On 23.05.11 11.28, Øystein Grøvlen wrote: > Hi Roy, > > You can find my comments for the code in-line. I will get back to you > with comments for test changes later. > > On 12/05/2011 17:26, Roy Lyseng wrote: > > #At file:///home/rl136806/mysql/repo/mysql-work0/ based on > revid:roy.lyseng@stripped > > > > 3371 Roy Lyseng 2011-05-12 > > Bug#12316645: Wrong cost calculation with optimizer_join_cache_level settings > > > > best_access_path() did not take properly into account the setting of > > optimizer_join_cache_level when calculating the access path that gives > > the smallest cost to access a given table. > > > > There are three changes applied in this bug fix: > > - best_access_path() properly accounts for join cache level 0, ie > > do not use join caching. > > - best_access_path() does account for join cache level greater than 3 > > when a table is outer-joined to the current join prefix. > > - best_access_path() does account for join cache level greater than 3 > > when a table is semi-joined to the current join prefix. > > > > Static properties for setting join buffering has been centralized to > > a new function set_join_buffer_properties(). > > > > Further on, setting of join cache level is changed. > > check_join_cache_usage() is renamed to setup_join_cache(). > > > > Cost calculation for joins with join buffering is better than it used > > to be, but it is still not fair. Sometimes, cost is calculated using > > join buffering, but the decision has to be reverted later. > > Cost calculation for BKA join strategy is also not considered properly > > (but this patch changes nothing in this area). > > ... > > > === modified file 'sql/sql_select.cc' > > --- a/sql/sql_select.cc 2011-05-10 06:35:59 +0000 > > +++ b/sql/sql_select.cc 2011-05-12 15:25:46 +0000 > > ... > > > @@ -10676,12 +10729,28 @@ void revise_cache_usage(JOIN_TAB *join_t > > @param icp_other_tables_ok[out] TRUE if condition pushdown supports > > other tables presence > > > > + @return false if successful, true if error. > > + Currently, allocation errors for join cache objects are ignored, > > + and regular execution is chosen silently. > > + > > @details > > The function finds out whether the table 'tab' can be joined using a join > > buffer. This check is performed after the best execution plan for 'join' > > has been chosen. If the function decides that a join buffer can be employed > > then it selects the most appropriate join cache object that contains this > > join buffer. > > + If it has already been decided to not use join buffering for this table, > > + no action is taken. > > + > > + Often it is already decided that join buffering will be used earlier in > > + the optimization process, and this will also ensure that the most correct > > + cost for the operation is calculated, and hence the probability of > > + choosing an optimal join plan is higher. However, some join buffering > > + decisions cannot currently be taken before this stage, hence we need this > > + function to decide the most accurate join buffering strategy. But > > + long-term it is the goal that join buffering strategy is decided > > + when the plan is selected. > > + > > ØG: Maybe you could create a @todo tag for the latest sentence. Indeed. > > > The result of the check and the type of the the join buffer to be used > > depend on: > > - the access method to access rows of the joined table > > @@ -10711,6 +10780,12 @@ void revise_cache_usage(JOIN_TAB *join_t > > If the function creates a join cache object it tries to initialize it. The > > failure to do this results in an invocation of the function that destructs > > the created object. > > + > > + Bitmap describing the chosen cache's properties is assigned to > > + tab->use_join_cache: > > + 1) the algorithm (JOIN_CACHE::ALG_NONE, JOIN_CACHE::ALG_BNL, > > + JOIN_CACHE::ALG_BKA, JOIN_CACHE::ALG_BKA_UNIQUE) > > + 2) the buffer's type (JOIN_CACHE::NON_INCREMENTAL_BUFFER or not). > > > > @note > > An inner table of a nested outer join or a nested semi-join can be currently > > @@ -10748,18 +10823,11 @@ void revise_cache_usage(JOIN_TAB *join_t > > } > > #endif > > > > - @return > > - Bitmap describing the chosen cache's properties: > > - 1) the algorithm (JOIN_CACHE::ALG_NONE, JOIN_CACHE::ALG_BNL, > > - JOIN_CACHE::ALG_BKA, JOIN_CACHE::ALG_BKA_UNIQUE) > > - 2) the buffer's type (JOIN_CACHE::NON_INCREMENTAL_BUFFER or not). > > */ > > > > -static > > -uint check_join_cache_usage(JOIN_TAB *tab, > > - JOIN *join, ulonglong options, > > - uint no_jbuf_after, > > - bool *icp_other_tables_ok) > > +static bool setup_join_cache(JOIN_TAB *tab, JOIN *join, > > + ulonglong options, uint no_jbuf_after, > > + bool *icp_other_tables_ok) > > { > > uint flags; > > COST_VECT cost; > > @@ -10772,10 +10840,12 @@ uint check_join_cache_usage(JOIN_TAB *ta > > uint i= tab-join->join_tab; > > const uint tab_sj_strategy= tab->get_sj_strategy(); > > *icp_other_tables_ok= TRUE; > > - > > - if (cache_level == 0 || i == join->const_tables) > > - return JOIN_CACHE::ALG_NONE; > > > > + if (cache_level == 0 || i == join->const_tables) > > + { > > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE); > > + return false; > > + } > > if (options& SELECT_NO_JOIN_CACHE) > > goto no_join_cache; > > /* > > @@ -10784,80 +10854,81 @@ uint check_join_cache_usage(JOIN_TAB *ta > > */ > > if (tab->use_quick == QS_DYNAMIC_RANGE) > > goto no_join_cache; > > - > > + > > + /* No join buffering if prevented by no_jbuf_after */ > > + if (i> no_jbuf_after) > > + { > > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE); > > + goto no_join_cache; > > + } > > + /* Non-linked join buffers can't guarantee one match */ > > + if (force_unlinked_cache&& > > + tab->is_inner_table_of_outer_join()&& > > + !tab->is_single_inner_of_outer_join()) > > + { > > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE); > > + goto no_join_cache; > > + } > > + /* > > + An inner table of an outer join nest must not use join buffering if > > + the first inner table of that outer join nest does not use join buffering. > > + This condition is not handled by earlier optimizer stages. > > + */ > > + if (tab->first_inner != NULL&& > > + tab->first_inner != tab&& > > + !tab->first_inner->use_join_cache) > > + goto no_join_cache; > > + /* > > + The first inner table of an outer join nest must not use join buffering > > + if the tables in the embedding outer join nest do not use join buffering. > > + This condition is not handled by earlier optimizer stages. > > + */ > > + if (tab->first_upper != NULL&& > > + !tab->first_upper->use_join_cache) > > + goto no_join_cache; > > /* > > Use join cache with FirstMatch semi-join strategy only when semi-join > > contains only one table. > > */ > > if (tab_sj_strategy == SJ_OPT_FIRST_MATCH&& > > !tab->is_single_inner_of_semi_join()) > > + { > > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE); > > goto no_join_cache; > > - /* > > - Non-linked join buffers can't guarantee one match > > - */ > > - if (force_unlinked_cache&& > > - tab->is_inner_table_of_outer_join()&& > > - !tab->is_single_inner_of_outer_join()) > > - goto no_join_cache; > > - > > - /* No join buffering if prevented by no_jbuf_after */ > > - if (i> no_jbuf_after) > > - goto no_join_cache; > > - > > + } > > /* No join buffering if this semijoin nest is handled by loosescan */ > > if (tab_sj_strategy == SJ_OPT_LOOSE_SCAN) > > + { > > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE); > > goto no_join_cache; > > - > > - /* Neither if semijoin Materialization */ > > + } > > + /* > > + No join buffering if this semijoin nest is handled by materialization. > > + This condition is not handled by earlier optimizer stages. > > + */ > > if (sj_is_materialize_strategy(tab_sj_strategy)) > > goto no_join_cache; > > > > - for (JOIN_TAB *first_inner= tab->first_inner; first_inner; > > - first_inner= first_inner->first_upper) > > - { > > - if (first_inner != tab&& !first_inner->use_join_cache) > > - goto no_join_cache; > > - } > > - if (tab_sj_strategy == SJ_OPT_FIRST_MATCH&& > > - tab->first_sj_inner_tab != tab&& > > - !tab->first_sj_inner_tab->use_join_cache) > > - goto no_join_cache; > > - if (!tab[-1].use_join_cache) > > - { > > - /* > > - Check whether table tab and the previous one belong to the same nest of > > - inner tables and if so do not use join buffer when joining table tab. > > - */ > > - if (tab->first_inner) > > - { > > - for (JOIN_TAB *first_inner= tab[-1].first_inner; > > - first_inner; > > - first_inner= first_inner->first_upper) > > - { > > - if (first_inner == tab->first_inner) > > - goto no_join_cache; > > - } > > - } > > - else if (tab_sj_strategy == SJ_OPT_FIRST_MATCH&& > > - tab->first_sj_inner_tab == tab[-1].first_sj_inner_tab) > > - goto no_join_cache; > > - } > > - > > ØG: I must admit that it is a bit difficult to verify that the few > simple if-tests you have added is equivalent to the complex mix of > for-loops and if-tests above. It would be great if you could give some > justification for this simplification in the commit comments. I can try. Often I do not understand the logic of the existing code myself, but now I am pretty confident it means that the existing code is wrong, or at least more complex than it should be :D The challenge then is to write the simplest possible code that does what is needed. A big part of the job is also to write understandable comments that match the code. If the reviewer can understand what goes on based on this, I think that I have done a pretty good job... > > > if (!force_unlinked_cache) > > prev_cache= tab[-1].cache; > > > > switch (tab->type) { > > case JT_ALL: > > + /* Outer joined and semi-joined tables require join cache level> 2 */ > > if (cache_level<= 2&& > > (tab->first_inner || tab_sj_strategy == SJ_OPT_FIRST_MATCH)) > > + { > > + DBUG_ASSERT(tab->use_join_cache == JOIN_CACHE::ALG_NONE); > > goto no_join_cache; > > + } > > if ((options& SELECT_DESCRIBE) || > > ((tab->cache= new JOIN_CACHE_BNL(join, tab, prev_cache))&& > > !tab->cache->init())) > > { > > *icp_other_tables_ok= FALSE; > > DBUG_ASSERT(might_do_join_buffering(cache_level, tab)); > > - return JOIN_CACHE::ALG_BNL | force_unlinked_cache; > > + tab->use_join_cache= JOIN_CACHE::ALG_BNL | force_unlinked_cache; > > + return false; > > } > > goto no_join_cache; > > case JT_SYSTEM: > > ... > > > === modified file 'sql/sql_select.h' > > --- a/sql/sql_select.h 2011-04-28 11:55:16 +0000 > > +++ b/sql/sql_select.h 2011-05-12 15:25:46 +0000 > > @@ -333,6 +333,7 @@ typedef struct st_join_table : public Sq > > */ > > ha_rows limit; > > TABLE_REF ref; > > + bool prevent_join_buffer; ///< Never use join buffering for this tbl > > join_buffer / join_cache? More consistency in naming would be good here. I checked the 6.0 user documentation which uses the term "join buffering" before settling on this name. > > > /** Join cache type (same as return code of check_join_cache_level() */ > > uint use_join_cache; > > JOIN_CACHE *cache; > Thanks, Roy