List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:August 9 2010 9:13am
Subject:Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)
Bug#50489
View as plain text  
Hello,

Minor comments below, nothing critical, ok to push anyway.

Roy Lyseng a écrit, Le 27.07.2010 10:24:
> #At file:///home/rl136806/mysql/repo/mysql-work5/ based on
> revid:epotemkin@stripped
> 
>  3219 Roy Lyseng	2010-07-27
>       Bug#50489: another segfault in fix_semijoin_strategies...


> === modified file 'sql/sql_class.h'
> --- a/sql/sql_class.h	2010-07-23 18:14:59 +0000
> +++ b/sql/sql_class.h	2010-07-27 08:23:50 +0000
> @@ -3360,43 +3360,16 @@ struct st_table_ref;
>  
>  
>  /*

/** will make doxygen pick this comment up.

> -  Optimizer and executor structure for the materialized semi-join info. This
> -  structure contains
> +  Executor structure for the materialized semi-join info, which contains
>     - The sj-materialization temporary table
>     - Members needed to make index lookup or a full scan of the temptable.
>  */
>  class SJ_MATERIALIZATION_INFO : public Sql_alloc

The is_sj_scan member could optionally be renamed to is_scan, as "sj" is 
obvious in the context of this object.
And sjm_table_cols could optionally be renamed to table_cols.

> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc	2010-07-26 11:34:07 +0000
> +++ b/sql/sql_select.cc	2010-07-27 08:23:50 +0000

> @@ -1040,8 +1039,7 @@ bool subquery_types_allow_materializatio
>  */
>  
>  static 
> -bool semijoin_types_allow_materialization(TABLE_LIST *sj_nest,
> -                                          bool *sjm_scan_allowed)
> +bool semijoin_types_allow_materialization(TABLE_LIST *sj_nest)
>  {
>    DBUG_ENTER("semijoin_types_allow_materialization");
>  
> @@ -1051,7 +1049,7 @@ bool semijoin_types_allow_materializatio
>    List_iterator<Item> it1(sj_nest->nested_join->sj_outer_exprs);
>    List_iterator<Item> it2(sj_nest->nested_join->sj_inner_exprs);
>  
> -  *sjm_scan_allowed= FALSE;
> +  sj_nest->nested_join->sjm.scan_allowed= FALSE;
>  
>    bool all_are_fields= TRUE;
>    Item *outer, *inner;
> @@ -1062,7 +1060,7 @@ bool semijoin_types_allow_materializatio
>      if (!types_allow_materialization(outer, inner))
>        DBUG_RETURN(FALSE);
>    }
> -  *sjm_scan_allowed= all_are_fields;
> +  sj_nest->nested_join->sjm.scan_allowed= all_are_fields;

In the patch there are many occurences of sj_nest->nested_join->sjm, I 
wonder if those repeated indirections are a problem or not (i.e. this 
sjm should be locally cached in a variable or not).
Same question for optimizer_semijoin_nests(), 
semijoin_order_allows_materialization()...

>    DBUG_PRINT("info",("semijoin_types_allow_materialization: ok, allowed"));
>    DBUG_RETURN(TRUE);
>  }
> @@ -1362,7 +1360,7 @@ static bool sj_table_is_included(JOIN *j
>    It may be possible to consolidate the materialization strategies into one.
>    
>    The choice between the strategies is made by the join optimizer (see
> -  advance_sj_state() and fix_semijoin_strategies_for_picked_join_order()).
> +  advance_sj_state() and fix_semijoin_strategies()).
>    This function sets up all fields/structures/etc needed for execution except
>    for setup/initialization of semi-join materialization which is done in 
>    setup_sj_materialization() (todo: can't we move that to here also?)
> @@ -4873,6 +4871,17 @@ static bool optimize_semijoin_nests(JOIN
>    DBUG_ENTER("optimize_semijoin_nests");
>    List_iterator<TABLE_LIST> sj_list_it(join->select_lex->sj_nests);
>    TABLE_LIST *sj_nest;
> +
> +  /*
> +    Perform necessary cleanup in case semijoin nests have been used in prior
> +    executions.
> +  */

ideally cleanup should be done by the prior executions when they end 
(that's supposed to be the "clean house" principle in MySQL code: don't 
leave it to next inhabitants to clean up after you). I don't know 
whether it's safely possible.

> +  while ((sj_nest= sj_list_it++))
> +  {
> +    sj_nest->nested_join->sjm.positions= NULL;
> +    sj_nest->sj_mat_info= NULL;
> +  }
> +  sj_list_it.rewind();
>    /*
>      The statement may have been executed with 'semijoin=on' earlier.
>      We need to verify that 'semijoin=on' still holds.
> @@ -4888,14 +4897,12 @@ static bool optimize_semijoin_nests(JOIN
>          Try semijoin materialization if the semijoin is classified as
>          non-trivially-correlated.
>        */ 
> -      sj_nest->sj_mat_info= NULL;
>        if (sj_nest->nested_join->sj_corr_tables)
>          continue;
>        /*
>          Check whether data types allow execution with materialization.
>        */
> -      bool sjm_scan_allowed;
> -      if (semijoin_types_allow_materialization(sj_nest, &sjm_scan_allowed))
> +      if (semijoin_types_allow_materialization(sj_nest))
>        {
>          join->emb_sjm_nest= sj_nest;
>          if (choose_plan(join, all_table_map & ~join->const_table_map))
> @@ -4905,21 +4912,13 @@ static bool optimize_semijoin_nests(JOIN
>            save it.
>          */
>          uint n_tables= my_count_bits(sj_nest->sj_inner_tables);
> -        SJ_MATERIALIZATION_INFO* sjm;
> -        if (!(sjm= new SJ_MATERIALIZATION_INFO) ||
> -            !(sjm->positions=
> (POSITION*)join->thd->alloc(sizeof(POSITION)*
> -                                                          n_tables)))
> -          DBUG_RETURN(TRUE); /* purecov: inspected */
> -        sjm->tables= n_tables;
> -        sjm->is_used= FALSE;
> -        sjm->sjm_scan_allowed= sjm_scan_allowed;
> -
>          double subjoin_out_rows, subjoin_read_time;
>          get_partial_join_cost(join, n_tables,
>                                &subjoin_read_time, &subjoin_out_rows);
>  
> -        sjm->materialization_cost.convert_from_cost(subjoin_read_time);
> -        sjm->rows= subjoin_out_rows;
> +        sj_nest->nested_join->sjm.materialization_cost
> +          .convert_from_cost(subjoin_read_time);
> +        sj_nest->nested_join->sjm.rows= subjoin_out_rows;
>  
>          List<Item> &right_expr_list= 
>            sj_nest->sj_subq_pred->unit->first_select()->item_list;
> @@ -4939,7 +4938,7 @@ static bool optimize_semijoin_nests(JOIN
>                  nest (the payoff is probably less here?)
>          */
>          {
> -          for (uint i=0 ; i < join->const_tables + sjm->tables ; i++)
> +          for (uint i=0 ; i < join->const_tables + n_tables ; i++)

This uses n_tables quite far after its initialization; to make sure that 
no-one adds code which changes n_tables meanwhile, I would declare it const.

>            {
>              JOIN_TAB *tab= join->best_positions[i].table;
>              join->map2table[tab->table->tablenr]= tab;
> @@ -4955,10 +4954,16 @@ static bool optimize_semijoin_nests(JOIN
>            double rows= 1.0;
>            while ((tableno = tm_it.next_bit()) != Table_map_iterator::BITMAP_END)
>              rows *= join->map2table[tableno]->table->quick_condition_rows;
> -          sjm->rows= min(sjm->rows, rows);
> +          sj_nest->nested_join->sjm.rows=
> min(sj_nest->nested_join->sjm.rows,
> +                                              rows);
>          }
> -        memcpy(sjm->positions, join->best_positions + join->const_tables, 
> -               sizeof(POSITION) * n_tables);
> +        if (!(sj_nest->nested_join->sjm.positions=
> +              (st_position*)join->thd->alloc(sizeof(st_position)*n_tables)))

st_position or POSITION? Further down the patch uses POSITION in the 
reverse memcpy().

> +          DBUG_RETURN(TRUE);
> +
> +        memcpy(sj_nest->nested_join->sjm.positions,
> +               join->best_positions + join->const_tables, 
> +               sizeof(st_position) * n_tables);
>  
>          /*
>            Calculate temporary table parameters and usage costs
> @@ -4974,19 +4979,19 @@ static bool optimize_semijoin_nests(JOIN
>            Let materialization cost include the cost to write the data into the
>            temporary table:
>          */ 
> -        sjm->materialization_cost.add_io(subjoin_out_rows, lookup_cost);
> +        sj_nest->nested_join->sjm.materialization_cost
> +          .add_io(subjoin_out_rows, lookup_cost);
>          
>          /*
>            Set the cost to do a full scan of the temptable (will need this to 
>            consider doing sjm-scan):
>          */ 
> -        sjm->scan_cost.zero();
> -        if (sjm->rows > 0.0)
> -          sjm->scan_cost.add_io(sjm->rows, lookup_cost);
> -
> -        sjm->lookup_cost.convert_from_cost(lookup_cost);
> -        sj_nest->sj_mat_info= sjm;
> -        DBUG_EXECUTE("opt", print_sjm(sjm););
> +        sj_nest->nested_join->sjm.scan_cost.zero();
> +        if (sj_nest->nested_join->sjm.rows > 0.0)
> +          sj_nest->nested_join->sjm.scan_cost
> +           .add_io(sj_nest->nested_join->sjm.rows, lookup_cost);
> +
> +        sj_nest->nested_join->sjm.lookup_cost.convert_from_cost(lookup_cost);
>        }
>      }
>    }
> @@ -7504,7 +7509,7 @@ semijoin_order_allows_materialization(co
>    */
>    const TABLE_LIST *emb_sj_nest= tab->emb_sj_nest;
>    if (!emb_sj_nest ||
> -      !emb_sj_nest->sj_mat_info ||
> +      !emb_sj_nest->nested_join->sjm.positions ||
>        (remaining_tables & emb_sj_nest->sj_inner_tables))
>      return SJ_OPT_NONE;
>  
> @@ -7525,7 +7530,7 @@ semijoin_order_allows_materialization(co
>    */
>    if (remaining_tables & emb_sj_nest->nested_join->sj_depends_on)
>    {
> -    if (emb_sj_nest->sj_mat_info->sjm_scan_allowed)
> +    if (emb_sj_nest->nested_join->sjm.scan_allowed)
>        return SJ_OPT_MATERIALIZE_SCAN;
>      return SJ_OPT_NONE;
>    }
> @@ -8184,7 +8189,7 @@ prev_record_reads(JOIN *join, uint idx, 
>    Fix semi-join strategies for the picked join order
>  
>    SYNOPSIS
> -    fix_semijoin_strategies_for_picked_join_order()
> +    fix_semijoin_strategies()
>        join  The join with the picked join order
>  
>    DESCRIPTION
> @@ -8222,16 +8227,20 @@ prev_record_reads(JOIN *join, uint idx, 
>      join and semi-join order from left to right.
>  */    
>  
> -static void fix_semijoin_strategies_for_picked_join_order(JOIN *join)
> +static bool fix_semijoin_strategies(JOIN *join)

The old name was clear about when it happens. "Fix" is so much used in 
MySQL, to me it just means "let me correct this stuff"; even 
advance_sj_state() could be called fix_semijoin_strategies.

>  {
>    uint table_count=join->tables;
>    uint tablenr;
>    table_map remaining_tables= 0;
>    table_map handled_tabs= 0;
> +
> +  DBUG_ENTER("fix_semijoin_strategies");
> +
>    for (tablenr= table_count - 1 ; tablenr != join->const_tables - 1; tablenr--)
>    {
>      POSITION *pos= join->best_positions + tablenr;
>      JOIN_TAB *s= pos->table;
> +    TABLE_LIST *emb_sj_nest= s->emb_sj_nest;
>      uint first;
>      LINT_INIT(first); // Set by every branch except SJ_OPT_NONE which doesn't use
> it
>  
> @@ -8243,8 +8252,12 @@ static void fix_semijoin_strategies_for_
>  
>      if (pos->sj_strategy == SJ_OPT_MATERIALIZE_LOOKUP)
>      {
> -      SJ_MATERIALIZATION_INFO *sjm= s->emb_sj_nest->sj_mat_info;
> -      sjm->is_used= TRUE;
> +      uint tables= my_count_bits(emb_sj_nest->sj_inner_tables);

I would declare 'tables' const

> +      SJ_MATERIALIZATION_INFO* sjm;
> +      if (!(sjm= new SJ_MATERIALIZATION_INFO))
> +        DBUG_RETURN(TRUE);

I find "sjm" is confusing: here it's used for a SJ_MATERIALIZATION_INFO 
object, whereas in structure st_semijoin_mat "sjm" is a st_semijoin_mat; 
and one goal of the patch is to separate both structs.

> +      s->emb_sj_nest->sj_mat_info= sjm;
> +      sjm->tables= tables;
>        sjm->is_sj_scan= FALSE;
>        /*
>          This memcpy() copies a partial QEP produced by
> @@ -8261,23 +8274,31 @@ static void fix_semijoin_strategies_for_
>          setting it to SJ_OPT_NONE). But until then, pos->sj_strategy should
>          not be read.
>        */
> -      memcpy(pos - sjm->tables + 1, sjm->positions, 
> -             sizeof(POSITION) * sjm->tables);
> -      first= tablenr - sjm->tables + 1;
> -      join->best_positions[first].n_sj_tables= sjm->tables;
> +      memcpy(pos - tables + 1, emb_sj_nest->nested_join->sjm.positions, 
> +             sizeof(POSITION) * tables);
> +      first= tablenr - tables + 1;
> +      join->best_positions[first].n_sj_tables= tables;
>        join->best_positions[first].sj_strategy= SJ_OPT_MATERIALIZE_LOOKUP;
> +
> +      DBUG_EXECUTE("opt", print_sjm(emb_sj_nest););
>      }
>      else if (pos->sj_strategy == SJ_OPT_MATERIALIZE_SCAN)
>      {
>        POSITION *first_inner= join->best_positions + pos->sjm_scan_last_inner;
> -      SJ_MATERIALIZATION_INFO *sjm=
> first_inner->table->emb_sj_nest->sj_mat_info;
> -      sjm->is_used= TRUE;
> +      TABLE_LIST *mat_nest= first_inner->table->emb_sj_nest;
> +      uint tables= my_count_bits(mat_nest->sj_inner_tables);
> +      SJ_MATERIALIZATION_INFO* sjm;
> +      if (!(sjm= new SJ_MATERIALIZATION_INFO))
> +        DBUG_RETURN(TRUE);
> +      mat_nest->sj_mat_info= sjm;
> +      sjm->tables= tables;
>        sjm->is_sj_scan= TRUE;
> -      first= pos->sjm_scan_last_inner - sjm->tables + 1;
> +      first= pos->sjm_scan_last_inner - tables + 1;
>        memcpy(join->best_positions + first, // stale semijoin strategy here too
> -             sjm->positions, sizeof(POSITION) * sjm->tables);
> +             mat_nest->nested_join->sjm.positions,
> +             sizeof(POSITION) * tables);
>        join->best_positions[first].sj_strategy= SJ_OPT_MATERIALIZE_SCAN;
> -      join->best_positions[first].n_sj_tables= sjm->tables;
> +      join->best_positions[first].n_sj_tables= tables;
>        /* 
>          Do what advance_sj_state did: re-run best_access_path for every table
>          in the [last_inner_table + 1; pos..) range
> @@ -8290,22 +8311,24 @@ static void fix_semijoin_strategies_for_
>          prefix_rec_count= join->best_positions[first-1].prefix_record_count;
>        
>        /* Add materialization record count*/
> -      prefix_rec_count *= sjm->rows;
> +      prefix_rec_count *= mat_nest->nested_join->sjm.rows;
>        
>        uint i;
>        table_map rem_tables= remaining_tables;
> -      for (i= tablenr; i != (first + sjm->tables - 1); i--)
> +      for (i= tablenr; i != (first + tables - 1); i--)
>          rem_tables |= join->best_positions[i].table->table->map;
>  
>        POSITION dummy;
>        join->cur_sj_inner_tables= 0;
> -      for (i= first + sjm->tables; i <= tablenr; i++)
> +      for (i= first + tables; i <= tablenr; i++)
>        {
>          best_access_path(join, join->best_positions[i].table, rem_tables, i,
> FALSE,
>                           prefix_rec_count, join->best_positions + i,
> &dummy);
>          prefix_rec_count *= join->best_positions[i].records_read;
>          rem_tables &= ~join->best_positions[i].table->table->map;
>        }
> +
> +      DBUG_EXECUTE("opt", print_sjm(emb_sj_nest););
>      }
>      else if (pos->sj_strategy == SJ_OPT_FIRST_MATCH)
>      {
> @@ -8400,6 +8423,8 @@ static void fix_semijoin_strategies_for_
>        pos->sj_strategy= SJ_OPT_NONE;
>      remaining_tables |= s->table->map;
>    }
> +
> +  DBUG_RETURN(FALSE);
>  }
>  
>  
> @@ -8413,8 +8438,7 @@ static void fix_semijoin_strategies_for_
>  
>    DESCRIPTION
>      Setup join structures according the picked join order
> -    - finalize semi-join strategy choices (see
> -        fix_semijoin_strategies_for_picked_join_order)
> +    - finalize semi-join strategy choices (see fix_semijoin_strategies)
>      - create join->join_tab array and put there the JOIN_TABs in the join order
>      - create data structures describing ref access methods.
>  
> @@ -8441,7 +8465,8 @@ static bool get_best_combination(JOIN *j
>  
>    used_tables= OUTER_REF_TABLE_BIT;		// Outer row is already read
>  
> -  fix_semijoin_strategies_for_picked_join_order(join);
> +  if (fix_semijoin_strategies(join))
> +    DBUG_RETURN(TRUE);
>  
>    for (j=join_tab, tablenr=0 ; tablenr < table_count ; tablenr++,j++)
>    {
> @@ -9219,8 +9244,8 @@ static bool make_join_select(JOIN *join,
>           - If we're looking at the first SJM table, reset used_tables
>             to refer to only allowed tables
>        */
> -      if (tab->emb_sj_nest && tab->emb_sj_nest->sj_mat_info
> && 
> -          tab->emb_sj_nest->sj_mat_info->is_used &&
> +      if (tab->emb_sj_nest &&
> +          tab->emb_sj_nest->sj_mat_info && 
>            !(used_tables & tab->emb_sj_nest->sj_inner_tables))
>        {
>          save_used_tables= used_tables;
> @@ -12358,8 +12383,7 @@ Item *eliminate_item_equal(Item *cond, C
>        */
>        TABLE_LIST *emb_nest= 
>          item_field->field->table->pos_in_table_list->embedding;
> -      if (!item_const && emb_nest && emb_nest->sj_mat_info
> &&
> -          emb_nest->sj_mat_info->is_used)
> +      if (!item_const && emb_nest && emb_nest->sj_mat_info)
>        {
>          /* 
>            Find the first equal expression that refers to a table that is
> @@ -13436,7 +13460,7 @@ void advance_sj_state(JOIN *join, table_
>                        double *current_record_count, double *current_read_time, 
>                        POSITION *loose_scan_pos)
>  {
> -  TABLE_LIST *emb_sj_nest;
> +  TABLE_LIST *emb_sj_nest= new_join_tab->emb_sj_nest;
>    POSITION *pos= join->positions + idx;
>    remaining_tables &= ~new_join_tab->table->map;
>  
> @@ -13484,13 +13508,11 @@ void advance_sj_state(JOIN *join, table_
>    
>    table_map handled_by_fm_or_ls= 0;
>    /* FirstMatch Strategy */
> -  if (new_join_tab->emb_sj_nest &&
> +  if (emb_sj_nest &&
>        join->thd->optimizer_switch_flag(OPTIMIZER_SWITCH_FIRSTMATCH))
>    {
> -    const table_map outer_corr_tables=
> -      new_join_tab->emb_sj_nest->nested_join->sj_depends_on;
> -    const table_map sj_inner_tables=
> -      new_join_tab->emb_sj_nest->sj_inner_tables;
> +    const table_map outer_corr_tables=
> emb_sj_nest->nested_join->sj_depends_on;
> +    const table_map sj_inner_tables=   emb_sj_nest->sj_inner_tables;
>      /* 
>        Enter condition:
>         1. The next join tab belongs to semi-join nest
> @@ -13571,7 +13593,7 @@ void advance_sj_state(JOIN *join, table_
>      */
>      if ((pos->first_loosescan_table != MAX_TABLES) &&   // (1)
>          (first->table->emb_sj_nest->sj_inner_tables & remaining_tables)
> && //(2)
> -        new_join_tab->emb_sj_nest != first->table->emb_sj_nest) //(2)
> +        emb_sj_nest != first->table->emb_sj_nest) //(2)
>      {
>        pos->first_loosescan_table= MAX_TABLES;
>      }
> @@ -13583,9 +13605,8 @@ void advance_sj_state(JOIN *join, table_
>      if (loose_scan_pos->read_time != DBL_MAX)
>      {
>        pos->first_loosescan_table= idx;
> -      pos->loosescan_need_tables=
> -        new_join_tab->emb_sj_nest->sj_inner_tables | 
> -        new_join_tab->emb_sj_nest->nested_join->sj_depends_on;
> +      pos->loosescan_need_tables=  emb_sj_nest->sj_inner_tables |
> +                                   emb_sj_nest->nested_join->sj_depends_on;
>      }
>      
>      if ((pos->first_loosescan_table != MAX_TABLES) && 
> @@ -13629,7 +13650,7 @@ void advance_sj_state(JOIN *join, table_
>      Update join->cur_sj_inner_tables (Used by FirstMatch in this function and
>      LooseScan detector in best_access_path)
>    */
> -  if ((emb_sj_nest= new_join_tab->emb_sj_nest))
> +  if (emb_sj_nest)
>    {
>      join->cur_sj_inner_tables |= emb_sj_nest->sj_inner_tables;
>      pos->dups_producing_tables |= emb_sj_nest->sj_inner_tables;
> @@ -13666,16 +13687,16 @@ void advance_sj_state(JOIN *join, table_
>        we reach the point #2.
>      */
>      pos->sjm_scan_need_tables=
> -      new_join_tab->emb_sj_nest->sj_inner_tables | 
> -      new_join_tab->emb_sj_nest->nested_join->sj_depends_on |
> -      new_join_tab->emb_sj_nest->nested_join->sj_corr_tables;
> +      emb_sj_nest->sj_inner_tables | 
> +      emb_sj_nest->nested_join->sj_depends_on |
> +      emb_sj_nest->nested_join->sj_corr_tables;
>      pos->sjm_scan_last_inner= idx;
>    }
>    else if (sjm_strategy == SJ_OPT_MATERIALIZE_LOOKUP)
>    {
>      COST_VECT prefix_cost; 
> -    SJ_MATERIALIZATION_INFO *mat_info= emb_sj_nest->sj_mat_info;
> -    signed int first_tab= (int)idx - mat_info->tables;
> +    signed int first_tab=

this is not your code, but I wonder whether 'signed' is useful here.

> +      (int)idx - my_count_bits(emb_sj_nest->sj_inner_tables);
>      double prefix_rec_count;
>      if (first_tab < (int)join->const_tables)
>      {
> @@ -13689,8 +13710,9 @@ void advance_sj_state(JOIN *join, table_
>      }
>  
>      double mat_read_time= prefix_cost.total_cost();
> -    mat_read_time += mat_info->materialization_cost.total_cost() +
> -                     prefix_rec_count * mat_info->lookup_cost.total_cost();
> +    mat_read_time +=
> +      emb_sj_nest->nested_join->sjm.materialization_cost.total_cost() +
> +      prefix_rec_count *
> emb_sj_nest->nested_join->sjm.lookup_cost.total_cost();
>  
>      if (mat_read_time < *current_read_time || pos->dups_producing_tables)
>      {
> @@ -13703,8 +13725,7 @@ void advance_sj_state(JOIN *join, table_
>        pos->sj_strategy= SJ_OPT_MATERIALIZE_LOOKUP;
>        *current_read_time=    mat_read_time;
>        *current_record_count= prefix_rec_count;
> -      pos->dups_producing_tables &=
> -        ~new_join_tab->emb_sj_nest->sj_inner_tables;
> +      pos->dups_producing_tables &= ~emb_sj_nest->sj_inner_tables;
>      }
>    }
>    
> @@ -13714,11 +13735,11 @@ void advance_sj_state(JOIN *join, table_
>    {
>      TABLE_LIST *mat_nest= 
>        join->positions[pos->sjm_scan_last_inner].table->emb_sj_nest;
> -    SJ_MATERIALIZATION_INFO *mat_info= mat_nest->sj_mat_info;
> +    uint tables= my_count_bits(mat_nest->sj_inner_tables);

several my_count_bits() calls added over the patch, maybe the number of 
tables should be cached?

>  
>      double prefix_cost;
>      double prefix_rec_count;
> -    int first_tab= pos->sjm_scan_last_inner + 1 - mat_info->tables;
> +    int first_tab= pos->sjm_scan_last_inner + 1 - tables;
>      /* Get the prefix cost */
>      if (first_tab == (int)join->const_tables)
>      {
> @@ -13732,18 +13753,19 @@ void advance_sj_state(JOIN *join, table_
>      }
>  
>      /* Add materialization cost */
> -    prefix_cost += mat_info->materialization_cost.total_cost() +
> -                   prefix_rec_count * mat_info->scan_cost.total_cost();
> -    prefix_rec_count *= mat_info->rows;
> +    prefix_cost+=
> +      mat_nest->nested_join->sjm.materialization_cost.total_cost() +
> +      prefix_rec_count * mat_nest->nested_join->sjm.scan_cost.total_cost();
> +    prefix_rec_count*= mat_nest->nested_join->sjm.rows;
>      
>      uint i;
>      table_map rem_tables= remaining_tables;
> -    for (i= idx; i != (first_tab + mat_info->tables - 1); i--)
> +    for (i= idx; i != (first_tab + tables - 1); i--)
>        rem_tables |= join->positions[i].table->table->map;
>  
>      POSITION curpos, dummy;
>      /* Need to re-run best-access-path as we prefix_rec_count has changed */
> -    for (i= first_tab + mat_info->tables; i <= idx; i++)
> +    for (i= first_tab + tables; i <= idx; i++)
>      {
>        best_access_path(join, join->positions[i].table, rem_tables, i, FALSE,
>                         prefix_rec_count, &curpos, &dummy);
> @@ -13775,14 +13797,13 @@ void advance_sj_state(JOIN *join, table_
>         Duplicate weedout can be applied after all ON-correlated and 
>         correlated 
>      */
> -    TABLE_LIST *nest;
> -    if ((nest= new_join_tab->emb_sj_nest))
> +    if (emb_sj_nest)
>      {
>        if (!pos->dupsweedout_tables)
>          pos->first_dupsweedout_table= idx;
>  
> -      pos->dupsweedout_tables |= nest->sj_inner_tables |
> -                                 nest->nested_join->sj_depends_on;
> +      pos->dupsweedout_tables|= emb_sj_nest->sj_inner_tables |
> +                                emb_sj_nest->nested_join->sj_depends_on;
>      }
>  
>      if (pos->dupsweedout_tables && 
> 
> === modified file 'sql/sql_test.cc'
> --- a/sql/sql_test.cc	2010-07-13 17:29:44 +0000
> +++ b/sql/sql_test.cc	2010-07-27 08:23:50 +0000
> @@ -377,21 +377,22 @@ print_plan(JOIN* join, uint idx, double 
>  }
>  
>  
> -void print_sjm(SJ_MATERIALIZATION_INFO *sjm)
> +void print_sjm(TABLE_LIST *emb_sj_nest)
>  {
>    DBUG_LOCK_FILE;
> +  SJ_MATERIALIZATION_INFO *sjm= emb_sj_nest->sj_mat_info;
>    fprintf(DBUG_FILE, "\nsemi-join nest{\n");
>    fprintf(DBUG_FILE, "  tables { \n");
>    for (uint i= 0;i < sjm->tables; i++)
>    {
>      fprintf(DBUG_FILE, "    %s%s\n", 
> -            sjm->positions[i].table->table->alias,
> +           
> emb_sj_nest->nested_join->sjm.positions[i].table->table->alias,
>              (i == sjm->tables -1)? "": ",");
>    }
>    fprintf(DBUG_FILE, "  }\n");
>    fprintf(DBUG_FILE, "  materialize_cost= %g\n",
> -          sjm->materialization_cost.total_cost());
> -  fprintf(DBUG_FILE, "  rows= %g\n", sjm->rows);
> +          emb_sj_nest->nested_join->sjm.materialization_cost.total_cost());
> +  fprintf(DBUG_FILE, "  rows= %g\n", emb_sj_nest->nested_join->sjm.rows);
>    fprintf(DBUG_FILE, "}\n");
>    DBUG_UNLOCK_FILE;
>  }
> 
> === modified file 'sql/sql_test.h'
> --- a/sql/sql_test.h	2010-07-13 17:29:44 +0000
> +++ b/sql/sql_test.h	2010-07-27 08:23:50 +0000
> @@ -30,7 +30,7 @@ void TEST_join(JOIN *join);
>  void print_plan(JOIN* join,uint idx, double record_count, double read_time,
>                  double current_read_time, const char *info);
>  void dump_TABLE_LIST_graph(SELECT_LEX *select_lex, TABLE_LIST* tl);
> -void print_sjm(SJ_MATERIALIZATION_INFO *sjm);
> +void print_sjm(TABLE_LIST *emb_sj_nest);
>  void print_keyuse_array(DYNAMIC_ARRAY *keyuse_array);
>  #endif
>  void mysql_print_status();
> 
> === modified file 'sql/table.h'
> --- a/sql/table.h	2010-07-23 18:14:59 +0000
> +++ b/sql/table.h	2010-07-27 08:23:50 +0000
> @@ -1939,6 +1939,21 @@ public:
>    Natural_join_column *get_natural_column_ref();
>  };
>  
> +typedef struct st_semijoin_mat

It would be good to have a doxygen comment for this structure, to say 
that this serves for the optimization stage...

> +{
> +  /* Optimal join order calculated for inner tables of this semijoin op. */
> +  struct st_position *positions;
> +  /* True if data types allow the MaterializeScan semijoin strategy */
> +  bool scan_allowed;
> +  /* Expected #rows in the materialized table */
> +  double rows;
> +  /* Materialization cost - execute sub-join and write rows to temp.table */
> +  COST_VECT materialization_cost;
> +  /* Cost to make one lookup in the temptable */
> +  COST_VECT lookup_cost;
> +  /* Cost of scanning the materialized table */
> +  COST_VECT scan_cost;
> +} st_semijoin_mat;
>  
>  typedef struct st_nested_join
>  {
> @@ -1971,6 +1986,7 @@ typedef struct st_nested_join
>      of the semi-join, respectively.
>    */
>    List<Item>        sj_outer_exprs, sj_inner_exprs;
> +  st_semijoin_mat   sjm;
>    /**
>       True if this join nest node is completely covered by the query execution
>       plan. This means two things.
> 
> 
> 
> ------------------------------------------------------------------------
> 
> 


-- 
Mr. Guilhem Bichot <guilhem.bichot@stripped>
Oracle / MySQL / Optimizer team, Lead Software Engineer
Bordeaux, France
www.oracle.com / www.mysql.com
Thread
bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Roy Lyseng27 Jul
  • Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Guilhem Bichot27 Jul
  • Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Guilhem Bichot9 Aug
    • Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Roy Lyseng12 Aug
      • Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Guilhem Bichot12 Aug
        • Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Roy Lyseng12 Aug