List:Commits« Previous MessageNext Message »
From:Roy Lyseng Date:April 27 2010 8:21am
Subject:Re: bzr commit into mysql-6.0-codebase-bugfixing branch
(roy.lyseng:3829) WL#5266
View as plain text  
Hi Øystein,

thanks for good suggestions!

Øystein Grøvlen wrote:
> Hi Roy,
> 
> Thanks for the updated patch. This looks even better.  See in-line for
> some nit-picking.
> 
> Roy Lyseng wrote:
>  > #At file:///home/rl136806/mysql/repo/mysql-6.0-work2/ based on 
> revid:horst.hunger@stripped
>  >
>  >  3829 Roy Lyseng    2010-04-21
>  >       WL#5266 - Refactor data used to collect correlated expressions 
> in semijoins
> 
> ...
> 
>  > === modified file 'sql/sql_select.cc'
>  > --- a/sql/sql_select.cc    2010-04-17 06:34:02 +0000
>  > +++ b/sql/sql_select.cc    2010-04-21 10:27:39 +0000
>  > @@ -275,7 +275,15 @@ static uint make_join_orderinfo(JOIN *jo
>  >  static int
>  >  join_read_record_no_init(JOIN_TAB *tab);
>  >  static
>  > -bool subquery_types_allow_materialization(Item_in_subselect *in_subs);
>  > +bool subquery_types_allow_materialization(Item_in_subselect 
> *predicate);
>  > +static
>  > +void semijoin_types_allow_materialization(TABLE_LIST *sj_nest,
>  > +                                          bool 
> types_allow_materialization,
>  > +                                          bool *);
> 
> This forward declaration does not match the actual declaration.  Also,
> please, mention all parameters.

Outch...
> 
> ...
> 
>  > +static
>  > +bool resolve_subquery(THD *thd, JOIN *join)
>  > +
>  > +{
>  > +  DBUG_ENTER("resolve_subquery");
>  > +
>  > +  SELECT_LEX *select_lex= join->select_lex;
>  > +
>  > +  /* Return if we are just normalizing a view */
>  > +  if (thd->lex->view_prepare_mode)
>  > +    DBUG_RETURN(FALSE);
>  > +  /* Return if the select lex is tagged as within a DESCRIBE */
>  > +  if (join->select_options & SELECT_DESCRIBE)
>  > +    DBUG_RETURN(FALSE);
>  > +
>  > +  /*
>  > +    @todo for PS, make the whole block execute only on the first 
> execution.
>  > +    resolve_subquery() is only invoked in the first execution for 
> subqueries
>  > +    that are transformed to semijoin, but for other subqueries, this 
> function
>  > +    is called for every execution. One solution is perhaps to define
>  > +    exec_method in class Item_subselect and exit immediately if 
> unequal to
>  > +    EXEC_UNSPECIFIED.
>  > +  */
>  > +  Item_subselect *subq_predicate= select_lex->master_unit()->item;
>  > +  DBUG_ASSERT(subq_predicate);
> 
> Do you really need subq_predicate?  Could not in_exists_predicate be
> used instead?  It is OK to use it as an abbreviation in order to make
> the statement just below shorter, but I think it is a bit confusing to
> have two pointers for the same thing (see comment below).

subq_predicate is required in the call to select_transformer(), so I cannot 
remove it. I could replace it with select_lex->master_unit()->item, but I do not 
think that is a good idea.

As an afterthought, assigning exec_method and the other fields and functions to 
Item_exists_subselect was not my best idea. It would have been better to put 
them on Item_subselect, which would mean that we would have general 
transformation capabilities for all subquery types.

In this case, in_exists_predicate would be replaced with subq_predicate in most 
cases.

However, it would mean violating the approved design, and I will not do that 
unless the reviewers insist...
> 
>  > +
>  > +  /* in_exists_predicate is non-NULL for IN, =ANY and EXISTS 
> predicates */
>  > +  Item_exists_subselect *in_exists_predicate=
>  > +    (subq_predicate->substype() == Item_subselect::IN_SUBS ||
>  > +     subq_predicate->substype() == Item_subselect::EXISTS_SUBS) ?
>  > +        (Item_exists_subselect*)subq_predicate :
>  > +        NULL;
> 
> ...
> 
>  > +    if (thd->optimizer_switch_flag(OPTIMIZER_SWITCH_MATERIALIZATION) 
>  &&
>  > +        subq_predicate->substype() == Item_subselect::IN_SUBS &&   
>  
> // 1
>  > +        !select_lex->is_part_of_union() &&     // 2
>  > +        select_lex->master_unit()->first_select()->leaf_tables
> && 
>     // 3
>  > +        thd->lex->sql_command == SQLCOM_SELECT &&     // *
>  > +        select_lex->outer_select()->leaf_tables &&     // 3A
>  > +        in_predicate->is_top_level_item() &&     // 4
>  > +        !in_predicate->is_correlated &&     // 5
>  > +        in_predicate->exec_method ==
>  > +                      Item_exists_subselect::EXEC_UNSPECIFIED &&     
> // 6
>  > +        subquery_types_allow_materialization(in_predicate))     // 7
>  > +      in_predicate->exec_method= 
> Item_exists_subselect::EXEC_MATERIALIZATION;
> 
> Ditto.

I admit it can be confusing, but in_predicate is not valid unless the test 
subq_predicate->substype() == Item_subselect::IN_SUBS is True. Notice also that 
the assignment to in_predicate is immediately above this big if test, so there 
should not be a big confusion.

Please advice what I should do here...
> 
> ...
> 
>  > @@ -972,46 +1039,66 @@ err:
>  >  */
>  >
>  >  static
>  > -bool subquery_types_allow_materialization(Item_in_subselect *in_subs)
>  > +void semijoin_types_allow_materialization(TABLE_LIST *sj_nest,
>  > +                                          bool 
> *materialization_allowed,
>  > +                                          bool *sjm_scan_allowed)
> 
> I suggest dropping the  materialization_allowed parameter let it be
> the return value.  Then the function call could be used directly in
> the if-statement that checks this.

OK.
> 
> ...
> 
>  >  {
>  > -  DBUG_ENTER("subquery_types_allow_materialization");
>  > +  DBUG_ENTER("semijoin_types_allow_materialization");
>  >
>  > -  DBUG_ASSERT(in_subs->left_expr->fixed);
>  > +  DBUG_ASSERT(sj_nest->nested_join->sj_outer_exprs.elements ==
>  > +              sj_nest->nested_join->sj_inner_exprs.elements);
>  >
>  > -  List_iterator<Item>
> it(in_subs->unit->first_select()->item_list);
>  > -  uint elements= in_subs->unit->first_select()->item_list.elements;
>  > +  List_iterator<Item> it1(sj_nest->nested_join->sj_outer_exprs);
>  > +  List_iterator<Item> it2(sj_nest->nested_join->sj_inner_exprs);
>  > +
>  > +  *materialization_allowed= FALSE;
>  > +  *sjm_scan_allowed= FALSE;
>  >
>  > -  in_subs->types_allow_materialization= FALSE;  // Assign default 
> values
>  > -  in_subs->sjm_scan_allowed= FALSE;
>  > -
>  >    bool all_are_fields= TRUE;
>  > -  for (uint i= 0; i < elements; i++)
>  > +  Item *outer, *inner;
>  > +  while (outer= it1++, inner= it2++)
> 
> Since outer and inner is not needed outside the loop, I wonder whether
> a for loop would be better.  Also, would not
>   (outer= it1++ && inner= it2++)
> be a bit safer?

A for loop will either need a loop variable or it becomes unnecessary 
complicated, I think. Can you suggest a simple solution with a for loop?

About safety, we have already asserted that the element counts are the same.

> 
> ...
> 
>  > @@ -9301,6 +9446,8 @@ static bool make_join_select(JOIN *join,
>  >        if (pushdown_on_conditions(join, tab))
>  >          DBUG_RETURN(1);
>  >
>  > +      DBUG_ASSERT(save_used_tables ? tab->emb_sj_nest != NULL : TRUE);
>  > +
> 
> Why not DBUG_ASSERT(save_used_tables && tab->emb_sj_nest != NULL)?

It would be:
DBUG_ASSERT((!save_used_tables) ||
             (save_used_tables && tab->emb_sj_nest != NULL));

then. I think the assert in the commit is the best one.
> 
>  >       if (save_used_tables && !(used_tables &
>  >                                 ~(tab->emb_sj_nest->sj_inner_tables |
>  >                                   join->const_table_map | 
> PSEUDO_TABLE_BITS)))
> 
> ...
> 
>  > @@ -13476,6 +13611,9 @@ void advance_sj_state(JOIN *join, table_
>  >      join->cur_dups_producing_tables |= emb_sj_nest->sj_inner_tables;
>  >
>  >      /* Remove the sj_nest if all of its SJ-inner tables are in 
> cur_table_map */
>  > +    DBUG_ASSERT((remaining_tables & emb_sj_nest->sj_inner_tables &
>  > +                 ~new_join_tab->table->map) ==
>  > +                (remaining_tables & emb_sj_nest->sj_inner_tables));
>  >      if (!(remaining_tables &
>  >            emb_sj_nest->sj_inner_tables &
> ~new_join_tab->table->map))
>  >        join->cur_sj_inner_tables &= ~emb_sj_nest->sj_inner_tables;
> 
> I not quite sure what/why you are trying to assert here,  but it seems
> like the above assert is equivalent to
> 
> DBUG_ASSERT(!(remaining_tables & emb_sj_nest->sj_inner_tables &
>               new_join_tab->table->map));

It was a start of an experiment I did to see whether it was possible to 
eliminate table->map from the statement below. The DBUG_ASSERT documents that it 
is redundant. I forgot to followup on that, and I also forgot to remove the 
DBUG_ASSERT. Sorry about that...

Now, the DBUG_ASSERT is removed and table->map is eliminated from the statement.
> ...
> 
>  > @@ -13627,10 +13764,12 @@ void advance_sj_state(JOIN *join, table_
>  >          pos->first_dupsweedout_table= idx;
>  >
>  >        pos->dupsweedout_tables |= nest->sj_inner_tables |
>  > -                                 nest->nested_join->sj_depends_on |
>  > -                                 nest->nested_join->sj_corr_tables;
>  > +                                 nest->nested_join->sj_depends_on;
>  >      }
>  >
>  > +    DBUG_ASSERT((remaining_tables & ~new_join_tab->table->map &
>  > +                 pos->dupsweedout_tables) ==
>  > +                (remaining_tables & pos->dupsweedout_tables));
>  >      if (pos->dupsweedout_tables &&
>  >          !(remaining_tables &
>  >            ~new_join_tab->table->map & pos->dupsweedout_tables))
> 
> And the above assert seems to be equivalent to
> 
> DBUG_ASSERT(!(remaining_tables & pos->dupsweedout_tables &
>               new_join_tab->table->map))
> 
> But I do not understand the purpose of this assert.
> 
Same as above.

Thanks,
Roy
Thread
bzr commit into mysql-6.0-codebase-bugfixing branch (roy.lyseng:3829) WL#5266Roy Lyseng21 Apr
  • Re: bzr commit into mysql-6.0-codebase-bugfixing branch(roy.lyseng:3829) WL#5266Øystein Grøvlen26 Apr
    • Re: bzr commit into mysql-6.0-codebase-bugfixing branch(roy.lyseng:3829) WL#5266Øystein Grøvlen26 Apr
      • Re: bzr commit into mysql-6.0-codebase-bugfixing branch(roy.lyseng:3829) WL#5266Roy Lyseng27 Apr
    • Re: bzr commit into mysql-6.0-codebase-bugfixing branch(roy.lyseng:3829) WL#5266Roy Lyseng27 Apr
      • Re: bzr commit into mysql-6.0-codebase-bugfixing branch(roy.lyseng:3829) WL#5266Øystein Grøvlen27 Apr