On 20.08.10 11.19, Jorgen Loland wrote:
> Roy,
>
> Thank you for the patch and a very informative commit message. For the
> most part, the patch looks correct but I have a few
> questions/requirements for a new patch. See inline comments tagged
> with (jl).
>
>
> > === modified file 'sql/item.cc'
> > --- a/sql/item.cc 2010-05-14 10:58:39 +0000
> > +++ b/sql/item.cc 2010-06-10 08:17:31 +0000
> > @@ -2296,21 +2296,73 @@ table_map Item_field::used_tables() cons
> > }
> >
> >
> > -void Item_field::fix_after_pullout(st_select_lex *new_parent, Item **ref)
> > +table_map Item_field::resolved_used_tables() const
> > {
> > - if (new_parent == depended_from)
> > - depended_from= NULL;
> > - Name_resolution_context *ctx= new Name_resolution_context();
> > - ctx->outer_context= NULL; // We don't build a complete name resolver
> > - ctx->table_list= NULL; // We rely on first_name_resolution_table instead
> > - ctx->select_lex= new_parent;
> > - ctx->first_name_resolution_table=
> context->first_name_resolution_table;
> > - ctx->last_name_resolution_table= context->last_name_resolution_table;
> > - ctx->error_processor= context->error_processor;
> > - ctx->error_processor_data= context->error_processor_data;
> > - ctx->resolve_in_select_list= context->resolve_in_select_list;
> > - ctx->security_ctx= context->security_ctx;
> > - this->context=ctx;
> > + if (field->table->const_table)
> > + return 0; // const item
> > + return field->table->map;
> > +}
> > +
> > +void Item_field::fix_after_pullout(st_select_lex *parent_select,
> > + st_select_lex *removed_select,
> > + Item **ref)
> > +{
>
> (snip)
>
> > + if (depended_from)
> > + {
> > + /*
> > + Refresh used_tables information for subqueries between the definition
> > + scope and resolution scope of the field item reference.
> > + */
> > + st_select_lex *child_select= context->select_lex;
> > + while (child_select->outer_select() != depended_from)
> > + {
> > + /*
> > + The subquery on this level is outer-correlated with respect to the field
> > + */
> > + Item_subselect *subq_predicate= child_select->master_unit()->item;
> > +
> > + subq_predicate->used_tables_cache|= OUTER_REF_TABLE_BIT;
> > +
> > + child_select= child_select->outer_select();
> > + }
> > + /*
> > + child_select is select_lex immediately inner to the depended_from level.
> > + Now, locate the subquery predicate that contains this select_lex and
> > + update used tables information.
> > + */
> > + Item_subselect *subq_predicate= child_select->master_unit()->item;
> > +
> > + subq_predicate->used_tables_cache|= this->resolved_used_tables();
> > + subq_predicate->const_item_cache&= this->const_item();
> > + }
>
> (jl)
> The content of this 'while' looks incorrect to me. To my knowledge,
> OUTER_REF_TABLE_BIT should be set only if an item references a table
> in an outer select. Using your convention of selects and tables:
>
> ---------
> select a from t1
> where a in (select c from t2 where d >= some(select e from t3 where b=e));
>
> A - t1 (a, b) <----.
> \ (b has reference to this select)
> B - t2 (c, d) |
> \ |
> C - t3 (e) ----'
>
> ---------
>
> Here, only b in the innermost select should have OUTER_REF_TABLE_BIT.
>
> The above while loop changes this after sj pullout. It sets
> OUTER_REF_TABLE_BIT for all Item_subselect between some Item that has
> an outer reference and the removed select. I need one more nesting
> level to illustrate:
>
> Before pullout:
>
> A <---.
> \ |
> B |
> \ |
> C |
> \ |
> D --' *1)
>
> *1) OUTER_REF_TABLE_BIT is set for the item that refers to a
> table in A)
>
> After pullout of B:
>
> A <---.
> \ |
> C | *2)
> \ |
> D --' *1)
>
> *2) OUTER_REF_TABLE_BIT is set for the Item_subselect in C. This is
> incorrect
>
>
> I suggest you only set OUTER_REF_TABLE_BIT for the Item_field
> fix_after_pullout() is called on; something like:
>
>
> if (depended_from)
> {
> st_select_lex *child_select= context->select_lex;
> if (child_select->outer_select() != depended_from)
> {
> Item_subselect *subq_predicate= child_select->master_unit()->item;
> subq_predicate->used_tables_cache|= OUTER_REF_TABLE_BIT;
> }
> while (child_select->outer_select() != depended_from)
> child_select= child_select->outer_select();
>
> /*
> child_select is select_lex immediately inner to the depended_from level.
> Now, locate the subquery predicate that contains this select_lex and
> update used tables information.
> */
> Item_subselect *subq_predicate= child_select->master_unit()->item;
> subq_predicate->used_tables_cache|= this->resolved_used_tables();
> subq_predicate->const_item_cache&= this->const_item();
> }
Thank you for the analysis. I considered both options originally, but coded this
way in the patch, because I did not know what was the correct way. Now I am
convinced that your proposal is correct, and will change the patch accordingly.
>
> (jl)
> Q: Does const_item_cache change?
I do not know. Just applying the 'better safe than sorry' principle.
>
> > === modified file 'sql/item.h'
> > --- a/sql/item.h 2010-05-14 10:58:39 +0000
> > +++ b/sql/item.h 2010-06-10 08:17:31 +0000
> > @@ -583,12 +583,19 @@ public:
> (snip)
> > + /**
> > + Fix after tables have been moved from one select_lex level to the parent
> > + level. Basically re-calculate all attributes dependent on the tables.
> >
> > + @param parent_select select_lex that tables are moved to.
> > + @param removed_select select_lex that tables are moved away from,
> > + child of parent_select.
> > + @param ref updated with new ref whenever the function substitutes
> > + this item with another.
> > + */
> > + virtual void fix_after_pullout(st_select_lex *parent_select,
> > + st_select_lex *removed_select,
> > + Item **ref) {};
>
> (jl)
> Minor detail: Suggest you write something like "...to the parent level
> by semijoin conversion"
Done.
>
> > === modified file 'sql/item_cmpfunc.cc'
> > --- a/sql/item_cmpfunc.cc 2010-05-14 10:58:39 +0000
> > +++ b/sql/item_cmpfunc.cc 2010-06-10 08:17:31 +0000
> > @@ -1755,6 +1755,32 @@ bool Item_in_optimizer::fix_fields(THD *
> > }
> >
> > +void Item_in_optimizer::fix_after_pullout(st_select_lex *parent_select,
> > + st_select_lex *removed_select,
> > + Item **ref)
> > +{
> > + used_tables_cache=0;
> > + not_null_tables_cache= 0;
> > + const_item_cache= 1;
> > +
> > + /*
> > + No need to call fix_after_pullout() on args[0] (ie left expression),
> > + as Item>in_subselect::fix_after_pullout() will do this.
>
> (jl) typo "Item>"
Fixed.
>
> + */
> + //args[0]->fix_after_pullout(parent_select, removed_select, &args[0]);
> +
> + //used_tables_cache|= args[0]->used_tables();
> + //not_null_tables_cache|= args[0]->not_null_tables();
> + //const_item_cache&= args[0]->const_item();
>
> (jl) Clean up this ^
Done.
>
>
> > === modified file 'sql/item_subselect.cc'
> > --- a/sql/item_subselect.cc 2010-05-14 10:58:39 +0000
> > +++ b/sql/item_subselect.cc 2010-06-10 08:17:31 +0000
> > @@ -303,6 +303,67 @@ bool Item_subselect::exec()
> > }
> > + while ((item=li++))
> > + item->fix_after_pullout(parent_select, removed_select, li.ref());
> > +
> > + // Probably need to call fix_after_pullout for semijoin expressions as well.
> > +
> > + // No need to call fix_after_pullout for JOIN expressions, as these
> > + // cannot have outer references?
> > +
> > + // No need to resolve for ORDER and GROUP BY fields as these contain
> > + // references only?
> > + //for (ORDER *order= (ORDER*) sel->order_list.first;
> > + // order;
> > + // order= order->next)
> > + // order->fix_after_pullout(parent_select, removed_select, ref_arg);
> > +
> > + //for (ORDER *order= (ORDER*) sel->group_list.first;
> > + // order;
> > + // order= order->next)
> > + // order->fix_after_pullout(parent_select, removed_select, ref_arg);
>
> (jl) Clean up this ^
Will investigate further whether some of these loops are needed.
>
> > @@ -3589,16 +3593,16 @@ bool convert_subq_to_sj(JOIN *parent_joi
> > /* Fix the created equality and AND */
> > sj_nest->sj_on_expr->fix_fields(parent_join->thd,
> &sj_nest->sj_on_expr);
> >
> > + /* Unlink the child select_lex so it doesn't show up in EXPLAIN: */
> > + subq_lex->master_unit()->exclude_level();
> > /*
> > Walk through sj nest's WHERE and ON expressions and call
> > item->fix_table_changes() for all items.
> > */
> > - sj_nest->sj_on_expr->fix_after_pullout(parent_lex,
> &sj_nest->sj_on_expr);
> > - fix_list_after_tbl_changes(parent_lex,
> &sj_nest->nested_join->join_list);
> > -
> > -
> > - /* Unlink the child select_lex so it doesn't show up in EXPLAIN: */
> > - subq_lex->master_unit()->exclude_level();
> > + sj_nest->sj_on_expr->fix_after_pullout(parent_lex, subq_lex,
> > + &sj_nest->sj_on_expr);
> > + fix_list_after_tbl_changes(parent_lex, subq_lex,
> > + &sj_nest->nested_join->join_list);
>
> (jl) Q: Was it necessary to move the "/*Unlink the child..." part before
> the "Walk through sj nest's..." part?
>
Yes, it was. The new fix_after_pullout() functions do not make sense when
applied to a removed select_lex level.
Thanks,
Roy