List:Commits« Previous MessageNext Message »
From:Olav Sandstaa Date:February 9 2011 12:45pm
Subject:Re: bzr commit into mysql-trunk branch (olav.sandstaa:3559) Bug#59696
View as plain text  
Hi Roy,

Thanks for the review of this patch.

On 07/02/2011 14:36, Roy Lyseng wrote:
> Hi Olav,
>
> thank you for a clever solution to an ugly bug.
>
> As far as I understand the code, the fix is correct. I just wonder if 
> you should add an assert on the transform of an Item_ref that will 
> trigger when anything but the original object is returned. The reason 
> for this is that I think that Item_ref objects can be used as helper 
> objects to encapsulate other item objects, in order to prevent having 
> to perform multiple updates if an item is being replaced. E.g, if an 
> item is used both in WHERE and HAVING, the HAVING item is encapsulated 
> in an Item_ref, so that substituting the original item does not have a 
> consequence within the HAVING clause, only within the Item_ref.

I agree that this is a good idea and in the updated patch I have added 
this assert. The test suite passes with it.


> Some minor comments and suggestions below.

I have committed an updated version of the patch. This fixes most of 
your suggestions and is available here:

   http://lists.mysql.com/commits/130859

See inline for some responses to your comments.

Olav

>
> On 27.01.11 16.51, Olav Sandstaa wrote:
>> #Atfile:///export/home/tmp/mysql2/opt-bug59696/  based
> onrevid:anitha.gopi@stripped
>>
>>   3559 Olav Sandstaa	2011-01-27
>>        Fix for Bug#59696 Optimizer fails to move WHERE condition on JOIN column
>>                          when joining with a view
>>
>>        When optimizing a JOIN of two tables the optimizer tries to move WHERE
>>        conditions on the join column(s) to the first table in the JOIN in
>>        order to reduce the number of records it has to read. If the second
>>        operand of the JOIN is a view that is identical to the underlying
>>        table the optimizer should be able to perform the same optimization.
>>        In this case it fails to move the WHERE condition on the join column
>>        to the first table.
>>
>>        The cause for this is that when the second operand is a view then the
>>        WHERE clauses is stored using Item_ref object that references the real
> suggest rewrite to:
> "conditions contain Item_ref objects that reference the real item 
> field objects.

Done.

>>        item objects. The existing code fails to evaluate these since it only
>>        evaluates the Item_ref object and fails to look at the real
>>        objects. As a result it fails to detect that the WHERE clause on the
> WHERE clause -> equals predicate

Done.

>>        second join operand can be substituted by a corresponding WHERE clause
>>        on the first operand.
> ditto here

Done.

>>
>>        There are two places where we fail to handle Item_ref correctly and that
>>        causes the optimizer to fail to do this optimization:
>>
>>        1. In optimize_cond() we call build_equal_items() which again calls
>>        build_equal_items_for_cond() where we in Item_func::compile() fail to
>>        evaluate the Item::equal_fields_propagator() for the condition on the
>>        view operand. As a result the field used in the condition for the view
>>        is not considered as an equal field to the join column of the first
>>        table when this later is evaluated in make_join_statistics().
>>
>>        2. Later in JOIN::optimize() we call substitute_for_best_equal_field()
>>        which uses Item_func::transform() to select the best condition of the
>>        tables. Item_func::transform() fails to evaluate the different
>>        conditions that are equal due to these being stored in a Item_ref
>>        object.
>>
>>        The fix for these problems is to implement compile() and transform()
>>        methods for the Item_ref class that will do the proper evaluation of
>>        the compile() and transform on the objects the Item_ref object has the
>>        reference to. This will ensure that when Item_func::compile() and
>>        Item_func::transform() evaluates Item_ref objects both the Item_ref
>>        object and the object it references get evaluated.
> Please try to simplify these two sentences.

Hopefully easier to read now.

>>       @ mysql-test/r/join.result
>>          Test case for Bug#59696 Optimizer fails to move WHERE condition
>>          on JOIN column when joining with a view.
>>       @ mysql-test/r/subquery_sj_none.result
>>          Change in explain/query plan due to the fix for Bug#59696 "Optimizer
>>          fails to move WHERE condition on JOIN column when joining with a view"
>>
>>          The change in query plan for the following query:
>>
>>            select * from t2 where (b,a) in (select a,b from t1 where a=3);
>>
>>          where the table t2 is defined as:
>>
>>            create table t2 as select * from t1;
>>
>>          is caused by that when the optimizer optimizes the subquery it will
>>          try to optimize the conditions (in optimize_cond()). One of the things
>>          it does it to call build_equal_items_for_cond() for the following
>>          condition:
>>
>>            (<cache>(`test`.`t2`.`b`) = `test`.`t1`.`a`)
>>
>>          When this is "compiled" in build_equal_items_for_cond() both the
>>          <cache>(`test`.`t2`.`b`) and `test`.`t1`.`a` are stored in
> Item_ref
>>          objects. So with the existing code the real items does not get
>>          "compiled" and are not considered for alternatives. So the result of
>>          the "compilation" is:
>>
>>            (<cache>(`test`.`t2`.`b`) = `test`.`t1`.`a`)
>>
>>          With the patch applied both of the real items get evaluated and the
>>          code is able to find that `test`.`t1`.`a` can be replaced by the
>>          constant "3". So the result is from the "compilation" of this condition
> is
> typo: remove first "is"

Done.

>>
>>            <cache>(`test`.`t2`.`b`) = 3
>>
>>          Which causes the execution plan for the subquery to change as it has.
>
> Thank you for the detailed explanation.

Your welcome :-)
>
>>       @ mysql-test/t/join.test
>>          Test case for Bug#59696 Optimizer fails to move WHERE condition
>>          on JOIN column when joining with a view.
>>       @ sql/item.cc
>>          Implement transform() and compile() methods for the Item_ref class.
> methods -> functions. Also one line below and four lines below.

Done, done and done.

>>          These methods will then do transform/compile for both the
>>          Item_ref object and on the object it has a reference to.
>>       @ sql/item.h
>>          Implement transform() and compile() methods for the Item_ref class.
>>
>>      modified:
>>        mysql-test/r/join.result
>>        mysql-test/r/subquery_sj_none.result
>>        mysql-test/t/join.test
>>        sql/item.cc
>>        sql/item.h
>> === modified file 'mysql-test/r/join.result'
>>
>> === modified file 'mysql-test/r/join.result'
>> --- a/mysql-test/r/join.result	2010-12-16 17:38:26 +0000
>> +++ b/mysql-test/r/join.result	2011-01-27 15:51:29 +0000
>> @@ -1222,3 +1222,40 @@
>>   DEALLOCATE PREPARE stmt;
>>   DROP TABLE t1;
>>   End of 5.1 tests
>> +#
>> +# Bug #59696 Optimizer fails to move WHERE condition on JOIN column
>> +#            when joining with a view
>> +#
>> +CREATE TABLE t1 (
>> +c1 INTEGER NOT NULL
>> +);
>> +INSERT INTO t1 VALUES (1),(2),(3);
>> +CREATE TABLE t2 (
>> +pk INTEGER NOT NULL,
>> +c1 INTEGER NOT NULL,
>> +PRIMARY KEY (pk)
>> +);
>> +INSERT INTO t2 VALUES (1,4),(3,5),(2,6);
>> +EXPLAIN SELECT t2.pk, t2.c1 FROM t2, t1
>> +WHERE t2.pk = t1.c1 AND t2.pk>= 2;
>> +id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
>> +1	SIMPLE	t1	ALL	NULL	NULL	NULL	NULL	3	Using where
>> +1	SIMPLE	t2	eq_ref	PRIMARY	PRIMARY	4	test.t1.c1	1	
>> +SELECT t2.pk, t2.c1 FROM t2, t1
>> +WHERE t2.pk = t1.c1 AND t2.pk>= 2;
>> +pk	c1
>> +2	6
>> +3	5
> Proposal: Do
> let $query= SELECT ...;
> eval explain $query;
> eval $query;
>
> so that you need to specify the query only once.

Not done (or rather: it was already done in the original patch for the 
test file - I do not know how to get that into the result file :-) )

>> +CREATE VIEW v_t2 AS SELECT * FROM t2;
>> +EXPLAIN SELECT v_t2.pk, v_t2.c1 FROM v_t2, t1
>> +WHERE v_t2.pk = t1.c1 AND v_t2.pk>= 2;
>> +id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
>> +1	SIMPLE	t1	ALL	NULL	NULL	NULL	NULL	3	Using where
>> +1	SIMPLE	t2	eq_ref	PRIMARY	PRIMARY	4	test.t1.c1	1	
>> +SELECT v_t2.pk, v_t2.c1 FROM v_t2, t1
>> +WHERE v_t2.pk = t1.c1 AND v_t2.pk>= 2;
>> +pk	c1
>> +2	6
>> +3	5
>> +DROP VIEW v_t2;
>> +DROP TABLE t1, t2;
>>
>> === modified file 'mysql-test/r/subquery_sj_none.result'
>> --- a/mysql-test/r/subquery_sj_none.result	2011-01-13 10:48:28 +0000
>> +++ b/mysql-test/r/subquery_sj_none.result	2011-01-27 15:51:29 +0000
>> @@ -2543,7 +2543,7 @@
>>   explain select * from t2 where (b,a) in (select a,b from t1 where a=3);
>>   id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
>>   1	PRIMARY	t2	ALL	NULL	NULL	NULL	NULL	100	Using where
>> -2	DEPENDENT SUBQUERY	t1	index_subquery	a	a	10	func,func	1	Using index; Using
> where
>> +2	DEPENDENT SUBQUERY	t1	ref	a	a	10	const,func	1	Using where; Using index
>>   drop table t1,t2;
>>   create table t1 (a int, b int);
>>   insert into t1 select a,a from t0;
>>
>> === modified file 'mysql-test/t/join.test'
>> --- a/mysql-test/t/join.test	2010-12-16 17:38:26 +0000
>> +++ b/mysql-test/t/join.test	2011-01-27 15:51:29 +0000
>> @@ -934,3 +934,43 @@
>>   DROP TABLE t1;
>>
>>   --echo End of 5.1 tests
>> +
>> +--echo #
>> +--echo # Bug #59696 Optimizer fails to move WHERE condition on JOIN column
>> +--echo #            when joining with a view
>> +--echo #
>> +
>> +CREATE TABLE t1 (
>> +  c1 INTEGER NOT NULL
>> +);
>> +
>> +INSERT INTO t1 VALUES (1),(2),(3);
>> +
>> +CREATE TABLE t2 (
>> +  pk INTEGER NOT NULL,
>> +  c1 INTEGER NOT NULL,
>> +  PRIMARY KEY (pk)
>> +);
>> +
>> +INSERT INTO t2 VALUES (1,4),(3,5),(2,6);
>> +
>> +let $query=
>> +SELECT t2.pk, t2.c1 FROM t2, t1
>> +WHERE t2.pk = t1.c1 AND t2.pk>= 2;
>> +
>> +eval EXPLAIN $query;
>> +eval $query;
>> +
>> +# Create a view on one of the tables. The same query plan should
>> +# be used when joining with this view as with the underlying table.
>> +CREATE VIEW v_t2 AS SELECT * FROM t2;
>> +
>> +let $query=
>> +SELECT v_t2.pk, v_t2.c1 FROM v_t2, t1
>> +WHERE v_t2.pk = t1.c1 AND v_t2.pk>= 2;
>> +
>> +eval EXPLAIN $query;
>> +eval $query;
>> +
>> +DROP VIEW v_t2;
>> +DROP TABLE t1, t2;
>>
>> === modified file 'sql/item.cc'
>> --- a/sql/item.cc	2011-01-18 10:32:38 +0000
>> +++ b/sql/item.cc	2011-01-27 15:51:29 +0000
>> @@ -6518,6 +6518,85 @@
>>   }
>>
>>
>> +/**
>> +  Transform an Item_ref object with a transformer callback function.
>> +
>> +  The function first applies the transform method to the item
> method -> function

Done.

>> +  referenced by this Item_reg object. If this returns a new item the
> type: Item_reg -> Item_ref

Done.

>> +  old item is substituted for a new one. After this the transformer
>> +  is applied to the Item_ref object.
>
> I think that you can remove references to error handling in the 
> comment without loss of clarity.

Not done. I do not consider what I wrote to be related to error handling.

>> +
>> +  @param transformer   the transformer callback function to be applied to
>> +                       the nodes of the tree of the object
>> +  @param argument      parameter to be passed to the transformer
>> +
>> +  @return Item returned as the result of transformation of the Item_ref object
>> +    @retval !NULL The transformation was successful
>> +    @retval NULL  Out of memory error
>> +*/
>> +
>> +Item* Item_ref::transform(Item_transformer transformer, uchar *arg)
>> +{
>> +  DBUG_ASSERT(!current_thd->is_stmt_prepare());
>> +  DBUG_ASSERT((*ref) != NULL);
>> +
>> +  /* Transform the object we are referencing. */
>> +  Item *new_item= (*ref)->transform(transformer, arg);
>> +  if (!new_item)
>> +    return NULL;
>> +
>> +  /*
>> +    THD::change_item_tree() should be called only if the tree was
>> +    really transformed, i.e. when a new item has been created.
>> +    Otherwise we'll be allocating a lot of unnecessary memory for
>> +    change records at each execution.
>> +  */
> Proposal:
> /* Record the new item in the 'ref' pointer, in a manner safe for 
> prepared execution. */
>
> IMHO, the argumentation for how this is called should be documented 
> with change_item_tree() and not where it is called.

Done (what I had in the original patch was copied from a "standard" 
comment that I found in about 7 other implementations of the 
Item::transform() member function :-) )

>> +  if (*ref != new_item)
>> +    current_thd->change_item_tree(ref, new_item);
>> +
>> +  /* Transform the item ref object. */
>> +  return (this->*transformer)(arg);
>
> Consider asserting that the return value from the transform is the 
> same as "this", as suggested above.

Done.

>> +}
>> +
>> +
>> +/**
>> +  Compile an Item_ref object with a processor and a transformer
>> +  callback functions.
> functions -> function.

Done.
>> +
>> +  First the function applies the analyzer to the Item_ref object. Then
>> +  if the analizer succeeeds we first applies the compile method to the
>> +  object the Item_ref object is referencing.  If this returns a new
>> +  item the old item is substituted for a new one.  After this the
>> +  transformer is applied to the Item_ref object itself.
>
> I think that you can omit error handling details here too.

Done (I think).

>> +
>> +  @param analyzer      the analyzer callback function to be applied to the
>> +                       nodes of the tree of the object
>> +  @param[in,out] arg_p parameter to be passed to the processor
>> +  @param transformer   the transformer callback function to be applied to the
>> +                       nodes of the tree of the object
>> +  @param arg_t         parameter to be passed to the transformer
>> +
>> +  @return Item returned as the result of transformation of the Item_ref object
>> +*/
>> +
>> +Item* Item_ref::compile(Item_analyzer analyzer, uchar **arg_p,
>> +                        Item_transformer transformer, uchar *arg_t)
>> +{
>> +  /* Analyze this Item object. */
>> +  if (!(this->*analyzer)(arg_p))
>> +    return NULL;
>> +
>> +  /* Compile the Item we are referencing. */
>> +  DBUG_ASSERT((*ref) != NULL);
>> +  Item *new_item= (*ref)->compile(analyzer, arg_p, transformer, arg_t);
>> +  if (new_item&&  *ref != new_item)
>> +    current_thd->change_item_tree(ref, new_item);
>> +
>> +  /* Transform this Item object. */
>> +  return (this->*transformer)(arg_t);
>> +}
>
> Considering the detailed explanation in the Doxygen header, I think 
> that the code comments are redundant.

Done.

>> +
>> +
>>   void Item_ref::print(String *str, enum_query_type query_type)
>>   {
>>     if (ref)
>>
>> === modified file 'sql/item.h'
>> --- a/sql/item.h	2011-01-10 16:37:47 +0000
>> +++ b/sql/item.h	2011-01-27 15:51:29 +0000
>> @@ -2620,6 +2620,9 @@
>>       return (*ref)->walk(processor, walk_subquery, arg) ||
>>              (this->*processor)(arg);
>>     }
>> +  virtual Item* transform(Item_transformer, uchar *arg);
>> +  virtual Item* compile(Item_analyzer analyzer, uchar **arg_p,
>> +                        Item_transformer transformer, uchar *arg_t);
>>     virtual void print(String *str, enum_query_type query_type);
>>     bool result_as_longlong()
>>     {
>>
>>
>>
>>
> Thanks,
> Roy


Thread
bzr commit into mysql-trunk branch (olav.sandstaa:3559) Bug#59696Olav Sandstaa27 Jan
  • Re: bzr commit into mysql-trunk branch (olav.sandstaa:3559) Bug#59696Roy Lyseng7 Feb
    • Re: bzr commit into mysql-trunk branch (olav.sandstaa:3559) Bug#59696Olav Sandstaa9 Feb
      • Re: bzr commit into mysql-trunk branch (olav.sandstaa:3559) Bug#59696Roy Lyseng9 Feb