From: Roy Lyseng Date: April 27 2011 10:21am Subject: Re: bzr commit into mysql-5.5 branch (jorgen.loland:3448) Bug#11765699 List-Archive: http://lists.mysql.com/commits/136184 Message-Id: <4DB7EE29.90403@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Hi Jørgen, thank you for fixing this problem. The bug fix is approved, but please note some comments below. On 15.04.11 10.29, Jorgen Loland wrote: > #At file:///export/home/jl208045/mysql/mysql-5.5-11765699/ based on revid:serge.kozlov@stripped > > 3448 Jorgen Loland 2011-04-15 > Bug#11765699 - 58690: !TABLE || > (!TABLE->READ_SET || BITMAP_IS_SET(TABLE->READ_SET, FIELD_INDEX > > JOIN::conds points to the same expression tree as SELECT_LEX::where, so > modifying the expression tree of JOIN::conds effectively does the same > modifications to SELECT_LEX::where. > > JOIN::conds is optimized in JOIN::optimize() > > conds= optimize_cond(conds); > > optimize_cond() makes modifications to the conds input parameter. One > of these modifications removes an item from the input conds parameter. > This was not a problem for JOIN::conds because removed items are > included in the condition returned from the function and therefore > assigned to conds as can be seen above. However, the removed parts of > the condition were lost from SELECT_LEX::where. > > This bug was about a query with three levels of subselects: > > SELECT 1 FROM t1 WHERE a = > (SELECT 1 FROM t2 WHERE b = > (SELECT 1 FROM t1 t11 WHERE c = 1 OR t1.a = 1 AND 1 = 2) > GROUP BY b > ); > > Prior to optimization, select#3 has > > join->conds = select_lex->where = "c = 1 OR t1.a = 1 AND 1 = 2" > > As part of optimize_cond(), remove_eq_conds() sees that > "t1.a = 1 AND 1 = 2" is FALSE and correctly removes this condition. This > leaves the OR item with only one argument: "c = 1". With only one > argument left in this OR, internal_remove_eq_conds() empties the argument > list of the OR and returns "c = 1". The effect is that after > optimize_cond(), select#3 has: > > join->conds = "c = 1" > select_lex->where = "" //<- !! > > Fixed by not emptying the argument list of an Item_cond even when there > is only one argument. The effect is that after optimize_cond(), > select_lex->where is an OR-item with only one argument: "c = 1". This is > slightly inefficient (evaluation would have to go through the OR), but > it works correctly. Note that condition evaluation is performed using the > correctly optimized JOIN::conds. > > Note: This fix has to be reconsidered when the optimizer has been split > into separate phases. The problem here is not necessarily the lack of phasing, but rather the sloppy handling and missing semantic definition of JOIN::conds and SELECT_LEX::where (and the similar having fields). > @ mysql-test/r/subselect_innodb.result > Added test for BUG#11765699 > @ mysql-test/t/subselect_innodb.test > Added test for BUG#11765699 > @ sql/sql_select.cc > In internal_remove_eq_conds(): Do not empty argument_list() of an Item_cond > with only one argument. This will result in lost condition parts in > SELECT_LEX::where/having > > modified: > mysql-test/r/subselect_innodb.result > mysql-test/t/subselect_innodb.test Is there a reason that you added this to an innodb-specific file? I got the same error with a MyISAM table. > sql/sql_select.cc > === modified file 'mysql-test/r/subselect_innodb.result' > --- a/mysql-test/r/subselect_innodb.result 2011-02-17 12:41:25 +0000 > +++ b/mysql-test/r/subselect_innodb.result 2011-04-15 08:29:20 +0000 > @@ -254,3 +254,21 @@ SELECT * FROM t1 WHERE b< (SELECT CAST( > a b > 2011-05-13 0 > DROP TABLE t1; > +# > +# Bug 11765699 - 58690: !TABLE || (!TABLE->READ_SET || > +# BITMAP_IS_SET(TABLE->READ_SET, FIELD_INDEX > +# > +CREATE TABLE t1( a INT ); > +INSERT INTO t1 VALUES (0), (1); > +CREATE TABLE t2( b TEXT, c INT, PRIMARY KEY (b( 1 )) ) ENGINE = INNODB; > +INSERT INTO t2 VALUES ('a', ''), ('b', ''); > +Warnings: > +Warning 1366 Incorrect integer value: '' for column 'c' at row 1 > +Warning 1366 Incorrect integer value: '' for column 'c' at row 2 Can you modify this insert statement so that we avoid the warnings? > +SELECT 1 FROM t1 WHERE a = > +(SELECT 1 FROM t2 WHERE b = > +(SELECT 1 FROM t1 t11 WHERE c = 1 OR t1.a = 1 AND 1 = 2) > +GROUP BY b > +); > +1 > +DROP TABLE t1, t2; > > === modified file 'mysql-test/t/subselect_innodb.test' > --- a/mysql-test/t/subselect_innodb.test 2011-02-17 12:41:25 +0000 > +++ b/mysql-test/t/subselect_innodb.test 2011-04-15 08:29:20 +0000 > @@ -247,3 +247,22 @@ CREATE TABLE t1(a date, b int, unique(b) > INSERT INTO t1 VALUES ('2011-05-13', 0); > SELECT * FROM t1 WHERE b< (SELECT CAST(a as date) FROM t1 GROUP BY a); > DROP TABLE t1; > + > +--echo # > +--echo # Bug 11765699 - 58690: !TABLE || (!TABLE->READ_SET || > +--echo # BITMAP_IS_SET(TABLE->READ_SET, FIELD_INDEX > +--echo # > + > +CREATE TABLE t1( a INT ); > +INSERT INTO t1 VALUES (0), (1); > + > +CREATE TABLE t2( b TEXT, c INT, PRIMARY KEY (b( 1 )) ) ENGINE = INNODB; > +INSERT INTO t2 VALUES ('a', ''), ('b', ''); > + > +SELECT 1 FROM t1 WHERE a = > + (SELECT 1 FROM t2 WHERE b = > + (SELECT 1 FROM t1 t11 WHERE c = 1 OR t1.a = 1 AND 1 = 2) > + GROUP BY b > + ); > + > +DROP TABLE t1, t2; > > === modified file 'sql/sql_select.cc' > --- a/sql/sql_select.cc 2011-04-14 09:10:11 +0000 > +++ b/sql/sql_select.cc 2011-04-15 08:29:20 +0000 > @@ -919,6 +919,13 @@ JOIN::optimize() > thd->restore_active_arena(arena,&backup); > } > > + /* > + Note: optimize_cond makes changes to conds. Since > + select_lex->where and conds points to the same condition, this > + function call effectively changes select_lex->where as well. This > + needs to be reconsidered after splitting the optimizer into > + properly separated phases. > + */ Perhaps make this a Doxygen @note? > conds= optimize_cond(this, conds, join_list,&cond_value); > if (thd->is_error()) > { > @@ -928,6 +935,7 @@ JOIN::optimize() > } > > { > + // Note above about optimize_cond also applies to SELEC_LEX->having SELEC_LEX -> select_lex > having= optimize_cond(this, having, join_list,&having_value); > if (thd->is_error()) > { > @@ -9441,7 +9449,8 @@ optimize_cond(JOIN *join, COND *conds, L > SYNPOSIS > remove_eq_conds() > thd THD environment > - cond the condition to handle > + cond the condition to handle. Note that cond > + is changed by this function > cond_value the resulting value of the condition Rewriting the interface like this would be better: bool remove_eq_conds(THD *const thd, Item **condition, Item::cond_result cond_value); Then the return value would be just the regular error return indication. (This is not a proposal, just an observation). > > RETURN > @@ -9503,9 +9512,34 @@ internal_remove_eq_conds(THD *thd, COND > *cond_value != Item::COND_OK) > return (COND*) 0; > if (((Item_cond*) cond)->argument_list()->elements == 1) > - { // Remove list > + { > + /* > + BUG#11765699: > + We're dealing with an AND or OR item that has only one > + argument. However, it is not an option to empty the list > + because: > + > + - this function is called for either JOIN::conds or > + JOIN::having, but these point to the same condition as > + SELECT_LEX::where and SELECT_LEX::having do. > + > + - The return value of remove_eq_conds() is assigned to > + JOIN::conds and JOIN::having, so emptying the list and > + returning the only remaining item "replaces" the AND or OR > + with item for the variables in JOIN. However, the return > + value is not assigned to the SELECT_LEX counterparts. Thus, > + if argument_list is emptied, SELECT_LEX forgets the item in > + argument_list()->head(). > + > + item is therefore returned, but argument_list is not emptied. > + */ > item= ((Item_cond*) cond)->argument_list()->head(); > - ((Item_cond*) cond)->argument_list()->empty(); > + /* > + Consider reenabling the line below when the optimizer has been > + split into properly separated phases. > + > + ((Item_cond*) cond)->argument_list()->empty(); > + */ > return item; > } > } Thanks, Roy