Hi Jørgen,
thank you for fixing this problem. The bug fix is approved, but please note some
comments below.
On 15.04.11 10.29, Jorgen Loland wrote:
> #At file:///export/home/jl208045/mysql/mysql-5.5-11765699/ based on
> revid:serge.kozlov@stripped
>
> 3448 Jorgen Loland 2011-04-15
> Bug#11765699 - 58690: !TABLE ||
> (!TABLE->READ_SET || BITMAP_IS_SET(TABLE->READ_SET, FIELD_INDEX
>
> JOIN::conds points to the same expression tree as SELECT_LEX::where, so
> modifying the expression tree of JOIN::conds effectively does the same
> modifications to SELECT_LEX::where.
>
> JOIN::conds is optimized in JOIN::optimize()
>
> conds= optimize_cond(conds);
>
> optimize_cond() makes modifications to the conds input parameter. One
> of these modifications removes an item from the input conds parameter.
> This was not a problem for JOIN::conds because removed items are
> included in the condition returned from the function and therefore
> assigned to conds as can be seen above. However, the removed parts of
> the condition were lost from SELECT_LEX::where.
>
> This bug was about a query with three levels of subselects:
>
> SELECT 1 FROM t1 WHERE a =
> (SELECT 1 FROM t2 WHERE b =
> (SELECT 1 FROM t1 t11 WHERE c = 1 OR t1.a = 1 AND 1 = 2)
> GROUP BY b
> );
>
> Prior to optimization, select#3 has
>
> join->conds = select_lex->where = "c = 1 OR t1.a = 1 AND 1 = 2"
>
> As part of optimize_cond(), remove_eq_conds() sees that
> "t1.a = 1 AND 1 = 2" is FALSE and correctly removes this condition. This
> leaves the OR item with only one argument: "c = 1". With only one
> argument left in this OR, internal_remove_eq_conds() empties the argument
> list of the OR and returns "c = 1". The effect is that after
> optimize_cond(), select#3 has:
>
> join->conds = "c = 1"
> select_lex->where = "" //<- !!
>
> Fixed by not emptying the argument list of an Item_cond even when there
> is only one argument. The effect is that after optimize_cond(),
> select_lex->where is an OR-item with only one argument: "c = 1". This is
> slightly inefficient (evaluation would have to go through the OR), but
> it works correctly. Note that condition evaluation is performed using the
> correctly optimized JOIN::conds.
>
> Note: This fix has to be reconsidered when the optimizer has been split
> into separate phases.
The problem here is not necessarily the lack of phasing, but rather the sloppy
handling and missing semantic definition of JOIN::conds and SELECT_LEX::where
(and the similar having fields).
> @ mysql-test/r/subselect_innodb.result
> Added test for BUG#11765699
> @ mysql-test/t/subselect_innodb.test
> Added test for BUG#11765699
> @ sql/sql_select.cc
> In internal_remove_eq_conds(): Do not empty argument_list() of an Item_cond
> with only one argument. This will result in lost condition parts in
> SELECT_LEX::where/having
>
> modified:
> mysql-test/r/subselect_innodb.result
> mysql-test/t/subselect_innodb.test
Is there a reason that you added this to an innodb-specific file? I got the same
error with a MyISAM table.
> sql/sql_select.cc
> === modified file 'mysql-test/r/subselect_innodb.result'
> --- a/mysql-test/r/subselect_innodb.result 2011-02-17 12:41:25 +0000
> +++ b/mysql-test/r/subselect_innodb.result 2011-04-15 08:29:20 +0000
> @@ -254,3 +254,21 @@ SELECT * FROM t1 WHERE b< (SELECT CAST(
> a b
> 2011-05-13 0
> DROP TABLE t1;
> +#
> +# Bug 11765699 - 58690: !TABLE || (!TABLE->READ_SET ||
> +# BITMAP_IS_SET(TABLE->READ_SET, FIELD_INDEX
> +#
> +CREATE TABLE t1( a INT );
> +INSERT INTO t1 VALUES (0), (1);
> +CREATE TABLE t2( b TEXT, c INT, PRIMARY KEY (b( 1 )) ) ENGINE = INNODB;
> +INSERT INTO t2 VALUES ('a', ''), ('b', '');
> +Warnings:
> +Warning 1366 Incorrect integer value: '' for column 'c' at row 1
> +Warning 1366 Incorrect integer value: '' for column 'c' at row 2
Can you modify this insert statement so that we avoid the warnings?
> +SELECT 1 FROM t1 WHERE a =
> +(SELECT 1 FROM t2 WHERE b =
> +(SELECT 1 FROM t1 t11 WHERE c = 1 OR t1.a = 1 AND 1 = 2)
> +GROUP BY b
> +);
> +1
> +DROP TABLE t1, t2;
>
> === modified file 'mysql-test/t/subselect_innodb.test'
> --- a/mysql-test/t/subselect_innodb.test 2011-02-17 12:41:25 +0000
> +++ b/mysql-test/t/subselect_innodb.test 2011-04-15 08:29:20 +0000
> @@ -247,3 +247,22 @@ CREATE TABLE t1(a date, b int, unique(b)
> INSERT INTO t1 VALUES ('2011-05-13', 0);
> SELECT * FROM t1 WHERE b< (SELECT CAST(a as date) FROM t1 GROUP BY a);
> DROP TABLE t1;
> +
> +--echo #
> +--echo # Bug 11765699 - 58690: !TABLE || (!TABLE->READ_SET ||
> +--echo # BITMAP_IS_SET(TABLE->READ_SET, FIELD_INDEX
> +--echo #
> +
> +CREATE TABLE t1( a INT );
> +INSERT INTO t1 VALUES (0), (1);
> +
> +CREATE TABLE t2( b TEXT, c INT, PRIMARY KEY (b( 1 )) ) ENGINE = INNODB;
> +INSERT INTO t2 VALUES ('a', ''), ('b', '');
> +
> +SELECT 1 FROM t1 WHERE a =
> + (SELECT 1 FROM t2 WHERE b =
> + (SELECT 1 FROM t1 t11 WHERE c = 1 OR t1.a = 1 AND 1 = 2)
> + GROUP BY b
> + );
> +
> +DROP TABLE t1, t2;
>
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc 2011-04-14 09:10:11 +0000
> +++ b/sql/sql_select.cc 2011-04-15 08:29:20 +0000
> @@ -919,6 +919,13 @@ JOIN::optimize()
> thd->restore_active_arena(arena,&backup);
> }
>
> + /*
> + Note: optimize_cond makes changes to conds. Since
> + select_lex->where and conds points to the same condition, this
> + function call effectively changes select_lex->where as well. This
> + needs to be reconsidered after splitting the optimizer into
> + properly separated phases.
> + */
Perhaps make this a Doxygen @note?
> conds= optimize_cond(this, conds, join_list,&cond_value);
> if (thd->is_error())
> {
> @@ -928,6 +935,7 @@ JOIN::optimize()
> }
>
> {
> + // Note above about optimize_cond also applies to SELEC_LEX->having
SELEC_LEX -> select_lex
> having= optimize_cond(this, having, join_list,&having_value);
> if (thd->is_error())
> {
> @@ -9441,7 +9449,8 @@ optimize_cond(JOIN *join, COND *conds, L
> SYNPOSIS
> remove_eq_conds()
> thd THD environment
> - cond the condition to handle
> + cond the condition to handle. Note that cond
> + is changed by this function
> cond_value the resulting value of the condition
Rewriting the interface like this would be better:
bool remove_eq_conds(THD *const thd, Item **condition, Item::cond_result
cond_value);
Then the return value would be just the regular error return indication.
(This is not a proposal, just an observation).
>
> RETURN
> @@ -9503,9 +9512,34 @@ internal_remove_eq_conds(THD *thd, COND
> *cond_value != Item::COND_OK)
> return (COND*) 0;
> if (((Item_cond*) cond)->argument_list()->elements == 1)
> - { // Remove list
> + {
> + /*
> + BUG#11765699:
> + We're dealing with an AND or OR item that has only one
> + argument. However, it is not an option to empty the list
> + because:
> +
> + - this function is called for either JOIN::conds or
> + JOIN::having, but these point to the same condition as
> + SELECT_LEX::where and SELECT_LEX::having do.
> +
> + - The return value of remove_eq_conds() is assigned to
> + JOIN::conds and JOIN::having, so emptying the list and
> + returning the only remaining item "replaces" the AND or OR
> + with item for the variables in JOIN. However, the return
> + value is not assigned to the SELECT_LEX counterparts. Thus,
> + if argument_list is emptied, SELECT_LEX forgets the item in
> + argument_list()->head().
> +
> + item is therefore returned, but argument_list is not emptied.
> + */
> item= ((Item_cond*) cond)->argument_list()->head();
> - ((Item_cond*) cond)->argument_list()->empty();
> + /*
> + Consider reenabling the line below when the optimizer has been
> + split into properly separated phases.
> +
> + ((Item_cond*) cond)->argument_list()->empty();
> + */
> return item;
> }
> }
Thanks,
Roy