List:Commits« Previous MessageNext Message »
From:Roy Lyseng Date:March 23 2011 9:49am
Subject:Re: bzr commit into mysql-trunk branch (olav.sandstaa:3306) Bug#11873324
View as plain text  
Hi Olav,

thank you for a patch with a nice encapsulation solution.

The patch is approved, I have just a few nitpicking comments below.

On 22.03.11 11.09, Olav Sandstaa wrote:
> #At file:///export/home/tmp/mysql2/bug11873324-fix/ based on
> revid:vinay.fisrekar@stripped
>
>   3306 Olav Sandstaa	2011-03-22
>        Fix for Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
>
>        This bug causes NOT IN subqueries to in some cases return too few
>        records. This can happen when we have a subquery where a table is
nitpicking: records -> rows
>        normally access using ref access and we have pushed an index condition
>        down to the storage engine based on this but in some cases the access
>        method is changed during execution to use table scan instead of ref
>        access. Queries like this are marked with 'Full scan on NULL key' in
>        the explain output.
>
>        Details about how this situation occurs.
>
>        1. Index condition pushdown: Since the optimizer decides that the
>           access method for one of the tables in the subquery should be ref
>           access this is candidate for using index condition pushdown. The ICP
>           will push part of the table's condition down to the storage engine. As
>           part of this it will also remove this part from the table's condition
>           that should be evaluated by the server.
>
>        2. Next looking at what 'Full scan on NULL key' means for these queries:
>
>        This is a NOT IN subquery. So for every record produced by the outer
>        query we will execute the subquery and see if this particular value
>        is produced by the subquery. The important parts relevant for this
>        case is found in the following two functions:
>
>        A. longlong Item_in_optimizer::val_int() (called from
>           evaluate_join_record via a few other Item::val... functions)
>
>           This functions checks if the value to check for is NULL. If it is NULL
> then
>           the function does (among other things):
>
>            1. Turn off the predicates that are based on column compares for
>               which the left part is currently NULL. This is done by setting
>               the subquery's "cond guards" to false.
>
>            2. Executes the subquery (see next step)
>
>            3. Turn all predicates back on. This is done by setting the
>               subquery's "cond guards" back to true.
>
>        B. The execution of the subquery will eventually call the following function:
>
>            subselect_single_select_engine::exec()
>
>           In this function we have the following relevant actions:
>
>            1. We check if the subquery has "guarded conditions" and if so we do
>               a detailed check for all join tabs of the subquery to see if any of
>               the "guarded conditions" are "false" (see step 1 above).
>
>            2. If any of the "guarded conditions" are false we do as commented in the
>               code:
>
>              /*
>                For at least one of the pushed predicates the following is true:
>                We should not apply optimizations based on the condition that was
>                pushed down into the subquery. Those optimizations are ref[_or_null]
>                acceses. Change them to be full table scans.
>              */
>
>               So we basically changes from using ref access to table scan.
>
>        Summary for how this produces missing results:
>
>        1. When optimizing the subquery we push and index condition down to InnoDB
>           based on that the table should be accessed using ref access.
>
>        2. If the outer query produces (in this case) an integer to be checked
>           for "NOT IN subquery", the subquery is executed using ref access
>           (and everything is fine).
>
>        3. If the outer query produces a NULL value the execution of the
>           subquery is changed to use table scan. In this case the pushed index
>           condition will no longer be evaluated by InnoDB and the subquery will
>           produce more records than it should and the NOT IN will evaluate to false
>           (when it should evaluate to true) and the query will produce fewer
>           result records than it should (which is not fine).
>
>        The fix for this is to add a check in push_index_cond() to determine
>        if the jointab has "guarded conditions". If it has then do not push
>        any index conditions down to the storage engine.
>       @ mysql-test/include/icp_tests.inc
>          Test case for Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
>       @ mysql-test/r/innodb_icp.result
>          Test case for Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
>       @ mysql-test/r/innodb_icp_none.result
>          Test case for Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
>       @ mysql-test/r/myisam_icp.result
>          Test case for Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
>       @ mysql-test/r/myisam_icp_none.result
>          Test case for Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
>       @ mysql-test/r/subquery_nomat_nosj.result
>          Change in plan after fix for Bug#11873324. After this fix we
>          will no longer use index condition pushdown for subqueries that
>          can do a 'Full scan on NULL key' (ie. a subquery that
>          can change from using ref access to do a full table scan during
>          execution).
>       @ sql/sql_select.cc
>          Fix for Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
>
>          Some NOT IN (and some IN) subqueries that have "guarded conditions"
I think that "and some IN" is too unspecific. I propose that you specify "IN 
queries for which is_top_level_item() returns false". This would e.g. happen for 
an IN predicate under XOR (because IS TRUE is not propagated past XOR), or an IN 
(subquery) IS FALSE (which is equivalent to NOT IN).

If you feel for it, you may add extra test cases for these types of queries.
>          can be evaluated using ref access when the outer query produces a row
>          that is non-NULL. If the outer query produces a NULL value and need to
>          check that this is NOT IN the result from the subquery the code for
>          executing the subquery changes from using ref access to use table
>          scan. In this case any index conditions pushed for this table will no
>          longer be evaluated.
>
>          To avoid that this situation occurs we check in push_index_cond() that
>          the jointab does not have any "guarded conditions". If it has we do not
>          try to push any index conditions for this table.
>       @ sql/sql_select.h
>          Add methods to TABLE_REF and JOIN_TAB to check if they contains
>          guarded conditions.
>
>      modified:
>        mysql-test/include/icp_tests.inc
>        mysql-test/r/innodb_icp.result
>        mysql-test/r/innodb_icp_none.result
>        mysql-test/r/myisam_icp.result
>        mysql-test/r/myisam_icp_none.result
>        mysql-test/r/subquery_nomat_nosj.result
>        sql/sql_select.cc
>        sql/sql_select.h
> === modified file 'mysql-test/include/icp_tests.inc'
>
> === modified file 'mysql-test/include/icp_tests.inc'
> --- a/mysql-test/include/icp_tests.inc	2011-03-17 11:48:04 +0000
> +++ b/mysql-test/include/icp_tests.inc	2011-03-22 10:09:26 +0000
> @@ -838,3 +838,43 @@
>   SELECT col999 FROM t1 WHERE col1000 = "3" AND col1003<=>  sysdate();
>
>   DROP TABLE t1;
> +
> +--echo #
> +--echo # Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
> +--echo #
> +
> +CREATE TABLE t1 (
> +  pk INTEGER NOT NULL,
> +  i1 INTEGER,
> +  PRIMARY KEY (pk),
> +  KEY col_int_key (i1)
> +);
> +
> +INSERT INTO t1 VALUES (14,NULL), (18,133);
> +
> +CREATE TABLE t2 (
> +   pk INTEGER NOT NULL,
> +   i1 INTEGER,
> +   c1 VARCHAR(1),
> +   PRIMARY KEY (pk),
> +   KEY col_int_key (i1)
> +);
> +
> +INSERT INTO t2 VALUES (1,7,'f');
> +
> +let query=
> +SELECT t1.i1
> +FROM t1
> +WHERE t1.i1 NOT IN
> +( SELECT STRAIGHT_JOIN SUBQUERY_t1.pk
> +  FROM t1 AS SUBQUERY_t1
> +    JOIN t2 AS SUBQUERY_t2
> +    ON SUBQUERY_t2.i1 = SUBQUERY_t1.pk
> +  WHERE SUBQUERY_t1.i1>  0
> +    OR SUBQUERY_t2.c1 = 'a'
> +);
> +
> +eval EXPLAIN $query;
> +eval $query;
> +
> +DROP TABLE t1,t2;
>
> === modified file 'mysql-test/r/innodb_icp.result'
> --- a/mysql-test/r/innodb_icp.result	2011-03-17 11:48:04 +0000
> +++ b/mysql-test/r/innodb_icp.result	2011-03-22 10:09:26 +0000
> @@ -776,5 +776,51 @@
>   SELECT col999 FROM t1 WHERE col1000 = "3" AND col1003<=>  sysdate();
>   col999
>   DROP TABLE t1;
> +#
> +# Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
> +#
> +CREATE TABLE t1 (
> +pk INTEGER NOT NULL,
> +i1 INTEGER,
> +PRIMARY KEY (pk),
> +KEY col_int_key (i1)
> +);
> +INSERT INTO t1 VALUES (14,NULL), (18,133);
> +CREATE TABLE t2 (
> +pk INTEGER NOT NULL,
> +i1 INTEGER,
> +c1 VARCHAR(1),
> +PRIMARY KEY (pk),
> +KEY col_int_key (i1)
> +);
> +INSERT INTO t2 VALUES (1,7,'f');
> +EXPLAIN SELECT t1.i1
> +FROM t1
> +WHERE t1.i1 NOT IN
> +( SELECT STRAIGHT_JOIN SUBQUERY_t1.pk
> +FROM t1 AS SUBQUERY_t1
> +JOIN t2 AS SUBQUERY_t2
> +ON SUBQUERY_t2.i1 = SUBQUERY_t1.pk
> +WHERE SUBQUERY_t1.i1>  0
> +OR SUBQUERY_t2.c1 = 'a'
> +);
> +id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
> +1	PRIMARY	t1	index	NULL	col_int_key	5	NULL	2	Using where; Using index
> +2	DEPENDENT SUBQUERY	SUBQUERY_t1	eq_ref	PRIMARY,col_int_key	PRIMARY	4	func	1	Using
> where; Full scan on NULL key
> +2	DEPENDENT SUBQUERY	SUBQUERY_t2	ref	col_int_key	col_int_key	5	func	1	Using where;
> Full scan on NULL key
> +SELECT t1.i1
> +FROM t1
> +WHERE t1.i1 NOT IN
> +( SELECT STRAIGHT_JOIN SUBQUERY_t1.pk
> +FROM t1 AS SUBQUERY_t1
> +JOIN t2 AS SUBQUERY_t2
> +ON SUBQUERY_t2.i1 = SUBQUERY_t1.pk
> +WHERE SUBQUERY_t1.i1>  0
> +OR SUBQUERY_t2.c1 = 'a'
> +);
> +i1
> +NULL
> +133
> +DROP TABLE t1,t2;
>   set default_storage_engine= @save_storage_engine;
>   set optimizer_switch=default;
>
> === modified file 'mysql-test/r/innodb_icp_none.result'
> --- a/mysql-test/r/innodb_icp_none.result	2011-03-17 11:48:04 +0000
> +++ b/mysql-test/r/innodb_icp_none.result	2011-03-22 10:09:26 +0000
> @@ -775,5 +775,51 @@
>   SELECT col999 FROM t1 WHERE col1000 = "3" AND col1003<=>  sysdate();
>   col999
>   DROP TABLE t1;
> +#
> +# Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
> +#
> +CREATE TABLE t1 (
> +pk INTEGER NOT NULL,
> +i1 INTEGER,
> +PRIMARY KEY (pk),
> +KEY col_int_key (i1)
> +);
> +INSERT INTO t1 VALUES (14,NULL), (18,133);
> +CREATE TABLE t2 (
> +pk INTEGER NOT NULL,
> +i1 INTEGER,
> +c1 VARCHAR(1),
> +PRIMARY KEY (pk),
> +KEY col_int_key (i1)
> +);
> +INSERT INTO t2 VALUES (1,7,'f');
> +EXPLAIN SELECT t1.i1
> +FROM t1
> +WHERE t1.i1 NOT IN
> +( SELECT STRAIGHT_JOIN SUBQUERY_t1.pk
> +FROM t1 AS SUBQUERY_t1
> +JOIN t2 AS SUBQUERY_t2
> +ON SUBQUERY_t2.i1 = SUBQUERY_t1.pk
> +WHERE SUBQUERY_t1.i1>  0
> +OR SUBQUERY_t2.c1 = 'a'
> +);
> +id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
> +1	PRIMARY	t1	index	NULL	col_int_key	5	NULL	2	Using where; Using index
> +2	DEPENDENT SUBQUERY	SUBQUERY_t1	eq_ref	PRIMARY,col_int_key	PRIMARY	4	func	1	Using
> where; Full scan on NULL key
> +2	DEPENDENT SUBQUERY	SUBQUERY_t2	ref	col_int_key	col_int_key	5	func	1	Using where;
> Full scan on NULL key
> +SELECT t1.i1
> +FROM t1
> +WHERE t1.i1 NOT IN
> +( SELECT STRAIGHT_JOIN SUBQUERY_t1.pk
> +FROM t1 AS SUBQUERY_t1
> +JOIN t2 AS SUBQUERY_t2
> +ON SUBQUERY_t2.i1 = SUBQUERY_t1.pk
> +WHERE SUBQUERY_t1.i1>  0
> +OR SUBQUERY_t2.c1 = 'a'
> +);
> +i1
> +NULL
> +133
> +DROP TABLE t1,t2;
>   set default_storage_engine= @save_storage_engine;
>   set optimizer_switch=default;
>
> === modified file 'mysql-test/r/myisam_icp.result'
> --- a/mysql-test/r/myisam_icp.result	2011-03-17 11:48:04 +0000
> +++ b/mysql-test/r/myisam_icp.result	2011-03-22 10:09:26 +0000
> @@ -774,4 +774,50 @@
>   SELECT col999 FROM t1 WHERE col1000 = "3" AND col1003<=>  sysdate();
>   col999
>   DROP TABLE t1;
> +#
> +# Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
> +#
> +CREATE TABLE t1 (
> +pk INTEGER NOT NULL,
> +i1 INTEGER,
> +PRIMARY KEY (pk),
> +KEY col_int_key (i1)
> +);
> +INSERT INTO t1 VALUES (14,NULL), (18,133);
> +CREATE TABLE t2 (
> +pk INTEGER NOT NULL,
> +i1 INTEGER,
> +c1 VARCHAR(1),
> +PRIMARY KEY (pk),
> +KEY col_int_key (i1)
> +);
> +INSERT INTO t2 VALUES (1,7,'f');
> +EXPLAIN SELECT t1.i1
> +FROM t1
> +WHERE t1.i1 NOT IN
> +( SELECT STRAIGHT_JOIN SUBQUERY_t1.pk
> +FROM t1 AS SUBQUERY_t1
> +JOIN t2 AS SUBQUERY_t2
> +ON SUBQUERY_t2.i1 = SUBQUERY_t1.pk
> +WHERE SUBQUERY_t1.i1>  0
> +OR SUBQUERY_t2.c1 = 'a'
> +);
> +id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
> +1	PRIMARY	t1	index	NULL	col_int_key	5	NULL	2	Using where; Using index
> +2	DEPENDENT SUBQUERY	SUBQUERY_t1	eq_ref	PRIMARY,col_int_key	PRIMARY	4	func	1	Using
> where; Full scan on NULL key
> +2	DEPENDENT SUBQUERY	SUBQUERY_t2	ALL	col_int_key	NULL	NULL	NULL	1	Using where; Using
> join buffer (BNL, incremental buffers)
> +SELECT t1.i1
> +FROM t1
> +WHERE t1.i1 NOT IN
> +( SELECT STRAIGHT_JOIN SUBQUERY_t1.pk
> +FROM t1 AS SUBQUERY_t1
> +JOIN t2 AS SUBQUERY_t2
> +ON SUBQUERY_t2.i1 = SUBQUERY_t1.pk
> +WHERE SUBQUERY_t1.i1>  0
> +OR SUBQUERY_t2.c1 = 'a'
> +);
> +i1
> +NULL
> +133
> +DROP TABLE t1,t2;
>   set optimizer_switch=default;
>
> === modified file 'mysql-test/r/myisam_icp_none.result'
> --- a/mysql-test/r/myisam_icp_none.result	2011-03-17 11:48:04 +0000
> +++ b/mysql-test/r/myisam_icp_none.result	2011-03-22 10:09:26 +0000
> @@ -773,4 +773,50 @@
>   SELECT col999 FROM t1 WHERE col1000 = "3" AND col1003<=>  sysdate();
>   col999
>   DROP TABLE t1;
> +#
> +# Bug#11873324 "WRONG RESULT WITH ICP AND STRAIGHT_JOIN"
> +#
> +CREATE TABLE t1 (
> +pk INTEGER NOT NULL,
> +i1 INTEGER,
> +PRIMARY KEY (pk),
> +KEY col_int_key (i1)
> +);
> +INSERT INTO t1 VALUES (14,NULL), (18,133);
> +CREATE TABLE t2 (
> +pk INTEGER NOT NULL,
> +i1 INTEGER,
> +c1 VARCHAR(1),
> +PRIMARY KEY (pk),
> +KEY col_int_key (i1)
> +);
> +INSERT INTO t2 VALUES (1,7,'f');
> +EXPLAIN SELECT t1.i1
> +FROM t1
> +WHERE t1.i1 NOT IN
> +( SELECT STRAIGHT_JOIN SUBQUERY_t1.pk
> +FROM t1 AS SUBQUERY_t1
> +JOIN t2 AS SUBQUERY_t2
> +ON SUBQUERY_t2.i1 = SUBQUERY_t1.pk
> +WHERE SUBQUERY_t1.i1>  0
> +OR SUBQUERY_t2.c1 = 'a'
> +);
> +id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
> +1	PRIMARY	t1	index	NULL	col_int_key	5	NULL	2	Using where; Using index
> +2	DEPENDENT SUBQUERY	SUBQUERY_t1	eq_ref	PRIMARY,col_int_key	PRIMARY	4	func	1	Using
> where; Full scan on NULL key
> +2	DEPENDENT SUBQUERY	SUBQUERY_t2	ALL	col_int_key	NULL	NULL	NULL	1	Using where; Using
> join buffer (BNL, incremental buffers)
> +SELECT t1.i1
> +FROM t1
> +WHERE t1.i1 NOT IN
> +( SELECT STRAIGHT_JOIN SUBQUERY_t1.pk
> +FROM t1 AS SUBQUERY_t1
> +JOIN t2 AS SUBQUERY_t2
> +ON SUBQUERY_t2.i1 = SUBQUERY_t1.pk
> +WHERE SUBQUERY_t1.i1>  0
> +OR SUBQUERY_t2.c1 = 'a'
> +);
> +i1
> +NULL
> +133
> +DROP TABLE t1,t2;
>   set optimizer_switch=default;
>
> === modified file 'mysql-test/r/subquery_nomat_nosj.result'
> --- a/mysql-test/r/subquery_nomat_nosj.result	2011-02-14 11:21:26 +0000
> +++ b/mysql-test/r/subquery_nomat_nosj.result	2011-03-22 10:09:26 +0000
> @@ -4923,7 +4923,7 @@
>   id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
>   1	PRIMARY	t1	ALL	NULL	NULL	NULL	NULL	4	Using where
>   2	DEPENDENT SUBQUERY	t2	eq_ref	PRIMARY	PRIMARY	4	func	1	Using where; Using index;
> Full scan on NULL key
> -2	DEPENDENT SUBQUERY	t3	eq_ref	PRIMARY	PRIMARY	4	func	1	Using index condition; Using
> where; Full scan on NULL key
> +2	DEPENDENT SUBQUERY	t3	eq_ref	PRIMARY	PRIMARY	4	func	1	Using where; Full scan on
> NULL key
>   SELECT * FROM t1
>   WHERE t1.id NOT IN (SELECT t2.id FROM t2,t3
>   WHERE t3.name='xxx' AND t2.id=t3.id);
>
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc	2011-03-17 12:45:19 +0000
> +++ b/sql/sql_select.cc	2011-03-22 10:09:26 +0000
> @@ -10295,13 +10295,17 @@
>          condition might then also be applied by the storage engine
>          when doing the update part and result in either not finding
>          the record to update or updating the wrong record.
> +    4. The jointab is not part of a subquery that has guarded conditions
> +       that can be turned on or off during execution of a 'Full scan on NULL
> +       key'.
>     */
>     if (tab->select_cond&&
>         tab->table->file->index_flags(keyno, 0, 1)&
>         HA_DO_INDEX_COND_PUSHDOWN&&
>        
> tab->join->thd->optimizer_switch_flag(OPTIMIZER_SWITCH_INDEX_CONDITION_PUSHDOWN)&&
>         tab->join->thd->lex->sql_command !=
> SQLCOM_UPDATE_MULTI&&
> -      tab->join->thd->lex->sql_command != SQLCOM_DELETE_MULTI)
> +      tab->join->thd->lex->sql_command != SQLCOM_DELETE_MULTI&&
> +      !tab->have_guarded_conds())
>     {
>       DBUG_EXECUTE("where", print_where(tab->select_cond, "full cond",
>                    QT_ORDINARY););
>
> === modified file 'sql/sql_select.h'
> --- a/sql/sql_select.h	2011-03-17 09:43:28 +0000
> +++ b/sql/sql_select.h	2011-03-22 10:09:26 +0000
> @@ -179,6 +179,26 @@
>       return FALSE;
>     }
>
> +
> +  /**
> +    Check if there are triggered/guarded conditions that might be
> +    'switched off' by the subquery code when executing 'Full scan on
> +    NULL' subqueries.
> +
> +    @return true if there are guarded conditions, false otherwise
> +  */
> +
> +  bool have_guarded_conds() const
> +  {
> +    DBUG_ASSERT(key_parts == 0 || cond_guards != NULL);
> +
> +    for (uint i = 0; i<  key_parts; i++)
> +    {
> +      if (cond_guards[i])
> +        return true;
> +    }
> +    return false;
> +  }
>   } TABLE_REF;
>
>
> @@ -431,6 +451,20 @@
>       return tmp_select_cond;
>     }
>     uint get_sj_strategy() const;
> +
> +
> +  /**
> +    Check if there are triggered/guarded conditions that might be
> +    'switched off' by the subquery code when executing 'Full scan on
> +    NULL' subqueries.
> +
> +    @return true if there are guarded conditions, false otherwise
> +  */
> +
> +  bool have_guarded_conds() const
> +  {
> +    return ref.have_guarded_conds();
> +  }
>   } JOIN_TAB;

nitpicking: I think that it would be better English to call the functions 
has_guarded_conds().

Thanks,

Roy
Thread
bzr commit into mysql-trunk branch (olav.sandstaa:3306) Bug#11873324Olav Sandstaa22 Mar
  • Re: bzr commit into mysql-trunk branch (olav.sandstaa:3306) Bug#11873324Roy Lyseng23 Mar
    • Re: bzr commit into mysql-trunk branch (olav.sandstaa:3306) Bug#11873324Olav Sandstaa23 Mar