Hi Øystein,
I think this change is good enough to be pushed.
However, I do question the sanity of the implementation of the subquery
materialization feature. It is strange that we do algorithm selection so early
in the query preparation phase, and then keep the Item_in_subselect object in an
inconsistent state for so long. The inconsistency is cleared when
setup_subquery_materialization() is called, but the need for this makes
JOIN::optimize() unnecessarily clumsy.
It would have been better if algorithm selection was done later in the
preparation phase, or if setup_engine() could be called on-demand in execution.
I realize that some of this is mandated by the "transformation" of the IN
subquery into an EXISTS execution plan. However, I think that this could be
better handled by proper code generation that did not involve generating
additional special Item objects.
This is in contrast to the semijoin transformation which is a true
transformation and keeps its data structures consistent throughout the query
preparation phase.
There are a pair of nit-picking comments below, feel free to ignore them if you
like.
Thanks,
Roy
On 27.07.10 13.05, Oystein Grovlen wrote:
> #At file:///home/og136792/mysql/mysql-next-mr-opt-backporting/ based on
> revid:epotemkin@stripped
>
> 3219 Oystein Grovlen 2010-07-27
> BUG#52344 - Subquery materialization:
> Assertion if subquery in on-clause of outer join
>
> Problem: If tables of an outer join is constant tables,
is -> are
> the associated on-clause will be evaluated in the optimization
> phase. If the on-clause contains a query that is to be
> executed with subquery materialization, this will not work
> since the infrastructure for such execution is not yet set up.
>
> Solution: Do not evaluate on-clause in optimization phase if
> is_expensive() returns true for this clause. This is how the
> problem is currently avoided for where-clauses. This works
> because, Item_in_subselect::is_expensive_processor returns true
> if query is to be executed with subquery materialization.
> @ mysql-test/include/subquery_mat.inc
> Added test case for BUG#52344.
> @ mysql-test/r/subquery_mat.result
> Updated result file with test case for BUG#52344.
> @ mysql-test/r/subquery_mat_all.result
> Updated result file with test case for BUG#52344.
> @ mysql-test/r/subquery_mat_none.result
> Updated result file with test case for BUG#52344.
> @ sql/sql_select.cc
> Do not evaluate on-clause in optimization phase if
> is_expensive() returns true for this clause. This prevents
> executing materialized subqueries in optimization phase.
> (Proper setup for such execution has not been done at this
> stage.)
>
> modified:
> mysql-test/include/subquery_mat.inc
> mysql-test/r/subquery_mat.result
> mysql-test/r/subquery_mat_all.result
> mysql-test/r/subquery_mat_none.result
> sql/sql_select.cc
> === modified file 'mysql-test/include/subquery_mat.inc'
> --- a/mysql-test/include/subquery_mat.inc 2010-06-30 06:35:23 +0000
> +++ b/mysql-test/include/subquery_mat.inc 2010-07-27 11:05:20 +0000
> @@ -1008,6 +1008,33 @@ FROM t2);
> --echo #
> DROP TABLE IF EXISTS t1,t2,t3,empty1;
>
> +
> +--echo #
> +--echo # BUG#52344 - Subquery materialization:
> +--echo # Assertion if subquery in on-clause of outer join
> +--echo #
> +
> +CREATE TABLE t1 (i INTEGER);
> +INSERT INTO t1 VALUES (10);
> +
> +CREATE TABLE t2 (j INTEGER);
> +INSERT INTO t2 VALUES (5);
> +
> +CREATE TABLE t3 (k INTEGER);
> +
> +EXPLAIN
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT k FROM t3);
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT k FROM t3);
> +
> +EXPLAIN
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT max(k) FROM t3);
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT max(k) FROM t3);
> +
> +DROP TABLE t1, t2, t3;
> +
> +--echo # End BUG#52344
> +
> +
> #
> # Bug #52538 Valgrind bug: Item_in_subselect::init_left_expr_cache()
> #
> @@ -1113,3 +1140,5 @@ DROP TABLE t1,t2,t3;
>
> --echo # End BUG#54511
>
> +
> +
>
> === modified file 'mysql-test/r/subquery_mat.result'
> --- a/mysql-test/r/subquery_mat.result 2010-06-30 06:35:23 +0000
> +++ b/mysql-test/r/subquery_mat.result 2010-07-27 11:05:20 +0000
> @@ -1355,6 +1355,35 @@ id select_type table type possible_keys
> # Cleanup for BUG#46680
> #
> DROP TABLE IF EXISTS t1,t2,t3,empty1;
> +#
> +# BUG#52344 - Subquery materialization:
> +# Assertion if subquery in on-clause of outer join
> +#
> +CREATE TABLE t1 (i INTEGER);
> +INSERT INTO t1 VALUES (10);
> +CREATE TABLE t2 (j INTEGER);
> +INSERT INTO t2 VALUES (5);
> +CREATE TABLE t3 (k INTEGER);
> +EXPLAIN
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT k FROM t3);
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 PRIMARY t1 system NULL NULL NULL NULL 1
> +1 PRIMARY t2 system NULL NULL NULL NULL 1
> +2 SUBQUERY t3 system NULL NULL NULL NULL 0 const row not found
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT k FROM t3);
> +i
> +10
> +EXPLAIN
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT max(k) FROM t3);
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 PRIMARY t1 system NULL NULL NULL NULL 1
> +1 PRIMARY t2 system NULL NULL NULL NULL 1
> +2 SUBQUERY t3 system NULL NULL NULL NULL 0 const row not found
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT max(k) FROM t3);
> +i
> +10
> +DROP TABLE t1, t2, t3;
> +# End BUG#52344
> CREATE TABLE t1 (
> pk INTEGER AUTO_INCREMENT,
> col_int_nokey INTEGER,
>
> === modified file 'mysql-test/r/subquery_mat_all.result'
> --- a/mysql-test/r/subquery_mat_all.result 2010-06-30 06:35:23 +0000
> +++ b/mysql-test/r/subquery_mat_all.result 2010-07-27 11:05:20 +0000
> @@ -1354,6 +1354,35 @@ id select_type table type possible_keys
> # Cleanup for BUG#46680
> #
> DROP TABLE IF EXISTS t1,t2,t3,empty1;
> +#
> +# BUG#52344 - Subquery materialization:
> +# Assertion if subquery in on-clause of outer join
> +#
> +CREATE TABLE t1 (i INTEGER);
> +INSERT INTO t1 VALUES (10);
> +CREATE TABLE t2 (j INTEGER);
> +INSERT INTO t2 VALUES (5);
> +CREATE TABLE t3 (k INTEGER);
> +EXPLAIN
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT k FROM t3);
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 PRIMARY t1 system NULL NULL NULL NULL 1
> +1 PRIMARY t2 system NULL NULL NULL NULL 1
> +2 DEPENDENT SUBQUERY NULL NULL NULL NULL NULL NULL NULL no matching row in const
> table
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT k FROM t3);
> +i
> +10
> +EXPLAIN
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT max(k) FROM t3);
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 PRIMARY t1 system NULL NULL NULL NULL 1
> +1 PRIMARY t2 system NULL NULL NULL NULL 1
> +2 SUBQUERY t3 system NULL NULL NULL NULL 0 const row not found
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT max(k) FROM t3);
> +i
> +10
> +DROP TABLE t1, t2, t3;
> +# End BUG#52344
> CREATE TABLE t1 (
> pk INTEGER AUTO_INCREMENT,
> col_int_nokey INTEGER,
>
> === modified file 'mysql-test/r/subquery_mat_none.result'
> --- a/mysql-test/r/subquery_mat_none.result 2010-07-13 17:29:44 +0000
> +++ b/mysql-test/r/subquery_mat_none.result 2010-07-27 11:05:20 +0000
> @@ -1354,6 +1354,35 @@ id select_type table type possible_keys
> # Cleanup for BUG#46680
> #
> DROP TABLE IF EXISTS t1,t2,t3,empty1;
> +#
> +# BUG#52344 - Subquery materialization:
> +# Assertion if subquery in on-clause of outer join
> +#
> +CREATE TABLE t1 (i INTEGER);
> +INSERT INTO t1 VALUES (10);
> +CREATE TABLE t2 (j INTEGER);
> +INSERT INTO t2 VALUES (5);
> +CREATE TABLE t3 (k INTEGER);
> +EXPLAIN
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT k FROM t3);
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 PRIMARY t1 system NULL NULL NULL NULL 1
> +1 PRIMARY t2 system NULL NULL NULL NULL 1
> +2 DEPENDENT SUBQUERY NULL NULL NULL NULL NULL NULL NULL no matching row in const
> table
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT k FROM t3);
> +i
> +10
> +EXPLAIN
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT max(k) FROM t3);
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 PRIMARY t1 system NULL NULL NULL NULL 1
> +1 PRIMARY t2 system NULL NULL NULL NULL 1
> +2 DEPENDENT SUBQUERY NULL NULL NULL NULL NULL NULL NULL no matching row in const
> table
> +SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT max(k) FROM t3);
> +i
> +10
> +DROP TABLE t1, t2, t3;
> +# End BUG#52344
> CREATE TABLE t1 (
> pk INTEGER AUTO_INCREMENT,
> col_int_nokey INTEGER,
>
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc 2010-07-26 11:34:07 +0000
> +++ b/sql/sql_select.cc 2010-07-27 11:05:20 +0000
> @@ -17517,7 +17517,14 @@ join_read_const_table(JOIN_TAB *tab, POS
> DBUG_RETURN(error);
> }
> }
> - if (*tab->on_expr_ref&& !table->null_row)
> + /* We will evaluate on-expressions here only if it is not considered
> + expensive. This also prevents executing materialized subqueries
> + in optimization phase. This is necessary since proper setup for
> + such execution has not been done at this stage. (See comment in
> + internal_remove_eq_conds() tagged DontEvaluateMaterializedSubqueryTooEarly).
Please break up this line.
> + */
> + if (*tab->on_expr_ref&& !table->null_row&&
> + !(*tab->on_expr_ref)->is_expensive())
> {
> if ((table->null_row= test((*tab->on_expr_ref)->val_int() == 0)))
> mark_as_null_row(table);
>
>
>
>
>