Hi Øystein,
thanks for fixing this.
Some small comments in addition to Guilhem's comments.
On 29.08.10 08.31, oystein.grovlen@stripped wrote:
> #At file:///home/oysteing/mysql-new/mysql-next-mr-opt-backporting-2/ based on
revid:epotemkin@stripped
>
> 3232 oystein.grovlen@stripped 2010-08-29
> Bug#48213 Materialized subselect crashes if using GEOMETRY type
>
> The problem occurred because during execution of MaterializationLookup
> strategy for semi-join, a materialized table was created which
> contained a GEOMETRY column, a specialized BLOB column. This caused
> an segmentation fault because such tables will have extra columns, and
> the semi-join code was not prepared for that.
The terms used are "MaterializeLookup" and "MaterializeScan".
>
> The solution is to disable MaterializationLookup strategy when
> Blob/Geometry columns would need to be materialized. Blob columns
> cannot be used for index look-up anyway, so it does not makes sense to
> use MaterializationLookup.
>
> This fix also implies that it is detected earlier that subquery
> materialization can not be used either. This means that code to
> revert the decision to use subquery materialization is no longer
> needed, and this has been removed.
>
> In addition to adding a new test case for this bug, some existing
> tests for subqueries involving BLOB columns have been moved from
> subquery_mat.inc to subquery_sj.inc. This way, these tests will also
> be run for more cases involving semijoin materialization. Also, the
> extended query plans for many of these tests had to be updated since
> it is now reflected that in->exist optimization will be performed for
> the involved queries.
> @ mysql-test/include/subquery_mat.inc
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/include/subquery_sj.inc
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_mat.result
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_mat_all.result
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_mat_none.result
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_all.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_all_jcl6.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_all_jcl7.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_dupsweed.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_dupsweed_jcl6.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_dupsweed_jcl7.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_firstmatch.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_firstmatch_jcl6.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_firstmatch_jcl7.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_loosescan.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_loosescan_jcl6.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_loosescan_jcl7.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_mat.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_mat_jcl6.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_mat_jcl7.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_mat_nosj.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
Some
> extended query plans for the moved tests had to be updated since
> it is now reflected that in->exist optimization will be performed for
> the involved queries.
> @ mysql-test/r/subquery_sj_none.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_none_jcl6.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ mysql-test/r/subquery_sj_none_jcl7.result
> Added new test case for Bug#48213
> Moved tests for subqueries involving BLOB columns from
subquery_mat.inc to
> subquery_sj.inc to get better coverage for semijoin materialization.
> @ sql/item.cc
> Added method Item::allows_materialization_lookup() that checks if
this item
> allows MaterializationLookup strategy to be used for semijoin. That is,
> if BLOB type is involved, MaterializationLookup cannot be used.
> @ sql/item.h
> Added method Item::allows_materialization_lookup() that checks if
this item
> allows MaterializationLookup strategy to be used for semijoin.
> @ sql/item_subselect.cc
> Removed code to revert the decision to use subquery materialization.
> It will now be detected earlier that subquery materialization is not
possible,
> so this code is no longer needed.
> @ sql/sql_select.cc
> Added check for whether column types of subquery allow index to be
created
> on materialized table. If not, subquery materialization and
> MaterializationLookup strategy for semijoin cannot be used.
> @ sql/table.h
> Added Semijoin_mat_optimize::lookup_allowed to record whether it is
possible
> to use MaterializationLookup strategy for this semijoin materialization.
>
> modified:
> mysql-test/include/subquery_mat.inc
> mysql-test/include/subquery_sj.inc
> mysql-test/r/subquery_mat.result
> mysql-test/r/subquery_mat_all.result
> mysql-test/r/subquery_mat_none.result
> mysql-test/r/subquery_sj_all.result
> mysql-test/r/subquery_sj_all_jcl6.result
> mysql-test/r/subquery_sj_all_jcl7.result
> mysql-test/r/subquery_sj_dupsweed.result
> mysql-test/r/subquery_sj_dupsweed_jcl6.result
> mysql-test/r/subquery_sj_dupsweed_jcl7.result
> mysql-test/r/subquery_sj_firstmatch.result
> mysql-test/r/subquery_sj_firstmatch_jcl6.result
> mysql-test/r/subquery_sj_firstmatch_jcl7.result
> mysql-test/r/subquery_sj_loosescan.result
> mysql-test/r/subquery_sj_loosescan_jcl6.result
> mysql-test/r/subquery_sj_loosescan_jcl7.result
> mysql-test/r/subquery_sj_mat.result
> mysql-test/r/subquery_sj_mat_jcl6.result
> mysql-test/r/subquery_sj_mat_jcl7.result
> mysql-test/r/subquery_sj_mat_nosj.result
> mysql-test/r/subquery_sj_none.result
> mysql-test/r/subquery_sj_none_jcl6.result
> mysql-test/r/subquery_sj_none_jcl7.result
> sql/item.cc
> sql/item.h
> sql/item_subselect.cc
> sql/sql_select.cc
> sql/table.h
>
<Lots of test changes clipped>.
Moving the tests should be a separate patch - it is very hard to follow what are
caused by your changes and what are caused by moving tests.
> === modified file 'sql/item.cc'
> --- a/sql/item.cc 2010-08-14 09:38:42 +0000
> +++ b/sql/item.cc 2010-08-29 06:31:42 +0000
> @@ -1081,6 +1081,27 @@ int Item::save_in_field_no_warnings(Fiel
> }
>
>
> +/**
> + @brief Check if item allows MaterializationLookup semijoin strategy
> +
> + @details MaterializationLookup strategy requires an index on the
> + materialized subquery. Indexes are not supported for BLOB and
> + GEOMETRY columns. Note also that original non-BLOB columns that are
> + longer than CONVERT_IF_BIGGER_TO_BLOB will be converted to BLOB columns.
> +*/
> +bool Item::allows_materialization_lookup()
> +
> +{
I think that you should add DBUG_ASSERT(fixed); here.
> + if (field_type() == MYSQL_TYPE_BLOB || field_type() == MYSQL_TYPE_GEOMETRY)
> + return FALSE;
> +
> + if (max_length > CONVERT_IF_BIGGER_TO_BLOB)
> + return FALSE;
> +
> + return TRUE;
> +}
> +
> +
> /*****************************************************************************
> Item_sp_variable methods
> *****************************************************************************/
>
> === modified file 'sql/item.h'
> --- a/sql/item.h 2010-08-19 12:54:22 +0000
> +++ b/sql/item.h 2010-08-29 06:31:42 +0000
> @@ -1276,6 +1276,14 @@ public:
> Return TRUE if the item points to a column of an outer-joined table.
> */
> virtual bool is_outer_field() const { DBUG_ASSERT(fixed); return FALSE; }
> +
> + /**
> + Check if item allows MaterializationLookup semijoin strategy
> +
> + @retval TRUE If subquery types allow MaterializationLookup.
> + @retval FALSE Otherwise.
> + */
> + bool allows_materialization_lookup();
> };
>
>
>
> === modified file 'sql/item_subselect.cc'
> --- a/sql/item_subselect.cc 2010-08-04 10:34:01 +0000
> +++ b/sql/item_subselect.cc 2010-08-29 06:31:42 +0000
> @@ -1847,7 +1847,8 @@ bool Item_in_subselect::fix_fields(THD *
> execution.
>
> @returns
> - @retval TRUE memory allocation error occurred
> + @retval TRUE memory allocation error occurred, or was not able to create
> + temporary table
> @retval FALSE an execution method was chosen successfully
> */
>
> @@ -1875,21 +1876,13 @@ bool Item_in_subselect::setup_engine()
> old_engine)) ||
> new_engine->init_permanent(unit->get_unit_column_types()))
> {
> - Item_subselect::trans_res trans_res;
> /*
> - If for some reason we cannot use materialization for this IN predicate,
> - delete all materialization-related objects, and apply the IN=>EXISTS
> - transformation.
> + For some reason we cannot use materialization for this IN predicate.
> + Delete all materialization-related objects, and return error.
> */
> delete new_engine;
> new_engine= NULL;
> - exec_method= EXEC_UNSPECIFIED;
> - if (left_expr->cols() == 1)
> - trans_res= single_value_in_to_exists_transformer(old_engine->join,
> - &eq_creator);
> - else
> - trans_res= row_value_in_to_exists_transformer(old_engine->join);
> - res= (trans_res != Item_subselect::RES_OK);
> + DBUG_RETURN(TRUE);
> }
> if (new_engine)
> engine= new_engine;
new_engine is never NULL here, hence the test can be removed.
AFAIU, you can also eliminate the "if (new_engine)" test around the
init_runtime() call, and you can eliminate the local variable "res" (do
DBUG_RETURN(TRUE) and DBUG_RETURN(FALSE) instead of DBUG_RETURN(res).
Nice little cleanup that eliminates two tests and a local variable :)
>
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc 2010-08-19 16:44:09 +0000
> +++ b/sql/sql_select.cc 2010-08-29 06:31:42 +0000
> @@ -984,7 +984,10 @@ bool subquery_types_allow_materializatio
> if (!types_allow_materialization(predicate->left_expr->element_index(i),
> inner))
> DBUG_RETURN(FALSE);
> + if (!inner->allows_materialization_lookup())
> + DBUG_RETURN(FALSE);
> }
> +
> DBUG_PRINT("info",("subquery_types_allow_materialization: ok, allowed"));
> DBUG_RETURN(TRUE);
> }
> @@ -1049,6 +1052,7 @@ bool semijoin_types_allow_materializatio
> List_iterator<Item> it2(sj_nest->nested_join->sj_inner_exprs);
>
> sj_nest->nested_join->sjm.scan_allowed= FALSE;
> + sj_nest->nested_join->sjm.lookup_allowed= TRUE;
>
> bool all_are_fields= TRUE;
> Item *outer, *inner;
> @@ -1058,10 +1062,13 @@ bool semijoin_types_allow_materializatio
> inner->real_item()->type() == Item::FIELD_ITEM);
> if (!types_allow_materialization(outer, inner))
> DBUG_RETURN(FALSE);
> + sj_nest->nested_join->sjm.lookup_allowed&=
> + inner->allows_materialization_lookup();
> }
> sj_nest->nested_join->sjm.scan_allowed= all_are_fields;
> DBUG_PRINT("info",("semijoin_types_allow_materialization: ok, allowed"));
> - DBUG_RETURN(TRUE);
> + DBUG_RETURN(sj_nest->nested_join->sjm.scan_allowed ||
> + sj_nest->nested_join->sjm.lookup_allowed);
> }
>
>
> @@ -7504,9 +7511,10 @@ semijoin_order_allows_materialization(co
>
> /*
> Must use MaterializeScan strategy if there are outer correlated tables
> - among the remaining tables, otherwise use MaterializeLookup.
> + among the remaining tables, otherwise, if possible, use MaterializeLookup.
> */
> - if (remaining_tables & emb_sj_nest->nested_join->sj_depends_on)
> + if (remaining_tables & emb_sj_nest->nested_join->sj_depends_on ||
> + !emb_sj_nest->nested_join->sjm.lookup_allowed)
> {
> if (emb_sj_nest->nested_join->sjm.scan_allowed)
> return SJ_OPT_MATERIALIZE_SCAN;
>
> === modified file 'sql/table.h'
> --- a/sql/table.h 2010-08-23 12:05:47 +0000
> +++ b/sql/table.h 2010-08-29 06:31:42 +0000
> @@ -1976,6 +1976,8 @@ struct Semijoin_mat_optimize
> {
> /* Optimal join order calculated for inner tables of this semijoin op. */
> struct st_position *positions;
> + /* True if data types allow the MaterializeLookup semijoin strategy */
> + bool lookup_allowed;
> /* True if data types allow the MaterializeScan semijoin strategy */
> bool scan_allowed;
> /* Expected #rows in the materialized table */
I am about to propose a refactoring of semijoin strategy that among other things
will replace scan_allowed (and the new lookup_allowed) with an enum/bitmask
field. It should be easy to integrate this after your patch, though.
Thanks,
Roy