Hi Roy,
Thanks for the patch. Looks good. See in-line for comments.
I also miss a test case for this bug. Why not add the test case from
the bug report?
--
Øystein
On 11/ 9/10 04:29 PM, Roy Lyseng wrote:
> #At file:///home/rl136806/mysql/repo/mysql-work3/ based on
> revid:olav.sandstaa@stripped
>
> 3280 Roy Lyseng 2010-11-09
> Bug#56080: Assertion 'inited=INDEX in sql/handler.h
>
> The problem occurs because some TABLE objects are associated with
> the optimizer execution structures even after close_thread_tables()
> has been called. In turn, this means that final cleanup of some
> TABLE objects is done from within THD::cleanup_after_query().
> At this point in time, the TABLE objects may already have been
> given to another session, meaning that we may end another session's
> index or table scan.
>
> The safest solution seems to be to always perform a thorough cleanup
> of all execution data structures, including releasing the TABLE
> objects, before close_thread_tables() is called. In other words,
> this will have to be done at the end of each execution.
>
> However, this is currently not possible in the implementation of
> subquery materialization. This part of the optimizer assigns some
> objects, including a TABLE object, for the lifetime of the
> statement object, and not for each execution. This gives an odd
> situation, where some parts of the data structures are set up for
> reuse and others are not. Combine this with the fact that the
> user may decide to use another strategy for a subsequent
> subquery execution, the "permanent" data structures may not even
> be used later.
>
> It is therefore reasonable to allocate materialization execution
> structures for the lifetime of one execution. It means that the
> functions init_permanent() and init_runtime() of class
> subselect_hash_sj_engine are combined into one setup() function.
> In addition, the function Item_subselect::cleanup() will be called
> when the cleanup() function for the underlying query expression
> is called, and this function will cleanup and destroy the
> materialization data structures, so that the subquery item object
> is ready for a new optimization process in the next round of
> execution.
>
> However, the EXPLAIN functionality relies on these data
> structures being present when explaining a query containing
> a materialized subquery. Thus, we have two conflicting
> requirements for query cleanup:
> - For maximum efficiency we should release resources as early as
> possible after execution.
> - We cannot release resources that are needed for EXPLAIN queries.
I do not quite understand this. Is this related to the issue that
subqueries are executed as part of EXPLAIN?
...
> @@ -1917,43 +1914,30 @@ void Item_in_subselect::fix_after_pullou
>
> bool Item_in_subselect::setup_engine()
> {
> - subselect_hash_sj_engine *hash_engine;
> DBUG_ENTER("Item_in_subselect::setup_engine");
>
> - if (engine->engine_type() == subselect_engine::SINGLE_SELECT_ENGINE)
> - {
> - /* Create/initialize objects in permanent memory. */
> - Query_arena *arena= thd->stmt_arena;
> - Query_arena backup;
> - if (arena->is_conventional())
> - arena= 0;
> - else
> - thd->set_n_backup_active_arena(arena,&backup);
> + /* Currently, the decision whether to materialize is already taken. */
> + if (exec_method != Item_exists_subselect::EXEC_MATERIALIZATION)
> + DBUG_RETURN(FALSE);
Maybe it should be specified in the function documentation that
setup_engine() does nothing if query is not to be executed with
materialization. As it is, it seems like it should only be called if
materialization has been selected.
I also see that I forgot to remove the comment about reverting to
IN=>EXISTS transformation. Good if you could remove that, too.
By the way, it should be OK to use built-in lower case false, now.
>
> - subselect_single_select_engine *old_engine=
> - static_cast<subselect_single_select_engine*>(engine);
> - if (!(hash_engine= new subselect_hash_sj_engine(thd, this,
> - old_engine)) ||
> - hash_engine->init_permanent(unit->get_unit_column_types()))
> - {
> - /*
> - For some reason we cannot use materialization for this IN predicate.
> - Delete all materialization-related objects, and return error.
> - */
> - delete hash_engine;
> - if (arena)
> - thd->restore_active_arena(arena,&backup);
> - DBUG_RETURN(TRUE);
> - }
> - engine= hash_engine;
> + DBUG_ASSERT(engine->engine_type() == subselect_engine::SINGLE_SELECT_ENGINE);
>
> - if (arena)
> - thd->restore_active_arena(arena,&backup);
> - }
> - else
> + old_engine= engine;
> +
> + if (!(engine= new subselect_hash_sj_engine(thd, this,
> + static_cast<subselect_single_select_engine*>(old_engine))))
> + DBUG_RETURN(TRUE);
> + if (((subselect_hash_sj_engine *) engine)
> + ->setup(unit->get_unit_column_types()))
> {
> - DBUG_ASSERT(engine->engine_type() == subselect_engine::HASH_SJ_ENGINE);
> - hash_engine= static_cast<subselect_hash_sj_engine*>(engine);
> + /*
> + For some reason we cannot use materialization for this IN predicate.
> + Delete all materialization-related objects, and return error.
> + */
> + delete engine;
> + engine= old_engine;
> + old_engine= NULL;
> + DBUG_RETURN(TRUE);
Or DBUG_RETURN(true);
...
> subselect_hash_sj_engine::~subselect_hash_sj_engine()
> {
> delete result;
> - if (tab)
> - free_tmp_table(thd, tab->table);
> + DBUG_ASSERT(!tab);
I suggest adding a comment that explains that the above assert will
verify that cleanup has been called.
...
> @@ -495,11 +494,11 @@ public:
> 1 - Either an execution error, or the engine was "changed", and the
> caller should call exec() again for the new engine.
> */
> - virtual int exec()= 0;
> + virtual bool exec()= 0;
> virtual uint cols()= 0; /* return number of columns in select */
> virtual uint8 uncacheable()= 0; /* query is uncacheable */
> - enum Item_result type() { return res_type; }
> - enum_field_types field_type() { return res_field_type; }
> + virtual const enum Item_result type() const { return res_type; }
> + virtual const enum_field_types field_type() const { return res_field_type; }
I do not think the first const of each line makes much sense. You will
copy this value anyway during assignment. If the return value was a
pointer of a reference, it would be a different matter.
...
> @@ -553,24 +553,26 @@ public:
>
> class subselect_union_engine: public subselect_engine
> {
> +private:
> st_select_lex_unit *unit; /* corresponding unit structure */
If private is made explicit, you might as well move it to the end of the
class definition.
...
> @@ -615,26 +617,28 @@ public:
> {
> set_thd(thd_arg);
> }
> - void cleanup();
> - int prepare();
> - void fix_length_and_dec(Item_cache** row);
> - int exec();
> - uint cols() { return 1; }
> - uint8 uncacheable() { return UNCACHEABLE_DEPENDENT; }
> - void exclude();
> - table_map upper_select_const_tables() { return 0; }
> + virtual void cleanup() {}
> + virtual bool prepare();
> + virtual void fix_length_and_dec(Item_cache** row);
> + virtual bool exec();
> + virtual uint cols() { return 1; }
> + virtual uint8 uncacheable() { return UNCACHEABLE_DEPENDENT; }
> + virtual void exclude();
> + virtual table_map upper_select_const_tables() { return 0; }
> virtual void print (String *str, enum_query_type query_type);
> - bool change_result(Item_subselect *si, select_result_interceptor *result);
> - bool no_tables();
> - int scan_table();
> + virtual bool change_result(Item_subselect *si,
> + select_result_interceptor *result);
> + virtual bool no_tables();
> + bool scan_table();
> bool copy_ref_key();
> - bool no_rows() { return empty_result_set; }
> + virtual bool no_rows() { return empty_result_set; }
I think no_rows() with some effort could be made const.
> virtual enum_engine_type engine_type() { return UNIQUESUBQUERY_ENGINE; }
> };
>
>
> class subselect_indexsubquery_engine: public subselect_uniquesubquery_engine
> {
> +private:
> /* FALSE for 'ref', TRUE for 'ref-or-null'. */
> bool check_null;
> /*
> @@ -677,7 +681,7 @@ public:
> check_null(chk_null),
> having(having_arg)
> {}
> - int exec();
> + virtual bool exec();
> virtual void print (String *str, enum_query_type query_type);
> virtual enum_engine_type engine_type() { return INDEXSUBQUERY_ENGINE; }
> };
> @@ -712,21 +716,17 @@ inline bool Item_subselect::is_uncacheab
>
> class subselect_hash_sj_engine: public subselect_uniquesubquery_engine
> {
> -protected:
> +private:
> /* TRUE if the subquery was materialized into a temp table. */
> bool is_materialized;
> /*
> The old engine already chosen at parse time and stored in permanent memory.
> - Through this member we can re-create and re-prepare materialize_join for
> - each execution of a prepared statement. We also reuse the functionality
> - of subselect_single_select_engine::[prepare | cols].
> + Through this member we can re-create and re-prepare the join object
> + used to materialize the subquery for each execution of a prepared
> + statement. We also reuse the functionality of
> + subselect_single_select_engine::[prepare | cols].
> */
> subselect_single_select_engine *materialize_engine;
> - /*
> - QEP to execute the subquery and materialize its result into a
> - temporary table. Created during the first call to exec().
> - */
> - JOIN *materialize_join;
> /* Temp table context of the outer select's JOIN. */
> TMP_TABLE_PARAM *tmp_param;
>
> @@ -735,20 +735,19 @@ public:
> subselect_single_select_engine *old_engine)
> :subselect_uniquesubquery_engine(thd, NULL, in_predicate, NULL),
> is_materialized(FALSE), materialize_engine(old_engine),
> - materialize_join(NULL), tmp_param(NULL)
> + tmp_param(NULL)
> {}
> ~subselect_hash_sj_engine();
>
> - bool init_permanent(List<Item> *tmp_columns);
> - bool init_runtime();
> - void cleanup();
> - int prepare()
> + bool setup(List<Item> *tmp_columns);
> + virtual void cleanup();
> + virtual bool prepare()
> {
> return materialize_engine->prepare();
> }
> - int exec();
> - void print (String *str, enum_query_type query_type);
> - uint cols()
> + virtual bool exec();
> + virtual void print (String *str, enum_query_type query_type);
> + virtual uint cols()
> {
> return materialize_engine->cols();
> }
Can cols() be made const?
>
> === modified file 'sql/sql_lex.h'
> --- a/sql/sql_lex.h 2010-10-25 16:07:05 +0000
> +++ b/sql/sql_lex.h 2010-11-09 15:29:16 +0000
> @@ -596,7 +596,7 @@ public:
> inline bool is_union ();
>
> friend void lex_start(THD *thd);
> - friend int subselect_union_engine::exec();
> + friend bool subselect_union_engine::exec();
>
> List<Item> *get_unit_column_types();
> };
>
> === modified file 'sql/sql_parse.cc'
> --- a/sql/sql_parse.cc 2010-10-27 14:46:44 +0000
> +++ b/sql/sql_parse.cc 2010-11-09 15:29:16 +0000
> @@ -5551,6 +5551,8 @@ void mysql_parse(THD *thd, char *rawbuf,
> thd->end_statement();
> thd->cleanup_after_query();
> DBUG_ASSERT(thd->change_list.is_empty());
> + //sql_print_error("Query: %*s", length, rawbuf);
> + //report_memory_plugin();
I guess this is not intentionally included.
> }
>
> DBUG_VOID_RETURN;
>
> === modified file 'sql/sql_plugin.cc'
> --- a/sql/sql_plugin.cc 2010-11-05 22:14:29 +0000
> +++ b/sql/sql_plugin.cc 2010-11-09 15:29:16 +0000
> @@ -1554,6 +1554,19 @@ error:
> DBUG_RETURN(TRUE);
> }
>
> +void report_memory_plugin()
> +
> +{
> + struct st_plugin_int *p;
> +
> + for (uint i= 0; i< plugin_array.elements; i++)
> + {
> + p= *dynamic_element(&plugin_array, i, struct st_plugin_int **);
> + if (!strcmp(p->name.str, "MEMORY"))
> + sql_print_error("Plugin '%s' has ref_count=%d",
> + p->name.str, p->ref_count);
> + }
> +}
This is not mentioned in commit comments. Included by accident?
>
> void plugin_shutdown(void)
> {
>
> === modified file 'sql/sql_plugin.h'
> --- a/sql/sql_plugin.h 2010-10-07 17:34:38 +0000
> +++ b/sql/sql_plugin.h 2010-11-09 15:29:16 +0000
> @@ -160,4 +160,5 @@ typedef my_bool (plugin_foreach_func)(TH
> #define plugin_foreach(A,B,C,D) plugin_foreach_with_mask(A,B,C,PLUGIN_IS_READY,D)
> extern bool plugin_foreach_with_mask(THD *thd, plugin_foreach_func *func,
> int type, uint state_mask, void *arg);
> +void report_memory_plugin();
> #endif
>
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc 2010-11-05 22:19:41 +0000
> +++ b/sql/sql_select.cc 2010-11-09 15:29:16 +0000
> @@ -2626,10 +2626,14 @@ void JOIN::restore_tmp()
> }
>
>
> -int
> -JOIN::reinit()
> +/**
> + Reset the state of this join object so that it is ready for a
> + new execution.
> +*/
> +
> +void JOIN::reset()
> {
> - DBUG_ENTER("JOIN::reinit");
> + DBUG_ENTER("JOIN::reset");
>
> unit->offset_limit_cnt= (ha_rows)(select_lex->offset_limit ?
> select_lex->offset_limit->val_uint() :
> @@ -2677,7 +2681,7 @@ JOIN::reinit()
> func->clear();
> }
>
> - DBUG_RETURN(0);
> + DBUG_VOID_RETURN;
> }
>
> /**
> @@ -3309,14 +3313,12 @@ JOIN::exec()
>
>
> /**
> - Clean up join.
> + Clean up and destroy join object.
>
> - @return
> - Return error that hold JOIN.
> + @return false if previous execution was successful, and true otherwise
Why is this return value needed? Cannot this be picked up after execution?
...
> === modified file 'sql/sql_union.cc'
> --- a/sql/sql_union.cc 2010-10-25 09:18:21 +0000
> +++ b/sql/sql_union.cc 2010-11-09 15:29:16 +0000
> @@ -223,7 +223,7 @@ bool st_select_lex_unit::prepare(THD *th
> DBUG_RETURN(TRUE);
> }
> sl->join->select_options|= SELECT_DESCRIBE;
> - sl->join->reinit();
> + sl->join->reset();
Please, remove TAB indentation.
> }
> }
> DBUG_RETURN(FALSE);
> @@ -495,7 +495,10 @@ bool st_select_lex_unit::exec()
> thd->lex->current_select= sl;
>
> if (optimized)
> - saved_error= sl->join->reinit();
> + {
> + saved_error= false;
TAB indentation.
...
--
Øystein