iHello Guilhem,
On 05/10/2011 12:31 AM, Guilhem Bichot wrote:
> Hello Gleb,
>
> Good, good job.
Thank you!
> The patch is well commented, I find the design (classes) easy to figure out. This has
> thus been a shorter review than expected :-)
>
> Gleb Shchepa a écrit, Le 01.04.2011 15:54:
>> #At file:///mnt/sda7/work/mysql-next-mr-opt-backporting-wl4897/ based on
> revid:tor.didriksen@stripped
>>
>> 3358 Gleb Shchepa 2011-04-01
>> WL#4897: Add EXPLAIN INSERT/UPDATE/DELETE
>> WL#4897 implements EXPLAIN command for INSERT, REPLACE and
>> single- and multi-table UPDATE and DELETE queries.
>> @ mysql-test/include/explain_non_select.inc
>> Coverage and regresstion tests and sanity checks for WL#4897.
>
> GB5 s/regresstion/regression
Yes. But it's a commit commentary, and I can't fix it since you prefer separate sequential
commits.
>
>> @ mysql-test/r/explain_non_select.result
>> Coverage and regresstion tests and sanity checks for WL#4897.
>
> GB6 same
See GB5.
>
>> @ mysql-test/t/explain_non_select.test
>> Coverage and regresstion tests and sanity checks for WL#4897.
>
> GB7 same
See GB5.
>
>> @ sql/opt_explain.cc
>> WL#4897: Add EXPLAIN INSERT/UPDATE/DELETE
>
>> === added file 'mysql-test/include/explain_non_select.inc'
>> --- a/mysql-test/include/explain_non_select.inc 1970-01-01 00:00:00 +0000
>> +++ b/mysql-test/include/explain_non_select.inc 2011-04-01 13:53:57 +0000
>> @@ -0,0 +1,63 @@
>> +--echo #
>> +--echo # query: $query
>> +--echo # select: $select
>> +--echo #
>> +
>> +if ($select) {
>> +--disable_query_log
>> +--eval $select INTO OUTFILE '$MYSQLTEST_VARDIR/tmp/before_explain.txt'
>> +--enable_query_log
>> +}
>> +
>> +--eval EXPLAIN $query
>> +if (`SELECT ROW_COUNT() > 0`) {
>> +--echo # Erroneous query: EXPLAIN $query
>> +--die Unexpected ROW_COUNT() <> 0
>> +}
>> +
>> +--eval EXPLAIN EXTENDED $query
>> +if (`SELECT ROW_COUNT() > 0`) {
>> +--echo # Erroneous query: EXPLAIN EXTENDED $query
>> +--die Unexpected ROW_COUNT() <> 0
>> +}
>> +
>> +if ($select) {
>> +--eval EXPLAIN EXTENDED $select
>> +}
>> +
>
> GB8 I suggest to add some FLUSH TABLES and SHOW STATUS (like you did below), to show
> that "EXPLAIN DELETE/UPDATE" does not generate more "handler_read/write" calls than
> "EXPLAIN SELECT".
> This will prove that EXPLAIN DELETE/UPDATE is not guilty of reading the tables' data
> more too much (I already know it's true, but better put it in the test).
Done.
>
>> === added file 'mysql-test/r/explain_non_select.result'
>> +DROP TABLE t1;
>> +CREATE TABLE t1 (a INT);
>> +INSERT INTO t1 VALUES (1), (2), (3);
>> +#
>> +# query: DELETE FROM t1 USING t1 WHERE a = 1
>
> GB9 what does "USING t1" mean here (there's no join)?
This is an easiest way to make a multitable DELETE (from single-table one).
>
>
>> +#
>> +# query: DELETE FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1
>> +# select: SELECT * FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1
>> +#
>> +EXPLAIN DELETE FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>> +id select_type table type possible_keys key key_len ref
> rows Extra
>> +1 SIMPLE t1 ALL PRIMARY PRIMARY 4 NULL 4 Using where
>> +EXPLAIN EXTENDED DELETE FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>> +id select_type table type possible_keys key key_len ref
> rows filtered Extra
>> +1 SIMPLE t1 ALL PRIMARY PRIMARY 4 NULL 4 100.00
> Using where
>> +EXPLAIN EXTENDED SELECT * FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>> +id select_type table type possible_keys key key_len ref
> rows filtered Extra
>> +1 SIMPLE t1 index NULL PRIMARY 4 NULL 1 400.00
> Using where; Using index
>
> GB10 400.0 is surprising.
> The manual says "The filtered column indicates an estimated percentage of table rows
> that will be filtered by the table condition", we could imagine it should be <=100. I
> never saw values greater than 100 until now.
> This isn't due to your patch, but I suggest filing a bug report. There's
> another case below (search for 352040 below).
Will report. I can't find an existent bug report for this, but I remember a similar one
(probably just a déjà vu).
>
>> +# query: UPDATE t1 SET a = 10 WHERE a IN (SELECT a FROM t2)
>> +# select: SELECT * FROM t1 WHERE a IN (SELECT a FROM t2)
>> +#
>> +EXPLAIN UPDATE t1 SET a = 10 WHERE a IN (SELECT a FROM t2);
>> +id select_type table type possible_keys key key_len ref
> rows Extra
>> +1 PRIMARY t1 ALL NULL NULL NULL NULL 3 Using where
>> +2 DEPENDENT SUBQUERY t2 ALL NULL NULL NULL NULL 3
> Using where
>> +EXPLAIN EXTENDED UPDATE t1 SET a = 10 WHERE a IN (SELECT a FROM t2);
>> +id select_type table type possible_keys key key_len ref
> rows filtered Extra
>> +1 PRIMARY t1 ALL NULL NULL NULL NULL 3 100.00
> Using where
>> +2 DEPENDENT SUBQUERY t2 ALL NULL NULL NULL NULL 3
> 100.00 Using where
>> +EXPLAIN EXTENDED SELECT * FROM t1 WHERE a IN (SELECT a FROM t2);
>> +id select_type table type possible_keys key key_len ref
> rows filtered Extra
>> +1 PRIMARY t1 ALL NULL NULL NULL NULL 3 100.00
>> +1 PRIMARY t2 ALL NULL NULL NULL NULL 3 100.00
> Using where; FirstMatch(t1); Using join buffer (BNL, incremental buffers)
>
> GB11 I wondered why semijoin is not used for UPDATE above; debugging shows it's
> intentional, per point "6" of resolve_subquery().
As #6 says, single-table UPDATE and DELETE don't have a top-level JOIN (only WHERE item
tree etc), so the current semijoin implementation is not suitable for those queries:
: 6. We are not in a subquery of a single table UPDATE/DELETE that
: doesn't have a JOIN (TODO: We should handle this at some
: point by switching to multi-table UPDATE/DELETE)
>
>> +CREATE TABLE t1 (i INT, j INT);
>> +INSERT INTO t1 VALUES (1, 1), (2, 2), (3, 3), (4, 4), (5, 5);
>> +#
>> +# query: DELETE FROM t1
>> +# select: SELECT * FROM t1
>> +#
>> +EXPLAIN DELETE FROM t1;
>> +id select_type table type possible_keys key key_len ref
> rows Extra
>> +1 SIMPLE NULL NULL NULL NULL NULL NULL 5 Using
> delete_all_rows
>
> GB12 it would be good to test with an innodb table too, to show that
> EXPLAIN properly reports that innodb doesn't use "delete_all_rows". See
> also my comment about sql_delete.cc.
Done. Also I'm thinking about one large include file and two small test files, one for
MyISAM and one for InnoDB.
>
>> +# query: DELETE FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5
>> +# select: SELECT * FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5
>> +#
>> +EXPLAIN DELETE FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>> +id select_type table type possible_keys key key_len ref
> rows Extra
>> +1 SIMPLE t2 ALL a a 15 NULL 17602 Using where
>> +EXPLAIN EXTENDED DELETE FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>> +id select_type table type possible_keys key key_len ref
> rows filtered Extra
>> +1 SIMPLE t2 ALL a a 15 NULL 17602 100.00 Using
> where
>> +EXPLAIN EXTENDED SELECT * FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>> +id select_type table type possible_keys key key_len ref
> rows filtered Extra
>> +1 SIMPLE t2 index NULL a 15 NULL 5 352040.00 Using
> where
>
> GB13 352040.0 is fishy
See GB10.
>
>> +#
>> +# query: DELETE FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5
>> +# select: SELECT * FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5
>> +#
>> +EXPLAIN DELETE FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>> +id select_type table type possible_keys key key_len ref
> rows Extra
>> +1 SIMPLE t2 ALL a NULL NULL NULL 26 Using where;
> Using filesort
>> +EXPLAIN EXTENDED DELETE FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>> +id select_type table type possible_keys key key_len ref
> rows filtered Extra
>> +1 SIMPLE t2 ALL a NULL NULL NULL 26 100.00 Using
> where; Using filesort
>> +Warnings:
>> +Warning 1713 Cannot use range access on index 'a' due to type or collation
> conversion on field 'b'
>> +EXPLAIN EXTENDED SELECT * FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>> +id select_type table type possible_keys key key_len ref
> rows filtered Extra
>> +1 SIMPLE t2 ALL NULL NULL NULL NULL 26 100.00
> Using where; Using filesort
>> +Warnings:
>> +Note 1003 select `test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS
> `b`,`test`.`t2`.`c` AS `c`,`test`.`t2`.`d` AS `d` from `test`.`t2` where (`test`.`t2`.`b`
> = 10) order by `test`.`t2`.`a`,`test`.`t2`.`c` limit 5
>
> GB14 here DELETE has warning 1713 but SELECT doesn't. It's understandable, as DELETE
> considered "a" (in "possible_keys"), then realized it was not usable; whereas SELECT
> didn't consider "a" so didn't even come to seeing the type/collation problem. However I
> don't know why DELETE considered index "a" and SELECT didn't...
I don't have a detailed answer for you ATM. Will answer later. Single-table DELETE
implementation differs from SELECT implementation significantly.
I think that SELECT filters this index out before the "quick select" probe.
>
>> === added file 'sql/opt_explain.cc'
>> --- a/sql/opt_explain.cc 1970-01-01 00:00:00 +0000
>> +++ b/sql/opt_explain.cc 2011-04-01 13:53:57 +0000
>> @@ -0,0 +1,1450 @@
>> +/* Copyright (c) 2011, Oracle and/or its affiliates. All rights reserved.
>> +
>> + This program is free software; you can redistribute it and/or modify
>> + it under the terms of the GNU General Public License as published by
>> + the Free Software Foundation; version 2 of the License.
>> +
>> + This program is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + GNU General Public License for more details.
>> +
>> + You should have received a copy of the GNU General Public License
>> + along with this program; if not, write to the Free Software Foundation,
>> + 51 Franklin Street, Suite 500, Boston, MA 02110-1335 USA */
>> +
>> +#include "opt_explain.h"
>> +#include "sql_partition.h" // for make_used_partitions_str()
>> +
>> +
>> +/**
>> + A base for all Explain_* classes
>> +
>> + This class hierarchy is a successor of the old select_describe() function
>> + implementation. It extends old select_describe() functionality to deal with
>> + single-table data-modifying commands (UPDATE and DELETE).
>
> GB15 and INSERT/REPLACE VALUES() too?
Yes. But I have to remove this message, see GB16.
>
> GB16 in the final push, I guess you will delete references to old code
> (like "old select_describe()"). Those mentions have helped me greatly to
> understand the transition between trunk and your tree. But once your
> tree is trunk, they will not be needed anymore...?
Ok... (Actually it was added not for review only but for easier code browsing in the
future.)
>
>> +*/
>> +
>> +class Explain
>> +{
>> +private:
>> + List<Item> items; ///< item list to feed select_result::send_data()
>> + Item_null *nil; ///< pre-allocated NULL item to fill empty columns in
> EXPLAIN
>> +
>> +protected:
>> + /*
>> + Next "col_*" fields are intended for the filling by "explain_*()" methods.
>> + Then the make_list() method links these Items into "items" list.
>> +
>> + NOTE: NULL value means that Item_null object will be pushed into "items"
>> + list instead.
>> + */
>> + Item_uint *col_id; ///< "id" column: seq. number of SELECT withing the
> query
>
> GB17 protected member variables (not member functions) are discouraged in the two C++
> books I have (Stroustrup, Meyers). Stroustrup: "declaring data members protected is
> usually a design error".
> But I don't have yet a good alternative to suggest. Maybe I will find one.
If you see a more elegant way, you are welcome.
>
>> + Item_string *col_select_type; ///< "select_type" column
>> + Item_string *col_table_name; ///< "table" to which the row of output
> refers
>> + Item_string *col_partitions; ///< "partitions" column
>> + Item_string *col_join_type; ///< "type" column, see join_type_str array
>> + Item_string *col_possible_keys; ///< "possible_keys": comma-separated list
>> + Item_string *col_key; ///< "key" column: index that is actually decided to
> use
>> + Item_string *col_key_len; ///< "key_length" column: length of the "key"
> above
>> + Item_string *col_ref; ///< "ref":columns/constants which are compared to
> "key"
>> + Item_int *col_rows; ///< "rows": estimated number of examined table
> rows
>> + Item_float *col_filtered; ///< "filtered": % of rows filtered by
> condition
>> + Item_string *col_extra; ///< "extra" column: additional information
>> +
>> + THD *thd; ///< cached THD pointer
>> + const CHARSET_INFO *cs; ///< cached pointer to system_charset_info
>> + JOIN *join; ///< top-level JOIN (if any) provided by caller
>> + SELECT_LEX *select_lex; ///< cached select_lex pointer
>> +
>> + select_result *external_result; ///< result stream (if any) provided by
> caller
>> +
>> +public:
>> + explicit Explain(JOIN *join_arg= NULL)
>> + : nil(NULL),
>> + thd(current_thd),
>
> GB18 I think that in all places where we create an Explain, we know the THD already,
> so can pass it, avoiding current_thd.
Ok...
>
>> + cs(system_charset_info),
>> + join(join_arg),
>> + select_lex(join ? join->select_lex : &thd->lex->select_lex),
>> + external_result(join ? join->result : NULL)
>> + {
>> + init_columns();
>> + }
>> + virtual ~Explain() {}
>
> GB19 Explain and Explain_table_base are only used as base classes in
> practice, so I'd declare their constructors/destructors protected, to
> prevent an accidental direct instantation.
Done.
>
>> +
>> + bool send();
>> +
>> +private:
>> + void init_columns();
>> + bool make_list();
>
> GB20 The doxygen comments for those two are in front of the
> implementation; you are allowed to move them in front of the
> declaration. I finds this helps the reader (he can understand the
> "class' API" just by reading the class declaration). But it's up to you.
Private methods are not an API (Interface) ;-)
>
>> + bool push(Item *item) { return items.push_back(item ? item : nil); }
>> +
>> +protected:
>> + bool describe(uint8 mask) { return thd->lex->describe & mask; }
>> +
>> + /**
>> + Prepare the self-allocated result object
>> +
>> + For queries with top-level JOIN the caller provides pre-allocated
>> + select_send object. Then that JOIN object prepares the select_send
>> + object calling result->prepare() in JOIN::prepare(),
>> + result->initalize_tables() in JOIN::optimize() and result->prepare2()
>> + in JOIN::exe().
>
> GB21 s/exe/exec
Fixed.
>
>> + However witout the presence of the top-level JOIN we have to
>
> GB22 s/witout/without
Fixed.
>
>> + prepare/initialize select_send object manually.
>> + */
>> + bool prepare(select_result *result)
>> + {
>> + DBUG_ASSERT(join == NULL);
>> + List<Item> dummy;
>> + return result->prepare(dummy, select_lex->master_unit()) ||
>> + result->prepare2();
>> + }
>> +
>> + virtual bool send_to(select_result *to);
>> +
>> + /*
>> + Rest of the methods are overloadable functions those calculate and fill
>
> GB23 missing comma?
Fixed.
>
>> + "col_*" fields with Items for further sending as EXPLAIN columns.
>> +
>> + "explain_*" methods return FALSE on success and TRUE on error (usually
> OOM).
>> + */
>> + virtual bool explain_id();
>> + virtual bool explain_select_type();
>> + virtual bool explain_table_name() { return FALSE; }
>> + virtual bool explain_partitions() { return FALSE; }
>> + virtual bool explain_join_type() { return FALSE; }
>> + virtual bool explain_possible_keys() { return FALSE; }
>> + /* fill col_key and and col_key_len fields together */
>
> GB24 /**
Fixed.
>
>> + virtual bool explain_key_and_len() { return FALSE; }
>> + virtual bool explain_ref() { return FALSE; }
>> + /* fill col_rows and col_filtered fields together */
>
> GB25 /**
Fixed.
>
>> + virtual bool explain_rows_and_filtered() { return FALSE; }
>> + virtual bool explain_extra();
>> +};
>> +
>> +
>> +/**
>> + Explain_msg class outputs a trivial EXPLAIN row with "extra" column
>> +
>> + Fromer part of the old select_describe() function
>
>
> GB26 s/fromer/former.
Same as GB15.
>
> GB27 in the final push, I guess you will delete references to old code?
Done.
>
>> + This class is intended for simple cases to produce EXPLAIN output
>> + with "No tables used", "No matching records" etc.
>> + Optionally it can output number of estimated rows in the "row"
>> + column.
>> +
>> + NOTE: this class also produces EXPLAIN rows for inner units (if any).
>> +*/
>> +
>> +class Explain_msg: public Explain
>> +{
>> +private:
>> + const char *message; ///< cached "message" argument
>> + const ha_rows rows; ///< HA_POS_ERROR or cached "rows" argument
>> +
>> +public:
>> + Explain_msg(JOIN *join_arg, const char *message_arg)
>
> GB28 should this constructor be explicit? same question for other ctors in the file.
AFAIU it is explicit already since it has more than one parameters without default values.
>
>> + : Explain(join_arg), message(message_arg), rows(HA_POS_ERROR)
>> + {}
>> + + explicit Explain_msg(const char *message_arg, ha_rows rows_arg=
> HA_POS_ERROR)
>> + : message(message_arg), rows(rows_arg)
>> + {}
>> +
>> +protected:
>> + virtual bool explain_rows_and_filtered();
>> + virtual bool explain_extra();
>> +};
>> +
>> +
>> +/**
>> + Explain_union class outputs EXPLAIN row for UNION
>> +
>> + Former part of the old select_describe() function.
>
> GB29 in the final push, I guess you will delete references to old code?
Done.
>
>> +*/
>> +
>> +class Explain_union : public Explain
>> +{
>> +private:
>> + char table_name_buffer[NAME_CHAR_LEN];
>> +
>> +public:
>> + Explain_union(JOIN *join_arg) : Explain(join_arg)
>> + {
>> + /* it's a UNION: */
>> + DBUG_ASSERT(join_arg->select_lex ==
> join_arg->unit->fake_select_lex);
>> + }
>> +
>> +protected:
>> + virtual bool explain_id();
>> + virtual bool explain_table_name();
>> + virtual bool explain_join_type();
>> + virtual bool explain_extra();
>> +};
>> +
>> +
>> +
>> +/**
>> + Common base class for Explain_join and Explain_table
>> +
>> + Former part of the old select_describe() function.
>
> GB30 in the final push, I guess you will delete references to old code?
Done.
>
>> +*/
>> +
>> +class Explain_table_base : public Explain {
>> +protected:
>> + TABLE *table;
>> + key_map usable_keys;
>> +
>> + char buff_possible_keys[512];
>> + String str_possible_keys;
>
> GB31 those two members are used only in
> Explain_table_base::explain_possible_keys()
> so I'd declare them as local variables of that function instead.
> I don't imagine there will be a notable performance loss by doing that.
> It's just allocating 512 bytes on the stack (incrementing the stack
> pointer), and setting a few String members. And it keeps things as
> localized as possible, which is good practice. Then they don't even need a Doxygen
> comment (as members variables, they would).
There is a problem with these fields: since Item_string doesn't copy the string value (in
most cases), we have to declare them all at least at the scope where we use Items.
We can't move char[] buffers, since Item_string have pointers to them.
And we can't move String wrapper variables, since String deallocates its data if
String::alloced==true (pre-allocated char[] buffer is to small to fit String's data).
To solve this trouble I've created a new "memroot_str" class. Instances/data of that class
live at the same scope as Items, but other temporary stuff (char[] buffers and Strings are
moved into methods).
The drawback of this approach is a number (not a big number) of new alloc_root() calls,
but IMO it is acceptable for EXPLAIN call.
>
>> +
>> +public:
>> + explicit Explain_table_base(JOIN *join_arg)
>> + : Explain(join_arg), table(NULL),
>> + str_possible_keys(buff_possible_keys, sizeof(buff_possible_keys), cs)
>> + {}
>> +
>> + explicit Explain_table_base(TABLE *table_arg)
>> + : table(table_arg),
>> + str_possible_keys(buff_possible_keys, sizeof(buff_possible_keys), cs)
>> + {}
>> +
>> +protected:
>> + virtual bool explain_partitions();
>> + virtual bool explain_possible_keys();
>> +};
>> +
>> +
>> +/**
>> + Explain_join class produces EXPLAIN output for JOINs
>> +
>> + Former part of the old select_describe() function.
>
>
>> +*/
>> +
>> +class Explain_join : public Explain_table_base
>> +{
>> +private:
>> + const bool need_tmp_table;
>> + const bool need_order;
>> + const bool distinct;
>> +
>> + uint tabnum;
>> + JOIN_TAB *tab;
>
> GB32 new members need a doxygen comment, like with ///<
Done.
>
>> + int quick_type;
>> + table_map used_tables;
>> + uint last_sjm_table;
>> +
>> + char table_name_buffer[NAME_LEN];
>
> GB33 I'd declare it local to Explain_join::explain_table_name(). Same
> comment for Explain_union's table_name_buffer.
See GB31.
>
>> + char buff_key[512];
>> + String str_key;
>
> GB34 same
>
See commentary for GB31.
>> + char buff_key_len[512];
>> + String str_key_len;
>
> GB35 same
>
See commentary for GB31.
>> + char buff_ref[512];
>> + String str_ref;
>
> GB36 same
>
See commentary for GB31.
>> + char buff_extra[512];
>> + String str_extra;
>
> GB37 same
>
See commentary for GB31.
>> +public:
>> + Explain_join(JOIN *join_arg,
>> + bool need_tmp_table_arg, bool need_order_arg,
>> + bool distinct_arg) + : Explain_table_base(join_arg),
> need_tmp_table(need_tmp_table_arg),
>> + need_order(need_order_arg), distinct(distinct_arg),
>> + tabnum(0), used_tables(0), last_sjm_table(MAX_TABLES),
>> + str_key(buff_key, sizeof(buff_key), cs),
>> + str_key_len(buff_key_len, sizeof(buff_key_len), cs),
>> + str_ref(buff_ref, sizeof(buff_ref), cs),
>> + str_extra(buff_extra, sizeof(buff_extra), cs)
>> + {
>> + /* it is not UNION: */
>> + DBUG_ASSERT(join_arg->select_lex !=
> join_arg->unit->fake_select_lex);
>> + }
>> +
>> +protected:
>> + virtual bool send_to(select_result *to);
>> + virtual bool explain_table_name();
>> + virtual bool explain_join_type();
>> + virtual bool explain_key_and_len();
>> + virtual bool explain_ref();
>> + virtual bool explain_rows_and_filtered();
>> + virtual bool explain_extra();
>> +};
>> +
>> +
>> +/**
>> + Explain_table class produce EXPLAIN output for queries without top-level JOIN
>> +
>> + This class is a simplified version of the Explain_join class. It works in the
>> + context of queries which implementation lacks top-level JOIN object (EXPLAIN
>> + single-table UPDATE and DELETE).
>> +*/
>> +
>> +class Explain_table: public Explain_table_base
>> +{
>> +private:
>> + const SQL_SELECT *select; ///< cached "select" argument
>> + const uint key; ///< cached "key" number argument
>> + const ha_rows limit; ///< HA_POS_ERROR or cached "limit" argument
>> + const bool need_sort; ///< cached need_sort argument
>> +
>> + /*
>> + Pre-allocated buffers and String wrapper objects for them
>> + */
>> + char buff_key[512];
>> + String str_key;
>> + char buff_key_len[512];
>> + String str_key_len;
>> + char buff_extra[512];
>> + String str_extra;
>> +
>> +public:
>> + Explain_table(TABLE *table_arg, SQL_SELECT *select_arg,
>> + uint key_arg, ha_rows limit_arg, bool need_sort_arg)
>> + : Explain_table_base(table_arg), select(select_arg), key(key_arg),
>> + limit(limit_arg), need_sort(need_sort_arg),
>> + str_key(buff_key, sizeof(buff_key), cs),
>> + str_key_len(buff_key_len, sizeof(buff_key_len), cs),
>> + str_extra(buff_extra, sizeof(buff_extra), cs)
>> + {
>> + usable_keys= table->keys_in_use_for_query;
>> + }
>> + +private:
>> + virtual bool explain_table_name();
>> + virtual bool explain_join_type();
>> + virtual bool explain_key_and_len();
>> + virtual bool explain_rows_and_filtered();
>> + virtual bool explain_extra();
>> +};
>> +
>> +
>> +static join_type calc_join_type(int quick_type)
>> +{
>> + if ((quick_type == QUICK_SELECT_I::QS_TYPE_INDEX_MERGE) ||
>> + (quick_type == QUICK_SELECT_I::QS_TYPE_ROR_INTERSECT) ||
>> + (quick_type == QUICK_SELECT_I::QS_TYPE_ROR_UNION))
>> + return JT_INDEX_MERGE;
>> + else
>> + return JT_RANGE;
>> +}
>> +
>> +
>> +/* Explain class methods
> ******************************************************/
>> +
>> +
>> +/**
>> + Explain class main function
>> +
>
> GB38 there is some trailing whitespace in the file
>
Removed.
>> + This method:
>> + a) allocates a select_send object (if no one pre-allocated available),
>> + b) calculates and sends whole EXPLAIN data.
>> +
>> + @returns
>> + @retval FALSE Ok
>> + @retval TRUE Error
>
> GB39 nowadays the coding style allows true/false too, it's up to you.
Ok.
>
>> +*/
>> +
>> +bool Explain::send()
>> +{
>> + /* Don't log this into the slow query log */
>> + thd->server_status&= ~(SERVER_QUERY_NO_INDEX_USED |
>> + SERVER_QUERY_NO_GOOD_INDEX_USED);
>> +
>> + select_result *result;
>> + if (external_result == NULL)
>> + {
>> + /* Create select_result object if the called doesn't provide one: */
>> + result= new select_send;
>> + if (!(result= new select_send))
>
> GB40 two calls to "new", not what you wanted
Oops. Fixed.
>
>> + return TRUE;
>> + if (thd->send_explain_fields(result) || prepare(result))
>
> GB41 it's not related to your patch, but I wonder why
> send_explain_fields() is a member of THD and not a free function; it
> uses only little of THD... Sending fields for a particular command is
> not the job of THD, at least there isn't a similar member function for
> the other SQL commands (or I missed it).
I wonder too.
>
>> + {
>> + delete result;
>> + return TRUE;
>> + }
>> + }
>> + else
>> + {
>> + result= external_result;
>> + external_result->reset_offset_limit_cnt();
>
> GB42 I commented out the call above and no test failed. Do you have an idea how to
> get coverage for it?
AFAIU we need a query with multi-line EXPLAIN output and a LIMIT N clause, where N is
shorter than the number of EXPLAIN output rows.
>
>> + }
>> +
>> + if (!(nil= new Item_null))
>> + return TRUE;
>
> GB43 Can send() be called twice? If it can, we don't have a real leak
> above, because Item_int is created in a MEMROOT, not on the heap. But I
> hope it would still not be called too many times for one statement?
It can be called twice for SELECT queries (UPDATE/DELETE are protected from re-execution).
Fixed.
>
>> + bool ret= send_to(result);
>> +
>> + if (ret && join)
>> + join->error= 1;
>> +
>> + for (SELECT_LEX_UNIT *unit= select_lex->first_inner_unit();
>> + unit && !ret;
>> + unit= unit->next_unit())
>> + ret= mysql_explain_union(thd, unit, result);
>> +
>> + if (external_result == NULL)
>> + {
>> + if (ret)
>> + result->abort_result_set();
>> + else
>> + result->send_eof();
>> + delete result;
>> + }
>> + return ret;
>> +}
>> +
>> +
>> +/**
>> + Reset all "col_*" fields
>> +*/
>> +
>> +void Explain::init_columns()
>> +{
>> + col_id= NULL;
>> + col_select_type= NULL;
>> + col_table_name= NULL;
>> + col_partitions= NULL;
>> + col_join_type= NULL;
>> + col_possible_keys= NULL;
>> + col_key= NULL;
>> + col_key_len= NULL;
>> + col_ref= NULL;
>> + col_rows= NULL;
>> + col_filtered= NULL;
>> + col_extra= NULL;
>> +}
>> +
>> +
>> +/**
>> + Calculate EXPLAIN column values and link them into "items" list
>> +
>> + @returns
>> + @retval FALSE Ok
>> + @retval TRUE Error
>> +*/
>> +
>> +bool Explain::make_list()
>> +{
>> + if (explain_id() ||
>> + explain_select_type() ||
>> + explain_table_name() ||
>> + explain_partitions() ||
>> + explain_join_type() ||
>> + explain_possible_keys() ||
>> + explain_key_and_len() ||
>> + explain_ref() ||
>> + explain_rows_and_filtered() ||
>> + explain_extra())
>> + return TRUE;
>> + + /* + NOTE: the number/types of items pushed into item_list must be in
> sync with
>> + EXPLAIN column types as they're "defined" in THD::send_explain_fields()
>> + */
>> + return push(col_id) ||
>> + push(col_select_type) ||
>> + push(col_table_name) ||
>> + (describe(DESCRIBE_PARTITIONS) && push(col_partitions)) ||
>> + push(col_join_type) ||
>> + push(col_possible_keys) ||
>> + push(col_key) ||
>> + push(col_key_len) ||
>> + push(col_ref) ||
>> + push(col_rows) ||
>> + (describe(DESCRIBE_EXTENDED) && push(col_filtered)) ||
>> + push(col_extra);
>> +}
>> +
>> +
>> +/**
>> + Make "items" list and send it to select_result output stream
>> +
>> + This method is virtual since the base implementation is intended for sending
>> + the "items" list once, but the overloaded Explain_join implementation sends
>> + it many times (once for each JOIN::join_tab[] element).
>> +
>> + @returns
>> + @retval FALSE Ok
>> + @retval TRUE Error
>> +*/
>> +
>> +bool Explain::send_to(select_result *to)
>> +{
>> + bool ret= make_list() || to->send_data(items);
>
> GB44 const bool
Ok.
>
>> + items.empty();
>> + init_columns();
>> + return ret;
>> +}
>> +
>> +
>> +bool Explain::explain_id()
>> +{
>> + col_id= new Item_uint(select_lex->select_number);
>> + return col_id == NULL;
>> +}
>> +
>> +
>> +bool Explain::explain_select_type()
>> +{
>> + if (select_lex->type)
>> + col_select_type= new Item_string(select_lex->type,
>> + strlen(select_lex->type), cs);
>> + else if (select_lex->first_inner_unit() || select_lex->next_select())
>> + col_select_type= new Item_string(STRING_WITH_LEN("PRIMARY"), cs);
>> + else
>> + col_select_type= new Item_string(STRING_WITH_LEN("SIMPLE"), cs);
>> + return col_select_type == NULL; +}
>> +
>> +
>> +bool Explain::explain_extra()
>> +{
>> + col_extra= new Item_string("", 0, cs);
>> + return col_extra == NULL;
>> +}
>> +
>> +
>> +/* Explain_msg class methods
> **************************************************/
>> +
>> +
>> +bool Explain_msg::explain_rows_and_filtered()
>> +{
>> + if (rows == HA_POS_ERROR)
>> + return FALSE;
>> + col_rows= new Item_int(rows, MY_INT64_NUM_DECIMAL_DIGITS);
>> + return col_rows == NULL;
>> +}
>> +
>> +
>> +bool Explain_msg::explain_extra()
>> +{
>> + col_extra= new Item_string(message, strlen(message), cs);
>> + return col_extra == NULL;
>> +}
>> +
>> +
>> +/* Explain_union class methods
> ************************************************/
>> +
>> +
>> +bool Explain_union::explain_id()
>> +{
>> + col_id= NULL;
>
> GB45 IIUC due to init_columns() col_id should already be NULL when we
> enter the function?
Yes. Removed.
>
>> + return FALSE;
>> +}
>> +
>> +
>
>> +bool Explain_union::explain_extra()
>> +{
>> + /* + Moved from select_describe():
>
> GB46 in the final push, I guess you will delete references to old code?
Removed.
>
>> + here we assume that the query will return at least two rows, so we
>> + show "filesort" in EXPLAIN. Of course, sometimes we'll be wrong
>> + and no filesort will be actually done, but executing all selects in
>> + the UNION to provide precise EXPLAIN information will hardly be
>> + appreciated :)
>> + */
>> + if (join->unit->global_parameters->order_list.first)
>> + {
>> + col_extra= new Item_string(STRING_WITH_LEN("Using filesort"), cs);
>> + return col_extra == NULL;
>> + }
>> + return Explain::explain_extra();
>> +}
>> +
>> +
>
>> +bool Explain_join::explain_ref()
>> +{
>> + str_ref.length(0);
>
> GB47 can move the line above into "if (tab->ref.key_parts)". But (see other
> command) making it local would be even better.
>
See commentary for GB31.
>> + if (tab->ref.key_parts)
>> + {
>> + for (store_key **ref= tab->ref.key_copy; *ref; ref++)
>> + {
>> + if (str_ref.length())
>> + str_ref.append(',');
>> + str_ref.append((*ref)->name(), strlen((*ref)->name()), cs);
>> + }
>> + col_ref= new Item_string(str_ref.ptr(), str_ref.length(), cs);
>> + return col_ref == NULL;
>> + }
>> + return FALSE;
>> +}
>> +
>> +
>> +bool Explain_join::explain_rows_and_filtered()
>> +{
>> + if (tab->table->pos_in_table_list->schema_table)
>
> GB48 several 'tab->table' could be replaced with 'table'.
Done.
>
>> + return FALSE;
>> +
>> + double examined_rows;
>> + if (tab->select && tab->select->quick)
>> + examined_rows= rows2double(tab->select->quick->records);
>> + else if (tab->type == JT_NEXT || tab->type == JT_ALL)
>> + {
>> + if (tab->limit)
>> + examined_rows= rows2double(tab->limit);
>> + else
>> + {
>> + tab->table->file->info(HA_STATUS_VARIABLE);
>> + examined_rows= rows2double(tab->table->file->stats.records);
>> + }
>> + }
>> + else
>> + examined_rows= join->best_positions[tabnum].records_read; +
>> + col_rows= new Item_int((longlong) (ulonglong) examined_rows, +
> MY_INT64_NUM_DECIMAL_DIGITS);
>
> GB49 it's not your code but it's strange: Item_int has a constructor which accepts
> ulonglong, why do we cast to longlong here...
I don't understand too why the unsigned == false flag is needed there.
>
>> + if (col_rows == NULL)
>> + return TRUE;
>> +
>> + /* Add "filtered" field */
>> + if (describe(DESCRIBE_EXTENDED))
>> + {
>> + float f= 0.0; + if (examined_rows)
>> + f= (float) (100.0 * join->best_positions[tabnum].records_read /
>> + examined_rows);
>
> GB50 not your code, but the cast to (float) looks weird, as 'f' is already a float.
Yes. Removed.
>
>> + col_filtered= new Item_float(f, 2);
>> + if (col_filtered == NULL)
>> + return TRUE;
>
> GB51 in Explain_join::explain_ref(), you do
> return (col_xxx == NULL);
> instead of the if().
Harmonized :-)
>
>> + }
>> + return FALSE;
>> +}
>> +
>> +
>> +bool Explain_join::explain_extra()
>> +{
>> + str_extra.length(0);
>> +
>> + my_bool key_read=table->key_read;
>
> GB52 space after =
Fixed them all.
>
>> + if ((tab->type == JT_NEXT || tab->type == JT_CONST) &&
>> + table->covering_keys.is_set(tab->index))
>> + key_read=1;
>
> GB53 same. Please do a pass over this function to fix other little issues of this
> kind.
Same.
>
>> + if (quick_type == QUICK_SELECT_I::QS_TYPE_ROR_INTERSECT &&
>> +
> !((QUICK_ROR_INTERSECT_SELECT*)tab->select->quick)->need_to_fetch_row)
>> + key_read=1;
>
> GB54 not your code, but can change "key_read" to bool, store true/false in it, and
> declare only closer to where it's used (which is an if() further down).
Done.
>
>> + if (distinct && test_all_bits(used_tables,thd->used_tables))
>> + str_extra.append(STRING_WITH_LEN("; Distinct"));
>> +
>> + if (tab->loosescan_match_tab)
>> + {
>> + str_extra.append(STRING_WITH_LEN("; LooseScan"));
>> + }
>
> GB55 can delete {}
Done.
>
>
>> +bool Explain_table::explain_key_and_len()
>> +{
>> + str_key.length(0);
>> + str_key_len.length(0);
>> +
>> + if (key != MAX_KEY)
>> + {
>> + KEY *key_info= table->key_info + key;
>> + col_key= new Item_string(key_info->name, strlen(key_info->name), cs);
>> + int length= longlong2str(key_info->key_length, buff_key_len, 10) - +
> buff_key_len;
>> + col_key_len= new Item_string(buff_key_len, length, cs);
>> + return col_key == NULL || col_key_len == NULL;
>> + }
>> + else if (select && select->quick)
>> + {
>> + select->quick->add_keys_and_lengths(&str_key, &str_key_len);
>> + col_key= new Item_string(str_key.ptr(), str_key.length(), cs);
>> + col_key_len= new Item_string(str_key_len.ptr(), str_key_len.length(), cs);
>> + return col_key == NULL || col_key_len == NULL;
>> + }
>> + return FALSE;
>
> GB56 this is a copy-paste of half of Explain_join::explain_key_and_len(); is there a
> good way to avoid this?
Moved common parts to separate methods.
>
>> +}
>> +
>> +
>> +bool Explain_table::explain_rows_and_filtered()
>> +{
>> + double examined_rows;
>> + if (select && select->quick)
>> + examined_rows= rows2double(select->quick->records);
>> + else if (!select && !need_sort && limit != HA_POS_ERROR)
>> + examined_rows= rows2double(limit);
>> + else
>> + {
>> + table->file->info(HA_STATUS_VARIABLE);
>> + examined_rows= rows2double(table->file->stats.records);
>> + }
>> + col_rows= new Item_int((longlong) (ulonglong) examined_rows,
>> + MY_INT64_NUM_DECIMAL_DIGITS);
>> + if (col_rows == NULL)
>> + return TRUE;
>> +
>> + if (describe(DESCRIBE_EXTENDED))
>> + {
>> + col_filtered= new Item_float(100.0, 2);
>> + if (col_filtered == NULL)
>> + return TRUE;
>> + }
>
> GB57 this is also partially a copy-paste of
> Explain_join::explain_rows_and_filtered().
I don't see an acceptable way to split it (do you want a number of very short methods with
very long names? ;-)
>
>> + return FALSE;
>> +}
>> +
>> +
>> +bool Explain_table::explain_extra()
>> +{
>
> GB58 this one also looks like copy-paste of selected bits of Explain_join::extra(),
> is it avoidable?
Ok... But in this case some old regression tests will fail on reordered "extra" output.
>
>> + str_extra.length(0);
>> +
>> + uint keyno= (select && select->quick) ? select->quick->index
> : key;
>> +
>
>> +class explain_send : public select_send {
>> +protected:
>> + /**
>> + Bits for result_state bitmap
>> +
>> + As far as we use explain_send object in a place of select_send,
> explain_send
>> + have to pass multiple invocation of its prepare(), prepare2() and
>> + initialize_tables() methods, since JOIN::exec() of subqueries runs
>> + these methods of select_send multiple times by design.
>> + insert_select, multi_update and multi_delete class methods are not intended
>> + for multiple invocations, so result_state bitmap guards data interceptor
>> + object from method re-invocation.
>> + */
>> + enum result_state_enum { + SELECT_RESULT_PREPARED = 0x01, // 1st bit
> set: prepare() is complete
>> + SELECT_RESULT_PREPARED2 = 0x02, // 2nd bit set: prepare2() is complete
>> + SELECT_RESULT_INITIALIZED = 0x04 // 3rd bit set: initialize_tables() done
>
> GB59 ///< instead of //. Can replace "complete" with "done", to be uniform.
Ok.
>
>> + };
>> + int result_state; ///< bitmap of result_state_enum bits
>
> GB60 "result_state" can be of type enum result_state_enum. Will be easier to read in
> gdb.
Ok. (Actually it's a bitmap, and in most cases gdb will show it as an integer number).
>
>> +
>> + /**
>> + Pointer to underlying insert_select, multi_update or multi_delete object
>> + */
>> + select_result_interceptor *interceptor;
>> +
>> +public:
>> + explain_send(select_result_interceptor *interceptor_arg) + : result_state(0),
> interceptor(interceptor_arg)
>> + {}
>> +
>> +protected:
>> + virtual int prepare(List<Item> &list, SELECT_LEX_UNIT *u)
>> + {
>> + if (result_state & SELECT_RESULT_PREPARED)
>> + return FALSE;
>> + else
>> + result_state|= SELECT_RESULT_PREPARED;
>
> GB61 can remove "else". If you prefer to keep it, then I suggest moving
> the final return into the else block.
Ok.
>
>> + return select_send::prepare(list, u) || interceptor->prepare(list, u);
>> + }
>> +
>> + virtual int prepare2(void)
>> + {
>> + if (result_state & SELECT_RESULT_PREPARED2)
>> + return FALSE;
>> + else
>> + result_state|= SELECT_RESULT_PREPARED2;
>
> GB62 same
Ok.
>
>> + return select_send::prepare2() || interceptor->prepare2();
>> + }
>> +
>> + virtual bool initialize_tables(JOIN *join)
>> + {
>> + if (result_state & SELECT_RESULT_INITIALIZED)
>> + return FALSE;
>> + else
>
> GB63 same
Ok.
>
>> + result_state|= SELECT_RESULT_INITIALIZED;
>> + return select_send::initialize_tables(join) ||
>> + interceptor->initialize_tables(join);
>> + }
>> +
>> + virtual void cleanup()
>> + {
>> + select_send::cleanup();
>> + interceptor->cleanup();
>> + }
>> +};
>> +
>> +
>> +/******************************************************************************
>> + External function implementations
>> +******************************************************************************/
>> +
>> +
>> +/**
>> + Send a messages as an "extra" column value
>
> GB64 s/messages/message
Fixed.
>
>> +
>> + This function forms the 1st row of the QEP output with a simple text message.
>> + This is useful to explain such trivial cases as "No tables used" etc.
>> + + NOTE: Also this function explains the rest of QEP (subqueries or joined
>> + tables if any).
>
> GB65 NOTE -> @note (same remark for all NOTE in the patch).
Ok.
>
>> + @param message text message for the "extra" column.
>> + @param rows HA_POS_ERROR or a value for the "rows" column.
>
> GB66 params above do not correspond to real params
Fixed.
>
>> +
>> + @returns
>> + @retval FALSE OK
>> + @retval TRUE Error
>> +*/
>> +
>> +bool msg_describe(JOIN *join, const char *message)
>> +{
>> + return Explain_msg(join, message).send();
>> +}
>> +
>> +
>> +/**
>> + Send a messages as an "extra" column value
>
> GB67 s/messages/message
Fixed.
>
>> + This function forms the 1st row of the QEP output with a simple text message.
>> + This is useful to explain such trivial cases as "No tables used" etc.
>> + + NOTE: Also this function explains the rest of QEP (subqueries if any).
>
> GB68 @note
Fixed.
>
>> +
>> + @param message text message for the "extra" column.
>> + @param rows HA_POS_ERROR or a value for the "rows" column.
>> +
>> + @returns
>> + @retval FALSE OK
>> + @retval TRUE Error
>> +*/
>> +
>> +bool msg_describe(const char *message, ha_rows rows)
>> +{
>> + return Explain_msg(message, rows).send();
>> +}
>> +
>> +
>> +/**
>> + EXPLAIN handling for single-table UPDATE and DELETE queries
>> +
>> + Send to the client a QEP data set for single-table EXPLAIN UPDATE/DELETE
>> + queries. As far as single-table UPDATE/DELETE are implemented without
>> + the regular JOIN tree, we can't reuse mysql_explain_union() directly,
>> + thus we deal with this single table in a special way and then call
>> + mysql_explain_union() for subqueries (if any).
>> +
>> + @param table TABLE object to update/delete rows in the UPDATE/DELETE
>> + query.
>> + @param select SQL_SELECT object that represents quick access methods/
>> + WHERE clause.
>> + @param key MAX_KEY or and index number of the key that was chosen to
>> + access table data.
>> + @param limit HA_POS_ERROR or LIMIT value.
>> + @param need_sort TRUE if it requires filesort() -- "Using filesort"
>> + string in the "extra" column.
>> +
>> + @returns
>> + @retval FALSE OK
>> + @retval TRUE Error
>> +*/
>> +
>> +bool table_describe(TABLE *table, SQL_SELECT *select, uint key, ha_rows limit,
>> + bool need_sort)
>> +{
>
> GB69 I'd add a few DBUG_ENTER/RETURN. Either to the top functions
> ("table_describe") or to Explain::send(). So that in --debug we at least
> have an idea of what's going on.
Done.
>
>> + return Explain_table(table, select, key, limit, need_sort).send();
>> +}
>> +
>> +
>> +/**
>> + EXPLAIN handling for EXPLAIN SELECT queries
>> +
>> + Send a description about what how the select will be done to the client
>
> GB70 typo
Fixed.
>
>> +
>> + @param join JOIN
>> + @param need_tmp_table TRUE if it requires a temporary table --
>> + "Using temporary" string in the "extra" column.
>> + @param need_order TRUE if it requires filesort() -- "Using filesort"
>> + string in the "extra" column.
>> + @param distinct TRUE if there is the DISTINCT clause (not optimized
>> + out) -- "Distinct" string in the "extra" column.
>> +
>> + @returns
>> + @retval FALSE OK
>> + @retval TRUE Error
>> +*/
>> +
>> +bool select_describe(JOIN *join, bool need_tmp_table, bool need_order,
>> + bool distinct)
>> +{
>> + DBUG_ENTER("select_describe");
>> + DBUG_PRINT("info", ("Select 0x%lx, type %s",
>
> GB71 can change 0x%lx to %p
Ok.
>
>> + (ulong)join->select_lex, join->select_lex->type));
>> + if (join->select_lex == join->unit->fake_select_lex)
>> + DBUG_RETURN(Explain_union(join).send());
>> + else
>> + DBUG_RETURN(Explain_join(join, need_tmp_table, need_order,
> distinct).send());
>
> GB72 the drawback of
> f(()
> {
> DBUG_RETURN(g());
> }
> is that in --debug output it looks like
> >f
> <f
> >g
> <g
> which does not reflect that f() called g().
> I suggest to set a const bool and then DBUG_RETURN it.
Fixed.
>
>> +}
>> +
>> +
>> +/**
>> + EXPLAIN handling for INSERT, REPLACE and multi-table UPDATE/DELETE queries
>> +
>> + Send to the client a QEP data set for data-modifying commands those have a
>> + regular JOIN tree (INSERT...SELECT, REPLACE...SELECT and multi-table
>> + UPDATE and DELETE queries) like mysql_select() does for SELECT queries in
>> + the "describe" mode.
>> +
>> + NOTE: See table_describe() for single-table UPDATE/DELETE EXPLAIN handling.
>> +
>> + NOTE: Unlike the mysql_select() function, explain_data_modification()
>> + calls abort_result_set() itself in the case of failure (OOM etc.)
>> + since explain_data_modification() uses internally created select_result
>> + stream.
>> +
>> + @param result pointer to select_insert, multi_delete or multi_update object:
>> + the function uses it to call result->prepare(),
>> + result->prepare2() and result->initialize_tables() only
> but
>> + not to modify table data or to send a result to client.
>> + @returns
>> + @retval FALSE OK
>> + @retval TRUE Error
>> +*/
>> +
>> +bool explain_data_modification(select_result_interceptor *result)
>
> GB73 I think we need another name.
> Even though you explained everything well in comments, if someone looks at the
> function's name (like, he's reading opt_explain.h), he could wonder why
> explain_data_modification() is not called for single-table UPDATE/DELETE which are "data
> modifications". I propose
> explain_data_modification -> explain_data_modification_having_JOIN
> table_describe -> explain_data_modification_having_no_JOIN
> Those names do speak about low-level details (presence or absence of JOIN), but it's
> unavoidable, because it's this (damn) low-level detail which forces to have two separate
> functions.
> Even msg_describe() could have a new name: after all, all EXPLAINs send messages
> ("msg")...
> Also some functions have "describe" in their name ("table_describe" etc), others have
> "explain" ("explain_data_modification"), could we use a single word?
> You don't need to change names right now, let's see what the other reviewer thinks,
> he may have better proposals.
Done.
>
>> +{
>> + THD *thd= current_thd;
>
> GB74 in places where explain_data_modification() is called, THD is known, we can
> avoid current_thd.
Fixed.
>
>> + explain_send explain(result);
>> + bool res= thd->send_explain_fields(&explain) ||
>> + mysql_explain_union(thd, &thd->lex->unit, &explain)
> ||
>> + thd->is_error();
>> + if (res)
>> + explain.abort_result_set();
>> + else
>> + explain.send_eof();
>> + return res;
>
> GB75 this code above resembles what is done in execute_sqlcom_select() (if
> lex->describe is true), which is:
>
> thd->send_explain_fields(result);
> res= mysql_explain_union(thd, &thd->lex->unit, result);
> ...
> if (res)
> result->abort_result_set();
> else
> result->send_eof();
> Can we make execute_sqlcom_select() call explain_data_modification(), to
> avoid duplication?
> The creation of the warning (of EXPLAIN EXTENDED) can also move into
> explain_data_modification().
Will do with the next commit (need to finish thd->lex->unit.print() implementation).
>
>> +}
>> +
>>
>> === added file 'sql/opt_explain.h'
>> --- a/sql/opt_explain.h 1970-01-01 00:00:00 +0000
>> +++ b/sql/opt_explain.h 2011-04-01 13:53:57 +0000
>> @@ -0,0 +1,30 @@
>> +/* Copyright (c) 2011, Oracle and/or its affiliates. All rights reserved.
>> +
>> + This program is free software; you can redistribute it and/or modify
>> + it under the terms of the GNU General Public License as published by
>> + the Free Software Foundation; version 2 of the License.
>> +
>> + This program is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + GNU General Public License for more details.
>> +
>> + You should have received a copy of the GNU General Public License
>> + along with this program; if not, write to the Free Software Foundation,
>> + 51 Franklin Street, Suite 500, Boston, MA 02110-1335 USA */
>> +
>> +
>> +#ifndef SQL_EXPLAIN_INCLUDED
>
> GB76 or OPT_EXPLAIN_INCLUDED?
Fixed.
>
>> +#define SQL_EXPLAIN_INCLUDED
>> +
>> +#include "sql_select.h"
>
> GB77 I don't remember whether the "way to do" nowadays is the include above,
> or just forward declarations:
> class JOIN;
> class TABLE;
> etc
> #include "my_base.h" // for ha_rows.
> sql_select.h is 2280 lines, it describes JOIN... so if I could choose I'd prefer to
> not include it.
Done.
>
>> +bool msg_describe(JOIN *join, const char *message);
>> +bool msg_describe(const char *message, ha_rows rows= HA_POS_ERROR);
>> +bool table_describe(TABLE *table, SQL_SELECT *select, uint key, ha_rows limit,
>> + bool need_sort);
>> +bool select_describe(JOIN *join, bool need_tmp_table, bool need_order,
>> + bool distinct);
>> +bool explain_data_modification(select_result_interceptor *result);
>
> GB78 some "const" can be added over the code; patch is attached.
>
>> +#endif /* SQL_EXPLAIN_INCLUDED */
>>
>
>> === modified file 'sql/sql_class.h'
>> --- a/sql/sql_class.h 2011-03-24 08:00:03 +0000
>> +++ b/sql/sql_class.h 2011-04-01 13:53:57 +0000
>> @@ -3111,6 +3111,14 @@ public:
>> */
>> virtual void cleanup();
>> void set_thd(THD *thd_arg) { thd= thd_arg; }
>> +
>> + /*
>
> GB79 /**
Ok.
>
>> + If we execute EXPLAIN SELECT ... LIMIT (or any other EXPLAIN query)
>> + we have to ignore LIMIT value sending EXPLAIN output rows since
>> + LIMIT value belongs to the underlying query, not to the whole EXPLAIN.
>> + */
>> + void reset_offset_limit_cnt() { unit->offset_limit_cnt= 0; }
>> +
>> #ifdef EMBEDDED_LIBRARY
>> virtual void begin_dataset() {}
>> #else
>>
>> === modified file 'sql/sql_delete.cc'
>> --- a/sql/sql_delete.cc 2011-03-17 17:39:31 +0000
>> +++ b/sql/sql_delete.cc 2011-04-01 13:53:57 +0000
>> @@ -35,6 +35,7 @@
>> #include "sp_head.h"
>> #include "sql_trigger.h"
>> #include "transaction.h"
>> +#include "opt_explain.h"
>> #include "records.h" // init_read_record,
>> // end_read_record
>>
>> @@ -60,6 +61,7 @@ bool mysql_delete(THD *thd, TABLE_LIST *
>> ha_rows deleted= 0;
>> bool reverse= FALSE;
>> bool skip_record;
>> + bool need_sort= FALSE;
>> ORDER *order= (ORDER *) ((order_list && order_list->elements) ?
>> order_list->first : NULL);
>> uint usable_index= MAX_KEY;
>> @@ -144,6 +146,15 @@ bool mysql_delete(THD *thd, TABLE_LIST *
>> /* Update the table->file->stats.records number */
>> table->file->info(HA_STATUS_VARIABLE | HA_STATUS_NO_LOCK);
>> ha_rows const maybe_deleted= table->file->stats.records;
>> +
>> + if (thd->lex->describe)
>> + {
>> + bool err= msg_describe("Using delete_all_rows", maybe_deleted);
>> + delete select;
>> + free_underlaid_joins(thd, select_lex);
>
> GB80 same remarks as for sql_update.cc (below), regarding having a common exit path.
Fixed.
>
>> + DBUG_RETURN(err);
>> + }
>> +
>> DBUG_PRINT("debug", ("Trying to use delete_all_rows()"));
>> if (!(error=table->file->ha_delete_all_rows()))
>
> GB81 this call may legally fail as "unsupported" (happens in innodb), in
> which case we fall back to row-by-row delete. So the msg_describe()
> above is premature, it gives false information for an innodb table.
Changed to "Deleting all rows".
>
>> {
>> @@ -169,14 +180,31 @@ bool mysql_delete(THD *thd, TABLE_LIST *
>> Item::cond_result result;
>> conds= remove_eq_conds(thd, conds, &result);
>> if (result == Item::COND_FALSE) // Impossible where
>> + {
>> limit= 0;
>> +
>> + if (thd->lex->describe)
>> + {
>> + bool err= msg_describe("Impossible WHERE");
>> + delete select;
>> + free_underlaid_joins(thd, select_lex);
>
> GB82 same note about exit paths.
Fixed.
>
>> + DBUG_RETURN(err);
>> + }
>> + }
>> }
>>
>> #ifdef WITH_PARTITION_STORAGE_ENGINE
>> if (prune_partitions(thd, table, conds))
>> - {
>> + { // No matching records
>> + if (thd->lex->describe)
>> + {
>> + bool err= msg_describe("No matching records");
>> + delete select;
>> + free_underlaid_joins(thd, select_lex);
>
> GB83 same
Fixed.
>
>> + DBUG_RETURN(err);
>> + }
>> +
>> free_underlaid_joins(thd, select_lex);
>> - // No matching record
>> my_ok(thd, 0);
>> DBUG_RETURN(0);
>> }
>> @@ -219,23 +247,34 @@ bool mysql_delete(THD *thd, TABLE_LIST *
>> DBUG_RETURN(TRUE);
>> }
>> }
>> +
>> + if (order)
>> + {
>> + table->update_const_key_parts(conds);
>> + order= simple_remove_const(order, conds);
>> +
>> + usable_index= get_index_for_order(order, table, select, limit,
>> + &need_sort, &reverse);
>> + }
>> +
>> + if (thd->lex->describe)
>> + {
>> + bool err= table_describe(table, select, usable_index, limit, need_sort);
>> + delete select;
>> + free_underlaid_joins(thd, select_lex);
>> + DBUG_RETURN(err);
>> + }
>> +
>> if (options & OPTION_QUICK)
>> (void) table->file->extra(HA_EXTRA_QUICK);
>
> GB84 moving the "quick" down this way is apparently ok. Only MyISAM's
> "index page deleting" code makes use of this information, and that code
> is called only later in this function.
Do you prefer to move it up?
>
>
>> - if (order)
>> + if (need_sort)
>> {
>> uint length= 0;
>> SORT_FIELD *sortorder;
>> ha_rows examined_rows;
>> ha_rows found_rows;
>> - table->update_const_key_parts(conds);
>> - order= simple_remove_const(order, conds);
>> -
>> - bool need_sort;
>> - usable_index= get_index_for_order(order, table, select, limit,
>> - &need_sort, &reverse);
>> - if (need_sort)
>> {
>> DBUG_ASSERT(usable_index == MAX_KEY);
>> table->sort.io_cache= (IO_CACHE *) my_malloc(sizeof(IO_CACHE),
>> @@ -361,6 +400,7 @@ bool mysql_delete(THD *thd, TABLE_LIST *
>> (void) table->file->extra(HA_EXTRA_NORMAL);
>>
>> cleanup:
>> + DBUG_ASSERT(!thd->lex->describe);
>> /*
>> Invalidate the table in the query cache if something changed. This must
>> be before binlog writing and ha_autocommit_...
>>
>> === modified file 'sql/sql_insert.cc'
>> --- a/sql/sql_insert.cc 2011-02-15 17:14:15 +0000
>> +++ b/sql/sql_insert.cc 2011-04-01 13:53:57 +0000
>> @@ -76,6 +76,7 @@
>> #include "transaction.h"
>> #include "sql_audit.h"
>> #include "debug_sync.h"
>> +#include "opt_explain.h"
>>
>> #ifndef EMBEDDED_LIBRARY
>> static bool delayed_get_table(THD *thd, MDL_request *grl_protection_request,
>> @@ -770,6 +771,19 @@ bool mysql_insert(THD *thd,TABLE_LIST *t
>> /* Restore the current context. */
>> ctx_state.restore_state(context, table_list);
>>
>> + if (thd->lex->describe)
>> + {
>> + /*
>> + Obviously INSERT without the SELECT is not suitable for EXPLAIN, since we
>> + don't plan how read tables.
>
> GB85 this sentence is strange English, even for a French :-)
Oops.
>
>> + So we simply send "No tables used" and stop execution here.
>
> GB86 it might be interesting to tell the user whether the INSERT
> will be using bulk insert or not (this can influence speed). But this is
> an engine's decision (in its implementation of start_bulk_insert()), and
> we usually do not and cannot report in EXPLAIN what the engine decided...
Currently I don't see a way how to show it.
>
>> + */
>> +
>> + bool err= msg_describe("No tables used");
>
> GB87 const bool
Ok.
>
>> + free_underlaid_joins(thd, &thd->lex->select_lex);
>> + DBUG_RETURN(err);
>
> GB88 We should try to not add a new exit path. free_underlaid_joins() is
> already present in two places in this function, let's not add a third
> here, it's too much copy-pasting.
> It also has a bug: if executing EXPLAIN INSERT DELAYED, we'll be opening
> the table in open_and_lock_for_insert_delayed(), and then we must
> release it in end_delayed_insert(); we forget to do it in the current
> patch, and so if we
> - start mysqld
> - type
> create table t1 (a int) engine=myisam;
> explain insert delayed into t1 values(1);
> - send SIGTERM to mysqld
> then mysqld hangs, because one thread is in
> #1 0x00000000006e3b53 in inline_mysql_mutex_lock (that=0x1406460,
> src_file=0xc671c0 "sql_insert.cc", src_line=2714) at
> include/mysql/psi/mysql_thread.h:615
> #2 0x00000000006ea124 in handle_delayed_insert (arg=0x23608e0) at
> sql_insert.cc:2714
> because it's waiting for some counter (di->tables_in_use) to be 0, which
> doesn't happen as EXPLAIN INSERT DELAYED didn't call end_delayed_insert().
> So I would prefer if instead of DBUG_RETURN(err) we would go to the
> existing label "abort", which has all we need... except that it returns
> TRUE. Maybe we can change it a bit:
> - initialize existing variable "error" to 'true'
> - rename "abort:" to "exit_without_my_ok:"
> - replace all "goto abort" with "goto exit_without_my_ok" and make sure
> that error==true at those places
> - in code after "exit_without_my_ok:", return "error" instead of TRUE
> - in EXPLAIN INSERT, set 'error' to the return value of msg_describe()
> and go to "exit_without_my_ok"
> ?
Fixed.
>
>> + }
>> +
>> /*
>> Fill in the given fields and dump it to the table file
>> */
>
>> === modified file 'sql/sql_update.cc'
>> --- a/sql/sql_update.cc 2011-02-25 16:41:57 +0000
>> +++ b/sql/sql_update.cc 2011-04-01 13:53:57 +0000
>> @@ -1,4 +1,4 @@
>> -/* Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
>> +/* Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.
>>
>> This program is free software; you can redistribute it and/or modify
>> it under the terms of the GNU General Public License as published by
>> @@ -38,6 +38,7 @@
>> #include "records.h" // init_read_record,
>> // end_read_record
>> #include "filesort.h" // filesort
>> +#include "opt_explain.h"
>> #include "sql_derived.h" // mysql_derived_prepare,
>> // mysql_handle_derived,
>> // mysql_derived_filling
>> @@ -392,7 +393,14 @@ int mysql_update(THD *thd,
>>
>> #ifdef WITH_PARTITION_STORAGE_ENGINE
>> if (prune_partitions(thd, table, conds))
>> - {
>> + { // No matching records
>> + if (thd->lex->describe)
>> + {
>> + bool err= msg_describe("No matching records");
>> + free_underlaid_joins(thd, select_lex);
>> + DBUG_RETURN(err);
>> + }
>
> GB89 I'd prefer to avoid adding this call to free_underlaid_joins().
> http://dev.mysql.com/doc/refman/5.5/en/explain-output.html
> does not mention "No matching records". How about "impossible WHERE"
> instead?
Fixed.
>
>> +
>> free_underlaid_joins(thd, select_lex);
>> my_ok(thd); // No matching records
>> DBUG_RETURN(0);
>> @@ -419,6 +427,13 @@ int mysql_update(THD *thd,
>> {
>> DBUG_RETURN(1); // Error in where
>> }
>> + if (thd->lex->describe)
>> + {
>> + bool err= msg_describe("Impossible WHERE");
>
> GB90 const bool
Ok.
>
>> + delete select;
>> + free_underlaid_joins(thd, select_lex);
>
> GB91 "delete" and free_underlaid_joins() have been called in the same code branch
> just a few lines before (line 417), I suspect we mustn't repeat that (double "delete"
> could cause a memory corruption).
Fixed.
>
>> + DBUG_RETURN(err);
>> + }
>> my_ok(thd); // No matching records
>> DBUG_RETURN(0);
>> }
>> @@ -450,6 +465,15 @@ int mysql_update(THD *thd,
>> used_key_is_modified= is_key_used(table, used_index, table->write_set);
>> }
>>
>> + if (thd->lex->describe)
>> + {
>> + bool err= table_describe(table, select, used_index, limit, +
> order && (need_sort||used_key_is_modified));
>
> GB92 const bool
Ok.
>
> GB93 'order && (need_sort||used_key_is_modified)' is an expression copied
> from code a bit further down, could we not copy-paste but rather store that in some const
> and use it in the two places?
Fixed. (Actually I've tried that before, but in this case we have to declare non-cost
variable at the top of the function -- "thanks" to goto. I don't know, what is uglier:
copy&paste of short expression or top-level declaration of locally-used variable).
>
> GB94 there are 3 situations:
> a) filesort
> b) scan table, collect row ids, update table (it's the "else" branch
> after filesort)
> c) neither a or b.
> In table_describe() above we show "a". Could we also show "b"? Or do we already?
Currently we don't show this case. What "extra" message you suggest to use at this place?
>
>> + delete select;
>> + free_underlaid_joins(thd, select_lex);
>
> GB95 would be great to not add "delete + free_underlaird_joins" again.
> I.e. having a common exit path.
Fixed.
>
>> + DBUG_RETURN(err);
>> + }
>> +
>> #ifdef WITH_PARTITION_STORAGE_ENGINE
>> if (used_key_is_modified || order ||
>> partition_key_modified(table, table->write_set))
>> @@ -1338,24 +1362,29 @@ bool mysql_multi_update(THD *thd,
>> (MODE_STRICT_TRANS_TABLES |
>> MODE_STRICT_ALL_TABLES));
>>
>> - List<Item> total_list;
>> + if (thd->lex->describe)
>> + res= explain_data_modification(*result);
>> + else
>> + {
>> + List<Item> total_list;
>>
>> - res= mysql_select(thd, &select_lex->ref_pointer_array,
>> - table_list, select_lex->with_wild,
>> - total_list,
>> - conds, 0, (ORDER *) NULL, (ORDER *)NULL, (Item *) NULL,
>> - (ORDER *)NULL,
>> - options | SELECT_NO_JOIN_CACHE | SELECT_NO_UNLOCK |
>> - OPTION_SETUP_TABLES_DONE,
>> - *result, unit, select_lex);
>> -
>> - DBUG_PRINT("info",("res: %d report_error: %d", res, (int)
> thd->is_error()));
>> - res|= thd->is_error();
>> - if (unlikely(res))
>> - {
>> - /* If we had a another error reported earlier then this will be ignored */
>> - (*result)->send_error(ER_UNKNOWN_ERROR, ER(ER_UNKNOWN_ERROR));
>> - (*result)->abort_result_set();
>> + res= mysql_select(thd, &select_lex->ref_pointer_array,
>> + table_list, select_lex->with_wild,
>> + total_list,
>> + conds, 0, (ORDER *) NULL, (ORDER *)NULL, (Item *) NULL,
>> + (ORDER *)NULL,
>> + options | SELECT_NO_JOIN_CACHE | SELECT_NO_UNLOCK |
>> + OPTION_SETUP_TABLES_DONE,
>> + *result, unit, select_lex);
>> +
>> + DBUG_PRINT("info",("res: %d report_error: %d",res, (int)
> thd->is_error()));
>> + res|= thd->is_error();
>> + if (unlikely(res))
>> + {
>> + /* If we had a another error reported earlier then this will be ignored
> */
>> + (*result)->send_error(ER_UNKNOWN_ERROR, ER(ER_UNKNOWN_ERROR));
>> + (*result)->abort_result_set();
>> + }
>> }
>> thd->abort_on_warning= 0;
>> DBUG_RETURN(res);
>>
>> === modified file 'sql/sql_yacc.yy'
>> --- a/sql/sql_yacc.yy 2011-03-09 20:54:55 +0000
>> +++ b/sql/sql_yacc.yy 2011-04-01 13:53:57 +0000
>> @@ -11376,6 +11376,34 @@ describe:
>> LEX *lex=Lex;
>> lex->select_lex.options|= SELECT_DESCRIBE;
>> }
>> + | describe_command opt_extended_describe
>> + { Lex->describe|= DESCRIBE_NORMAL; }
>> + insert + {
>> + LEX *lex=Lex;
>> + lex->select_lex.options|= SELECT_DESCRIBE;
>> + }
>> + | describe_command opt_extended_describe
>> + { Lex->describe|= DESCRIBE_NORMAL; }
>> + replace + {
>> + LEX *lex=Lex;
>> + lex->select_lex.options|= SELECT_DESCRIBE;
>> + }
>> + | describe_command opt_extended_describe
>> + { Lex->describe|= DESCRIBE_NORMAL; }
>> + update + {
>> + LEX *lex=Lex;
>> + lex->select_lex.options|= SELECT_DESCRIBE;
>> + }
>> + | describe_command opt_extended_describe
>> + { Lex->describe|= DESCRIBE_NORMAL; }
>> + delete + {
>> + LEX *lex=Lex;
>> + lex->select_lex.options|= SELECT_DESCRIBE;
>> + }
>> ;
>
> GB96 I'm not a parser expert, but isn't it possible to group all those
> this way:
> | describe_command opt_extended_describe
> { Lex->describe|= DESCRIBE_NORMAL; }
> explanable_command
> {
> LEX *lex=Lex;
> lex->select_lex.options|= SELECT_DESCRIBE;
> }
> explanable_command:
> select | insert | replace | update | delete;
> ?
Yep, it should be fine.
>
>>
>> describe_command:
>
> GB97 as discussed, it would be good to have EXPLAIN EXTENDED print the
> transformed query as it does for SELECT.
Not done yet. Will be a separate commit a bit later.
>
> GB98 testing "EXPLAIN PARTITIONS DELETE/UPDATE/...", without and with
> EXTENDED, would be good.
>
> GB99 it would be good to have coverage for all new places where we now
> send EXPLAIN output; for example the ones where prune_partitions()
> shortcuts DELETE/UPDATE; and others.
Added some tests. Still working on "100%" coverage.
>
> GB100 in the WL, there is "The solution is a new explain_send class that
> initializes tables like INSERT/REPLACE and multi-table UPDATE/DELETE
> do". I suggest to change "INSERT/REPLACE" to "INSERT/REPLACE...SELECT".
>
> GB101 what about unit testing? I'm not sure it is easily doable, given the dependency
> on THD, Field, range optimizer, JOIN, JOIN_TAB...?
I'm thinking about that, but I don't see a good (and profitable) way to cover this code
with unit tests.
>
> GB102 I have to think a bit more about explain_send wrapping the interceptor, there
> may be design changes to suggest there. Also about the relationship between mysql_select()
> and explain_data_modification(). Nothing serious though, and not yet clear in my mind.
> Will keep you posted.
>
Ok...
Thank you,
Gleb.