Hello,
I deleted all "ok" points where I have nothing to add.
Gleb Shchepa a écrit, Le 25.05.2011 16:21:
> On 05/10/2011 12:31 AM, Guilhem Bichot wrote:
>>> 3358 Gleb Shchepa 2011-04-01
>>> WL#4897: Add EXPLAIN INSERT/UPDATE/DELETE
>>> WL#4897 implements EXPLAIN command for INSERT, REPLACE and
>>> single- and multi-table UPDATE and DELETE queries.
>>> === added file 'mysql-test/r/explain_non_select.result'
>>> +#
>>> +# query: DELETE FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1
>>> +# select: SELECT * FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1
>>> +#
>>> +EXPLAIN DELETE FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>>> +id select_type table type possible_keys key
>>> key_len ref rows Extra
>>> +1 SIMPLE t1 ALL PRIMARY PRIMARY 4 NULL 4
>>> Using where
>>> +EXPLAIN EXTENDED DELETE FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>>> +id select_type table type possible_keys key
>>> key_len ref rows filtered Extra
>>> +1 SIMPLE t1 ALL PRIMARY PRIMARY 4 NULL 4
>>> 100.00 Using where
>>> +EXPLAIN EXTENDED SELECT * FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>>> +id select_type table type possible_keys key
>>> key_len ref rows filtered Extra
>>> +1 SIMPLE t1 index NULL PRIMARY 4 NULL 1
>>> 400.00 Using where; Using index
>>
>> GB10 400.0 is surprising.
>> The manual says "The filtered column indicates an estimated percentage
>> of table rows that will be filtered by the table condition", we could
>> imagine it should be <=100. I never saw values greater than 100 until
>> now.
>> This isn't due to your patch, but I suggest filing a bug report. There's
>> another case below (search for 352040 below).
>
> Will report. I can't find an existent bug report for this, but I
> remember a similar one (probably just a déjà vu).
as you said, it's BUG#34124.
>>> +CREATE TABLE t1 (i INT, j INT);
>>> +INSERT INTO t1 VALUES (1, 1), (2, 2), (3, 3), (4, 4), (5, 5);
>>> +#
>>> +# query: DELETE FROM t1
>>> +# select: SELECT * FROM t1
>>> +#
>>> +EXPLAIN DELETE FROM t1;
>>> +id select_type table type possible_keys key
>>> key_len ref rows Extra
>>> +1 SIMPLE NULL NULL NULL NULL NULL NULL 5
>>> Using delete_all_rows
>>
>> GB12 it would be good to test with an innodb table too, to show that
>> EXPLAIN properly reports that innodb doesn't use "delete_all_rows". See
>> also my comment about sql_delete.cc.
>
> Done. Also I'm thinking about one large include file and two small test
> files, one for MyISAM and one for InnoDB.
good idea
>>> +#
>>> +# query: DELETE FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5
>>> +# select: SELECT * FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5
>>> +#
>>> +EXPLAIN DELETE FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>>> +id select_type table type possible_keys key
>>> key_len ref rows Extra
>>> +1 SIMPLE t2 ALL a NULL NULL NULL 26 Using
>>> where; Using filesort
>>> +EXPLAIN EXTENDED DELETE FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>>> +id select_type table type possible_keys key
>>> key_len ref rows filtered Extra
>>> +1 SIMPLE t2 ALL a NULL NULL NULL 26
>>> 100.00 Using where; Using filesort
>>> +Warnings:
>>> +Warning 1713 Cannot use range access on index 'a' due to type
>>> or collation conversion on field 'b'
>>> +EXPLAIN EXTENDED SELECT * FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>>> +id select_type table type possible_keys key
>>> key_len ref rows filtered Extra
>>> +1 SIMPLE t2 ALL NULL NULL NULL NULL 26
>>> 100.00 Using where; Using filesort
>>> +Warnings:
>>> +Note 1003 select `test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS
>>> `b`,`test`.`t2`.`c` AS `c`,`test`.`t2`.`d` AS `d` from `test`.`t2`
>>> where (`test`.`t2`.`b` = 10) order by `test`.`t2`.`a`,`test`.`t2`.`c`
>>> limit 5
>>
>> GB14 here DELETE has warning 1713 but SELECT doesn't. It's
>> understandable, as DELETE considered "a" (in "possible_keys"), then
>> realized it was not usable; whereas SELECT didn't consider "a" so
>> didn't even come to seeing the type/collation problem. However I don't
>> know why DELETE considered index "a" and SELECT didn't...
>
> I don't have a detailed answer for you ATM. Will answer later.
> Single-table DELETE implementation differs from SELECT implementation
> significantly.
> I think that SELECT filters this index out before the "quick select" probe.
ok. don't bother researching this, this is probably not related to your code
>>> === added file 'sql/opt_explain.cc'
>>> --- a/sql/opt_explain.cc 1970-01-01 00:00:00 +0000
>>> +++ b/sql/opt_explain.cc 2011-04-01 13:53:57 +0000
>>> +/**
>>> + A base for all Explain_* classes
>>> +
>>> + This class hierarchy is a successor of the old select_describe()
>>> function
>>> + implementation. It extends old select_describe() functionality to
>>> deal with
>>> + single-table data-modifying commands (UPDATE and DELETE).
>> GB16 in the final push, I guess you will delete references to old code
>> (like "old select_describe()"). Those mentions have helped me greatly to
>> understand the transition between trunk and your tree. But once your
>> tree is trunk, they will not be needed anymore...?
>
> Ok... (Actually it was added not for review only but for easier code
> browsing in the future.)
Ah, indeed, so it was useful. Then you can just leave the text above
("This class hierarchy is a successor of the old select_describe()
function of 5.5"), and delete other mentions of old code; that should be
sufficient for the code readers of the future.
>>> +*/
>>> +
>>> +class Explain
>>> +{
>>> +private:
>>> + List<Item> items; ///< item list to feed
> select_result::send_data()
>>> + Item_null *nil; ///< pre-allocated NULL item to fill empty columns
>>> in EXPLAIN
>>> +
>>> +protected:
>>> + /*
>>> + Next "col_*" fields are intended for the filling by
>>> "explain_*()" methods.
>>> + Then the make_list() method links these Items into "items" list.
>>> +
>>> + NOTE: NULL value means that Item_null object will be pushed into
>>> "items"
>>> + list instead.
>>> + */
>>> + Item_uint *col_id; ///< "id" column: seq. number of SELECT
>>> withing the query
>>
>> GB17 protected member variables (not member functions) are discouraged
>> in the two C++ books I have (Stroustrup, Meyers). Stroustrup:
>> "declaring data members protected is usually a design error".
>> But I don't have yet a good alternative to suggest. Maybe I will find
>> one.
>
> If you see a more elegant way, you are welcome.
I don't see a better way. I think we are in a case where things are
unlikely to go wrong. I.e. I agree with the guy here:
http://www.parashift.com/c++-faq-lite/basics-of-inheritance.html#faq-19.8
>>
>>> + Item_string *col_select_type; ///< "select_type" column
>>> + Item_string *col_table_name; ///< "table" to which the row of
>>> + return TRUE;
>>> + if (thd->send_explain_fields(result) || prepare(result))
>>
>> GB41 it's not related to your patch, but I wonder why
>> send_explain_fields() is a member of THD and not a free function; it
>> uses only little of THD... Sending fields for a particular command is
>> not the job of THD, at least there isn't a similar member function for
>> the other SQL commands (or I missed it).
>
> I wonder too.
It's not mandatory to address this in this WL.
>>> + {
>>> + delete result;
>>> + return TRUE;
>>> + }
>>> + }
>>> + else
>>> + {
>>> + result= external_result;
>>> + external_result->reset_offset_limit_cnt();
>>
>> GB42 I commented out the call above and no test failed. Do you have an
>> idea how to get coverage for it?
>
> AFAIU we need a query with multi-line EXPLAIN output and a LIMIT N
> clause, where N is shorter than the number of EXPLAIN output rows.
do you think you can find a testcase? or is it too difficult? I wonder
if it could be code which used to serve but has become useless at some
unknown point of the past. As this reset_offset_limit_cnt() function is
a bit of a hack, it would be good if we could remove it...
>>> + if (quick_type == QUICK_SELECT_I::QS_TYPE_ROR_INTERSECT &&
>>> +
>>>
> !((QUICK_ROR_INTERSECT_SELECT*)tab->select->quick)->need_to_fetch_row)
>>> + key_read=1;
>>
>> GB54 not your code, but can change "key_read" to bool, store
>> true/false in it, and declare only closer to where it's used (which is
>> an if() further down).
>
> Done.
"key_read" can be declared further down, closer to where it's used.
>>> +bool Explain_table::explain_extra()
>>> +{
>>
>> GB58 this one also looks like copy-paste of selected bits of
>> Explain_join::extra(), is it avoidable?
>
> Ok... But in this case some old regression tests will fail on reordered
> "extra" output.
I think it's an acceptable behaviour change. Maybe worth telling the
docs team, so that they put a note in the changelog?
>>> + This function forms the 1st row of the QEP output with a simple
>>> text message.
>>> + This is useful to explain such trivial cases as "No tables used" etc.
>>> + + NOTE: Also this function explains the rest of QEP (subqueries
>>> or joined
>>> + tables if any).
>>
>> GB65 NOTE -> @note (same remark for all NOTE in the patch).
>
> Ok.
remains just one NOTE->@note change to do in the doxygen comment of
Explain_msg
>>> + explain_send explain(result);
>>> + bool res= thd->send_explain_fields(&explain) ||
>>> + mysql_explain_union(thd, &thd->lex->unit,
> &explain) ||
>>> + thd->is_error();
>>> + if (res)
>>> + explain.abort_result_set();
>>> + else
>>> + explain.send_eof();
>>> + return res;
>>
>> GB75 this code above resembles what is done in execute_sqlcom_select()
>> (if lex->describe is true), which is:
>>
>> thd->send_explain_fields(result);
>> res= mysql_explain_union(thd, &thd->lex->unit, result);
>> ...
>> if (res)
>> result->abort_result_set();
>> else
>> result->send_eof();
>> Can we make execute_sqlcom_select() call explain_data_modification(), to
>> avoid duplication?
>> The creation of the warning (of EXPLAIN EXTENDED) can also move into
>> explain_data_modification().
>
> Will do with the next commit (need to finish thd->lex->unit.print()
> implementation).
ok
>>> === added file 'sql/opt_explain.h'
>>> --- a/sql/opt_explain.h 1970-01-01 00:00:00 +0000
>>> +++ b/sql/opt_explain.h 2011-04-01 13:53:57 +0000
>>> +bool msg_describe(JOIN *join, const char *message);
>>> +bool msg_describe(const char *message, ha_rows rows= HA_POS_ERROR);
>>> +bool table_describe(TABLE *table, SQL_SELECT *select, uint key,
>>> ha_rows limit,
>>> + bool need_sort);
>>> +bool select_describe(JOIN *join, bool need_tmp_table, bool need_order,
>>> + bool distinct);
>>> +bool explain_data_modification(select_result_interceptor *result);
>>
>> GB78 some "const" can be added over the code; patch is attached.
Did you see the patch which was attached? It proposed to add a few "const".
>>> === modified file 'sql/sql_delete.cc'
>>> --- a/sql/sql_delete.cc 2011-03-17 17:39:31 +0000
>>> +++ b/sql/sql_delete.cc 2011-04-01 13:53:57 +0000
>>> @@ -219,23 +247,34 @@ bool mysql_delete(THD *thd, TABLE_LIST *
>>> DBUG_RETURN(TRUE);
>>> }
>>> }
>>> +
>>> + if (order)
>>> + {
>>> + table->update_const_key_parts(conds);
>>> + order= simple_remove_const(order, conds);
>>> +
>>> + usable_index= get_index_for_order(order, table, select, limit,
>>> + &need_sort, &reverse);
>>> + }
>>> +
>>> + if (thd->lex->describe)
>>> + {
>>> + bool err= table_describe(table, select, usable_index, limit,
>>> need_sort);
>>> + delete select;
>>> + free_underlaid_joins(thd, select_lex);
>>> + DBUG_RETURN(err);
>>> + }
>>> +
>>> if (options & OPTION_QUICK)
>>> (void) table->file->extra(HA_EXTRA_QUICK);
>>
>> GB84 moving the "quick" down this way is apparently ok. Only MyISAM's
>> "index page deleting" code makes use of this information, and that code
>> is called only later in this function.
>
> Do you prefer to move it up?
no, I was just explaining (to myself, and speaking loud) why your change
was correct.
>>> === modified file 'sql/sql_insert.cc'
>>> --- a/sql/sql_insert.cc 2011-02-15 17:14:15 +0000
>>> +++ b/sql/sql_insert.cc 2011-04-01 13:53:57 +0000
>>> + if (thd->lex->describe)
>>> + {
>>> + /*
>>> + Obviously INSERT without the SELECT is not suitable for
>>> EXPLAIN, since we
>>> + don't plan how read tables.
>>
>> GB85 this sentence is strange English, even for a French :-)
>
> Oops.
>
>>
>>> + So we simply send "No tables used" and stop execution here.
>>
>> GB86 it might be interesting to tell the user whether the INSERT
>> will be using bulk insert or not (this can influence speed). But this is
>> an engine's decision (in its implementation of start_bulk_insert()), and
>> we usually do not and cannot report in EXPLAIN what the engine decided...
>
> Currently I don't see a way how to show it.
me neither. It's like for "DELETE FROM t;" : I woud like to tell the
user whether we used delete_all_rows() or row-by-row delete, but this
can be known only if we try the delete, which isn't possible for EXPLAIN
DELETE...
>>> === modified file 'sql/sql_update.cc'
>>> --- a/sql/sql_update.cc 2011-02-25 16:41:57 +0000
>>> +++ b/sql/sql_update.cc 2011-04-01 13:53:57 +0000
>> GB93 'order && (need_sort||used_key_is_modified)' is an expression
>> copied from code a bit further down, could we not copy-paste but
>> rather store that in some const and use it in the two places?
>
> Fixed. (Actually I've tried that before, but in this case we have to
> declare non-cost variable at the top of the function -- "thanks" to
> goto. I don't know, what is uglier: copy&paste of short expression or
> top-level declaration of locally-used variable).
indeed... sigh...
>> GB94 there are 3 situations:
>> a) filesort
>> b) scan table, collect row ids, update table (it's the "else" branch
>> after filesort)
>> c) neither a or b.
>> In table_describe() above we show "a". Could we also show "b"? Or do
>> we already?
>
> Currently we don't show this case. What "extra" message you suggest to
> use at this place?
I see that in (b) we go into open_cached_file(). Maybe "Using temporary"?
>> GB97 as discussed, it would be good to have EXPLAIN EXTENDED print the
>> transformed query as it does for SELECT.
>
> Not done yet. Will be a separate commit a bit later.
ok
>> GB98 testing "EXPLAIN PARTITIONS DELETE/UPDATE/...", without and with
>> EXTENDED, would be good.
>>
>> GB99 it would be good to have coverage for all new places where we now
>> send EXPLAIN output; for example the ones where prune_partitions()
>> shortcuts DELETE/UPDATE; and others.
>
> Added some tests. Still working on "100%" coverage.
nice goal.
>> GB100 in the WL, there is "The solution is a new explain_send class that
>> initializes tables like INSERT/REPLACE and multi-table UPDATE/DELETE
>> do". I suggest to change "INSERT/REPLACE" to "INSERT/REPLACE...SELECT".
how about this?
>> GB101 what about unit testing? I'm not sure it is easily doable, given
>> the dependency on THD, Field, range optimizer, JOIN, JOIN_TAB...?
>
> I'm thinking about that, but I don't see a good (and profitable) way to
> cover this code with unit tests.
Me neither. You could have a unit test which creates some Explain_*
objects and checks that they return the correct data, but is that useful?
>> GB102 I have to think a bit more about explain_send wrapping the
>> interceptor, there may be design changes to suggest there. Also about
>> the relationship between mysql_select() and
>> explain_data_modification(). Nothing serious though, and not yet clear
>> in my mind. Will keep you posted.
I found nothing to post :-)
Something to do later: working with docs team to document the EXPLAIN
output of "EXPLAIN non-SELECT".