List:Commits« Previous MessageNext Message »
From:Gleb Shchepa Date:June 2 2011 1:57pm
Subject:Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897
View as plain text  
Hello Guilhem,

On 05/27/2011 05:41 PM, Guilhem Bichot wrote:
> Hello,
>
> I deleted all "ok" points where I have nothing to add.
>
> Gleb Shchepa a écrit, Le 25.05.2011 16:21:
>> On 05/10/2011 12:31 AM, Guilhem Bichot wrote:
>
>>>>  3358 Gleb Shchepa    2011-04-01
>>>>       WL#4897: Add EXPLAIN INSERT/UPDATE/DELETE
>>>>             WL#4897 implements EXPLAIN command for INSERT, REPLACE and
>>>>       single- and multi-table UPDATE and DELETE queries.
>
>>>> === added file 'mysql-test/r/explain_non_select.result'
>>>> +#
>>>> +# query:  DELETE   FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1
>>>> +# select: SELECT * FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1
>>>> +#
>>>> +EXPLAIN DELETE   FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>>>> +id    select_type    table    type    possible_keys    key    key_len   
> ref    rows    Extra
>>>> +1    SIMPLE    t1    ALL    PRIMARY    PRIMARY    4    NULL    4   
> Using where
>>>> +EXPLAIN EXTENDED DELETE   FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>>>> +id    select_type    table    type    possible_keys    key    key_len   
> ref    rows    filtered    Extra
>>>> +1    SIMPLE    t1    ALL    PRIMARY    PRIMARY    4    NULL    4   
> 100.00    Using where
>>>> +EXPLAIN EXTENDED SELECT * FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>>>> +id    select_type    table    type    possible_keys    key    key_len   
> ref    rows    filtered    Extra
>>>> +1    SIMPLE    t1    index    NULL    PRIMARY    4    NULL    1   
> 400.00    Using where; Using index
>>>
>>> GB10 400.0 is surprising.
>>> The manual says "The filtered column indicates an estimated percentage of
> table rows that will be filtered by the table condition", we could imagine it should be
> <=100. I never saw values greater than 100 until now.
>>> This isn't due to your patch, but I suggest filing a bug report. There's
>>> another case below (search for 352040 below).
>>
>> Will report. I can't find an existent bug report for this, but I remember a
> similar one (probably just a déjà vu).
>
> as you said, it's BUG#34124.

Yep.

>
>>>> +CREATE TABLE t1 (i INT, j INT);
>>>> +INSERT INTO t1 VALUES (1, 1), (2, 2), (3, 3), (4, 4), (5, 5);
>>>> +#
>>>> +# query:  DELETE FROM t1
>>>> +# select: SELECT * FROM t1
>>>> +#
>>>> +EXPLAIN DELETE FROM t1;
>>>> +id    select_type    table    type    possible_keys    key    key_len   
> ref    rows    Extra
>>>> +1    SIMPLE    NULL    NULL    NULL    NULL    NULL    NULL    5   
> Using delete_all_rows
>>>
>>> GB12 it would be good to test with an innodb table too, to show that
>>> EXPLAIN properly reports that innodb doesn't use "delete_all_rows". See
>>> also my comment about sql_delete.cc.
>>
>> Done. Also I'm thinking about one large include file and two small test files,
> one for MyISAM and one for InnoDB.
>
> good idea

Done.

>
>>>> +#
>>>> +# query:  DELETE   FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5
>>>> +# select: SELECT * FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5
>>>> +#
>>>> +EXPLAIN DELETE   FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>>>> +id    select_type    table    type    possible_keys    key    key_len   
> ref    rows    Extra
>>>> +1    SIMPLE    t2    ALL    a    NULL    NULL    NULL    26    Using
> where; Using filesort
>>>> +EXPLAIN EXTENDED DELETE   FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>>>> +id    select_type    table    type    possible_keys    key    key_len   
> ref    rows    filtered    Extra
>>>> +1    SIMPLE    t2    ALL    a    NULL    NULL    NULL    26    100.00   
> Using where; Using filesort
>>>> +Warnings:
>>>> +Warning    1713    Cannot use range access on index 'a' due to type or
> collation conversion on field 'b'
>>>> +EXPLAIN EXTENDED SELECT * FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>>>> +id    select_type    table    type    possible_keys    key    key_len   
> ref    rows    filtered    Extra
>>>> +1    SIMPLE    t2    ALL    NULL    NULL    NULL    NULL    26    100.00
>    Using where; Using filesort
>>>> +Warnings:
>>>> +Note    1003    select `test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS
> `b`,`test`.`t2`.`c` AS `c`,`test`.`t2`.`d` AS `d` from `test`.`t2` where (`test`.`t2`.`b`
> = 10) order by `test`.`t2`.`a`,`test`.`t2`.`c` limit 5
>>>
>>> GB14 here DELETE has warning 1713 but SELECT doesn't. It's understandable, as
> DELETE considered "a" (in "possible_keys"), then realized it was not usable; whereas
> SELECT didn't consider "a" so didn't even come to seeing the type/collation problem.
> However I don't know why DELETE considered index "a" and SELECT didn't...
>>
>> I don't have a detailed answer for you ATM. Will answer later. Single-table
> DELETE implementation differs from SELECT implementation significantly.
>> I think that SELECT filters this index out before the "quick select" probe.
>
> ok. don't bother researching this, this is probably not related to your code

It seems like so.

>
>>>> === added file 'sql/opt_explain.cc'
>>>> --- a/sql/opt_explain.cc    1970-01-01 00:00:00 +0000
>>>> +++ b/sql/opt_explain.cc    2011-04-01 13:53:57 +0000
>
>>>> +/**
>>>> +  A base for all Explain_* classes
>>>> +
>>>> +  This class hierarchy is a successor of the old select_describe()
> function
>>>> +  implementation. It extends old select_describe() functionality to deal
> with
>>>> +  single-table data-modifying commands (UPDATE and DELETE).
>
>>> GB16 in the final push, I guess you will delete references to old code
>>> (like "old select_describe()"). Those mentions have helped me greatly to
>>> understand the transition between trunk and your tree. But once your
>>> tree is trunk, they will not be needed anymore...?
>>
>> Ok... (Actually it was added not for review only but for easier code browsing in
> the future.)
>
> Ah, indeed, so it was useful. Then you can just leave the text above ("This class
> hierarchy is a successor of the old select_describe() function of 5.5"), and delete other
> mentions of old code; that should be sufficient for the code readers of the future.

Ok.

>
>>>> +*/
>>>> +
>>>> +class Explain
>>>> +{
>>>> +private:
>>>> +  List<Item> items; ///< item list to feed
> select_result::send_data()
>>>> +  Item_null *nil; ///< pre-allocated NULL item to fill empty columns
> in EXPLAIN
>>>> +
>>>> +protected:
>>>> +  /*
>>>> +    Next "col_*" fields are intended for the filling by "explain_*()"
> methods.
>>>> +    Then the make_list() method links these Items into "items" list.
>>>> +
>>>> +    NOTE: NULL value means that Item_null object will be pushed into
> "items"
>>>> +          list instead.
>>>> +  */
>>>> +  Item_uint   *col_id; ///< "id" column: seq. number of SELECT
> withing the query
>>>
>>> GB17 protected member variables (not member functions) are discouraged in the
> two C++ books I have (Stroustrup, Meyers). Stroustrup: "declaring data members protected
> is usually a design error".
>>> But I don't have yet a good alternative to suggest. Maybe I will find one.
>>
>> If you see a more elegant way, you are welcome.
>
> I don't see a better way. I think we are in a case where things are unlikely to go
> wrong. I.e. I agree with the guy here:
> http://www.parashift.com/c++-faq-lite/basics-of-inheritance.html#faq-19.8

Ok.

>
>>>
>>>> +  Item_string *col_select_type; ///< "select_type" column
>>>> +  Item_string *col_table_name; ///< "table" to which the row of 
>
>>>> +      return TRUE;
>>>> +    if (thd->send_explain_fields(result) || prepare(result))
>>>
>>> GB41 it's not related to your patch, but I wonder why
>>> send_explain_fields() is a member of THD and not a free function; it
>>> uses only little of THD... Sending fields for a particular command is
>>> not the job of THD, at least there isn't a similar member function for
>>> the other SQL commands (or I missed it).
>>
>> I wonder too.
>
> It's not mandatory to address this in this WL.

Ok.

>
>>>> +    {
>>>> +      delete result;
>>>> +      return TRUE;
>>>> +    }
>>>> +  }
>>>> +  else
>>>> +  {
>>>> +    result= external_result;
>>>> +    external_result->reset_offset_limit_cnt();
>>>
>>> GB42 I commented out the call above and no test failed. Do you have an idea
> how to get coverage for it?
>>
>> AFAIU we need a query with multi-line EXPLAIN output and a LIMIT N clause, where
> N is shorter than the number of EXPLAIN output rows.
>
> do you think you can find a testcase? or is it too difficult? I wonder if it could be
> code which used to serve but has become useless at some unknown point of the past. As this
> reset_offset_limit_cnt() function is a bit of a hack, it would be good if we could remove
> it...

Sorry, this function resets _offset_ value (not limit).
I've added a test to explain.test, since INSERT/UPDATE/... don't support offset values.
Also I've updated the commentary before the reset_offset_limit_cnt().

>
>>>> +  if (quick_type == QUICK_SELECT_I::QS_TYPE_ROR_INTERSECT &&
>>>> +     
> !((QUICK_ROR_INTERSECT_SELECT*)tab->select->quick)->need_to_fetch_row)
>>>> +    key_read=1;
>>>
>>> GB54 not your code, but can change "key_read" to bool, store true/false in
> it, and declare only closer to where it's used (which is an if() further down).
>>
>> Done.
>
> "key_read" can be declared further down, closer to where it's used.

Moved.

>
>>>> +bool Explain_table::explain_extra()
>>>> +{
>>>
>>> GB58 this one also looks like copy-paste of selected bits of
> Explain_join::extra(), is it avoidable?
>>
>> Ok... But in this case some old regression tests will fail on reordered "extra"
> output.
>
> I think it's an acceptable behaviour change. Maybe worth telling the docs team, so
> that they put a note in the changelog?

Agree.

>
>>>> +  This function forms the 1st row of the QEP output with a simple text
> message.
>>>> +  This is useful to explain such trivial cases as "No tables used" etc.
>>>> +  +  NOTE: Also this function explains the rest of QEP (subqueries or
> joined
>>>> +        tables if any).
>>>
>>> GB65 NOTE -> @note (same remark for all NOTE in the patch).
>>
>> Ok.
>
> remains just one NOTE->@note change to do in the doxygen comment of Explain_msg

Fixed.

>
>>>> +  explain_send explain(result);
>>>> +  bool res= thd->send_explain_fields(&explain) ||
>>>> +            mysql_explain_union(thd, &thd->lex->unit,
> &explain) ||
>>>> +            thd->is_error();
>>>> +  if (res)
>>>> +    explain.abort_result_set();
>>>> +  else
>>>> +    explain.send_eof();
>>>> +  return res;
>>>
>>> GB75 this code above resembles what is done in execute_sqlcom_select() (if
> lex->describe is true), which is:
>>>
>>>       thd->send_explain_fields(result);
>>>       res= mysql_explain_union(thd, &thd->lex->unit, result);
>>>       ...
>>>       if (res)
>>>         result->abort_result_set();
>>>       else
>>>         result->send_eof();
>>> Can we make execute_sqlcom_select() call explain_data_modification(), to
>>> avoid duplication?
>>> The creation of the warning (of EXPLAIN EXTENDED) can also move into
> explain_data_modification().
>>
>> Will do with the next commit (need to finish thd->lex->unit.print()
> implementation).
>
> ok

Sorry, not finished yet (probably will be a next WL).

>
>>>> === added file 'sql/opt_explain.h'
>>>> --- a/sql/opt_explain.h    1970-01-01 00:00:00 +0000
>>>> +++ b/sql/opt_explain.h    2011-04-01 13:53:57 +0000
>>>> +bool msg_describe(JOIN *join, const char *message);
>>>> +bool msg_describe(const char *message, ha_rows rows= HA_POS_ERROR);
>>>> +bool table_describe(TABLE *table, SQL_SELECT *select, uint key, ha_rows
> limit,
>>>> +                    bool need_sort);
>>>> +bool select_describe(JOIN *join, bool need_tmp_table, bool need_order,
>>>> +                     bool distinct);
>>>> +bool explain_data_modification(select_result_interceptor *result);
>>>
>>> GB78 some "const" can be added over the code; patch is attached.
>
> Did you see the patch which was attached? It proposed to add a few "const".

Thanks, have added some additional consts.

>
>>>> === modified file 'sql/sql_delete.cc'
>>>> --- a/sql/sql_delete.cc    2011-03-17 17:39:31 +0000
>>>> +++ b/sql/sql_delete.cc    2011-04-01 13:53:57 +0000
>
>>>> @@ -219,23 +247,34 @@ bool mysql_delete(THD *thd, TABLE_LIST *
>>>>        DBUG_RETURN(TRUE);
>>>>      }
>>>>    }
>>>> +
>>>> +  if (order)
>>>> +  {
>>>> +    table->update_const_key_parts(conds);
>>>> +    order= simple_remove_const(order, conds);
>>>> +
>>>> +    usable_index= get_index_for_order(order, table, select, limit,
>>>> + &need_sort, &reverse);
>>>> +  }
>>>> +
>>>> +  if (thd->lex->describe)
>>>> +  {
>>>> +    bool err= table_describe(table, select, usable_index, limit,
> need_sort);
>>>> +    delete select;
>>>> +    free_underlaid_joins(thd, select_lex);
>>>> +    DBUG_RETURN(err);
>>>> +  }
>>>> +
>>>>    if (options & OPTION_QUICK)
>>>>      (void) table->file->extra(HA_EXTRA_QUICK);
>>>
>>> GB84 moving the "quick" down this way is apparently ok. Only MyISAM's
>>> "index page deleting" code makes use of this information, and that code
>>> is called only later in this function.
>>
>> Do you prefer to move it up?
>
> no, I was just explaining (to myself, and speaking loud) why your change was
> correct.

Ok.

>
>>>> === modified file 'sql/sql_insert.cc'
>>>> --- a/sql/sql_insert.cc    2011-02-15 17:14:15 +0000
>>>> +++ b/sql/sql_insert.cc    2011-04-01 13:53:57 +0000
>>>> +  if (thd->lex->describe)
>>>> +  {
>>>> +    /*
>>>> +      Obviously INSERT without the SELECT is not suitable for EXPLAIN,
> since we
>>>> +      don't plan how read tables.
>>>
>>> GB85 this sentence is strange English, even for a French :-)
>>
>> Oops.
>>
>>>
>>>> +      So we simply send "No tables used" and stop execution here.
>>>
>>> GB86 it might be interesting to tell the user whether the INSERT
>>> will be using bulk insert or not (this can influence speed). But this is
>>> an engine's decision (in its implementation of start_bulk_insert()), and
>>> we usually do not and cannot report in EXPLAIN what the engine decided...
>>
>> Currently I don't see a way how to show it.
>
> me neither. It's like for "DELETE FROM t;" : I woud like to tell the user whether we
> used delete_all_rows() or row-by-row delete, but this can be known only if we try the
> delete, which isn't possible for EXPLAIN DELETE...

Ok.

>
>>>> === modified file 'sql/sql_update.cc'
>>>> --- a/sql/sql_update.cc    2011-02-25 16:41:57 +0000
>>>> +++ b/sql/sql_update.cc    2011-04-01 13:53:57 +0000
>
>>> GB93 'order && (need_sort||used_key_is_modified)' is an expression
> copied from code a bit further down, could we not copy-paste but rather store that in some
> const and use it in the two places?
>>
>> Fixed. (Actually I've tried that before, but in this case we have to declare
> non-cost variable at the top of the function -- "thanks" to goto. I don't know, what is
> uglier: copy&paste of short expression or top-level declaration of locally-used
> variable).
>
> indeed... sigh...
>
>>> GB94 there are 3 situations:
>>> a) filesort
>>> b) scan table, collect row ids, update table (it's the "else" branch
>>> after filesort)
>>> c) neither a or b.
>>> In table_describe() above we show "a". Could we also show "b"? Or do we
> already?
>>
>> Currently we don't show this case. What "extra" message you suggest to use at
> this place?
>
> I see that in (b) we go into open_cached_file(). Maybe "Using temporary"?

Done.
Before the final push I'll remove rearrangements of "Using temporary" and "Using filesort"
in result files (many result files were affected, now these changes are unnecessary).

>
>>> GB97 as discussed, it would be good to have EXPLAIN EXTENDED print the
>>> transformed query as it does for SELECT.
>>
>> Not done yet. Will be a separate commit a bit later.
>
> ok

Probably it will be in some short WL in the close future. Sorry.

>
>>> GB98 testing "EXPLAIN PARTITIONS DELETE/UPDATE/...", without and with
>>> EXTENDED, would be good.
>>>
>>> GB99 it would be good to have coverage for all new places where we now
>>> send EXPLAIN output; for example the ones where prune_partitions()
>>> shortcuts DELETE/UPDATE; and others.
>>
>> Added some tests. Still working on "100%" coverage.
>
> nice goal.

Done (98.5% or so ;-)
BTW I've hacked the old good dgcov.pl tool to work with cmake! This is really great
script.

>
>>> GB100 in the WL, there is "The solution is a new explain_send class that
>>> initializes tables like INSERT/REPLACE and multi-table UPDATE/DELETE
>>> do". I suggest to change "INSERT/REPLACE" to "INSERT/REPLACE...SELECT".
>
> how about this?

Done.

>
>>> GB101 what about unit testing? I'm not sure it is easily doable, given the
> dependency on THD, Field, range optimizer, JOIN, JOIN_TAB...?
>>
>> I'm thinking about that, but I don't see a good (and profitable) way to cover
> this code with unit tests.
>
> Me neither. You could have a unit test which creates some Explain_* objects and
> checks that they return the correct data, but is that useful?

I think this is a task for regression tests, not for unit tests.

>
>>> GB102 I have to think a bit more about explain_send wrapping the interceptor,
> there may be design changes to suggest there. Also about the relationship between
> mysql_select() and explain_data_modification(). Nothing serious though, and not yet clear
> in my mind. Will keep you posted.
>
> I found nothing to post :-)
>
> Something to do later: working with docs team to document the EXPLAIN output of
> "EXPLAIN non-SELECT".

Ok.

Thank you,
Gleb.

Thread
bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa1 Apr
  • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Guilhem Bichot28 Apr
    • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa25 May
      • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Guilhem Bichot27 May
  • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Guilhem Bichot9 May
    • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa25 May
      • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Guilhem Bichot27 May
        • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa2 Jun
          • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Guilhem Bichot9 Jun
            • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa9 Jun
    • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Roy Lyseng27 May
      • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa2 Jun
  • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Roy Lyseng27 May
    • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa2 Jun