List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:May 27 2011 1:41pm
Subject:Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897
View as plain text  
Hello,

I deleted all "ok" points where I have nothing to add.

Gleb Shchepa a écrit, Le 25.05.2011 16:21:
> On 05/10/2011 12:31 AM, Guilhem Bichot wrote:

>>>  3358 Gleb Shchepa    2011-04-01
>>>       WL#4897: Add EXPLAIN INSERT/UPDATE/DELETE
>>>             WL#4897 implements EXPLAIN command for INSERT, REPLACE and
>>>       single- and multi-table UPDATE and DELETE queries.

>>> === added file 'mysql-test/r/explain_non_select.result'
>>> +#
>>> +# query:  DELETE   FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1
>>> +# select: SELECT * FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1
>>> +#
>>> +EXPLAIN DELETE   FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>>> +id    select_type    table    type    possible_keys    key    
>>> key_len    ref    rows    Extra
>>> +1    SIMPLE    t1    ALL    PRIMARY    PRIMARY    4    NULL    4    
>>> Using where
>>> +EXPLAIN EXTENDED DELETE   FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>>> +id    select_type    table    type    possible_keys    key    
>>> key_len    ref    rows    filtered    Extra
>>> +1    SIMPLE    t1    ALL    PRIMARY    PRIMARY    4    NULL    4    
>>> 100.00    Using where
>>> +EXPLAIN EXTENDED SELECT * FROM t1 WHERE (@a:= a) ORDER BY a LIMIT 1;
>>> +id    select_type    table    type    possible_keys    key    
>>> key_len    ref    rows    filtered    Extra
>>> +1    SIMPLE    t1    index    NULL    PRIMARY    4    NULL    1    
>>> 400.00    Using where; Using index
>>
>> GB10 400.0 is surprising.
>> The manual says "The filtered column indicates an estimated percentage 
>> of table rows that will be filtered by the table condition", we could 
>> imagine it should be <=100. I never saw values greater than 100 until 
>> now.
>> This isn't due to your patch, but I suggest filing a bug report. There's
>> another case below (search for 352040 below).
> 
> Will report. I can't find an existent bug report for this, but I 
> remember a similar one (probably just a déjà vu).

as you said, it's BUG#34124.

>>> +CREATE TABLE t1 (i INT, j INT);
>>> +INSERT INTO t1 VALUES (1, 1), (2, 2), (3, 3), (4, 4), (5, 5);
>>> +#
>>> +# query:  DELETE FROM t1
>>> +# select: SELECT * FROM t1
>>> +#
>>> +EXPLAIN DELETE FROM t1;
>>> +id    select_type    table    type    possible_keys    key    
>>> key_len    ref    rows    Extra
>>> +1    SIMPLE    NULL    NULL    NULL    NULL    NULL    NULL    5    
>>> Using delete_all_rows
>>
>> GB12 it would be good to test with an innodb table too, to show that
>> EXPLAIN properly reports that innodb doesn't use "delete_all_rows". See
>> also my comment about sql_delete.cc.
> 
> Done. Also I'm thinking about one large include file and two small test 
> files, one for MyISAM and one for InnoDB.

good idea

>>> +#
>>> +# query:  DELETE   FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5
>>> +# select: SELECT * FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5
>>> +#
>>> +EXPLAIN DELETE   FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>>> +id    select_type    table    type    possible_keys    key    
>>> key_len    ref    rows    Extra
>>> +1    SIMPLE    t2    ALL    a    NULL    NULL    NULL    26    Using 
>>> where; Using filesort
>>> +EXPLAIN EXTENDED DELETE   FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>>> +id    select_type    table    type    possible_keys    key    
>>> key_len    ref    rows    filtered    Extra
>>> +1    SIMPLE    t2    ALL    a    NULL    NULL    NULL    26    
>>> 100.00    Using where; Using filesort
>>> +Warnings:
>>> +Warning    1713    Cannot use range access on index 'a' due to type 
>>> or collation conversion on field 'b'
>>> +EXPLAIN EXTENDED SELECT * FROM t2 WHERE b = 10 ORDER BY a, c LIMIT 5;
>>> +id    select_type    table    type    possible_keys    key    
>>> key_len    ref    rows    filtered    Extra
>>> +1    SIMPLE    t2    ALL    NULL    NULL    NULL    NULL    26    
>>> 100.00    Using where; Using filesort
>>> +Warnings:
>>> +Note    1003    select `test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS 
>>> `b`,`test`.`t2`.`c` AS `c`,`test`.`t2`.`d` AS `d` from `test`.`t2` 
>>> where (`test`.`t2`.`b` = 10) order by `test`.`t2`.`a`,`test`.`t2`.`c` 
>>> limit 5
>>
>> GB14 here DELETE has warning 1713 but SELECT doesn't. It's 
>> understandable, as DELETE considered "a" (in "possible_keys"), then 
>> realized it was not usable; whereas SELECT didn't consider "a" so 
>> didn't even come to seeing the type/collation problem. However I don't 
>> know why DELETE considered index "a" and SELECT didn't...
> 
> I don't have a detailed answer for you ATM. Will answer later. 
> Single-table DELETE implementation differs from SELECT implementation 
> significantly.
> I think that SELECT filters this index out before the "quick select" probe.

ok. don't bother researching this, this is probably not related to your code

>>> === added file 'sql/opt_explain.cc'
>>> --- a/sql/opt_explain.cc    1970-01-01 00:00:00 +0000
>>> +++ b/sql/opt_explain.cc    2011-04-01 13:53:57 +0000

>>> +/**
>>> +  A base for all Explain_* classes
>>> +
>>> +  This class hierarchy is a successor of the old select_describe() 
>>> function
>>> +  implementation. It extends old select_describe() functionality to 
>>> deal with
>>> +  single-table data-modifying commands (UPDATE and DELETE).

>> GB16 in the final push, I guess you will delete references to old code
>> (like "old select_describe()"). Those mentions have helped me greatly to
>> understand the transition between trunk and your tree. But once your
>> tree is trunk, they will not be needed anymore...?
> 
> Ok... (Actually it was added not for review only but for easier code 
> browsing in the future.)

Ah, indeed, so it was useful. Then you can just leave the text above 
("This class hierarchy is a successor of the old select_describe() 
function of 5.5"), and delete other mentions of old code; that should be 
sufficient for the code readers of the future.

>>> +*/
>>> +
>>> +class Explain
>>> +{
>>> +private:
>>> +  List<Item> items; ///< item list to feed
> select_result::send_data()
>>> +  Item_null *nil; ///< pre-allocated NULL item to fill empty columns 
>>> in EXPLAIN
>>> +
>>> +protected:
>>> +  /*
>>> +    Next "col_*" fields are intended for the filling by 
>>> "explain_*()" methods.
>>> +    Then the make_list() method links these Items into "items" list.
>>> +
>>> +    NOTE: NULL value means that Item_null object will be pushed into 
>>> "items"
>>> +          list instead.
>>> +  */
>>> +  Item_uint   *col_id; ///< "id" column: seq. number of SELECT 
>>> withing the query
>>
>> GB17 protected member variables (not member functions) are discouraged 
>> in the two C++ books I have (Stroustrup, Meyers). Stroustrup: 
>> "declaring data members protected is usually a design error".
>> But I don't have yet a good alternative to suggest. Maybe I will find 
>> one.
> 
> If you see a more elegant way, you are welcome.

I don't see a better way. I think we are in a case where things are 
unlikely to go wrong. I.e. I agree with the guy here:
http://www.parashift.com/c++-faq-lite/basics-of-inheritance.html#faq-19.8

>>
>>> +  Item_string *col_select_type; ///< "select_type" column
>>> +  Item_string *col_table_name; ///< "table" to which the row of 

>>> +      return TRUE;
>>> +    if (thd->send_explain_fields(result) || prepare(result))
>>
>> GB41 it's not related to your patch, but I wonder why
>> send_explain_fields() is a member of THD and not a free function; it
>> uses only little of THD... Sending fields for a particular command is
>> not the job of THD, at least there isn't a similar member function for
>> the other SQL commands (or I missed it).
> 
> I wonder too.

It's not mandatory to address this in this WL.

>>> +    {
>>> +      delete result;
>>> +      return TRUE;
>>> +    }
>>> +  }
>>> +  else
>>> +  {
>>> +    result= external_result;
>>> +    external_result->reset_offset_limit_cnt();
>>
>> GB42 I commented out the call above and no test failed. Do you have an 
>> idea how to get coverage for it?
> 
> AFAIU we need a query with multi-line EXPLAIN output and a LIMIT N 
> clause, where N is shorter than the number of EXPLAIN output rows.

do you think you can find a testcase? or is it too difficult? I wonder 
if it could be code which used to serve but has become useless at some 
unknown point of the past. As this reset_offset_limit_cnt() function is 
a bit of a hack, it would be good if we could remove it...

>>> +  if (quick_type == QUICK_SELECT_I::QS_TYPE_ROR_INTERSECT &&
>>> +      
>>>
> !((QUICK_ROR_INTERSECT_SELECT*)tab->select->quick)->need_to_fetch_row)
>>> +    key_read=1;
>>
>> GB54 not your code, but can change "key_read" to bool, store 
>> true/false in it, and declare only closer to where it's used (which is 
>> an if() further down).
> 
> Done.

"key_read" can be declared further down, closer to where it's used.

>>> +bool Explain_table::explain_extra()
>>> +{
>>
>> GB58 this one also looks like copy-paste of selected bits of 
>> Explain_join::extra(), is it avoidable?
> 
> Ok... But in this case some old regression tests will fail on reordered 
> "extra" output.

I think it's an acceptable behaviour change. Maybe worth telling the 
docs team, so that they put a note in the changelog?

>>> +  This function forms the 1st row of the QEP output with a simple 
>>> text message.
>>> +  This is useful to explain such trivial cases as "No tables used" etc.
>>> +  +  NOTE: Also this function explains the rest of QEP (subqueries 
>>> or joined
>>> +        tables if any).
>>
>> GB65 NOTE -> @note (same remark for all NOTE in the patch).
> 
> Ok.

remains just one NOTE->@note change to do in the doxygen comment of 
Explain_msg

>>> +  explain_send explain(result);
>>> +  bool res= thd->send_explain_fields(&explain) ||
>>> +            mysql_explain_union(thd, &thd->lex->unit,
> &explain) ||
>>> +            thd->is_error();
>>> +  if (res)
>>> +    explain.abort_result_set();
>>> +  else
>>> +    explain.send_eof();
>>> +  return res;
>>
>> GB75 this code above resembles what is done in execute_sqlcom_select() 
>> (if lex->describe is true), which is:
>>
>>       thd->send_explain_fields(result);
>>       res= mysql_explain_union(thd, &thd->lex->unit, result);
>>       ...
>>       if (res)
>>         result->abort_result_set();
>>       else
>>         result->send_eof();
>> Can we make execute_sqlcom_select() call explain_data_modification(), to
>> avoid duplication?
>> The creation of the warning (of EXPLAIN EXTENDED) can also move into 
>> explain_data_modification().
> 
> Will do with the next commit (need to finish thd->lex->unit.print() 
> implementation).

ok

>>> === added file 'sql/opt_explain.h'
>>> --- a/sql/opt_explain.h    1970-01-01 00:00:00 +0000
>>> +++ b/sql/opt_explain.h    2011-04-01 13:53:57 +0000
>>> +bool msg_describe(JOIN *join, const char *message);
>>> +bool msg_describe(const char *message, ha_rows rows= HA_POS_ERROR);
>>> +bool table_describe(TABLE *table, SQL_SELECT *select, uint key, 
>>> ha_rows limit,
>>> +                    bool need_sort);
>>> +bool select_describe(JOIN *join, bool need_tmp_table, bool need_order,
>>> +                     bool distinct);
>>> +bool explain_data_modification(select_result_interceptor *result);
>>
>> GB78 some "const" can be added over the code; patch is attached.

Did you see the patch which was attached? It proposed to add a few "const".

>>> === modified file 'sql/sql_delete.cc'
>>> --- a/sql/sql_delete.cc    2011-03-17 17:39:31 +0000
>>> +++ b/sql/sql_delete.cc    2011-04-01 13:53:57 +0000

>>> @@ -219,23 +247,34 @@ bool mysql_delete(THD *thd, TABLE_LIST *
>>>        DBUG_RETURN(TRUE);
>>>      }
>>>    }
>>> +
>>> +  if (order)
>>> +  {
>>> +    table->update_const_key_parts(conds);
>>> +    order= simple_remove_const(order, conds);
>>> +
>>> +    usable_index= get_index_for_order(order, table, select, limit,
>>> + &need_sort, &reverse);
>>> +  }
>>> +
>>> +  if (thd->lex->describe)
>>> +  {
>>> +    bool err= table_describe(table, select, usable_index, limit, 
>>> need_sort);
>>> +    delete select;
>>> +    free_underlaid_joins(thd, select_lex);
>>> +    DBUG_RETURN(err);
>>> +  }
>>> +
>>>    if (options & OPTION_QUICK)
>>>      (void) table->file->extra(HA_EXTRA_QUICK);
>>
>> GB84 moving the "quick" down this way is apparently ok. Only MyISAM's
>> "index page deleting" code makes use of this information, and that code
>> is called only later in this function.
> 
> Do you prefer to move it up?

no, I was just explaining (to myself, and speaking loud) why your change 
was correct.

>>> === modified file 'sql/sql_insert.cc'
>>> --- a/sql/sql_insert.cc    2011-02-15 17:14:15 +0000
>>> +++ b/sql/sql_insert.cc    2011-04-01 13:53:57 +0000
>>> +  if (thd->lex->describe)
>>> +  {
>>> +    /*
>>> +      Obviously INSERT without the SELECT is not suitable for 
>>> EXPLAIN, since we
>>> +      don't plan how read tables.
>>
>> GB85 this sentence is strange English, even for a French :-)
> 
> Oops.
> 
>>
>>> +      So we simply send "No tables used" and stop execution here.
>>
>> GB86 it might be interesting to tell the user whether the INSERT
>> will be using bulk insert or not (this can influence speed). But this is
>> an engine's decision (in its implementation of start_bulk_insert()), and
>> we usually do not and cannot report in EXPLAIN what the engine decided...
> 
> Currently I don't see a way how to show it.

me neither. It's like for "DELETE FROM t;" : I woud like to tell the 
user whether we used delete_all_rows() or row-by-row delete, but this 
can be known only if we try the delete, which isn't possible for EXPLAIN 
DELETE...

>>> === modified file 'sql/sql_update.cc'
>>> --- a/sql/sql_update.cc    2011-02-25 16:41:57 +0000
>>> +++ b/sql/sql_update.cc    2011-04-01 13:53:57 +0000

>> GB93 'order && (need_sort||used_key_is_modified)' is an expression 
>> copied from code a bit further down, could we not copy-paste but 
>> rather store that in some const and use it in the two places?
> 
> Fixed. (Actually I've tried that before, but in this case we have to 
> declare non-cost variable at the top of the function -- "thanks" to 
> goto. I don't know, what is uglier: copy&paste of short expression or 
> top-level declaration of locally-used variable).

indeed... sigh...

>> GB94 there are 3 situations:
>> a) filesort
>> b) scan table, collect row ids, update table (it's the "else" branch
>> after filesort)
>> c) neither a or b.
>> In table_describe() above we show "a". Could we also show "b"? Or do 
>> we already?
> 
> Currently we don't show this case. What "extra" message you suggest to 
> use at this place?

I see that in (b) we go into open_cached_file(). Maybe "Using temporary"?

>> GB97 as discussed, it would be good to have EXPLAIN EXTENDED print the
>> transformed query as it does for SELECT.
> 
> Not done yet. Will be a separate commit a bit later.

ok

>> GB98 testing "EXPLAIN PARTITIONS DELETE/UPDATE/...", without and with
>> EXTENDED, would be good.
>>
>> GB99 it would be good to have coverage for all new places where we now
>> send EXPLAIN output; for example the ones where prune_partitions()
>> shortcuts DELETE/UPDATE; and others.
> 
> Added some tests. Still working on "100%" coverage.

nice goal.

>> GB100 in the WL, there is "The solution is a new explain_send class that
>> initializes tables like INSERT/REPLACE and multi-table UPDATE/DELETE
>> do". I suggest to change "INSERT/REPLACE" to "INSERT/REPLACE...SELECT".

how about this?

>> GB101 what about unit testing? I'm not sure it is easily doable, given 
>> the dependency on THD, Field, range optimizer, JOIN, JOIN_TAB...?
> 
> I'm thinking about that, but I don't see a good (and profitable) way to 
> cover this code with unit tests.

Me neither. You could have a unit test which creates some Explain_* 
objects and checks that they return the correct data, but is that useful?

>> GB102 I have to think a bit more about explain_send wrapping the 
>> interceptor, there may be design changes to suggest there. Also about 
>> the relationship between mysql_select() and 
>> explain_data_modification(). Nothing serious though, and not yet clear 
>> in my mind. Will keep you posted.

I found nothing to post :-)

Something to do later: working with docs team to document the EXPLAIN 
output of "EXPLAIN non-SELECT".
Thread
bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa1 Apr
  • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Guilhem Bichot28 Apr
    • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa25 May
      • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Guilhem Bichot27 May
  • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Guilhem Bichot9 May
    • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa25 May
      • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Guilhem Bichot27 May
        • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa2 Jun
          • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Guilhem Bichot9 Jun
            • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa9 Jun
    • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Roy Lyseng27 May
      • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa2 Jun
  • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Roy Lyseng27 May
    • Re: bzr commit into mysql-trunk branch (gleb.shchepa:3358) WL#4897Gleb Shchepa2 Jun