MySQL Lists are EOL. Please join:

List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:June 14 2010 8:08pm
Subject:Re: bzr commit into mysql-next-mr-opt-backporting branch
(tor.didriksen:3197) Bug#52538
View as plain text  
Hello,

No big objections, just more questions and suggestions.

Tor Didriksen a écrit, Le 14.06.2010 17:03:
> On 2010-06-14 15:17, Guilhem Bichot wrote:
>> Hello,
>>
>> Tor Didriksen a écrit, Le 14.06.2010 14:38:
>>> #At file:///export/home/didrik/repo/next-mr-opt-backporting-bug52538/ 
>>> based on revid:jorgen.loland@stripped
>>>
>>>  3197 Tor Didriksen    2010-06-14
>>>       Bug #52538 Valgrind bug: Item_in_subselect::init_left_expr_cache()
>>>             The JOIN_TAB created in make_simple_join() was not 
>>> properly initialized.
>>>       Use new (thd->mem_root) rather than thd->alloc()
>>>      @ mysql-test/r/subselect_mat.result
>>>         Add test case.
>>>      @ mysql-test/t/subselect_mat.test
>>>         Add test case.
>>>      @ sql/item_subselect.cc
>>>         Do not ignore return value of init_left_expr_cache()
>>>         Initialize allocated JOIN_TAB.
>>>         Don't do function calls in DBUG_RETURN().
>>>      @ sql/item_subselect.h
>>>         Introduce new member need_expr_cache.
>>
>> what is this for? How is this connected to the bug? What was wrong 
>> before, which is correct now?
> 
> I changed the signature of init_left_expr_cache()
> -  @retval TRUE  if a memory allocation error occurred or the cache is
> -                not applicable to the current query
> +  @retval TRUE  if a memory allocation error occurred
>    @retval FALSE if success
> 
> so that it is possible to catch memory allocation errors
> (previously, the return value was ignored)

I see the two facts (return value ignored and introduction of 
need_expr_cache) as distinct.
Before, the return value was ignored. That itself could be fixed by just:
- making init_left_expr_cache() return FALSE if it doesn't think that a 
cache is needed
- testing whether init_left_expr_cache() returns TRUE, in which case it 
means OOM, and bailing out.
This doesn't need need_expr_cache.
My impression is that need_expr_cache is only an optimization so that, 
if init_left_expr_cache() decided that there should be no cache (so 
left_expr_cache==0), we won't call this function again and again, once 
per ::exec(), for no use. Is that it? If it is, I would add a line to 
this existing comment:
   /*
     Initialize the cache of the left predicate operand. This has to be 
done as
     late as now, because Cached_item directly contains a resolved field 
(not
     an item, and in some cases (when temp tables are created), these fields
     end up pointing to the wrong field. One solution is to change 
Cached_item
     to not resolve its field upon creation, but to resolve it dynamically
     from a given Item_ref object.
     TODO: the cache should be applied conditionally based on:
     - rules - e.g. only if the left operand is known to be ordered, and/or
     - on a cost-based basis, that takes into account the cost of a cache
       lookup, the cache hit rate, and the savings per cache hit.
   */
   if (need_expr_cache && !left_expr_cache && exec_method == 
MATERIALIZATION &&
       init_left_expr_cache())
     DBUG_RETURN(TRUE);

I would add "don't even try to init the cache if a previous execution 
decided that it's not needed (i.e. need_expr_cache is false)".

>>>      @ sql/sql_bitmap.h
>>>         Initialize Bitmap in constructor.
>>
>> is that needed?
> 
> JOIN_TAB contains a few key_map's, I want them to initialize themselves.
> The alternative is to do const_keys.set_prefix(0); etc. in the body of 
> JOIN_TAB constructor.

Your change is fine.
It touches Bitmap<64>, you could make your change more similar to the 
other Bitmap class in this file (this one:
template <uint default_width> class Bitmap ): it has a constructor which 
only calls init(), and init() does the zero-ing. You could do the same 
for Bitmap<64>: it currently has an empty init(); init() could do the 
zero-ing and the constructor could call init().

It seems that your change allows to eliminate some explicit settings to 
0 in other places of code. For example field.cc:
Field::Field(uchar *ptr_arg,uint32 length_arg,uchar *null_ptr_arg,
	     uchar null_bit_arg,
	     utype unireg_check_arg, const char *field_name_arg)
   :ptr(ptr_arg), null_ptr(null_ptr_arg),
    table(0), orig_table(0), table_name(0),
    field_name(field_name_arg),
    key_start(0), part_of_key(0), part_of_key_not_clustered(0),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    at least those three are Bitmap<64>, and there are more I think (I 
only grep-ped for "part_of_key" which is a key_part which is a typedef 
for Bitmap<64>).
Another one, sql_select.cc:

     field= new Field_varstring(uniq_tuple_length_arg, FALSE, "rowids", 
share,
                                &my_charset_bin);
     if (!field)
       DBUG_RETURN(0);
     field->table= table;
!    field->key_start.init(0);
!    field->part_of_key.init(0);
!    field->part_of_sortkey.init(0);

I suggest eliminating those now-useless settings (shorter code - 
yessss!), if it doesn't scare you. It would be possible to do it in a 
second patch.

JFYI A drawback of the change is that now this code:
   bool check_quick(THD *thd, bool force_quick_range, ha_rows limit)
   {
     key_map tmp;
     tmp.set_all();
is less efficient (first sets to 0 and sets all bits).

>>> === modified file 'mysql-test/r/subselect_mat.result'
>>> --- a/mysql-test/r/subselect_mat.result    2010-06-09 14:16:33 +0000
>>> +++ b/mysql-test/r/subselect_mat.result    2010-06-14 12:37:58 +0000
>>
>>> +SELECT table2.col_varchar_key AS field1,
>>> +table2.col_int_nokey AS field2
>>> +FROM ( t1 AS table1 LEFT OUTER JOIN t1 AS table2
>>> +ON (table2.col_varchar_key = table1.col_varchar_key  ) ) +WHERE 
>>> table1.pk = 6
>>> +HAVING  ( field2 ) IN +( SELECT SUBQUERY2_t2.col_int_nokey AS 
>>> SUBQUERY2_field2 +FROM ( t1 AS SUBQUERY2_t1 JOIN t1 AS SUBQUERY2_t2
>>> +ON (SUBQUERY2_t2.col_varchar_key = SUBQUERY2_t1.col_varchar_key ) ) )
>>> +ORDER BY field2 +;
>>> +field1    field2
>>> +t    1
>>> +t    2
>>
>> is that a correct result?
> 
> I hope so.
> That's what I'm getting in trunk and next-mr as well.

I checked this mentally and it looks correct indeed.

>>> === modified file 'sql/sql_select.h'
>>> --- a/sql/sql_select.h    2010-06-11 12:27:38 +0000
>>> +++ b/sql/sql_select.h    2010-06-14 12:37:58 +0000
>>> @@ -179,9 +179,10 @@ inline bool sj_is_materialize_strategy(u
>>>    return strategy >= SJ_OPT_MATERIALIZE_LOOKUP;
>>>  }
>>>
>>> -typedef struct st_join_table
>>> +typedef struct st_join_table : public Sql_alloc
>>>  {
>>> -  st_join_table() {}                          /* Remove gcc warning */
>>> +  st_join_table();
>>> +
>>>    TABLE        *table;
>>>    KEYUSE    *keyuse;            /**< pointer to first used key */
>>>    SQL_SELECT    *select;
>>> @@ -380,6 +381,91 @@ typedef struct st_join_table
>>>    uint get_sj_strategy() const;
>>>  } JOIN_TAB;
>>>
>>> +
>>> +inline
>>> +st_join_table::st_join_table()
>>> +  : table(NULL),
>>> +    keyuse(NULL),
>>> +    select(NULL),
>>> +    select_cond(NULL),
>>> +    quick(NULL),
>>> +    on_expr_ref(NULL),
>> ...
>>> +
>>> +    keep_current_rowid(0),
>>> +    embedding_map(0)
>>
>> FYI JOIN::save_join_tab() does
>> join_tab_save= (JOIN_TAB*)thd->memdup((uchar*) join_tab,
>>                         sizeof(JOIN_TAB) * tables)))
>> I'm not sure this is a problem though.
> 
> I know. There should have been an assignment operator instead.

Should we do it, while we are at this?

>>
>>> +{
>>> +  /*
>>> +    @todo Add constructor to READ_RECORD.
>>
>> If /** */ were used, doxygen would pick up the @todo.
> 
> done
> 
>>
>>> +    All users do init_read_record(), which does bzero(),
>>> +    rather than invoking a constructor.
>>> +  */
>>> +  bzero(&read_record, sizeof(read_record));
>>
>> so here we do a bzero() and another bzero() of the same bytes in 
>> init_read_record()?
> 
> Actually yes we do, somewhere near the end of make_simple_join()
> That can be removed I think.

Yes, let's not introduce a duplicate bzero now. Either put bzero in the 
JOIN_TAB constructor and remove it elsewhere, or leave it elsewhere.

> I would like to cleanup READ_RECORD as well, but in a separate patch.

Fine.
Thread
bzr commit into mysql-next-mr-opt-backporting branch (tor.didriksen:3197)Bug#52538Tor Didriksen14 Jun
  • Re: bzr commit into mysql-next-mr-opt-backporting branch(tor.didriksen:3197) Bug#52538Guilhem Bichot14 Jun
    • Re: bzr commit into mysql-next-mr-opt-backporting branch(tor.didriksen:3197) Bug#52538Tor Didriksen14 Jun
      • Re: bzr commit into mysql-next-mr-opt-backporting branch(tor.didriksen:3197) Bug#52538Guilhem Bichot14 Jun
        • Re: bzr commit into mysql-next-mr-opt-backporting branch(tor.didriksen:3197) Bug#52538Tor Didriksen15 Jun
          • Re: bzr commit into mysql-next-mr-opt-backporting branch(tor.didriksen:3197) Bug#52538Guilhem Bichot15 Jun