MySQL Lists are EOL. Please join:

List:Commits« Previous MessageNext Message »
From:Øystein Grøvlen Date:October 4 2009 8:37am
Subject:Re: bzr commit into mysql-5.4 branch (epotemkin:2864) Bug#34384
View as plain text  
Evgeny,

I will leave up the Serg (2nd reviewer) to decide whether he thinks
the performance improvements are of value.  As I said, I am a bit
sceptic.

See inline for comments to the actual patch.  My comments are
just suggestions.  Feel free to disregard, but please give me some
arguments when you disagree (so that I can learn from it).

--
Øystein

Evgeny Potemkin wrote:
 > #At file:///work/bzrroot/34384-bug-next/ based on 
revid:epotemkin@stripped
 >
 >  2864 Evgeny Potemkin	2009-09-19
 >       Bug#34384: Slow down on constant conversion.
 >
 >       When values of different types are compared they're converted 
to a type that
 >       allows correct comparison. This conversion is done for each 
comparison and
 >       takes some time. When a constant is being compared it's 
possible to cache the
 >       value after conversion to speedup comparison.
 >
 >       A test case isn't provided because all changes are internal and 
isn't visible
 >       outside.
 >
 >       A flag named uncacheable is added to the Item class. It denies 
caching of the
 >       item.
 >       A flag named late_caching is added to the Item_cache class. 
It's set to TRUE
 >       when we need to postpone caching till the first use of cached 
value.
 >       A function named check_late_caching is added to the Item_cache 
flag. Its
 >       purpose is to check state of the late_caching flag and cache 
value accordingly.
 >       Item_cache_xxx::val_xxx functions are changed to call it prior 
to return
 >       cached value.

and Item_cache_xxx::store functions


 >       The Arg_comparator::set_cmp_func function is changed to cache a 
value being
 >       compared if it's a constant and needs type conversion.
 >       Now the Item_func_like::fix_fields function denies caching 
arguments of the LIKE
 >       function.
 >       The Item_cache::get_cache function is overloaded to allow 
setting of the
 >       late_caching flag and cache type.
 >      @ mysql-test/r/range.result
 >         Adjusted a test case after fix for the bug#34384.
 >      @ sql/item.cc
 >         Bug#34384: Slow down on constant conversion.
 >         A flag named uncacheable is added to the Item class. It 
denies caching of the
 >         item.
 >         The Item_cache::get_cache function is overloaded to allow 
setting of the
 >         late_caching flag and cache type.
 >         Item_cache_xxx::val_xxx functions are changed to call it 
prior to return
 >         cached value.
 >      @ sql/item.h
 >         Bug#34384: Slow down on constant conversion.
 >         A function named check_late_caching is added to the 
Item_cache flag. Its
 >         purpose is to check state of the late_caching flag and cache 
value accordingly.
 >         The Item_cache::get_cache function is overloaded to allow 
setting of the
 >         late_caching flag and cache type.
 >      @ sql/item_cmpfunc.cc
 >         Bug#34384: Slow down on constant conversion.
 >         The Arg_comparator::set_cmp_func function is changed to cache 
a value being
 >         compared if it's a constant and needs type conversion.
 >      @ sql/item_cmpfunc.h
 >         Bug#34384: Slow down on constant conversion.
 >      @ sql/item_xmlfunc.cc
 >         Bug#34384: Slow down on constant conversion.
 >
 >     modified:
 >       mysql-test/r/range.result
 >       sql/item.cc
 >       sql/item.h
 >       sql/item_cmpfunc.cc
 >       sql/item_cmpfunc.h
 >       sql/item_xmlfunc.cc
 > === modified file 'mysql-test/r/range.result'
 > --- a/mysql-test/r/range.result	2008-06-10 22:27:52 +0000
 > +++ b/mysql-test/r/range.result	2009-09-19 17:55:58 +0000
 > @@ -1230,8 +1230,6 @@ c1
 >  Warnings:
 >  Warning	1366	Incorrect decimal value: 'A' for column 'c1' at row 1
 >  Warning	1292	Truncated incorrect DOUBLE value: 'A'
 > -Warning	1292	Truncated incorrect DOUBLE value: 'A'
 > -Warning	1292	Truncated incorrect DOUBLE value: 'A'

I guess the extra warnings disappear because there are only one
conversion now.

 >  DROP TABLE t1;
 >  create table t1 (a int,b int,key (b),key (a),key (b,a));
 >  insert into t1(a,b) values (1,2),(3,4),(5,6),(7,8);
 >
 > === modified file 'sql/item.cc'
 > --- a/sql/item.cc	2009-07-23 12:21:41 +0000
 > +++ b/sql/item.cc	2009-09-19 17:55:58 +0000
 > @@ -376,7 +376,8 @@ int Item::save_str_value_in_field(Field
 >  Item::Item():
 >    is_expensive_cache(-1), rsize(0), name(0), orig_name(0), 
name_length(0),
 >    fixed(0), is_autogenerated_name(TRUE),
 > -  collation(&my_charset_bin, DERIVATION_COERCIBLE)
 > +  collation(&my_charset_bin, DERIVATION_COERCIBLE),
 > +  uncacheable(0)
 >  {
 >    marker= 0;
 >    maybe_null=null_value=with_sum_func=unsigned_flag=0;
 > @@ -7136,17 +7137,22 @@ bool field_is_equal_to_item(Field *field
 >
 >  Item_cache* Item_cache::get_cache(const Item *item)
 >  {
 > -  switch (item->result_type()) {
 > +  return get_cache(item->result_type(), item, FALSE);
 > +}
 > +
 > +Item_cache* Item_cache::get_cache(const Item_result type, const Item 
*item, bool late_caching)
 > +{

You could consider using a default parameter here, but I am not quite sure
whether that is kosher in MySQL.

...

 > === modified file 'sql/item.h'
 > --- a/sql/item.h	2009-06-24 17:00:59 +0000
 > +++ b/sql/item.h	2009-09-19 17:55:58 +0000
 > @@ -532,6 +532,7 @@ public:
 >                                             of its arguments is or 
contains a
 >                                             subselect. Computed by 
fix_fields. */
 >    Item_result cmp_context;              /* Comparison context */
 > +  bool uncacheable;

A (doxygen) comment explaining this field would be good.
I think you should consider making this field private.

 >    // alloc & destruct is done as start of select using sql_alloc
 >    Item();
 >    /*
 > @@ -2995,14 +2996,24 @@ protected:
 >    Field *cached_field;
 >    enum enum_field_types cached_field_type;
 >  public:
 > +  bool late_caching;

I think you should add a comment here explaining the late caching
mechanism.  I think you should also consider making this field private.

 >    Item_cache():
 > -    example(0), used_table_map(0), cached_field(0), 
cached_field_type(MYSQL_TYPE_STRING)
 > +    example(0), used_table_map(0), cached_field(0), 
cached_field_type(MYSQL_TYPE_STRING),
 > +    late_caching(0)
 > +  {
 > +    fixed= 1;
 > +    null_value= 1;
 > +  }
 > +  Item_cache(bool late_caching_arg):
 > +    example(0), used_table_map(0), cached_field(0), 
cached_field_type(MYSQL_TYPE_STRING),
 > +    late_caching(late_caching_arg)
 >    {
 >      fixed= 1;
 >      null_value= 1;
 >    }

A default argument could also be used here.

 >    Item_cache(enum_field_types field_type_arg):
 > -    example(0), used_table_map(0), cached_field(0), 
cached_field_type(field_type_arg)
 > +    example(0), used_table_map(0), cached_field(0), 
cached_field_type(field_type_arg),
 > +    late_caching(0)
 >    {
 >      fixed= 1;
 >      null_value= 1;
 > @@ -3026,6 +3037,8 @@ public:
 >    enum Type type() const { return CACHE_ITEM; }
 >    enum_field_types field_type() const { return cached_field_type; }
 >    static Item_cache* get_cache(const Item *item);
 > +  static Item_cache* get_cache(const Item_result type, const Item* item,
 > +                                    bool late_caching);
 >    table_map used_tables() const { return used_table_map; }
 >    virtual void keep_array() {}
 >    virtual void print(String *str, enum_query_type query_type);
 > @@ -3037,6 +3050,18 @@ public:
 >    {
 >      return this == item;
 >    }
 > +  inline bool check_late_caching(Item *item)
 > +  {
 > +    if (!late_caching)
 > +      return FALSE;
 > +    if (item)
 > +      example= item;
 > +    else {
 > +      late_caching= FALSE;
 > +      store(example);
 > +    }
 > +    return TRUE;
 > +  }
 >  };

It seems like this method is used for two different purposes, both for
accessing cached values and for storing them.  Generally, I do not
think functions with quite different behavior based on the values of
parameters is a good idea.  Splitting this into two, would also give
opportunities for more descriptive names.  (BTW, placement of curly
braces for else-block is not according to coding standard.)

Maybe something like this could be done:

inline bool init_late_caching(Item *item)
{
   if (!late_caching)
     return FALSE;
   example= item;
   return TRUE;
}

inline void cache_value()
{
   late_caching= FALSE;
   store(example);
}

Item_cache_xxx::store(...)
{
   if (init_late_caching(item))
     return ...;
   ...
}

Item_cache_xxx::val_xxx(...)
{
   if (late_caching)
     cache_value();
   ...
}


 >
 >
 > @@ -3046,13 +3071,24 @@ protected:
 >    longlong value;
 >  public:
 >    Item_cache_int(): Item_cache(), value(0) {}
 > +  Item_cache_int(bool late_caching_arg): 
Item_cache(late_caching_arg), value(0) {}
 >    Item_cache_int(enum_field_types field_type_arg):
 >      Item_cache(field_type_arg), value(0) {}
 >
 >    void store(Item *item);
 >    void store(Item *item, longlong val_arg);
 > -  double val_real() { DBUG_ASSERT(fixed == 1); return (double) value; }
 > -  longlong val_int() { DBUG_ASSERT(fixed == 1); return value; }
 > +  double val_real()
 > +  {
 > +    DBUG_ASSERT(fixed == 1);
 > +    check_late_caching(NULL);
 > +    return (double) value;
 > +  }
 > +  longlong val_int()
 > +  {
 > +    DBUG_ASSERT(fixed == 1);
 > +    check_late_caching(NULL);
 > +    return value;
 > +  }


Maybe the above functions should be moved to the .cc-file when they
are not one-liners any more.  This has the added value that all
val_xxx implementations can be co-located.

 >    String* val_str(String *str);
 >    my_decimal *val_decimal(my_decimal *);
 >    enum Item_result result_type() const { return INT_RESULT; }
 > @@ -3065,9 +3101,15 @@ class Item_cache_real: public Item_cache
 >    double value;
 >  public:
 >    Item_cache_real(): Item_cache(), value(0) {}
 > +  Item_cache_real(bool late_caching_arg): 
Item_cache(late_caching_arg), value(0) {}
 >
 >    void store(Item *item);
 > -  double val_real() { DBUG_ASSERT(fixed == 1); return value; }
 > +  double val_real()
 > +  {
 > +    DBUG_ASSERT(fixed == 1);
 > +    check_late_caching(NULL);
 > +    return value;
 > +  }
 >    longlong val_int();
 >    String* val_str(String *str);
 >    my_decimal *val_decimal(my_decimal *);
 > @@ -3081,6 +3123,7 @@ protected:
 >    my_decimal decimal_value;
 >  public:
 >    Item_cache_decimal(): Item_cache() {}
 > +  Item_cache_decimal(bool late_caching_arg): 
Item_cache(late_caching_arg) {}
 >
 >    void store(Item *item);
 >    double val_real();
 > @@ -3105,10 +3148,22 @@ public:
 >                     MYSQL_TYPE_VARCHAR &&
 >                   !((const Item_field *) item)->field->has_charset())
 >    {}
 > +  Item_cache_str(const Item *item, bool late_caching_arg) :
 > +    Item_cache(late_caching_arg), value(0),
 > +    is_varbinary(item->type() == FIELD_ITEM &&
 > +                 ((const Item_field *) item)->field->type() ==
 > +                   MYSQL_TYPE_VARCHAR &&
 > +                 !((const Item_field *) item)->field->has_charset())
 > +  {}

You are duplicating some complex logic here, and duplication is the
root of all evil ;-) Making late_caching_arg have a default value is
one of several ways to avoid this.

 >    void store(Item *item);
 >    double val_real();
 >    longlong val_int();
 > -  String* val_str(String *) { DBUG_ASSERT(fixed == 1); return value; }
 > +  String* val_str(String *)
 > +  {
 > +    DBUG_ASSERT(fixed == 1);
 > +    check_late_caching(NULL);
 > +    return value;
 > +  }
 >    my_decimal *val_decimal(my_decimal *);
 >    enum Item_result result_type() const { return STRING_RESULT; }
 >    CHARSET_INFO *charset() const { return value->charset(); };
 > @@ -3123,6 +3178,9 @@ class Item_cache_row: public Item_cache
 >  public:
 >    Item_cache_row()
 >      :Item_cache(), values(0), item_count(2), save_array(0) {}
 > +  Item_cache_row(bool late_caching_arg)
 > +    :Item_cache(late_caching_arg), values(0), item_count(2),
 > +    save_array(0) {}
 >
 >    /*
 >      'allocate' used only in row transformer, to preallocate space 
for row
 >
 > === modified file 'sql/item_cmpfunc.cc'
 > --- a/sql/item_cmpfunc.cc	2009-07-03 10:39:01 +0000
 > +++ b/sql/item_cmpfunc.cc	2009-09-19 17:55:58 +0000
 > @@ -524,6 +524,13 @@ void Item_bool_func2::fix_length_and_dec
 >    set_cmp_func();
 >  }
 >
 > +void Item_bool_func2::set_cmp_func()
 > +{
 > +  /* Don't need cache for context analysis. */
 > +  if (!current_thd->is_context_analysis_only())
 > +    tmp_arg[0]->uncacheable= tmp_arg[1]->uncacheable= TRUE;

The comment and the name of the method called seems contradictory.
 From looking at the if-statement, it seems like the comment should be:
"If not doing only context_analysis, disable caching", but maybe
is_context_analysis_only is just another bad name ...

 > +  cmp.set_cmp_func(this, tmp_arg, tmp_arg+1);
 > +}
 >
 >  int Arg_comparator::set_compare_func(Item_bool_func2 *item, 
Item_result type)
 >  {
 > @@ -847,6 +854,7 @@ int Arg_comparator::set_cmp_func(Item_bo
 >                                          Item_result type)
 >  {
 >    ulonglong const_value= (ulonglong)-1;
 > +  Item_cache *cache;

Note that variables with same name is created locally in statements
blocks within this method.  While quite legal, it may create some
confusion.  Maybe yours should be local to?

 >    a= a1;
 >    b= a2;
 >
 > @@ -906,6 +914,28 @@ int Arg_comparator::set_cmp_func(Item_bo
 >        return 1;
 >    }
 >
 > +  {
 > +    /*
 > +      When one of items being compared is a constant and its type
 > +      differs from comparison type then cache such item to avoid
 > +      type conversion of this constant on each evaluation.
 > +    */
 > +    if ((*a)->const_item() && !(*a)->uncacheable &&
 > +        type != (*a)->result_type())
 > +    {
 > +      cache= Item_cache::get_cache(type, *a, TRUE);
 > +      cache->store(*a);
 > +      a_cache= cache;
 > +      a= (Item **)&a_cache;
 > +    } else if ((*b)->const_item() && !(*b)->uncacheable &&
 > +               type != (*b)->result_type())
 > +    {
 > +      cache= Item_cache::get_cache(type, *b, TRUE);
 > +      cache->store(*b);
 > +      b_cache= cache;
 > +      b= (Item **)&b_cache;
 > +    }
 > +  }

Quite some code duplication here, too.  You could make a helper
function to avoid that.  For example, something like this:

void cache_converted_constant(Item **value, Item **cached_value,
                                Item_result type)
{
   if (!value->const_item() || value->uncacheable ||
       type == value->result_type())
     return;

   Item_cache *cache= Item_cache::get_cache(type, value, TRUE);
   cache->store(value);
   *cached_value= cache;
   *cache= *cached_value;
}

And used like this:

   cache_converted_constant(a, &a_cache, type);
   cache_converted_constant(b, &b_cache, type);

 >    return set_compare_func(owner_arg, type);
 >  }
 >
 > @@ -4465,6 +4495,11 @@ Item_func::optimize_type Item_func_like:
 >  bool Item_func_like::fix_fields(THD *thd, Item **ref)
 >  {
 >    DBUG_ASSERT(fixed == 0);
 > +
 > +  /* LIKE doesn't compare arguments directly, thus no reason to 
cache them. */
 > +  for (int i= 0; i < arg_count; i++)
 > +    args[i]->uncacheable= TRUE;
 > +
 >    if (Item_bool_func2::fix_fields(thd, ref) ||
 >        escape_item->fix_fields(thd, &escape_item))
 >      return TRUE;
 >
 > === modified file 'sql/item_cmpfunc.h'
 > --- a/sql/item_cmpfunc.h	2009-06-22 09:36:50 +0000
 > +++ b/sql/item_cmpfunc.h	2009-09-19 17:55:58 +0000
 > @@ -335,10 +335,7 @@ public:
 >    Item_bool_func2(Item *a,Item *b)
 >      :Item_int_func(a,b), cmp(tmp_arg, tmp_arg+1), 
abort_on_null(FALSE) {}
 >    void fix_length_and_dec();
 > -  void set_cmp_func()
 > -  {
 > -    cmp.set_cmp_func(this, tmp_arg, tmp_arg+1);
 > -  }
 > +  void set_cmp_func();
 >    optimize_type select_optimize() const { return OPTIMIZE_OP; }
 >    virtual enum Functype rev_functype() const { return UNKNOWN_FUNC; }
 >    bool have_rev_func() const { return rev_functype() != UNKNOWN_FUNC; }
 >
 > === modified file 'sql/item_xmlfunc.cc'
 > --- a/sql/item_xmlfunc.cc	2009-07-28 14:16:37 +0000
 > +++ b/sql/item_xmlfunc.cc	2009-09-19 17:55:58 +0000
 > @@ -952,6 +952,8 @@ static Item *create_comparator(MY_XPATH
 >      */
 >
 >      Item *fake= new Item_string("", 0, xpath->cs);
 > +    /* Don't cache fake because its value will be changed during 
comparison.*/
 > +    fake->uncacheable= TRUE;
 >      Item_nodeset_func *nodeset;
 >      Item *scalar, *comp;
 >      if (a->type() == Item::XPATH_NODESET)
 >
 >
 >
 > ------------------------------------------------------------------------
 >
 >

Thread
bzr commit into mysql-5.4 branch (epotemkin:2864) Bug#34384Evgeny Potemkin19 Sep
  • Re: bzr commit into mysql-5.4 branch (epotemkin:2864) Bug#34384Øystein Grøvlen25 Sep
    • Re: bzr commit into mysql-5.4 branch (epotemkin:2864) Bug#34384Evgeny Potemkin25 Sep
      • Re: bzr commit into mysql-5.4 branch (epotemkin:2864) Bug#34384Øystein Grøvlen28 Sep
        • Re: bzr commit into mysql-5.4 branch (epotemkin:2864) Bug#34384Sergei Golubchik28 Sep
        • Re: bzr commit into mysql-5.4 branch (epotemkin:2864) Bug#34384Evgeny Potemkin29 Sep
  • Re: bzr commit into mysql-5.4 branch (epotemkin:2864) Bug#34384Øystein Grøvlen4 Oct