Hello Sergei,
Thank you for review, please see my answers below.
Sergei Golubchik wrote:
> Hi!
>
> On Aug 14, Gleb Shchepa wrote:
>> #At file:///work/bzr/5.1-bugteam-26020/
>>
>> 2688 Gleb Shchepa 2008-08-14
>> Bug#26020: User-Defined Variables are not consistent with
>> columns data types
>>
>> The "SELECT @lastId, @lastId := Id FROM t" query returns
>> different result sets depending on the type of the Id column
>> (INT or BIGINT).
>>
>> === modified file 'mysql-test/r/user_var.result'
>> --- a/mysql-test/r/user_var.result 2007-12-13 11:56:04 +0000
>> +++ b/mysql-test/r/user_var.result 2008-08-13 20:00:41 +0000
>> @@ -121,8 +121,8 @@ select @a:=0;
>> select @a+0, @a:=@a+0+count(*), count(*), @a+0 from t1 group by i;
>> @a+0 @a:=@a+0+count(*) count(*) @a+0
>> 0 1 1 0
>> -1 3 2 0
>> -3 6 3 0
>> +0 2 2 0
>> +0 3 3 0
>
> Why is new result better than the old ?
1. Please take a look at the last column (second @a+0): that column
consists of zeroes. At the same time first column (first @a+0) consists
of 0, 1, 3. So, first and last columns are under different grouping
rules, and that is confusing for me. My patch set up same grouping
rules for all entries of @a at the same select list.
2. There is similar query (see user_var.test, one line above our query):
select @a, @a:=@a+count(*), count(*), @a from t1 group by i;
@a @a:=@a+count(*) count(*) @a
0 1 1 0
0 2 2 0
0 3 3 0
As you can see, the result of this query is same as result of patched
query. IMHO this is not bad to make results of these (same) queries equal.
>
>> === modified file 'sql/item_func.cc'
>> --- a/sql/item_func.cc 2008-07-31 09:50:24 +0000
>> +++ b/sql/item_func.cc 2008-08-13 20:00:41 +0000
>> @@ -4667,9 +4661,36 @@ void Item_func_get_user_var::fix_length_
>> }
>>
>>
>> -bool Item_func_get_user_var::const_item() const
>> +bool Item_func_get_user_var::fix_fields(THD *thd, Item **ref)
>> {
>> - return (!var_entry || current_thd->query_id !=
> var_entry->update_query_id);
>> + DBUG_ENTER("Item_func_get_user_var::fix_fields");
>> + DBUG_ASSERT(fixed == 0);
>> + /* fix_fields will call Item_func_set_user_var::fix_length_and_dec */
>> + if (Item_func::fix_fields(thd, ref))
>> + DBUG_RETURN(TRUE);
>> +
>> + const_item_cache= TRUE;
>> +
>> + if (!var_entry)
>> + DBUG_RETURN(FALSE);
>> +
>> + if (thd->lex->sql_command == SQLCOM_LOAD)
>> + {
>> + const_item_cache= current_thd->query_id !=
> var_entry->update_query_id;
>
> Why do you want to treat LOAD DATA specially ? I'd prefer to handle all
> statements uniformly.
>
> Actually, you can keep var_entry->update_query_id for all statements.
> Consider
>
> SELECT @a, @a:=col, @a FROM ...
>
> In this case in the first @a item you'll do a search in the
> thd->lex->set_var_list, and notice that @a is not constant. You can
> set var_entry->update_query_id right away then, and in the second @a
> item you won't need to search the list, you can know that the @a is not
> constant immediately.
>
> That is, you only need to search the list if
>
> current_thd->query_id == var_entry->update_query_id
>
> Even better would be to have a three-state status field here, "var_entry
> is const", "var_entry is not const", "unknown, need to search the list".
>
> Note that typically, if a variable is used and modified in the same
> query, it's used in a pretty complex expressions, appearing many times
> in a query. With the optimization as above, you will only need to search
> the list once.
>
> A different approach would be to mark variables as "modifiable" before
> fix_fields. That is,
I like this approach, patch is coming soon.
>
> * move entry= get_variable(&thd->user_vars, name, 1) to a separate
> method, Item_func_set_user_var::set_entry().
> * call ::set_entry() for all items from the thd->lex->set_var_list
> from setup_fields() before the first call of ::fix_fields().
> * when calling ::set_entry() update var_entry->update_query_id to the
> current_thd->query_id.
>
> This way you have no memcmp and no list searches at all.
>
>> + DBUG_RETURN(FALSE);
>> + }
>> + List_iterator<Item_func_set_user_var> li(thd->lex->set_var_list);
>> + Item_func_set_user_var *var;
>> + while ((var= li++))
>> + {
>> + if (name.length == var->name.length &&
>> + !memcmp(name.str, var->name.str, name.length))
>
> memcmp() is probably wrong, it should be charset-aware
> comparison.
IMHO the trouble with var->name is even more general:
set names koi8r;
set @тест = 10; # variable name is in KOI8-R
set names utf8;
select @тест; # variable name is in UTF-8
# result is NULL
How do you think, is it a bug?
>
>> + {
>> + const_item_cache= FALSE;
>> + break;
>> + }
>> + }
>> + DBUG_RETURN(FALSE);
>> }
>
> Regards / Mit vielen Grüssen,
> Sergei
>
Thank you,
Gleb.