List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:April 13 2011 1:47pm
Subject:Re: bzr commit into mysql-5.1 branch (tor.didriksen:3648) Bug#11765713
View as plain text  
Hello Tor,

Tor Didriksen a écrit, Le 07.04.2011 13:56:
> #At file:///export/home/didrik/repo/5.1-foo/ based on
> revid:georgi.kodinov@stripped
> 
>  3648 Tor Didriksen	2011-04-07
>       Bug#11765713 58705: OPTIMIZER LET ENGINE DEPEND ON UNINITIALIZED VALUES CREATED
> BY OPT_SUM_QU

> === modified file 'sql/opt_sum.cc'
> --- a/sql/opt_sum.cc	2010-06-11 07:38:29 +0000
> +++ b/sql/opt_sum.cc	2011-04-07 11:56:55 +0000

> @@ -437,6 +442,10 @@ int opt_sum_query(TABLE_LIST *tables, Li
>          const_result= 0;
>      }
>    }
> +
> +  if (thd->is_error())
> +    DBUG_RETURN(thd->main_da.sql_errno());

The function's comment mentions only HA_ERR_... (i.e. handler) errors. I 
think that this new value above can be a ER_ (i.e. server) error (like 
"subquery has more than one row").
So the function comment should be updated.

> +
>    /*
>      If we have a where clause, we can only ignore searching in the
>      tables if MIN/MAX optimisation replaced all used tables
> @@ -446,7 +455,7 @@ int opt_sum_query(TABLE_LIST *tables, Li
>    */
>    if (removed_tables && used_tables != removed_tables)
>      const_result= 0;                            // We didn't remove all tables
> -  return const_result;
> +  DBUG_RETURN(const_result);
>  }
>  
>  
> @@ -565,6 +574,7 @@ bool simple_pred(Item_func *func_item, I
>     field identifiers), which will obviously fail, leading to 5 being stored in
>     the Field object.
>     
> +   @param[in]     thd            thread handler
>     @param[in]     max_fl         Set to true if we are optimizing MAX(),
>                                   false means we are optimizing %MIN()
>     @param[in, out] ref           Reference to the structure where the function 
> @@ -589,7 +599,7 @@ bool simple_pred(Item_func *func_item, I
>      true     We can use the index to get MIN/MAX value
>  */
>  
> -static bool matching_cond(bool max_fl, TABLE_REF *ref, KEY *keyinfo, 
> +static bool matching_cond(THD *thd, bool max_fl, TABLE_REF *ref, KEY *keyinfo, 
>                            KEY_PART_INFO *field_part, COND *cond,
>                            key_part_map *key_part_used, uint *range_fl,
>                            uint *prefix_len)
> @@ -613,7 +623,7 @@ static bool matching_cond(bool max_fl, T
>      Item *item;
>      while ((item= li++))
>      {
> -      if (!matching_cond(max_fl, ref, keyinfo, field_part, item,
> +      if (!matching_cond(thd, max_fl, ref, keyinfo, field_part, item,
>                           key_part_used, range_fl, prefix_len))
>          DBUG_RETURN(FALSE);
>      }
> @@ -732,6 +742,9 @@ static bool matching_cond(bool max_fl, T
>  
>      if (is_null || (is_null_safe_eq && args[1]->is_null()))

Speaking to myself::
- opt_sum_query() is called for the second subquery (the right argument 
of '>SOME').
- this is_null() above is what evaluates the most inner subquery (the 
right argument of '<=>') and sets error flag (of thd->is_error()).

Below ("if (thd->is_error())"), the patch wants to detect the error. It 
tests is_error() only if is_null() is true. It's good to note that this 
is relying on the contract of subquery items ("if it gets an error it 
pretends to be NULL"). Maybe this contract should be written somewhere. 
Otherwise it would be tempting, for a future refactoring developer, to 
break this contract (after all, if there's an error, the user is not 
supposed to rely the value, in all logic; here we do rely on the value 
of is_null() to decide if we have to test is_error()...).

Note that the function (matching_cond()) calls is_null() earlier too:
   if (!is_null_safe_eq && !is_null &&
       (args[1]->is_null() || (between && args[2]->is_null())))
     DBUG_RETURN(FALSE);
but fortunately, if there are errors in one of those two is_null(), 
return value is 'true' (thanks to contract) and we return immediately, 
so there is no bug there.

>      {
> +      if (thd->is_error())
> +        DBUG_RETURN(false);

I would put a comment around this, explaining how there can be an error 
here, something like:

// possible e.g. if argument is failed subquery

Because it's not trivial without having studied this particular testcase.

Function matching_cond() should announce in its comment that "false" 
means "index can't be used, or error (check thd->is_error())".
Same for find_key_for_maxmin().

> +
>        part->field->set_null();
>        *key_ptr= (uchar) 1;
>      }

I tried hard to find why there was unitialized data, which led me to the 
following analysis.

Below I consider code without the patch.

The most inner subquery (the right argument of '<=>') reported that it's 
NULL (due to error, but we don't need to know that).
"Are you NULL or not?" is all that we asked the subquery to say. We 
didn't expect it to fill any other info. And the subquery's internal 
state, in case of error, is exactly as if it contained NULL (see 
Item_singlerow_subselect::val_int()). There's no "stuff which the 
subquery should have filled but hasn't because of error".

 From that we conclude that as we observed Valgrind errors in the case 
of error-in-subq, we should also observe Valgrind errors in the case of 
no-error-in-subq, as the subq is NULL in both cases.

And indeed, the case of no-error-in-subq:

SELECT 1 FROM t1 WHERE a < SOME  (SELECT a FROM t1 WHERE a <=> 
(SELECT a FROM t1 where a is null) );

has the same Valgrind errors.
That isn't fixed by the patch, which focused on the error-in-subq case.

I propose to fix this new case in the same go.

Possibly, after fixing it, there will be less need for is_error() tests. 
Theoretically, it's possible to let matching_cond() and 
find_key_for_maxmin() continue (with their NULL 
due-to-error-but-they-do-not-know-it-and-they-do-not-need-to, by virtue 
of the "contract of subq items"), let opt_sum_query() continue, let the 
entire statement continue, build its result set and whatever stupid 
data, until the final error reporting happens in net_end_statement():

void net_end_statement(THD *thd)
{
   switch (thd->main_da.status()) {
   case Diagnostics_area::DA_ERROR:
     /* The query failed, send error to log and abort bootstrap. */
     error= net_send_error(thd,

which sends to the client the error recorded by the most inner subq item 
long ago.

Which means that the fix for the new testcase (no-error-in-subq) is in 
theory enough to fix the original testcase (error-in-subq)
Maybe you'll want to still add code to catch errors early. I don't know. 
   It's if()s added to the code paths...
Let's see after the new testcase is fixed.

If you prefer to not work on the new testcase, feel free to file a new 
bug report, and mark yours as duplicate of the new one.
Thread
bzr commit into mysql-5.1 branch (tor.didriksen:3648) Bug#11765713Tor Didriksen7 Apr
  • Re: bzr commit into mysql-5.1 branch (tor.didriksen:3648) Bug#11765713Guilhem Bichot7 Apr
    • Re: bzr commit into mysql-5.1 branch (tor.didriksen:3648) Bug#11765713Tor Didriksen8 Apr
  • Re: bzr commit into mysql-5.1 branch (tor.didriksen:3648) Bug#11765713Sergey Glukhov11 Apr
  • Re: bzr commit into mysql-5.1 branch (tor.didriksen:3648) Bug#11765713Guilhem Bichot13 Apr
    • Re: bzr commit into mysql-5.1 branch (tor.didriksen:3648) Bug#11765713Tor Didriksen14 Apr