excellent review Guilhem, new patch coming
-- didrik
On Wed, Apr 13, 2011 at 3:47 PM, Guilhem Bichot
<guilhem.bichot@stripped>wrote:
> Hello Tor,
>
> Tor Didriksen a écrit, Le 07.04.2011 13:56:
>
>> #At file:///export/home/didrik/repo/5.1-foo/ based on
>> revid:georgi.kodinov@stripped
>>
>>
>> 3648 Tor Didriksen 2011-04-07
>> Bug#11765713 58705: OPTIMIZER LET ENGINE DEPEND ON UNINITIALIZED
>> VALUES CREATED BY OPT_SUM_QU
>>
>
> === modified file 'sql/opt_sum.cc'
>> --- a/sql/opt_sum.cc 2010-06-11 07:38:29 +0000
>> +++ b/sql/opt_sum.cc 2011-04-07 11:56:55 +0000
>>
>
> @@ -437,6 +442,10 @@ int opt_sum_query(TABLE_LIST *tables, Li
>> const_result= 0;
>> }
>> }
>> +
>> + if (thd->is_error())
>> + DBUG_RETURN(thd->main_da.sql_errno());
>>
>
> The function's comment mentions only HA_ERR_... (i.e. handler) errors. I
> think that this new value above can be a ER_ (i.e. server) error (like
> "subquery has more than one row").
> So the function comment should be updated.
>
>
> +
>> /*
>> If we have a where clause, we can only ignore searching in the
>> tables if MIN/MAX optimisation replaced all used tables
>> @@ -446,7 +455,7 @@ int opt_sum_query(TABLE_LIST *tables, Li
>> */
>> if (removed_tables && used_tables != removed_tables)
>> const_result= 0; // We didn't remove all
>> tables
>> - return const_result;
>> + DBUG_RETURN(const_result);
>> }
>> @@ -565,6 +574,7 @@ bool simple_pred(Item_func *func_item, I
>> field identifiers), which will obviously fail, leading to 5 being
>> stored in
>> the Field object.
>> + @param[in] thd thread handler
>> @param[in] max_fl Set to true if we are optimizing MAX(),
>> false means we are optimizing %MIN()
>> @param[in, out] ref Reference to the structure where the
>> function @@ -589,7 +599,7 @@ bool simple_pred(Item_func *func_item, I
>> true We can use the index to get MIN/MAX value
>> */
>> -static bool matching_cond(bool max_fl, TABLE_REF *ref, KEY *keyinfo,
>> +static bool matching_cond(THD *thd, bool max_fl, TABLE_REF *ref, KEY
>> *keyinfo, KEY_PART_INFO *field_part, COND *cond,
>> key_part_map *key_part_used, uint *range_fl,
>> uint *prefix_len)
>> @@ -613,7 +623,7 @@ static bool matching_cond(bool max_fl, T
>> Item *item;
>> while ((item= li++))
>> {
>> - if (!matching_cond(max_fl, ref, keyinfo, field_part, item,
>> + if (!matching_cond(thd, max_fl, ref, keyinfo, field_part, item,
>> key_part_used, range_fl, prefix_len))
>> DBUG_RETURN(FALSE);
>> }
>> @@ -732,6 +742,9 @@ static bool matching_cond(bool max_fl, T
>> if (is_null || (is_null_safe_eq && args[1]->is_null()))
>>
>
> Speaking to myself::
> - opt_sum_query() is called for the second subquery (the right argument of
> '>SOME').
> - this is_null() above is what evaluates the most inner subquery (the right
> argument of '<=>') and sets error flag (of thd->is_error()).
>
> Below ("if (thd->is_error())"), the patch wants to detect the error. It
> tests is_error() only if is_null() is true. It's good to note that this is
> relying on the contract of subquery items ("if it gets an error it pretends
> to be NULL"). Maybe this contract should be written somewhere. Otherwise it
> would be tempting, for a future refactoring developer, to break this
> contract (after all, if there's an error, the user is not supposed to rely
> the value, in all logic; here we do rely on the value of is_null() to decide
> if we have to test is_error()...).
>
> Note that the function (matching_cond()) calls is_null() earlier too:
> if (!is_null_safe_eq && !is_null &&
> (args[1]->is_null() || (between && args[2]->is_null())))
> DBUG_RETURN(FALSE);
> but fortunately, if there are errors in one of those two is_null(), return
> value is 'true' (thanks to contract) and we return immediately, so there is
> no bug there.
>
>
> {
>> + if (thd->is_error())
>> + DBUG_RETURN(false);
>>
>
> I would put a comment around this, explaining how there can be an error
> here, something like:
>
> // possible e.g. if argument is failed subquery
>
> Because it's not trivial without having studied this particular testcase.
>
> Function matching_cond() should announce in its comment that "false" means
> "index can't be used, or error (check thd->is_error())".
> Same for find_key_for_maxmin().
>
>
> +
>> part->field->set_null();
>> *key_ptr= (uchar) 1;
>> }
>>
>
> I tried hard to find why there was unitialized data, which led me to the
> following analysis.
>
> Below I consider code without the patch.
>
> The most inner subquery (the right argument of '<=>') reported that it's
> NULL (due to error, but we don't need to know that).
> "Are you NULL or not?" is all that we asked the subquery to say. We didn't
> expect it to fill any other info. And the subquery's internal state, in case
> of error, is exactly as if it contained NULL (see
> Item_singlerow_subselect::val_int()). There's no "stuff which the subquery
> should have filled but hasn't because of error".
>
> From that we conclude that as we observed Valgrind errors in the case of
> error-in-subq, we should also observe Valgrind errors in the case of
> no-error-in-subq, as the subq is NULL in both cases.
>
> And indeed, the case of no-error-in-subq:
>
> SELECT 1 FROM t1 WHERE a < SOME (SELECT a FROM t1 WHERE a <=> (SELECT a
> FROM t1 where a is null) );
>
> has the same Valgrind errors.
> That isn't fixed by the patch, which focused on the error-in-subq case.
>
> I propose to fix this new case in the same go.
>
> Possibly, after fixing it, there will be less need for is_error() tests.
> Theoretically, it's possible to let matching_cond() and
> find_key_for_maxmin() continue (with their NULL
> due-to-error-but-they-do-not-know-it-and-they-do-not-need-to, by virtue of
> the "contract of subq items"), let opt_sum_query() continue, let the entire
> statement continue, build its result set and whatever stupid data, until the
> final error reporting happens in net_end_statement():
>
> void net_end_statement(THD *thd)
> {
> switch (thd->main_da.status()) {
> case Diagnostics_area::DA_ERROR:
> /* The query failed, send error to log and abort bootstrap. */
> error= net_send_error(thd,
>
> which sends to the client the error recorded by the most inner subq item
> long ago.
>
> Which means that the fix for the new testcase (no-error-in-subq) is in
> theory enough to fix the original testcase (error-in-subq)
> Maybe you'll want to still add code to catch errors early. I don't know.
> It's if()s added to the code paths...
> Let's see after the new testcase is fixed.
>
> If you prefer to not work on the new testcase, feel free to file a new bug
> report, and mark yours as duplicate of the new one.
>
>
> --
> MySQL Code Commits Mailing List
> For list archives: http://lists.mysql.com/commits
> To unsubscribe:
> http://lists.mysql.com/commits?unsub=1
>
>