List:Commits« Previous MessageNext Message »
From:Roy Lyseng Date:February 15 2011 12:12pm
Subject:Re: bzr commit into mysql-5.1 branch (martin.hansson:3587) Bug#59839
View as plain text  
Hi Martin,

thank you for fixing this problem.

The fix is approved, but please fix the comments before pushing.

On 09.02.11 10.58, Martin Hansson wrote:
> #At file:///data0/martin/bzrroot/bug59839/5.1-commit/ based on
> revid:dmitry.shulga@stripped
>
>   3587 Martin Hansson	2011-02-09
>        Bug#59839: Aggregation followed by subquery yields wrong result
>
>        The loop that was looping over subqueries' references to outer field used a
>        local boolean variable to tell whether the field was grouped or not. But the
>        implementor failed to reset the variable after each iteration. Thus a field
>        that was not directly aggregated appeared to be.
>
>        Fixed by resetting the variable upon each new iteration.
>
>      modified:
>        mysql-test/r/group_by.result
>        mysql-test/t/group_by.test
>        sql/sql_select.cc
> === modified file 'mysql-test/r/group_by.result'
> --- a/mysql-test/r/group_by.result	2010-10-29 08:23:06 +0000
> +++ b/mysql-test/r/group_by.result	2011-02-09 09:58:43 +0000
> @@ -1855,4 +1855,40 @@ ON 1 WHERE t2.f1>  1 GROUP BY t2.f1;
>   COUNT(*)
>   2
>   DROP TABLE t1;
> +#
> +# Bug#59839: Aggregation followed by subquery yields wrong result
> +#
> +CREATE TABLE t1 (
> +a INT,
> +b INT,
> +c INT,
> +KEY (a, b)
> +);
> +INSERT INTO t1 VALUES
> +( 1, 1,  1 ),
> +( 1, 2,  2 ),
> +( 1, 3,  3 ),
> +( 1, 4,  6 ),
> +( 1, 5,  5 ),
> +( 1, 9, 13 ),
> +( 2, 1,  6 ),
> +( 2, 2,  7 ),
> +( 2, 3,  8 );
> +EXPLAIN
> +SELECT a, AVG(t1.b),
> +(SELECT t11.c FROM t1 t11 WHERE t11.a = t1.a AND t11.b = AVG(t1.b)) AS t11c,
> +(SELECT t12.c FROM t1 t12 WHERE t12.a = t1.a AND t12.b = AVG(t1.b)) AS t12c
> +FROM t1 GROUP BY a;
> +id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
> +1	PRIMARY	t1	index	NULL	a	10	NULL	9	Using index
> +3	DEPENDENT SUBQUERY	t12	ref	a	a	10	func,func	2	Using where
> +2	DEPENDENT SUBQUERY	t11	ref	a	a	10	func,func	2	Using where
> +SELECT a, AVG(t1.b),
> +(SELECT t11.c FROM t1 t11 WHERE t11.a = t1.a AND t11.b = AVG(t1.b)) AS t11c,
> +(SELECT t12.c FROM t1 t12 WHERE t12.a = t1.a AND t12.b = AVG(t1.b)) AS t12c
> +FROM t1 GROUP BY a;
> +a	AVG(t1.b)	t11c	t12c
> +1	4.0000	6	6
> +2	2.0000	7	7
> +DROP TABLE t1;
>   # End of 5.1 tests
>
> === modified file 'mysql-test/t/group_by.test'
> --- a/mysql-test/t/group_by.test	2010-10-29 08:23:06 +0000
> +++ b/mysql-test/t/group_by.test	2011-02-09 09:58:43 +0000
> @@ -1247,4 +1247,41 @@ ON 1 WHERE t2.f1>  1 GROUP BY t2.f1;
>
>   DROP TABLE t1;
>
> +--echo #
> +--echo # Bug#59839: Aggregation followed by subquery yields wrong result
> +--echo #
> +
> +CREATE TABLE t1 (
> +  a INT,
> +  b INT,
> +  c INT,
> +  KEY (a, b)
> +);
> +
> +INSERT INTO t1 VALUES
> +  ( 1, 1,  1 ),
> +  ( 1, 2,  2 ),
> +  ( 1, 3,  3 ),
> +  ( 1, 4,  6 ),
> +  ( 1, 5,  5 ),
> +  ( 1, 9, 13 ),
> +
> +  ( 2, 1,  6 ),
> +  ( 2, 2,  7 ),
> +  ( 2, 3,  8 );
> +
> +EXPLAIN
> +SELECT a, AVG(t1.b),
> +(SELECT t11.c FROM t1 t11 WHERE t11.a = t1.a AND t11.b = AVG(t1.b)) AS t11c,
> +(SELECT t12.c FROM t1 t12 WHERE t12.a = t1.a AND t12.b = AVG(t1.b)) AS t12c
> +FROM t1 GROUP BY a;
> +
> +SELECT a, AVG(t1.b),
> +(SELECT t11.c FROM t1 t11 WHERE t11.a = t1.a AND t11.b = AVG(t1.b)) AS t11c,
> +(SELECT t12.c FROM t1 t12 WHERE t12.a = t1.a AND t12.b = AVG(t1.b)) AS t12c
> +FROM t1 GROUP BY a;

I'd suggest that you write something like:
set $query=SELECT ...;
eval EXPLAIN $query;
eval $query;

That way you have to write complex query strings only once.
> +
> +DROP TABLE t1;
> +
> +
>   --echo # End of 5.1 tests
>
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc	2011-02-08 11:52:33 +0000
> +++ b/sql/sql_select.cc	2011-02-09 09:58:43 +0000
> @@ -281,46 +281,54 @@ bool handle_select(THD *thd, LEX *lex, s
>   /*

/* -> /** (for Doxygen)
>     Fix fields referenced from inner selects.
>
> -  SYNOPSIS
> -    fix_inner_refs()
> -    thd               Thread handle
> -    all_fields        List of all fields used in select
> -    select            Current select
> -    ref_pointer_array Array of references to Items used in current select
> -    group_list        GROUP BY list (is NULL by default)
> +  @param thd               Thread handle
> +  @param all_fields        List of all fields used in select
> +  @param select            Current select
> +  @param ref_pointer_array Array of references to Items used in current select
> +  @param group_list        GROUP BY list (is NULL by default)
>
> -  DESCRIPTION
> -    The function serves 3 purposes - adds fields referenced from inner
> -    selects to the current select list, resolves which class to use
> -    to access referenced item (Item_ref of Item_direct_ref) and fixes
> -    references (Item_ref objects) to these fields.
> +  @details
> +    The function serves 3 purposes
> +
> +    - adds fields referenced from inner query blocks to the current select list
>
> -    If a field isn't already in the select list and the ref_pointer_array
> +    - resolves which class to use to access referenced item (Item_ref or
resolves -> decides
access referenced item -> reference the items
> +      Item_direct_ref)
> +
> +    - fixes references (Item_ref objects) to these fields.
fields -> items
> +
> +    If a field isn't already on the select list and the ref_pointer_array
>       is provided then it is added to the all_fields list and the pointer to
>       it is saved in the ref_pointer_array.
>
>       The class to access the outer field is determined by the following rules:
> -    1. If the outer field isn't used under an aggregate function
> -      then the Item_ref class should be used.
> -    2. If the outer field is used under an aggregate function and this
> -      function is aggregated in the select where the outer field was
> -      resolved or in some more inner select then the Item_direct_ref
> -      class should be used.
> -      Also it should be used if we are grouping by a subquery containing
> -      the outer field.
> +
> +    -#. If the outer field isn't used under an aggregate function then the
> +        Item_ref class should be used.
> +
> +    -#. If the outer field is used under an aggregate function and this
> +        function is, in turn, aggregated in the query block where the outer
> +        field was resolved or some query nested therein, then the
> +        Item_direct_ref class should be used. Also it should be used if we are
> +        grouping by a subquery containing the outer field.
> +
>       The resolution is done here and not at the fix_fields() stage as
> -    it can be done only after sum functions are fixed and pulled up to
> -    selects where they are have to be aggregated.
> +    it can be done only after aggregate functions are fixed and pulled up to
> +    selects where they are to be aggregated.
> +
>       When the class is chosen it substitutes the original field in the
>       Item_outer_ref object.
>
>       After this we proceed with fixing references (Item_outer_ref objects) to
>       this field from inner subqueries.
>
> -  RETURN
> -    TRUE  an error occured
> -    FALSE ok
> -*/
> +  @return Status
> +  @retval true An error occured.
> +  @retval false OK.
This return convention is so common, that I would advice you to write:
   @return true if an error occurred, false if success.
> +
> +  @todo This comment must be restructured. It is jumping back and forth
> +        between abstraction levels and it uses inconsistent terminology.

Hey! There is actual documentation here! You can either remove the @todo, fix 
the comment, or pinpoint the jumping and use of inconsistent terminology.

I guess the inconsistet terminology issues are: field vs. item. select vs query 
block (I prefer query specification or query expression, they are SQL official 
terms),
> + */
>
>   bool
>   fix_inner_refs(THD *thd, List<Item>  &all_fields, SELECT_LEX *select,
> @@ -328,11 +336,11 @@ fix_inner_refs(THD *thd, List<Item>  &all
>   {
>     Item_outer_ref *ref;
>     bool res= FALSE;
The local variable res is unused and can be removed.
> -  bool direct_ref= FALSE;
>
>     List_iterator<Item_outer_ref>  ref_it(select->inner_refs_list);
>     while ((ref= ref_it++))
>     {
> +    bool direct_ref= false;
>       Item *item= ref->outer_ref;
>       Item **item_ref= ref->ref;
>       Item_ref *new_ref;
Thanks,
Roy
Thread
bzr commit into mysql-5.1 branch (martin.hansson:3587) Bug#59839Martin Hansson9 Feb
Re: bzr commit into mysql-5.1 branch (martin.hansson:3587) Bug#59839Jorgen Loland9 Feb
  • Re: bzr commit into mysql-5.1 branch (martin.hansson:3587) Bug#59839Martin Hansson9 Feb
Re: bzr commit into mysql-5.1 branch (martin.hansson:3587) Bug#59839Roy Lyseng15 Feb
  • Re: bzr commit into mysql-5.1 branch (martin.hansson:3587) Bug#59839Martin Hansson15 Feb
    • Re: bzr commit into mysql-5.1 branch (martin.hansson:3587) Bug#59839Roy Lyseng15 Feb
      • Re: bzr commit into mysql-5.1 branch (martin.hansson:3587) Bug#59839Martin Hansson15 Feb