MySQL Lists are EOL. Please join:

List:Internals« Previous MessageNext Message »
From:Martin Friebe Date:March 21 2007 9:54pm
Subject:intended behaviour for subquery aggregate with outer column refs?
View as plain text  
I am trying to establish the wanted behaviour for certain queries.
I had been looking at them according to the draft of the SQL:2003 that I 
have (which might not be correct in all aspects, as it only is a (late) 
draft).
The mysql documentation does not specify anything on the behaviour, but 
I have now found some docs within the source code.

Those docs in the source code, differ form my understanding of the draft.
I wish to establish:
a) is the draft I have correct on the subject
b) is my understanding correct, and have I not missed out essential parts
c) if yes to the above: Is there a reason for the difference in the 
current implementation, including the in-code documentation. Is there an 
intention to change this, and in which version(s)?

looking at the 2 queries:

select
   t1.a,
   (select count(t1.b) from t2 limit 1) # aggregated in t1
 from t1
 group by t1.a

VERSUS
select
   t1.a,
 from t1
where   (select count(t1.b) from t2 limit 1) # aggregated in t2
 group by t1.a

Looking at the bit of the code-embedded docs, which I have marked as 
"important bit":

In the first query "count(b)" can be aggregated in the outer query, as 
it appears in the outer query's "select list". Therefore mysql chooses 
the outer query as aggregation query for "count(b)"

In the 2nd part, an attempt to to this, would lead to an error, as 
"count(b)" would then appear in the outer queries "where" clause, which 
is not permitted.
MySql therefore chooses to do, what it used to do in version 4.1:
The aggregation query for "count(b)" is the inner query. "b" is imported 
as a column from the outer query.

This does go in-line with the documentation found inside the source 
code, as quoted below.

However it does not seem to be the same as the SQL2003 draft. refering 
to parts 6a/b and 7
> 6) The aggregation query of a <set function specification> SFS is 
> determined as follows. Case:
> a) If SFS has no aggregated column reference, then the aggregation 
> query of SFS is the innermost <query specification> that contains SFS.
> b) Otherwise, the innermost qualifying query of the aggregated column 
> references of SFS is the aggregation query of SFS.
> 7) SFS shall be contained in the <having clause>, <window clause>, or 
> <select list> of its aggregation query. 
6b) does specify the aggregation query for the set-function, the same 
way as mysql does for the *first* query.
It makes no exception-rules from that.
7) declares in which part of the query the set-function can appear.
My understanding would be that appearance, in other location should 
trigger an error.

As far as I see, mysql extends those possibilities, and falls back to an 
behaviour outside this spec for all cases that would lead to an error, 
if the fallback is not an error itself.
This is of course a nice extension, but may on the other hand lead to 
quite a confusing way of how to fid out what a query actually does?

My question is:
Is this extension to the spec intended, and is it intended to be kept.

There is http://bugs.mysql.com/bug.php?id=27348 which is exactly about 
the above behaviour. If this behaviour is intended to stay, this 
particular bug can be closed. I will add a link to the mailing list to 
the bug, and await feedback

Best Regards
Martin



In code documentation, which I found
> The general rule to detect the query where a set function is to be
>   evaluated can be formulated as follows.
>   Consider a set function S(E) where E is an expression with occurrences
>   of column references C1, ..., CN. Resolve these column references 
> against
>   subqueries that contain the set function S(E). Let Q be the innermost
>   subquery of those subqueries. (It should be noted here that S(E)
>   in no way can be evaluated in the subquery embedding the subquery Q,
>   otherwise S(E) would refer to at least one unbound column reference)
important part follows
>   If S(E) is used in a construct of Q where set functions are allowed then
>   we evaluate S(E) in Q.
>   Otherwise we look for a innermost subquery containing S(E) of those 
> where
>   usage of S(E) is allowed.

quotes from the sql:2003 DRAFT
> 4.15.2 Group functions
> A group function may only appear in the <select list>, <having clause> 
> or <window clause> of a <query specification> or <select statement: 
> single row>, or in the <order by clause> of a cursor that is a simple 
> table query.
>
> ---------------------
> 6.9 <set function specification>
> Format
> <set function specification> ::= <aggregate function> | <grouping 
> operation>
> Syntax Rules
> 1) If <aggregate function> specifies a <general set function>, then 
> the <value expression> simply contained
> in the <general set function> shall not contain a <set function 
> specification> or a <subquery>.
> 2) If <aggregate function> specifies <binary set function>, then 
> neither the <dependent variable expression>
> nor the <independent variable expression> simply contained in the 
> <binary set function> shall contain a
> <set function specification> or a <subquery>.
> 3) A <value expression> VE simply contained in a <set function 
> specification> SFE is an aggregated argument
> of SFE if either SFE is not an <ordered set function> or VE is simply 
> contained in a <within group specification>;
> otherwise, VE is a non-aggregated argument of SFE.
> 4) A column reference CR contained in an aggregated argument of a <set 
> function specification> SFS is called
> an aggregated column reference of SFS.
> 5) If <aggregate function> specifies a <filter clause>, then the 
> <search condition> immediately contained in
> <filter clause> shall not contain a <set function specification>.
> 6) The aggregation query of a <set function specification> SFS is 
> determined as follows.
> Case:
> a) If SFS has no aggregated column reference, then the aggregation 
> query of SFS is the innermost <query
> specification> that contains SFS.
> b) Otherwise, the innermost qualifying query of the aggregated column 
> references of SFS is the aggregation
> query of SFS.
> 7) SFS shall be contained in the <having clause>, <window clause>, or 
> <select list> of its aggregation query. 



Thread
intended behaviour for subquery aggregate with outer column refs?Martin Friebe21 Mar