List:General Discussion« Previous MessageNext Message »
From:Jose Miguel Pérez Date:September 21 2004 11:00pm
Subject:RE: Query with group by
View as plain text  
Hi Michael! 

	Talking about the "query with group by issue"...

> I'll explain my reasoning below.
[...]
>  From the manual, section "7.2.8 How MySQL Optimizes LEFT 
> JOIN and RIGHT 
> JOIN" <http://dev.mysql.com/doc/mysql/en/LEFT_JOIN_optimization.html>:
> 
>    A LEFT JOIN B join_condition is implemented in MySQL as follows:
>    ...
>    * The LEFT JOIN condition is used to decide how to 
> retrieve rows from table B. (In other words, any condition in the
> WHERE clause is not used.)
>    ...
> 
> So, the "WHERE c2.id IS NULL" cannot be applied until after 
> the rows which match the ON clause (and the NULL rows) have been
> fetched.

	Thanks for your explanations here, it's clear now I was confused. I
missed this LEFT JOIN behaviour. I was in the idea the WHERE xx IS NULL was
taken into account.

[...]
> You are certainly right that "temporary" and "filesort" are 
> to be avoided. And they will be, if the table is properly indexed. Single 
> column indexing won't help much here, because the WHERE condition,
> the GROUP BY column, and the MAX column are all different.  A multi-column
> index on (content, location, date), however, will allow mysql to use the
> index to find the matching rows, find the groups, and calculate the MAX
date.

	I still refuse to use the temporary table solution, call me fussy
here. Indeed, I think giving more indexes than necessary is a bit redundant,
unless completely necessary. (You will end up having more space on indexes
than data itself). It's a "handle with care" issue for me.

>>     Anyway, I don't know if one can program an agregate UDF 
>>  called something like EXTERNAL_MAX(...) or something, so that we
>>  could do like:
>> 
>>         SELECT EXTERNAL_MAX(date, version)  ---> i.e: 
>>  Returns the "version" value for the row with MAX(date).
>>
>>     This, for sure, will be the best solution. ;-)
> 
> That would have to do the same thing behind the scenes.

	I have to beg you pardon here. ;-) Think again this solution doesn't
require a JOIN, nor a temporary table. I think the UDF solution doesn't have
to be less efficient than a MAX or AVG aggregate function by itself. I'm in
the process of creating such a monster :-) Let me know if you are curious
about and have time to test it, I will test MAX() and EXTERNAL_MAX() against
a very large table.

	Cheers,
	Jose Miguel.


Thread
Query with group byVincent.Badier16 Sep
  • Re: Query with group byRhino16 Sep
    • Re: Query with group byVincent.Badier16 Sep
  • Re: Query with group byJose Miguel Pérez16 Sep
    • Re: Query with group byMichael Stassen16 Sep
  • Re: Query with group byRhino16 Sep
    • Re: Query with group byMichael Stassen17 Sep
  • Re: Query with group byJose Miguel Pérez16 Sep
    • Re: Query with group byMichael Stassen17 Sep
      • RE: Query with group byJose Miguel Pérez22 Sep
  • Re: Query with group byRhino17 Sep
    • Re: Query with group byMichael Stassen18 Sep
  • Re: Query with group byRhino18 Sep