List:General Discussion« Previous MessageNext Message »
From:Michael Stassen Date:March 17 2005 5:16am
Subject:Re: Erroneus column using MAX() and GROUP BY
View as plain text  
You've misunderstood how GROUP BY and MAX() work.  GROUP BY divides your 
data into groups, and aggregate functions such as MAX() tell you something 
about each group, but they *do not* return *rows* from your table.  Consider 
the following example rows in a larger table:

   cat   val1   val2
    4      1      1
    4      5      3
    4      7      2
    4      3      4
    4      7      1

Now consider the query

   SELECT cat, MIN(val1), MAX(val1), MIN(val2), MAX(val2)
   FROM mytable
   GROUP BY cat;

I think it should be easy to see that for the group where cat is 4, I'll get 
the following result:

   +-----+-----------+-----------+-----------+-----------+
   | cat | MIN(val1) | MAX(val1) | MIN(val2) | MAX(val2) |
   +-----+-----------+-----------+-----------+-----------+
   |   4 |         1 |         7 |         1 |         4 |
   +-----+-----------+-----------+-----------+-----------+

Which row is that in my table?  You see?  Even if I only asked for 
MAX(val1), there are 2 rows with the max value of 7.  AS I said before, we 
get information about each group, but not rows from the table.

Other systems wouldn't even allow your query, because clienthistory_id is 
neither an aggregate function nor a grouped column.  MySQL allows this as a 
convenience, but you are warned not to use columns whose values are not 
unique per group, as you will get random (first found, I believe) results. 
See the manual for more 
<http://dev.mysql.com/doc/mysql/en/group-by-hidden-fields.html>.

Fortunately, yours is such a frequently asked question, that the manual has 
a page describing three solutions.  See 
<http://dev.mysql.com/doc/mysql/en/example-maximum-column-group-row.html>.

Michael


Daevid Vincent wrote:

> I have this table:
> 
> mysql> select historyvlan_time, historyvlan_vlan, v.clienthistory_id from
> pe_historyvlan as v join pe_clienthistory  using (clienthistory_id) order by
> historyvlan_vlan, historyvlan_time desc;
> +------------------+------------------+------------------+
> | historyvlan_time | historyvlan_vlan | clienthistory_id |
> +------------------+------------------+------------------+
> |       0503011446 |                4 |               55 |  <--
> |       0503011440 |                4 |               54 |
> |       0502181640 |                4 |               29 |
> |       0502181638 |                4 |               26 |
> |       0502181508 |                4 |               24 |
> |       0503021500 |                5 |               73 |  <--
> |       0503011808 |                6 |               71 |  <--
> |       0503011452 |                6 |               56 |
> |       0502181626 |                6 |               25 |
> |       0502181640 |                7 |               28 |  <--
> |       0503011805 |                8 |               70 |  <--
> |       0503011801 |                8 |               68 |
> |       0503011731 |                8 |               61 |
> |       0503011730 |                8 |               60 |
> +------------------+------------------+------------------+
> 14 rows in set (0.00 sec)
> 
> I am trying to find the id and vlan for the most recent time:
> 
> mysql> select max(historyvlan_time), historyvlan_vlan, v.clienthistory_id
> from pe_historyvlan as v join pe_clienthistory  using (clienthistory_id)
> group by historyvlan_vlan order by historyvlan_vlan, historyvlan_time desc;
> +-----------------------+------------------+------------------+
> | max(historyvlan_time) | historyvlan_vlan | clienthistory_id |
> +-----------------------+------------------+------------------+
> | 0503011446            |                4 |               24 | <--
> | 0503021500            |                5 |               73 |
> | 0503011808            |                6 |               25 | <--
> | 0502181640            |                7 |               28 | 
> | 0503011805            |                8 |               60 | <--
> +-----------------------+------------------+------------------+
> 5 rows in set (0.00 sec)
> 
> 
> Why do I get '24' when I should get '55', '25' when I should get '61', and
> '60' when I should get '70'.
> 
> 
Thread
Erroneus column using MAX() and GROUP BYDaevid Vincent17 Mar
  • Re: Erroneus column using MAX() and GROUP BYMichael Stassen17 Mar
    • RE: Erroneus column using MAX() and GROUP BYElton Clark18 Mar