List:General Discussion« Previous MessageNext Message »
From:Dan Nelson Date:June 18 2007 4:24pm
Subject:Re: Hiding columns used in GROUP BY and HAVING clauses
View as plain text  
In the last episode (Jun 18), Edward Kay said:
>From: Dan Nelson [mailto:dnelson@stripped]
> <snip>
> > > > > At the moment, I have this and it works:
> > > > >
> > > > > 	select * from contact_address
> > > > > 	group by primary_entity_id
> > > > > 	having count(primary_entity_id) = 1
> > > > > 	and is_primary = 0;
> > > > >
> > > > > This is fine except I want to use the result in a sub-query.
> > > > > Since it returns two columns this doesn't work:
> > > > >
> > > > > 	update contact_address set is_primary = 1 where address_id in (
> > > > > 		select * from contact_address
> > > > > 		group by primary_entity_id
> > > > > 		having count(primary_entity_id) = 1
> > > > > 		and is_primary = 0
> > > > > 	);
> > > > >
> > > > > Normally, I'd only return the address_id in the sub-SELECT,
> > > > > but I need the is_primary column for the HAVING clause.
> > > >
> > > > I did some tests, and it looks like you can use aggregate
> > > > functions in your HAVING clause without actually selecting the
> > > > column.  So "HAVING COUNT(primary_entity_id) = 1" should work
> > > > even if you only select address_id.
> > >
> > > Yes, that is true and it does work.
> > >
> > > What doesn't work however, is the extra 'AND is_primary = 0' HAVING
> > > clause.
> >
> > Couldn't you move that up into a WHERE clause (still in the
> > subquery)? It's just a regular field comparison, so it doesn't have
> > to be in the HAVING clause.
> 
> No, because that's not the same thing. Imagine the following data:
> 
> address_id   primary_entity_id   is_primary
> -------------------------------------------
>   1            293                 0
>   2            293                 1
> 
> With my query above, the sub-query result set would be empty since
> count(primary_entity_id) = 2.
> 
> If I move the is_primary = 0 requirement to a WHERE clause, then the
> first row would be incorrectly updated since the group function would
> only be working on the sub-set of data (where is_primary = 0).
> 
> It's a subtle but important difference.

Yes, that complicates things a bit...  That does mean you'll need to
pull another column in your subquery.  I guess you could strip that
column out with a subquery :) Something like:

update contact_address set is_primary = 1 where address_id in ( 
  select address_id from (
    select address_id, primary_entity_id from contact_address
      group by primary_entity_id
      having count(primary_entity_id) = 1
      and primary_entity_id = 0 
  )
); 

Note that you're using a mysql-ism by referencing a non-aggregated
field in your HAVING clause.  It's okay in this particular case because
you're only aggregating one row, but if you group more than one row
together the results are indeterminate:
http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-fields.html

-- 
	Dan Nelson
	dnelson@stripped
Thread
Hiding columns used in GROUP BY and HAVING clausesEdward Kay15 Jun
  • Re: Hiding columns used in GROUP BY and HAVING clausesBaron Schwartz15 Jun
  • Re: Hiding columns used in GROUP BY and HAVING clausesDan Nelson15 Jun
    • RE: Hiding columns used in GROUP BY and HAVING clausesEdward Kay18 Jun
      • Re: Hiding columns used in GROUP BY and HAVING clausesDan Nelson18 Jun
        • RE: Hiding columns used in GROUP BY and HAVING clausesEdward Kay18 Jun
          • Re: Hiding columns used in GROUP BY and HAVING clausesBaron Schwartz18 Jun
            • RE: Hiding columns used in GROUP BY and HAVING clausesEdward Kay18 Jun
          • Re: Hiding columns used in GROUP BY and HAVING clausesDan Nelson18 Jun
            • Re: Hiding columns used in GROUP BY and HAVING clausesBaron Schwartz18 Jun
              • RE: Hiding columns used in GROUP BY and HAVING clausesEdward Kay19 Jun