List:General Discussion« Previous MessageNext Message »
From:Edward Kay Date:June 19 2007 8:43am
Subject:RE: Hiding columns used in GROUP BY and HAVING clauses
View as plain text  
> -----Original Message-----
> From: Baron Schwartz [mailto:baron@stripped]

> Dan Nelson wrote:
> > In the last episode (Jun 18), Edward Kay said:
> >> From: Dan Nelson [mailto:dnelson@stripped]
> >> <snip>
> >>>>>> At the moment, I have this and it works:
> >>>>>>
> >>>>>> 	select * from contact_address
> >>>>>> 	group by primary_entity_id
> >>>>>> 	having count(primary_entity_id) = 1
> >>>>>> 	and is_primary = 0;
> >>>>>>
> >>>>>> This is fine except I want to use the result in a
> sub-query.
> >>>>>> Since it returns two columns this doesn't work:
> >>>>>>
> >>>>>> 	update contact_address set is_primary = 1 where 
> address_id in (
> >>>>>> 		select * from contact_address
> >>>>>> 		group by primary_entity_id
> >>>>>> 		having count(primary_entity_id) = 1
> >>>>>> 		and is_primary = 0
> >>>>>> 	);
> >>>>>>
> >>>>>> Normally, I'd only return the address_id in the sub-SELECT,
> >>>>>> but I need the is_primary column for the HAVING clause.
> >>>>> I did some tests, and it looks like you can use aggregate
> >>>>> functions in your HAVING clause without actually selecting the
> >>>>> column.  So "HAVING COUNT(primary_entity_id) = 1" should work
> >>>>> even if you only select address_id.
> >>>> Yes, that is true and it does work.
> >>>>
> >>>> What doesn't work however, is the extra 'AND is_primary = 0' HAVING
> >>>> clause.
> >>> Couldn't you move that up into a WHERE clause (still in the
> >>> subquery)? It's just a regular field comparison, so it doesn't have
> >>> to be in the HAVING clause.
> >> No, because that's not the same thing. Imagine the following data:
> >>
> >> address_id   primary_entity_id   is_primary
> >> -------------------------------------------
> >>   1            293                 0
> >>   2            293                 1
> >>
> >> With my query above, the sub-query result set would be empty since
> >> count(primary_entity_id) = 2.
> >>
> >> If I move the is_primary = 0 requirement to a WHERE clause, then the
> >> first row would be incorrectly updated since the group function would
> >> only be working on the sub-set of data (where is_primary = 0).
> >>
> >> It's a subtle but important difference.
> > 
> > Yes, that complicates things a bit...  That does mean you'll need to
> > pull another column in your subquery.  I guess you could strip that
> > column out with a subquery :) Something like:
> > 
> > update contact_address set is_primary = 1 where address_id in ( 
> >   select address_id from (
> >     select address_id, primary_entity_id from contact_address
> >       group by primary_entity_id
> >       having count(primary_entity_id) = 1
> >       and primary_entity_id = 0 
> >   )
> > ); 
> > 
> 
> This is fine, but as I wrote earlier, MySQL does not do well with IN() 
> subqueries, and nesting a subquery in a dependent FROM clause is 
> probably even 
> harder for it to optimize well, since that will end up being an 
> un-indexed 
> temporary table.  It is probably better to write this as a JOIN.  
> I gave an 
> example a few days ago.

Thanks Baron and Dan for your comments, you were both really helpful.

Edward
Thread
Hiding columns used in GROUP BY and HAVING clausesEdward Kay15 Jun
  • Re: Hiding columns used in GROUP BY and HAVING clausesBaron Schwartz15 Jun
  • Re: Hiding columns used in GROUP BY and HAVING clausesDan Nelson15 Jun
    • RE: Hiding columns used in GROUP BY and HAVING clausesEdward Kay18 Jun
      • Re: Hiding columns used in GROUP BY and HAVING clausesDan Nelson18 Jun
        • RE: Hiding columns used in GROUP BY and HAVING clausesEdward Kay18 Jun
          • Re: Hiding columns used in GROUP BY and HAVING clausesBaron Schwartz18 Jun
            • RE: Hiding columns used in GROUP BY and HAVING clausesEdward Kay18 Jun
          • Re: Hiding columns used in GROUP BY and HAVING clausesDan Nelson18 Jun
            • Re: Hiding columns used in GROUP BY and HAVING clausesBaron Schwartz18 Jun
              • RE: Hiding columns used in GROUP BY and HAVING clausesEdward Kay19 Jun