List:General Discussion« Previous MessageNext Message »
From:rob.pollock@gmail.com Date:September 26 2013 12:55am
Subject:Re: Problem with having
View as plain text  
I

Sent from my D

----- Reply message -----
From: "Rick James" <rjames@stripped>
To: "Larry Martell" <larry.martell@stripped>, "shawn green" <shawn.l.green@strippedom>
Cc: "mysql mailing list" <mysql@stripped>
Subject: Problem with having
Date: Thu, Sep 26, 2013 12:11 PM


Still more to this saga....

Comment 1:
... HAVING x;
The expression ( x ) is evaluated as a true/false value, based on whether x is nonzero (true) or zero (false).  Your 'x' is  MIN(date_time) , which is very likely to be nonzero, hence TRUE.  That is, the HAVING does nothing useful.

Comment 2:
This shows
1. a technique
2. how MariaDB optimizes it away, and
3. how you can get MariaDB to still do the "group by trick":
https://mariadb.com/kb/en/group-by-trick-has-been-optimized-away/
(I do not know of other MySQL variants that have any trouble with the trick.)

> -----Original Message-----
> From: Larry Martell [mailto:larry.martell@stripped]
> Sent: Tuesday, September 24, 2013 1:44 PM
> To: shawn green
> Cc: mysql mailing list
> Subject: Re: Problem with having
> 
> On Tue, Sep 24, 2013 at 9:05 AM, shawn green
> <shawn.l.green@stripped>wrote:
> 
> > Hello Larry,
> >
> >
> > On 9/23/2013 6:22 PM, Larry Martell wrote:
> >
> >> On Mon, Sep 23, 2013 at 3:15 PM, shawn green
> >> <shawn.l.green@stripped>**
> >> wrote:
> >>
> >>  Hi Larry,
> >>>
> >>>
> >>> On 9/23/2013 3:58 PM, Larry Martell wrote:
> >>>
> >>>  On Mon, Sep 23, 2013 at 1:51 PM, Sukhjinder K. Narula
> >>>> <narula.sk@stripped>wrote:
> >>>>
> >>>>   Hi,
> >>>>
> >>>>>
> >>>>> In your second query, you seem to have MIN(date_time), but you are
> >>>>> talking about maximum. So your group by query is actually pulling
> >>>>> the minimum date for this recipe.
> >>>>>
> >>>>>
> >>>>>  I pasted the wrong query in. I get the same results regardless of
> >>>>> if I
> >>>> have
> >>>> MIN or MAX - I get the id of the max, but the date_time of the min.
> >>>>
> >>>>
> >>>>
> >>>>   On Mon, Sep 23, 2013 at 3:35 PM, Larry Martell <
> >>>> larry.martell@stripped>
> >>>>
> >>>>> **wrote:
> >>>>>
> >>>>>
> >>>>>   I want to find the rows from a table that have the max date_time
> >>>>> for
> >>>>>
> >>>>>> each
> >>>>>> recipe. I know I've done this before with group by and having,
> >>>>>> but I can't seem to get it to work now. I get the correct row id,
> >>>>>> but not the correct date_time. I'm sure I'm missing something
> >>>>>> simple.
> >>>>>>
> >>>>>> For purposes of showing an example, I'll use one recipe, 19166.
> >>>>>>
> >>>>>>
> >>>>>> For that recipe here's the row I would want:
> >>>>>>
> >>>>>> mysql> select id, MAX(date_time) from data_cstmeta  where
> >>>>>> mysql> recipe_id =
> >>>>>> 19166;
> >>>>>> +---------+-------------------****--+
> >>>>>> | id      | MAX(date_time)      |
> >>>>>> +---------+-------------------****--+
> >>>>>>
> >>>>>> | 1151701 | 2013-02-07 18:38:13 |
> >>>>>> +---------+-------------------****--+
> >>>>>>
> >>>>>> 1 row in set (0.01 sec)
> >>>>>>
> >>>>>> I would think this query would give me that - it gives me the
> >>>>>> correct id, but not the correct date_time:
> >>>>>>
> >>>>>> mysql> SELECT id, date_time as MaxDateTime  FROM data_cstmeta
> >>>>>> mysql> where
> >>>>>> recipe_id = 19166 group by recipe_id HAVING MIN(date_time);
> >>>>>> +---------+-------------------****--+
> >>>>>> | id      | MaxDateTime         |
> >>>>>> +---------+-------------------****--+
> >>>>>>
> >>>>>> | 1151701 | 2010-12-13 16:16:55 |
> >>>>>> +---------+-------------------****--+
> >>>>>>
> >>>>>> 1 row in set (0.01 sec)
> >>>>>>
> >>>>>> How can I fix this?
> >>>>>>
> >>>>>> Thanks!
> >>>>>> -larry
> >>>>>>
> >>>>>>
> >>>>>>  You have to do a two-stage match. One stage to find the MAX() of
> >>>>>> a
> >>> value
> >>> for each recipe_id, the other to match that MAX() to one or more
> >>> rows to give you the best ID values.
> >>>
> >>> Here's a subquery method of doing it.  There are many many others
> >>> (google for "groupwize maximum")
> >>>
> >>> SELECT a.id, b.MaxDateTime
> >>> FROM data_cstmeta a
> >>> INNER JOIN (
> >>>      SELECT MAX(date_time) MaxDateTime
> >>>      FROM data_cstmeta
> >>>      WHERE recipe_id = 19166
> >>> ) b
> >>>    on b.MaxDateTime = a.date_time
> >>> WHERE recipe_id = 19166;
> >>>
> >>>
> >>>  Having the recipe_id in the query was just to show an example. I
> >>> really
> >> want the id's with the max date for each recipe_id:
> >>
> >> This is what I changed it to, which works, but is too slow. I need to
> >> find a more efficient solution:
> >>
> >> SELECT d1.id, d1.date_time as MaxDateTime
> >>     FROM data_cstmeta d1
> >>     LEFT JOIN data_cstmeta d2 ON d1.recipe_id = d2.recipe_id AND
> >> d1.date_time < d2.date_time
> >>     WHERE d2.recipe_id IS NULL
> >>
> >>
> > As I said, there are many many ways to solve this problem. Here is one
> > that is going to perform much better for the generic case than what
> > you are doing.
> >
> > CREATE TEMPORARY TABLE tmpMaxDates (key(recipe_id,maxdatetime)) SELECT
> > recipe_id, max(date_time) maxdatetime FROM data_cstmeta GROUP BY
> > recipe_id;
> >
> > SELECT a.id, b.maxdatetime
> > FROM data_cstmeta a
> > INNER JOIN tmpMaxDates b
> >   on a.recipe_id = b.recipe_id
> >   and a.date_time = b.maxdatetime;
> >
> > DROP TEMPORARY TABLE tmpMaxDates;
> >
> >
> > Of course, an appropriate multi-column index on data_cstmeta would
> > also make your technique much faster than it is today.
> >
> >
> Thanks much Shawn! This ran in a few  seconds vs. 30 minutes for my
> solution.
Thread
Re: Problem with havingrob.pollock@gmail.com26 Sep