Eh? Don't you need two SELECTs to "Retrieve ... only one id ... and
list the unique ids ..."?
Rick James
MySQL Geeks - Consulting & Review
> -----Original Message-----
> From: Mariella Petrini [mailto:mariellapetrini@stripped]
> Sent: Sunday, September 14, 2008 8:04 PM
> To: Sergey Petrunia; internals@stripped; Rick James;
> mysql@stripped
> Subject: Re: SELECT DISTINCT with ORDER BY implementation
>
> Hi,
>
> Thanks a lot for your answers.
>
>
> >
> > SELECT ... FROM (SELECT ... ORDER BY x) tbl
> >
> > With MySQL, the inner ORDER BY just adds overhead, because
> > execution
> > proceeds as follows:
> >
> > * Run the inner query, making it to produce result in the
> > specified
> > ordering. Dump the result into a temporaty table.
> > * Run the upper query, using the temporary table as a
> > regular base table,
> > in particular assuming that the data in the temporary
> > table is not
> > ordered.
> >
> > That is, the ORDER BY just causes overhead and no benefit.
>
> Did I understand correctly that the ORDER BY executed in the
> inner query could get lost when the "result set" is dumped
> into the temporary table to be used by the upper query ?
>
>
>
> My original intent was the following:
>
> Retrieve from the "sort" table only one id (several rows with
> the same id match the WHERE condition)
> that would match the WHERE criteria and list the unique ids
> sorted by the "sort" field in DESC mode
>
>
> WHERE condition is :
> WHERE info1=7 AND info2=7 and info3=7 and
> (sort >='2008-10-10' AND sort <= '2008-11-11') .... )
>
> SORT is :
> ORDER BY sort DESC
>
>
>
>
>
>
> 1)
> The following query does not retrieve what I would like:
>
> SELECT DISTINCT (id) from sort_table WHERE info1=7 AND
> info2=7 and info3=7 and (sort >='2008-10-10' AND sort <=
> '2008-11-11') ORDER BY sort DESC)
> because in the case of DISTINCT the execution of DISTINCT is
> run before of the ORDER BY (so the unique rows are not
> ordered by sort field)
>
>
> 2)
> The following query it seems to retrieve only one row that
> matches the WHERE condition and the unique rows (unique ids)
> are ordered by the sort field DESC
>
>
> select DISTINCT(id) from (select id from
> sort_table FORCE INDEX
> (info1) where info1=7 AND info2=7 and info3=7 and (sort
> >='2008-10-10' AND sort <= '2008-11-11') ORDER BY sort DESC) as t
>
>
>
> 3) If there is a better way (more efficient and possibly
> without changing the structure of the table and possibly the index)
> to get the list of the unique ids
> that satisfy the WHERE condition and get them ordered by the
> 'sort' field DESC, I would be happy to follow it
>
>
> P.S. The index "info" is a combined index on (info1,info2,info3)
> P.S. The table contains one extra field (info4) that is listed
> in the WHERE condition as well
> So in reality the WHERE conditions is a little longer
> and looks like:
>
>
> WHERE info1=7 AND info2=7 and info3=7 and (sort
> >='2008-10-10' AND sort <= '2008-11-11') and (info4 >= ...
> and info4 <=...) ORDER BY sort DESC
>
>
>
> Thanks a lot for your help,
>
> Mariella
>
>
> --- On Fri, 9/12/08, Sergey Petrunia <spetrunia@stripped> wrote:
>
> > From: Sergey Petrunia <spetrunia@stripped>
> > Subject: Re: SELECT DISTINCT with ORDER BY implementation
> > To: mariellapetrini@stripped
> > Cc: internals@stripped, "Rick James"
> <rjames@stripped>, mysql@stripped
> > Date: Friday, September 12, 2008, 1:24 AM
> > Hi Mariella,
> >
> > Short answers:
> > * you're seeing an EXPLAIN bug
> > * FROM (SELECT ... ORDER BY) just adds overhead.
> >
> > Long answer:
> >
> > #
> > # Filling the table
> > #
> > create table t0 (a int);
> > insert into t0 values
> > (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
> > # insert 100K rows
> > insert into sort_table select A.a, now(), B.a, C.a, D.a
> > from t0 A, t0
> > B, t0 C, t0 D, t0 E;
> >
> > #
> > # Pick the constants such that conditions on info(1,2,3)
> > match 100 records,
> > # and condition on sort matches nothing.
> > #
> > mysql> explain select DISTINCT(id) from (select id from
> > sort_table
> > FORCE INDEX (info1) where info1=7 AND info2=7 and info3=7
> > and (sort
> > >='2008-10-10' AND sort <=
> > '2008-11-11') ORDER BY sort DESC) t\G
> > *************************** 1. row
> > ***************************
> > id: 1
> > select_type: PRIMARY
> > table: <derived2>
> > type: system
> > possible_keys: NULL
> > key: NULL
> > key_len: NULL
> > ref: NULL
> > rows: 0
> > Extra: const row not found
> > *************************** 2. row
> > ***************************
> > id: 2
> > select_type: DERIVED
> > table: sort_table
> > type: ALL
> > possible_keys: info1
> > key: info1
> > key_len: 12
> > ref:
> > rows: 99
> > Extra: Using filesort
> > 2 rows in set (0.01 sec)
> >
> > #
> > # Now analyze the counters:
> > #
> > mysql> flush status;
> > Query OK, 0 rows affected (0.01 sec)
> >
> > mysql> select DISTINCT(id) from (select id from
> > sort_table FORCE INDEX
> > (info1) where info1=7 AND info2=7 and info3=7 and (sort
> > >='2008-10-10' AND sort <=
> > '2008-11-11') ORDER BY sort DESC) t\G
> > Empty set (0.01 sec)
> >
> > mysql> show status like 'handler%';
> > +----------------------------+-------+
> > | Variable_name | Value |
> > +----------------------------+-------+
> > | Handler_commit | 0 |
> > | Handler_delete | 0 |
> > | Handler_discover | 0 |
> > | Handler_prepare | 0 |
> > | Handler_read_first | 1 |
> > | Handler_read_key | 1 |
> > | Handler_read_next | 100 | ---(1)
> > | Handler_read_prev | 0 |
> > | Handler_read_rnd | 0 |
> > | Handler_read_rnd_next | 1 | ---(2)
> > | Handler_rollback | 0 |
> > | Handler_savepoint | 0 |
> > | Handler_savepoint_rollback | 0 |
> > | Handler_update | 0 |
> > | Handler_write | 14 |
> > +----------------------------+-------+
> > 15 rows in set (0.00 sec)
> >
> > When we look at (1) and (2) it is a apparent that
> > * Full table scan wasn't done. If it was, we'd see
> > 100K reads
> > * Instead it did an index scan, and got 100 records. This
> > is exactly how
> > many records the range scan over "info1=7 AND
> > info2=7 and info3=7" would
> > produce.
> > This leads us to conclude that this is an EXPLAIN bug.
> > I'm going to file
> > this.
> >
> >
> > Btw, what was your intent in having a query in form
> >
> > SELECT ... FROM (SELECT ... ORDER BY x) tbl
> >
> > With MySQL, the inner ORDER BY just adds overhead, because
> > execution
> > proceeds as follows:
> >
> > * Run the inner query, making it to produce result in the
> > specified
> > ordering. Dump the result into a temporaty table.
> > * Run the upper query, using the temporary table as a
> > regular base table,
> > in particular assuming that the data in the temporary
> > table is not
> > ordered.
> >
> > That is, the ORDER BY just causes overhead and no benefit.
> >
> > BR
> > Sergey
> > --
> > Sergey Petrunia, Lead Software Engineer
> > MySQL AB, www.mysql.com
> > Office: N/A
> > Blog: http://s.petrunia.net/blog
> >
> > --
> > MySQL Internals Mailing List
> > For list archives: http://lists.mysql.com/internals
> > To unsubscribe:
> > http://lists.mysql.com/internals?unsub=1
>
>
>
>