List:General Discussion« Previous MessageNext Message »
From:Rick James Date:September 15 2008 4:44pm
Subject:RE: SELECT DISTINCT with ORDER BY implementation
View as plain text  
Eh?  Don't you need two SELECTs to "Retrieve ... only one id ... and
list the unique ids ..."?

 
Rick James
MySQL Geeks - Consulting & Review

 

> -----Original Message-----
> From: Mariella Petrini [mailto:mariellapetrini@stripped] 
> Sent: Sunday, September 14, 2008 8:04 PM
> To: Sergey Petrunia; internals@stripped; Rick James; 
> mysql@stripped
> Subject: Re: SELECT DISTINCT with ORDER BY implementation
> 
> Hi,
> 
> Thanks a lot for your answers.
> 
> 
> > 
> >   SELECT ... FROM (SELECT ... ORDER BY x) tbl
> > 
> > With MySQL, the inner ORDER BY just adds overhead, because
> > execution
> > proceeds as follows:
> > 
> >  * Run the inner query, making it to produce result in the
> > specified
> >    ordering. Dump the result into a temporaty table.
> >  * Run the upper query, using the temporary table as a
> > regular base table,
> >    in particular assuming that the data in the temporary
> > table is not
> >    ordered.
> > 
> > That is, the ORDER BY just causes overhead and no benefit.
> 
> Did I understand correctly that the ORDER BY  executed in the 
> inner query could get lost when the "result set" is dumped 
> into the temporary table to be used by the upper query ? 
> 
> 
> 
> My original intent was the following:
> 
> Retrieve from the "sort" table only one id (several rows with 
> the same id match the WHERE condition)
> that would match the WHERE criteria and list the unique ids 
> sorted by the "sort" field in DESC mode
> 
> 
> WHERE condition is :
> WHERE info1=7 AND info2=7 and info3=7 and 
> (sort >='2008-10-10' AND sort <= '2008-11-11') .... ) 
> 
> SORT is :
> ORDER BY sort DESC
> 
> 
> 
> 
> 
> 
> 1)
> The following query does not retrieve what I would like:
> 
> SELECT DISTINCT (id) from sort_table WHERE info1=7 AND 
> info2=7 and info3=7 and  (sort >='2008-10-10' AND sort <= 
> '2008-11-11') ORDER BY sort DESC)
> because in the case of DISTINCT the execution of DISTINCT is 
> run before of the ORDER BY (so the unique rows are not 
> ordered by sort field)
> 
> 
> 2)
> The following query it seems to retrieve only one row  that 
> matches the WHERE condition and the unique rows (unique ids) 
> are ordered by the sort field DESC
> 
> 
> select DISTINCT(id) from (select id from
> sort_table FORCE INDEX
> (info1) where info1=7 AND info2=7 and info3=7 and  (sort 
> >='2008-10-10' AND sort <= '2008-11-11') ORDER BY sort DESC) as t
> 
> 
> 
> 3) If there is a better way (more efficient and possibly 
> without changing the structure of the table and possibly the index) 
> to get the list of the unique ids 
> that satisfy the WHERE condition and get them ordered by the 
> 'sort' field DESC, I would be happy to follow it
> 
> 
> P.S. The index "info" is a combined index on (info1,info2,info3)
> P.S. The table contains one extra field (info4) that is listed 
>      in the WHERE condition as well
>      So in reality the WHERE conditions is a little longer 
> and looks like:
> 
> 
> WHERE info1=7 AND info2=7 and info3=7 and  (sort 
> >='2008-10-10' AND sort <= '2008-11-11') and (info4 >= ... 
> and info4 <=...) ORDER BY sort DESC
> 
> 
> 
> Thanks a lot for your help,
> 
> Mariella
> 
> 
> --- On Fri, 9/12/08, Sergey Petrunia <spetrunia@stripped> wrote:
> 
> > From: Sergey Petrunia <spetrunia@stripped>
> > Subject: Re: SELECT DISTINCT with ORDER BY implementation
> > To: mariellapetrini@stripped
> > Cc: internals@stripped, "Rick James" 
> <rjames@stripped>, mysql@stripped
> > Date: Friday, September 12, 2008, 1:24 AM
> > Hi Mariella,
> > 
> > Short answers:
> >  * you're seeing an EXPLAIN bug
> >  * FROM (SELECT ... ORDER BY) just adds overhead.
> > 
> > Long answer:
> > 
> > #
> > #  Filling the table
> > #
> > create table t0 (a int);
> > insert into t0 values
> > (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
> > # insert 100K rows
> > insert into sort_table select A.a, now(), B.a, C.a, D.a
> > from t0 A, t0
> > B, t0 C, t0 D, t0 E;
> > 
> > #
> > # Pick the constants such that conditions on info(1,2,3)
> > match 100 records,
> > # and condition on sort matches nothing.
> > #
> > mysql> explain select DISTINCT(id) from (select id from
> > sort_table
> > FORCE INDEX (info1) where info1=7 AND info2=7 and info3=7
> > and  (sort
> > >='2008-10-10' AND sort <=
> > '2008-11-11') ORDER BY sort DESC) t\G
> > *************************** 1. row
> > ***************************
> >            id: 1
> >   select_type: PRIMARY
> >         table: <derived2>
> >          type: system
> > possible_keys: NULL
> >           key: NULL
> >       key_len: NULL
> >           ref: NULL
> >          rows: 0
> >         Extra: const row not found
> > *************************** 2. row
> > ***************************
> >            id: 2
> >   select_type: DERIVED
> >         table: sort_table
> >          type: ALL
> > possible_keys: info1
> >           key: info1
> >       key_len: 12
> >           ref:
> >          rows: 99
> >         Extra: Using filesort
> > 2 rows in set (0.01 sec)
> > 
> > #
> > # Now analyze the counters:
> > #
> > mysql> flush status;
> > Query OK, 0 rows affected (0.01 sec)
> > 
> > mysql> select DISTINCT(id) from (select id from
> > sort_table FORCE INDEX
> > (info1) where info1=7 AND info2=7 and info3=7 and  (sort
> > >='2008-10-10' AND sort <=
> > '2008-11-11') ORDER BY sort DESC) t\G
> > Empty set (0.01 sec)
> > 
> > mysql> show status like 'handler%';
> > +----------------------------+-------+
> > | Variable_name              | Value |
> > +----------------------------+-------+
> > | Handler_commit             | 0     |
> > | Handler_delete             | 0     |
> > | Handler_discover           | 0     |
> > | Handler_prepare            | 0     |
> > | Handler_read_first         | 1     |
> > | Handler_read_key           | 1     |
> > | Handler_read_next          | 100   | ---(1)
> > | Handler_read_prev          | 0     |
> > | Handler_read_rnd           | 0     |
> > | Handler_read_rnd_next      | 1     | ---(2)
> > | Handler_rollback           | 0     |
> > | Handler_savepoint          | 0     |
> > | Handler_savepoint_rollback | 0     |
> > | Handler_update             | 0     |
> > | Handler_write              | 14    |
> > +----------------------------+-------+
> > 15 rows in set (0.00 sec)
> > 
> > When we look at (1) and (2) it is a apparent that
> > * Full table scan wasn't done. If it was, we'd see
> > 100K reads
> > * Instead it did an index scan, and got 100 records. This
> > is exactly how
> >   many records the range scan over "info1=7 AND
> > info2=7 and info3=7" would
> >   produce.
> > This leads us to conclude that this is an EXPLAIN bug.
> > I'm going to file
> > this.
> > 
> > 
> > Btw, what was your intent in having a query in form
> > 
> >   SELECT ... FROM (SELECT ... ORDER BY x) tbl
> > 
> > With MySQL, the inner ORDER BY just adds overhead, because
> > execution
> > proceeds as follows:
> > 
> >  * Run the inner query, making it to produce result in the
> > specified
> >    ordering. Dump the result into a temporaty table.
> >  * Run the upper query, using the temporary table as a
> > regular base table,
> >    in particular assuming that the data in the temporary
> > table is not
> >    ordered.
> > 
> > That is, the ORDER BY just causes overhead and no benefit.
> > 
> > BR
> >  Sergey
> > --
> > Sergey Petrunia, Lead Software Engineer
> > MySQL AB, www.mysql.com
> > Office: N/A
> > Blog: http://s.petrunia.net/blog
> > 
> > -- 
> > MySQL Internals Mailing List
> > For list archives: http://lists.mysql.com/internals
> > To unsubscribe:   
> > http://lists.mysql.com/internals?unsub=1
> 
> 
>       
> 
Thread
SELECT DISTINCT with ORDER BY implementationMariella Petrini10 Sep
  • RE: SELECT DISTINCT with ORDER BY implementationRick James10 Sep
    • RE: SELECT DISTINCT with ORDER BY implementationMariella Petrini10 Sep
RE: SELECT DISTINCT with ORDER BY implementationMariella Petrini11 Sep
  • Re: SELECT DISTINCT with ORDER BY implementationSergey Petrunia12 Sep
RE: SELECT DISTINCT with ORDER BY implementationMariella Petrini11 Sep