List:General Discussion« Previous MessageNext Message »
From:Sergey Petrunia Date:September 12 2008 8:24am
Subject:Re: SELECT DISTINCT with ORDER BY implementation
View as plain text  
Hi Mariella,

Short answers:
 * you're seeing an EXPLAIN bug
 * FROM (SELECT ... ORDER BY) just adds overhead.

Long answer:

#
#  Filling the table
#
create table t0 (a int);
insert into t0 values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
# insert 100K rows
insert into sort_table select A.a, now(), B.a, C.a, D.a from t0 A, t0
B, t0 C, t0 D, t0 E;

#
# Pick the constants such that conditions on info(1,2,3) match 100 records,
# and condition on sort matches nothing.
#
mysql> explain select DISTINCT(id) from (select id from sort_table
FORCE INDEX (info1) where info1=7 AND info2=7 and info3=7 and  (sort
>='2008-10-10' AND sort <= '2008-11-11') ORDER BY sort DESC) t\G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: system
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 0
        Extra: const row not found
*************************** 2. row ***************************
           id: 2
  select_type: DERIVED
        table: sort_table
         type: ALL
possible_keys: info1
          key: info1
      key_len: 12
          ref:
         rows: 99
        Extra: Using filesort
2 rows in set (0.01 sec)

#
# Now analyze the counters:
#
mysql> flush status;
Query OK, 0 rows affected (0.01 sec)

mysql> select DISTINCT(id) from (select id from sort_table FORCE INDEX
(info1) where info1=7 AND info2=7 and info3=7 and  (sort
>='2008-10-10' AND sort <= '2008-11-11') ORDER BY sort DESC) t\G
Empty set (0.01 sec)

mysql> show status like 'handler%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 0     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 1     |
| Handler_read_key           | 1     |
| Handler_read_next          | 100   | ---(1)
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 1     | ---(2)
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 0     |
| Handler_write              | 14    |
+----------------------------+-------+
15 rows in set (0.00 sec)

When we look at (1) and (2) it is a apparent that
* Full table scan wasn't done. If it was, we'd see 100K reads
* Instead it did an index scan, and got 100 records. This is exactly how
  many records the range scan over "info1=7 AND info2=7 and info3=7" would
  produce.
This leads us to conclude that this is an EXPLAIN bug. I'm going to file
this.


Btw, what was your intent in having a query in form

  SELECT ... FROM (SELECT ... ORDER BY x) tbl

With MySQL, the inner ORDER BY just adds overhead, because execution
proceeds as follows:

 * Run the inner query, making it to produce result in the specified
   ordering. Dump the result into a temporaty table.
 * Run the upper query, using the temporary table as a regular base table,
   in particular assuming that the data in the temporary table is not
   ordered.

That is, the ORDER BY just causes overhead and no benefit.

BR
 Sergey
--
Sergey Petrunia, Lead Software Engineer
MySQL AB, www.mysql.com
Office: N/A
Blog: http://s.petrunia.net/blog
Thread
SELECT DISTINCT with ORDER BY implementationMariella Petrini10 Sep
  • RE: SELECT DISTINCT with ORDER BY implementationRick James10 Sep
    • RE: SELECT DISTINCT with ORDER BY implementationMariella Petrini10 Sep
RE: SELECT DISTINCT with ORDER BY implementationMariella Petrini11 Sep
  • Re: SELECT DISTINCT with ORDER BY implementationSergey Petrunia12 Sep
RE: SELECT DISTINCT with ORDER BY implementationMariella Petrini11 Sep