MySQL Lists are EOL. Please join:

List:General Discussion« Previous MessageNext Message »
From:Mariella Petrini Date:September 10 2008 4:16am
Subject:SELECT DISTINCT with ORDER BY implementation
View as plain text  
Hello,

I would like to ask a couple of questions related to SELECT DISTINCT and ORDER BY in MySQL
5.0.x and 5.1.x

1)
Given a table 

CREATE TABLE `sort_table` (
  `id` int(10) unsigned NOT NULL,
  `sort` datetime NOT NULL,
  `info1` int(10) unsigned NOT NULL,
  `info2` int(10) unsigned NOT NULL,
  `info3` int(10) unsigned NOT NULL,
  KEY `info1` (`info1`,`info2`,`info3`),
  KEY `info4` (`info1`,`info2`,`info3`,`sort`)
) ENGINE=... DEFAULT CHARSET=utf8;

and executing 

SELECT DISTINCT (id) from sort_table WHERE info1=.. and info2=... and info3=...AND (sort
>= '...') and (sort <= '...') ORDER BY sort DESC

Is that query always implemented as (like )

SELECT id from sort_table WHERE info1=.. and info2=... and info3=...AND (sort >= '...')
and (sort <= '...') 
GROUP BY id ORDER BY sort DESC


2)
The sort_table stats are up-to-date (ANALYZE (and OPTIMIZE) command has been run)
When executing the following query

select DISTINCT(id)  from (select id from sort_table  where info1=.. AND info2=.. and
info3=.. and  (sort >='.....' AND sort <= '.....') ORDER BY sort DESC) t;

a)

If the index info4 (`info1`,`info2`,`info3`,`sort`) is chosen
then the bahaviour of the SIMPLE Query is identical to the DERIVED Query
(select_type=DERIVED)

explain select id from sort_table FORCE INDEX (info4)  where info1=... AND info2=... and
info3=... and  (sort >='.....' AND sort <= '......') ORDER BY sort DESC;
+----+-------------+------------+-------+---------------+-------+---------+------+------+-------------+
| id | select_type | table      | type  | possible_keys | key   | key_len | ref  | rows |
Extra       |
+----+-------------+------------+-------+---------------+-------+---------+------+------+-------------+
|  1 | SIMPLE      | sort_table | range | info4         | info4 | 20      | NULL |   16 |
Using where | 
+----+-------------+------------+-------+---------------+-------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> explain select DISTINCT(id)  from (select id from sort_table FORCE INDEX (info4)
>  where info1=... AND info2=.. and info3=... and  (sort >='......' AND sort <=
> '......') ORDER BY sort DESC) t;
+----+-------------+------------+-------+---------------+-------+---------+------+------+-----------------+
| id | select_type | table      | type  | possible_keys | key   | key_len | ref  | rows |
Extra           |
+----+-------------+------------+-------+---------------+-------+---------+------+------+-----------------+
|  1 | PRIMARY     | <derived2> | ALL   | NULL          | NULL  | NULL    | NULL |  
16 | Using temporary | 
|  2 | DERIVED     | sort_table | range | info4         | info4 | 20      | NULL |   16 |
Using where     | 
+----+-------------+------------+-------+---------------+-------+---------+------+------+-----------------+


b)
In the case the index "info1" is chosen
select DISTINCT(id)  from (select id from sort_table FORCE INDEX (info1)  where info1=..
AND info2=.. and info3=.. and  (sort >='.....' AND sort <= '.....') ORDER BY sort
DESC) t;

The EXPLAIN shows
+----+-------------+------------+------+---------------+-------+---------+------+------+-----------------+
| id | select_type | table      | type | possible_keys | key   | key_len | ref  | rows |
Extra           |
+----+-------------+------------+------+---------------+-------+---------+------+------+-----------------+
|  1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL  | NULL    | NULL |  
16 | Using temporary | 
|  2 | DERIVED     | sort_table | ALL  | info1         | info1 | 12      |      |   16 |
Using filesort  | 
+----+-------------+------------+------+---------------+-------+---------+------+------+-----------------+

In this case the "key" shows up as info1, but the type="ALL"

while the behaviour in a SIMPLE QUERY would be (type=ref)

EXPLAIN select id from sort_table FORCE INDEX (info1)  where info1=.. AND info2=.. and
info3=.. and  (sort >='.....' AND sort <= '.....') ORDER BY sort DESC

+----+-------------+------------+------+---------------+-------+---------+-------------------+------+-----------------------------+
| id | select_type | table      | type | possible_keys | key   | key_len | ref            
  | rows | Extra                       |
+----+-------------+------------+------+---------------+-------+---------+-------------------+------+-----------------------------+
|  1 | SIMPLE      | sort_table | ref  | info1         | info1 | 12      |
const,const,const |   16 | Using where; Using filesort | 
+----+-------------+------------+------+---------------+-------+---------+-------------------+------+-----------------------------+


Is there any way to force in b) case the same behaviour (optimization) of the SIMPLE query
in the DERIVED query (select_type=DERIVED)?
Is there a better way of implementing case b) (without using case a)) ?

Thanks in advance for your help,

Mariella







      
Thread
SELECT DISTINCT with ORDER BY implementationMariella Petrini10 Sep
  • RE: SELECT DISTINCT with ORDER BY implementationRick James10 Sep
    • RE: SELECT DISTINCT with ORDER BY implementationMariella Petrini10 Sep
RE: SELECT DISTINCT with ORDER BY implementationMariella Petrini11 Sep
  • Re: SELECT DISTINCT with ORDER BY implementationSergey Petrunia12 Sep
RE: SELECT DISTINCT with ORDER BY implementationMariella Petrini11 Sep