List:General Discussion« Previous MessageNext Message »
From:Dan Nelson Date:June 19 2009 6:18am
Subject:Re: How to Optimize distinct with index
View as plain text  
Please don't change the subject on someone else's thread.  Next time, post a
new message instead of hitting "reply" on an unrelated message.

In the last episode (Jun 19):
> Hi,
> 	I have a sql : 
> 		Select distinct user_id from user where key1=value and
> key2=value2 and key3=value2;
> 
> I add index on (key1,key2,key3,user_id), this sql use temporary table
> however.  I have thousands of queries per second.  How to optimize it?

Because of the "distinct" clause, mysql has to remember all of the "user_id"
values during the query so it can remove duplicates.  You do have an index
containing all of your fields, which certainly helps performance, but
because the index is sorted with user_id last, mysql can't use that index to
perform the "distinct" operation.  Imagine your query returns 1000 rows with
998 unique usernames that happen to have key1 values from 2..999, and one
duplicate username that happens to have rows with key1=1 and key1=1000. 
Because it's using the index to fetch data, rows will be sorted by key1, and
the duplicate name will be in the first and last rows.  Mysql needs to store
the names in a temporary table to be able to remove the duplicates.

Now, if user_id were first, mysql could use it directly to remove duplicates
(since it would see duplicate names next to each other), but it wouldn't be
able to use that index in your where clause..  :( You can't win in this
case.

Luckily, temporary tables aren't bad as long as they are small and mysql
doesn't have to write them to disk.  To be sure, run "show status like
'created_%'" before and after a query and see if the Created_tmp_disk_tables
number increases.  As long as the temp tables stay in RAM, your query will
be efficient.

http://dev.mysql.com/doc/refman/5.1/en/server-status-variables.html#statvar_Created_tmp_disk_tables

> Anthoer question:
>
> Select * from user where user_id in(id1,id2,id3,id4,.....) order by
> user_id; I add index on user_id, but after in, order use temporary table,
> How to optimize it?

Mysql should have been able to use the index here, I think.  Please post the
output of "create table user", a sample query, and its EXPLAIN output.

-- 
	Dan Nelson
	dnelson@stripped
Thread
Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Matt Neimeyer19 Jun
  • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Johnny Withers19 Jun
    • How to Optimize distinct with index周彦伟19 Jun
  • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Dan Nelson19 Jun
    • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Matt Neimeyer19 Jun
  • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Brent Baisley19 Jun
    • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Matt Neimeyer22 Jun
  • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Peter Brawley20 Jun
    • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Walter Heck - OlinData.com20 Jun
Re: How to Optimize distinct with indexDan Nelson19 Jun
Re: How to Optimize distinct with indexDarryle Steplight19 Jun
  • Re: How to Optimize distinct with indexMoon's Father26 Jun