List:General Discussion« Previous MessageNext Message »
From:Daevid Vincent Date:February 23 2012 10:50pm
Subject:USING WHERE; USING TEMPORARY; USING filesort
View as plain text  
Anyone have any thoughts on how I might optimize this query? It takes about
2 seconds. I know that seems quick, but we get nearly 30,000 hits per second
and so if we can shave this down, it helps. Also we do use memcache, but
even with that, we still see this in the slow-log sometimes. I have indexes
on everything used in this query and even a compound one as you see in the
EXPLAIN.

I'm not going to lose sleep over it, but I thought if there was something
simple or a way to refactor I'd give it a shot. I thought changing the query
to use JOIN ON syntax would have helped, but it didn't do anything really?!
Also, this is all being used with PHP, so I'm fine with pulling things out
into two or three queries if you suggest it will make a faster difference.

-- old query:

SELECT sg.`scene_id`, COUNT(*) AS num FROM `scenes_list` AS s, `dvds` AS d,
`scenes_genres` AS sg, `scenes_downloads_new` AS sd WHERE sg.`genre_id` IN
('1','8','10','19','38','58','65') AND d.`dvd_id`=s.`dvd_id` AND
sg.`scene_id`=s.`scene_id` AND d.`status`='ok' AND
d.`date_release`!='0000-00-00' AND sd.`scene_id`=s.`scene_id` GROUP BY
sg.`scene_id` ORDER BY num DESC, sd.`count_dl_monthly` DESC LIMIT 30;

-- refactored: is this correct order?? Smallest table and most filters first
right to narrow the dataset as quick as possible?

-- EXPLAIN
SELECT 
        s.`scene_id`,
        COUNT(*) AS num 
FROM 	`dvds` AS d
	JOIN `scenes_list` AS s ON  d.`dvd_id` = s.`dvd_id`
		AND d.`date_release` != '0000-00-00' 
		AND d.`status` = 'ok' 
	JOIN `scenes_genres` AS sg ON sg.`scene_id` = s.`scene_id`
		AND sg.`genre_id` IN ('1', '8', '10', '19', '38', '58',
'65') 
	JOIN `scenes_downloads_new` AS sd ON sd.`scene_id` = s.`scene_id`
GROUP BY sg.`scene_id` 
ORDER BY num DESC, sd.`count_dl_monthly` DESC 
LIMIT 30
;

SELECT COUNT(*) FROM dvds; -- 12181
SELECT COUNT(*) FROM scenes_downloads_new; -- 66054
SELECT COUNT(*) FROM scenes_list; -- 67197
SELECT COUNT(*) FROM scenes_genres; -- 344272

-- why do I still hit the filesort and temp table? how can I get rid of
that?

    id  select_type  TABLE   TYPE    possible_keys
KEY             key_len  ref                          ROWS  Extra

------  -----------  ------  ------
------------------------------------------------------  --------------
-------  -------------------------  ------
----------------------------------------------
     1  SIMPLE       d       ref
PRIMARY,date_release,STATUS,status_release,dvd_release  status_release  1
const                        2436  USING WHERE; USING TEMPORARY; USING
filesort  
     1  SIMPLE       s       ref     PRIMARY,dvd_id_2,dvd_id
dvd_id          4        .d.dvd_id         6  USING WHERE

     1  SIMPLE       sd      eq_ref  PRIMARY
PRIMARY         3        .s.scene_id       1  USING WHERE

     1  SIMPLE       sg      ref     PRIMARY,scene_id,genre_id
scene_id        4        .s.scene_id       5  USING WHERE



If I take off the ORDER BY (just to see what happens), basically it's
exactly the same EXPLAIN output (wrong results of course), but it does add
"Using index" to the scene_id row (weird).

If I take off the GROUP BY (just to see what happens), basically it's
exactly the same EXPLAIN output (wrong results of course), but it does
remove "Using Filesort".

Taking them both off is optimal I suspect, and again, barring the fact the
results are wrong, it takes 1 second for the query. Should I read that to
mean, "it is what it is and that's the best I can expect from that
multi-join query"? Is there any benefit to splitting this up and if so, how
should I split it?

d

Thread
USING WHERE; USING TEMPORARY; USING filesortDaevid Vincent23 Feb
  • Re: USING WHERE; USING TEMPORARY; USING filesortJohan De Meersman24 Feb
  • Re: USING WHERE; USING TEMPORARY; USING filesortPerrin Harkins24 Feb