List:General Discussion« Previous MessageNext Message »
From:Jay Pipes Date:October 23 2007 12:50am
Subject:Re: Slow Subquery
View as plain text  
Indeed, as you say, Brent, correlated subqueries are not well-optimized 
in MySQL.  The specific subquery (the IN() subquery) demonstrated in the 
original post is, however, optimized in MySQL 6.0 :)

More comments inline.

Brent Baisley wrote:
> You are using a correlated subquery, which MySQL is terrible at. 
> Whenever you find yourself doing a correlated subquery, see if you can 
> switch it to a derived table with a join, which MySQL is far better at. 
> A derived table is like a "virtual" table you create on the fly. It's 
> very simple, just assign a name to your query and then treat it as if it 
> is a regular table.

Actually, in this case, no need for a derived table.  A simple join will 
suffice:

SELECT * FROM projects p
JOIN project_tags pt ON p.project_id = pt.project_id
JOIN tags t ON pt.tag_id = t.tag_id
WHERE tags.name='foo';

Make sure you've got indexes on p (project_id), pt (project_id, tag_id), 
t (name)

Cheers,

Jay

> So your query would look something like this:
> SELECT projects.* FROM projects
> JOIN (SELECT projects_tags.project_id FROM tags, projects_tags WHERE 
> tags.name='foo' AND projects_tags.project_id=projects.id) AS ptagids
> ON project.id=ptagids.project_id
> 
> Your IN has become a JOIN and mysql optimizes it far better.
> 
> On Oct 19, 2007, at 6:57 PM, Ryan Bates wrote:
> 
>> I'm trying to determine why a subquery is slower than running two 
>> separate queries. I have a simple many-to-many association using 3 
>> tables: projects, tags and projects_tags. Here's the query I'm using 
>> to find the projects with a given tag:
>>
>> SELECT * FROM projects WHERE id IN (SELECT projects_tags.project_id 
>> FROM tags, projects_tags WHERE tags.name='foo' AND 
>> projects_tags.project_id=projects.id);
>> (0.36 sec)
>>
>> Compare that with splitting it into two queries:
>>
>> SELECT projects_tags.project_id FROM tags, projects_tags WHERE 
>> tags.name='foo' AND projects_tags.project_id=projects.id
>> (0.00 sec) /* returns 1, 2, 3 */
>>
>> SELECT * FROM projects WHERE id IN (1, 2, 3);
>> (0.00 sec)
>>
>> Why is it so much faster? Looking at the explain statement (below) of 
>> the one with the subquery, it appears it's not using the primary key 
>> index on the projects table. Why is it that MySQL doesn't perform this 
>> simple optimization? And is there a solution that will allow me to 
>> still use a subquery?
>>
>> I realize I can use a join instead of a subquery, but this is a 
>> simplified example.
>>
>>
>> Here's the explain statement:
>>
>> *************************** 1. row ***************************
>>            id: 1
>>   select_type: PRIMARY
>>         table: projects
>>          type: ALL
>> possible_keys: NULL
>>           key: NULL
>>       key_len: NULL
>>           ref: NULL
>>          rows: 15433
>>         Extra: Using where
>> *************************** 2. row ***************************
>>            id: 2
>>   select_type: DEPENDENT SUBQUERY
>>         table: tags
>>          type: ref
>> possible_keys: PRIMARY,index_tags_on_name
>>           key: index_tags_on_name
>>       key_len: 258
>>           ref: const
>>          rows: 1
>>         Extra: Using where; Using index
>> *************************** 3. row ***************************
>>            id: 2
>>   select_type: DEPENDENT SUBQUERY
>>         table: projects_tags
>>          type: ref
>> possible_keys: tag_id
>>           key: tag_id
>>       key_len: 5
>>           ref: my_database.tags.id
>>          rows: 10
>>         Extra: Using where
>>
>>
>> Here's the table dumps:
>>
>> CREATE TABLE `projects` (
>>   `id` int(11) NOT NULL auto_increment,
>>   `name` varchar(255) default NULL,
>>   PRIMARY KEY  (`id`)
>> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
>>
>> CREATE TABLE `tags` (
>>   `id` int(11) NOT NULL auto_increment,
>>   `name` varchar(255) default NULL,
>>   PRIMARY KEY  (`id`),
>>   KEY `index_tags_on_name` (`name`)
>> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
>>
>> CREATE TABLE `projects_tags` (
>>   `project_id` int(11) default NULL,
>>   `tag_id` int(11) default NULL,
>>   KEY `tag_id` (`tag_id`)
>> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
>>
>>
>> I'm using MySQL 5.0.37. Thanks in advance.
>>
>> Ryan
>>
>> -- 
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe:    http://lists.mysql.com/mysql?unsub=1
>>
> 
> 

Thread
Slow SubqueryRyan Bates20 Oct
  • Re: Slow SubqueryPeter Brawley20 Oct
  • Re: Slow SubqueryBaron Schwartz20 Oct
  • Re: Slow SubqueryBrent Baisley23 Oct
    • Re: Slow SubqueryJay Pipes23 Oct