| List: | General Discussion | « Previous MessageNext Message » | |
| From: | Shawn Green (MySQL) | Date: | January 21 2011 9:01pm |
| Subject: | Re: Slow query on MySQL4 server doing simple inner join of two InnoDB tables | ||
| View as plain text | |||
On 1/21/2011 14:21, Kendall Gifford wrote: > Hello everyone, I've got a database on an old Fedora Core 4 server running > MySQL 4 (mysql-server.x86_64 4.1.12-2.FC4.1). The database in question has > just two (InnoDB) tables: > > messages (approx 2.5 million records) > recipients (approx 6.5 million records) > > These track information about email messages. Each message "has many" > recipient records. The structure of the two tables (omitting irrelevant data > fields) are as follows: > > > +-------------+------------------+------+-----+---------------------+----------------+ > | Field | Type | Null | Key | Default | > Extra | > > +-------------+------------------+------+-----+---------------------+----------------+ > | id | int(10) unsigned | | PRI | NULL | > auto_increment | > | sent_at | datetime | | MUL | 0000-00-00 00:00:00 > | | > | ......................... OTHER FIELDS OMITTED FOR BREVITY > ....................... | > > +-------------+------------------+------+-----+---------------------+----------------+ > > > +-------------+------------------+------+-----+---------------------+----------------+ > | Field | Type | Null | Key | Default | > Extra | > > +-------------+------------------+------+-----+---------------------+----------------+ > | id | int(10) unsigned | | PRI | NULL | > auto_increment | > | message_id | int(10) unsigned | | MUL | 0 > | | > | employee_id | int(10) unsigned | YES | MUL | NULL > | | > | ......................... OTHER FIELDS OMITTED FOR BREVITY > ....................... | > > +-------------+------------------+------+-----+---------------------+----------------+ > > I have the following query that is just too slow: > >> SELECT messages.* FROM messages >> INNER JOIN recipients ON recipients.message_id = messages.id >> WHERE recipients.employee_id = X >> GROUP BY messages.id >> ORDER BY sent_at DESC >> LIMIT 0, 25; > > This takes about 44 seconds on average. The query explanation is as follows: > > > +----+-------------+------------+--------+--------------------------+--------------+---------+-------------------------------------+--------+----------------------------------------------+ > | id | select_type | table | type | possible_keys | > key | key_len | ref | rows | > Extra | > > +----+-------------+------------+--------+--------------------------+--------------+---------+-------------------------------------+--------+----------------------------------------------+ > | 1 | SIMPLE | recipients | ref | messages_fk,employee_idx | > employee_idx | 5 | const | 222640 | > Using where; Using temporary; Using filesort | > | 1 | SIMPLE | messages | eq_ref | PRIMARY | > PRIMARY | 4 | email_archive.recipients.message_id | 1 > | | > > +----+-------------+------------+--------+--------------------------+--------------+---------+-------------------------------------+--------+----------------------------------------------+ > > I've been doing some searching on the web and have no idea if/how this can > be sped up. Most searches these days reference MySQL 5.x which I'm just not > sure how much applies. I'm hoping that there is something obvious that I'm > missing, or that one of you experts knows what I might be able to change to > speed this query up. > > Anyhow, thanks in advance for even so much as reading my message, let alone > replying :). > You need to get rid of the GROUP BY to make this go faster. You can do that by running two queries, one to pick the list of unique recipients.message_id values that match your where condition then another to actually retrieve the message data. Something like this CREATE TEMPORARY TABLE tmpMessages (message_id INT UNSIGNED, PRIMARY KEY (message_id)) ENGINE=MEMORY; INSERT IGNORE tmpMessages SELECT message_id FROM recipients WHERE employee_id = X; SELECT messages.* FROM messages INNER JOIN tmpMessages ON tmpMessages.message_id = messages.id ORDER BY sent_at DESC LIMIT 0, 25; By pre-selecting a limited set of message_id values from the recipients table, you seriously reduce the number of rows that need to be scanned. Also, the INSERT IGNORE technique is faster than the GROUP BY because it uses an index to identify any duplicates instead of a scan of all previous unique values. Please let us all know if this is faster enough. (and don't forget to drop the temp table once you are through using it) -- Shawn Green MySQL Principal Technical Support Engineer Oracle USA, Inc. Office: Blountville, TN
