List:General Discussion« Previous MessageNext Message »
From:Ciaran Lee Date:September 24 2009 9:56pm
Subject:Re: query optimization question (my struggle against 'using
temporary; using filesort')
View as plain text  
Hi Gavin,Thanks very much, I'll implement as many of your suggestions as
possible. The varchar(255)'s are inexcusable and I feel suitably ashamed :)
The queries were generated by ActiveRecord (an ORM library for Ruby),
although even if I had written them myself they would probably not be much
better.
Regards,
Ciaran Lee

2009/9/24 Gavin Towey <gtowey@stripped>

> Hi Ciaran,
>
> So I think there's a couple things going on:
>
> 1. The explain plan for your "slow" query looks wrong, such as mysql is
> confused.  It's possible your index statistics are incorrect.  Try ANALYZE
> TABLE  on listings and addresses.
>
> I think a sure way to fix it is to add STRAIGHT_JOIN to force the join
> order.  That should get rid of the temp table and filesort operations and
> give faster results.
>
> SELECT
>  STRAIGHT_JOIN
>  listings.*, addresses.*
>  FROM
>    `listings`
>     JOIN
>      `addresses` ON `addresses`.addressable_id = `listings`.id
>      AND
>      `addresses`.addressable_type = 'Listing'
>  WHERE (addresses.parent_city_id = 3)
>  ORDER BY listings.id DESC LIMIT 1
>
>
>
> 2. I need to make some comments about your table design:
>
> This column is AWFUL:
> `addressable_type` varchar(255) DEFAULT NULL,
>
> Why have field that hold up to 255 characters and put a little string in it
> like "Listing?" Why does it matter?  Well it makes your indexes
> disasterously bloated:
>
> KEY `index_addresses_on_parent_city_id_and_addressable_type`
> (`parent_city_id`,`addressable_type`),
>
>
> If you noticed in the explain plan, that index is 733 bytes *per row*.
>  Especially using utf8 means each character takes 3 bytes in the index.
>  That's terrible. That type field should be a foreign key tinyint or at the
> very least be a much much shorter varchar field (such as 8 or 10)
>
> You have lots of those varchar(255) fields, which looks like lazy design --
> by not gathering correct requirements and designing accordingly you will
> hurt your database performance, waste disk space and cause yourself all
> kinds of future problems.
>
> 3.  Why are you using OUTER JOIN?
>
> It looks to me like you're using it because you don't know the difference,
> since you're not looking for NULL rows or anything.  In fact, it looks like
> mysql is smart enough to know that you've negated the OUTER JOIN by putting
> conditions on the joined tables in the WHERE clause, and convert then to
> INNER JOINS.  Don't rely on that!  Use the correct join type.
>
> Those queries
> Regards,
> Gavin Towey
>
> -----Original Message-----
> From: Ciaran Lee [mailto:ciaran.lee@stripped]
> Sent: Tuesday, September 22, 2009 1:32 PM
> To: mysql@stripped
> Subject: query optimization question (my struggle against 'using temporary;
> using filesort')
>
> Hi,
>
> I hope this is the right place to ask a question about query optimization.
>
> Background:
> I have a database which has events, which occur in places (listings).
> Places
> have addresses, and addresses belong to a city. I can select the latest
> event within a particular city very efficiently (less than 2ms), but
> selecting the latest listing within a city is REALLY slow (10-20 seconds)
> despite being almost a subset of the event query.
>
> I have been working on this for about a day, and have tried all sorts of
> tweaks to the indexes but to no avail. I always seem to end up with 'using
> temporary; using filesort' as the 'extra' content in the explain result. If
> anyone has a suggestion for what I might do to fix this, I'd really
> appreciate it. If not, I could further de-normalize the database for
> performance reasons, but I would feel dirty for doing so.
>
> Here is the fast query (select the latest event within a particular city),
> and it's explain.
> SELECT
>  events.*, listings.*, addresses.*
>  FROM
>    `events`
>    LEFT OUTER JOIN
>      `listings` ON `listings`.id = `events`.listing_id
>    LEFT OUTER JOIN
>      `addresses` ON `addresses`.addressable_id = `listings`.id
>      AND
>      `addresses`.addressable_type = 'Listing'
>  WHERE (addresses.parent_city_id = 3)
>  ORDER BY events.id DESC LIMIT 1
>
>
> 
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------+---------+------------------------------+------+-------------+
>  | id | select_type | table     | type   | possible_keys
>
>                           | key
>       | key_len | ref                          | rows | Extra       |
>
> 
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------+---------+------------------------------+------+-------------+
>  |  1 | SIMPLE      | events    | index  | index_events_on_listing_id
>
>                          | PRIMARY
>       | 4       | NULL                         |    1 |             |
>  |  1 | SIMPLE      | listings  | eq_ref | PRIMARY
>
>                           | PRIMARY
>       | 4       | ratemyarea.events.listing_id |    1 | Using where |
>  |  1 | SIMPLE      | addresses | ref    |
>
>
> index_addresses_on_parent_city_id_and_addressable_type,index_addresses_on_addressable_type_and_addressable_id,addressable_id_type_city
> | index_addresses_on_addressable_type_and_addressable_id | 773     | const,
> ratemyarea.listings.id |    1 | Using where |
>
> 
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------+---------+------------------------------+------+-------------+
>
>
> Here is the slow query (select the latest listing within a particular
> city),
> and it's explain
> SELECT
>  listings.*, addresses.*
>  FROM
>    `listings`
>    LEFT OUTER JOIN
>      `addresses` ON `addresses`.addressable_id = `listings`.id
>      AND
>      `addresses`.addressable_type = 'Listing'
>  WHERE (addresses.parent_city_id = 3)
>  ORDER BY listings.id DESC LIMIT 1
>
>
> 
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------+---------+-------------------------------------+-------+----------------------------------------------+
>  | id | select_type | table     | type   | possible_keys
>
>                           | key                      | key_len | ref
>                          | rows  | Extra
>     |
>
> 
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------+---------+-------------------------------------+-------+----------------------------------------------+
>  |  1 | SIMPLE      | addresses | ref    |
>
>
> index_addresses_on_parent_city_id_and_addressable_type,index_addresses_on_addressable_type_and_addressable_id,addressable_id_type_city
> | addressable_id_type_city | 773     | const,const
> |
> 25680 | Using where; Using temporary; Using filesort |
>  |  1 | SIMPLE      | listings  | eq_ref | PRIMARY
>
>                           | PRIMARY                  | 4       |
> ratemyarea.addresses.addressable_id |     1 |
>               |
>
> 
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------+---------+-------------------------------------+-------+----------------------------------------------+
>
>
> Here is the structure of the tables:
>
> CREATE TABLE `addresses` (
>  `id` int(11) NOT NULL AUTO_INCREMENT,
>  `postal_code` varchar(255) DEFAULT NULL,
>  `addressable_type` varchar(255) DEFAULT NULL,
>  `addressable_id` int(11) DEFAULT NULL,
>  `parent_city_id` int(11) DEFAULT NULL,
>  `lat` decimal(15,10) DEFAULT NULL,
>  `lng` decimal(15,10) DEFAULT NULL,
>  PRIMARY KEY (`id`),
>  KEY `index_addresses_on_lat_and_lng` (`lat`,`lng`),
>  KEY `index_addresses_on_parent_city_id_and_addressable_type`
> (`parent_city_id`,`addressable_type`),
>  KEY `index_addresses_on_addressable_type_and_addressable_id`
> (`addressable_type`,`addressable_id`)
> ) ENGINE=InnoDB AUTO_INCREMENT=120513 DEFAULT CHARSET=utf8;
>
>
>
> # Dump of table cities
> # ------------------------------------------------------------
>
> CREATE TABLE `cities` (
>  `id` int(11) NOT NULL AUTO_INCREMENT,
>  `name` varchar(255) NOT NULL,
>  `country_id` int(11) NOT NULL,
>  `lat` decimal(15,10) NOT NULL,
>  `lng` decimal(15,10) NOT NULL,
>  PRIMARY KEY (`id`),
>  KEY `index_cities_on_name` (`name`),
>  KEY `index_cities_on_country_id` (`country_id`),
>  KEY `index_cities_on_lat_and_lng` (`lat`,`lng`)
> ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;
>
>
>
> # Dump of table events
> # ------------------------------------------------------------
>
> CREATE TABLE `events` (
>  `id` int(11) NOT NULL AUTO_INCREMENT,
>  `listing_id` int(11) NOT NULL,
>  `description` text,
>  `title` varchar(150) NOT NULL,
>  PRIMARY KEY (`id`),
>  KEY `index_events_on_title` (`title`),
>  KEY `index_events_on_listing_id` (`listing_id`)
> ) ENGINE=InnoDB AUTO_INCREMENT=6359 DEFAULT CHARSET=utf8;
>
>
>
> # Dump of table listings
> # ------------------------------------------------------------
>
> CREATE TABLE `listings` (
>  `id` int(11) NOT NULL AUTO_INCREMENT,
>  `name` varchar(255) DEFAULT NULL,
>  `latitude` decimal(15,10) DEFAULT NULL,
>  `longitude` decimal(15,10) DEFAULT NULL,
>  `description` text,
>  PRIMARY KEY (`id`),
>  KEY `index_listings_on_place` (`name`),
>  KEY `index_listings_on_latitude_and_longitude` (`latitude`,`longitude`)
> ) ENGINE=InnoDB AUTO_INCREMENT=109358 DEFAULT CHARSET=utf8;
>
>
>
> # Dump of table users
> # ------------------------------------------------------------
>
> CREATE TABLE `users` (
>  `id` int(11) NOT NULL AUTO_INCREMENT,
>  `email` varchar(255) DEFAULT NULL,
>  PRIMARY KEY (`id`),
>  KEY `index_users_on_email` (`email`),
>  KEY `index_users_on_latitude_and_longitude` (`latitude`,`longitude`)
> ) ENGINE=InnoDB AUTO_INCREMENT=1972 DEFAULT CHARSET=utf8;
>
> The information contained in this transmission may contain privileged and
> confidential information. It is intended only for the use of the person(s)
> named above. If you are not the intended recipient, you are hereby notified
> that any review, dissemination, distribution or duplication of this
> communication is strictly prohibited. If you are not the intended recipient,
> please contact the sender by reply email and destroy all copies of the
> original message.
>

Thread
query optimization question (my struggle against 'using temporary; using filesort')Ciaran Lee22 Sep
  • RE: query optimization question (my struggle against 'usingtemporary; using filesort')Gavin Towey24 Sep
    • Re: query optimization question (my struggle against 'using temporary; using filesort')Ciaran Lee24 Sep