Hi Gavin,Thanks very much, I'll implement as many of your suggestions as
possible. The varchar(255)'s are inexcusable and I feel suitably ashamed :)
The queries were generated by ActiveRecord (an ORM library for Ruby),
although even if I had written them myself they would probably not be much
better.
Regards,
Ciaran Lee
2009/9/24 Gavin Towey <gtowey@stripped>
> Hi Ciaran,
>
> So I think there's a couple things going on:
>
> 1. The explain plan for your "slow" query looks wrong, such as mysql is
> confused. It's possible your index statistics are incorrect. Try ANALYZE
> TABLE on listings and addresses.
>
> I think a sure way to fix it is to add STRAIGHT_JOIN to force the join
> order. That should get rid of the temp table and filesort operations and
> give faster results.
>
> SELECT
> STRAIGHT_JOIN
> listings.*, addresses.*
> FROM
> `listings`
> JOIN
> `addresses` ON `addresses`.addressable_id = `listings`.id
> AND
> `addresses`.addressable_type = 'Listing'
> WHERE (addresses.parent_city_id = 3)
> ORDER BY listings.id DESC LIMIT 1
>
>
>
> 2. I need to make some comments about your table design:
>
> This column is AWFUL:
> `addressable_type` varchar(255) DEFAULT NULL,
>
> Why have field that hold up to 255 characters and put a little string in it
> like "Listing?" Why does it matter? Well it makes your indexes
> disasterously bloated:
>
> KEY `index_addresses_on_parent_city_id_and_addressable_type`
> (`parent_city_id`,`addressable_type`),
>
>
> If you noticed in the explain plan, that index is 733 bytes *per row*.
> Especially using utf8 means each character takes 3 bytes in the index.
> That's terrible. That type field should be a foreign key tinyint or at the
> very least be a much much shorter varchar field (such as 8 or 10)
>
> You have lots of those varchar(255) fields, which looks like lazy design --
> by not gathering correct requirements and designing accordingly you will
> hurt your database performance, waste disk space and cause yourself all
> kinds of future problems.
>
> 3. Why are you using OUTER JOIN?
>
> It looks to me like you're using it because you don't know the difference,
> since you're not looking for NULL rows or anything. In fact, it looks like
> mysql is smart enough to know that you've negated the OUTER JOIN by putting
> conditions on the joined tables in the WHERE clause, and convert then to
> INNER JOINS. Don't rely on that! Use the correct join type.
>
> Those queries
> Regards,
> Gavin Towey
>
> -----Original Message-----
> From: Ciaran Lee [mailto:ciaran.lee@stripped]
> Sent: Tuesday, September 22, 2009 1:32 PM
> To: mysql@stripped
> Subject: query optimization question (my struggle against 'using temporary;
> using filesort')
>
> Hi,
>
> I hope this is the right place to ask a question about query optimization.
>
> Background:
> I have a database which has events, which occur in places (listings).
> Places
> have addresses, and addresses belong to a city. I can select the latest
> event within a particular city very efficiently (less than 2ms), but
> selecting the latest listing within a city is REALLY slow (10-20 seconds)
> despite being almost a subset of the event query.
>
> I have been working on this for about a day, and have tried all sorts of
> tweaks to the indexes but to no avail. I always seem to end up with 'using
> temporary; using filesort' as the 'extra' content in the explain result. If
> anyone has a suggestion for what I might do to fix this, I'd really
> appreciate it. If not, I could further de-normalize the database for
> performance reasons, but I would feel dirty for doing so.
>
> Here is the fast query (select the latest event within a particular city),
> and it's explain.
> SELECT
> events.*, listings.*, addresses.*
> FROM
> `events`
> LEFT OUTER JOIN
> `listings` ON `listings`.id = `events`.listing_id
> LEFT OUTER JOIN
> `addresses` ON `addresses`.addressable_id = `listings`.id
> AND
> `addresses`.addressable_type = 'Listing'
> WHERE (addresses.parent_city_id = 3)
> ORDER BY events.id DESC LIMIT 1
>
>
>
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------+---------+------------------------------+------+-------------+
> | id | select_type | table | type | possible_keys
>
> | key
> | key_len | ref | rows | Extra |
>
>
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------+---------+------------------------------+------+-------------+
> | 1 | SIMPLE | events | index | index_events_on_listing_id
>
> | PRIMARY
> | 4 | NULL | 1 | |
> | 1 | SIMPLE | listings | eq_ref | PRIMARY
>
> | PRIMARY
> | 4 | ratemyarea.events.listing_id | 1 | Using where |
> | 1 | SIMPLE | addresses | ref |
>
>
> index_addresses_on_parent_city_id_and_addressable_type,index_addresses_on_addressable_type_and_addressable_id,addressable_id_type_city
> | index_addresses_on_addressable_type_and_addressable_id | 773 | const,
> ratemyarea.listings.id | 1 | Using where |
>
>
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------+---------+------------------------------+------+-------------+
>
>
> Here is the slow query (select the latest listing within a particular
> city),
> and it's explain
> SELECT
> listings.*, addresses.*
> FROM
> `listings`
> LEFT OUTER JOIN
> `addresses` ON `addresses`.addressable_id = `listings`.id
> AND
> `addresses`.addressable_type = 'Listing'
> WHERE (addresses.parent_city_id = 3)
> ORDER BY listings.id DESC LIMIT 1
>
>
>
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------+---------+-------------------------------------+-------+----------------------------------------------+
> | id | select_type | table | type | possible_keys
>
> | key | key_len | ref
> | rows | Extra
> |
>
>
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------+---------+-------------------------------------+-------+----------------------------------------------+
> | 1 | SIMPLE | addresses | ref |
>
>
> index_addresses_on_parent_city_id_and_addressable_type,index_addresses_on_addressable_type_and_addressable_id,addressable_id_type_city
> | addressable_id_type_city | 773 | const,const
> |
> 25680 | Using where; Using temporary; Using filesort |
> | 1 | SIMPLE | listings | eq_ref | PRIMARY
>
> | PRIMARY | 4 |
> ratemyarea.addresses.addressable_id | 1 |
> |
>
>
> +----+-------------+-----------+--------+----------------------------------------------------------------------------------------------------------------------------------------+--------------------------+---------+-------------------------------------+-------+----------------------------------------------+
>
>
> Here is the structure of the tables:
>
> CREATE TABLE `addresses` (
> `id` int(11) NOT NULL AUTO_INCREMENT,
> `postal_code` varchar(255) DEFAULT NULL,
> `addressable_type` varchar(255) DEFAULT NULL,
> `addressable_id` int(11) DEFAULT NULL,
> `parent_city_id` int(11) DEFAULT NULL,
> `lat` decimal(15,10) DEFAULT NULL,
> `lng` decimal(15,10) DEFAULT NULL,
> PRIMARY KEY (`id`),
> KEY `index_addresses_on_lat_and_lng` (`lat`,`lng`),
> KEY `index_addresses_on_parent_city_id_and_addressable_type`
> (`parent_city_id`,`addressable_type`),
> KEY `index_addresses_on_addressable_type_and_addressable_id`
> (`addressable_type`,`addressable_id`)
> ) ENGINE=InnoDB AUTO_INCREMENT=120513 DEFAULT CHARSET=utf8;
>
>
>
> # Dump of table cities
> # ------------------------------------------------------------
>
> CREATE TABLE `cities` (
> `id` int(11) NOT NULL AUTO_INCREMENT,
> `name` varchar(255) NOT NULL,
> `country_id` int(11) NOT NULL,
> `lat` decimal(15,10) NOT NULL,
> `lng` decimal(15,10) NOT NULL,
> PRIMARY KEY (`id`),
> KEY `index_cities_on_name` (`name`),
> KEY `index_cities_on_country_id` (`country_id`),
> KEY `index_cities_on_lat_and_lng` (`lat`,`lng`)
> ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;
>
>
>
> # Dump of table events
> # ------------------------------------------------------------
>
> CREATE TABLE `events` (
> `id` int(11) NOT NULL AUTO_INCREMENT,
> `listing_id` int(11) NOT NULL,
> `description` text,
> `title` varchar(150) NOT NULL,
> PRIMARY KEY (`id`),
> KEY `index_events_on_title` (`title`),
> KEY `index_events_on_listing_id` (`listing_id`)
> ) ENGINE=InnoDB AUTO_INCREMENT=6359 DEFAULT CHARSET=utf8;
>
>
>
> # Dump of table listings
> # ------------------------------------------------------------
>
> CREATE TABLE `listings` (
> `id` int(11) NOT NULL AUTO_INCREMENT,
> `name` varchar(255) DEFAULT NULL,
> `latitude` decimal(15,10) DEFAULT NULL,
> `longitude` decimal(15,10) DEFAULT NULL,
> `description` text,
> PRIMARY KEY (`id`),
> KEY `index_listings_on_place` (`name`),
> KEY `index_listings_on_latitude_and_longitude` (`latitude`,`longitude`)
> ) ENGINE=InnoDB AUTO_INCREMENT=109358 DEFAULT CHARSET=utf8;
>
>
>
> # Dump of table users
> # ------------------------------------------------------------
>
> CREATE TABLE `users` (
> `id` int(11) NOT NULL AUTO_INCREMENT,
> `email` varchar(255) DEFAULT NULL,
> PRIMARY KEY (`id`),
> KEY `index_users_on_email` (`email`),
> KEY `index_users_on_latitude_and_longitude` (`latitude`,`longitude`)
> ) ENGINE=InnoDB AUTO_INCREMENT=1972 DEFAULT CHARSET=utf8;
>
> The information contained in this transmission may contain privileged and
> confidential information. It is intended only for the use of the person(s)
> named above. If you are not the intended recipient, you are hereby notified
> that any review, dissemination, distribution or duplication of this
> communication is strictly prohibited. If you are not the intended recipient,
> please contact the sender by reply email and destroy all copies of the
> original message.
>