Hi Simon,
once you apply functions to a field, an index on that field is pretty
much useless. For this particular query, I would be tempted to create
additional fields to store the values of MONTH(articles.created_at)
and DAY(articles.created_at). Create an index on (month_created,
day_created) You could just sort by articles.created_at; no need for
the YEAR function, the result will be the same given your other
selectors.
Given those adjustments, the query looks righteous enough.
- michael dykman
On Tue, Jan 18, 2011 at 12:22 PM, Simon Wilkinson
<simon.wilkinson@stripped> wrote:
> Hi,
>
> I am trying to optimize the following query:
>
> SELECT articles.* FROM articles INNER JOIN newsletters ON
> articles.newsletter_id = newsletters.id INNER JOIN users ON users.id =
> newsletters.user_id WHERE users.id =12 AND MONTH(articles.created_at) = '12'
> AND DAY(articles.created_at) = '5' ORDER BY YEAR(articles.created_at),
> LENGTH(articles.body);
>
> I am trying to retrieve all the articles created on a specific day of a
> specific month that belong to a user, ordered by the oldest and then longest
> article.
>
> I have run explain on the query, and get the following:
>
>
> +----+-------------+----------+-------+------------------------------------------------------------------------+--------------------------------------------+---------+---------------------+------+----------------------------------------------+
> | id | select_type | table | type |
> possible_keys
>
> |
> key
> | key_len | ref
> |
> rows | Extra
> |
>
> +----+-------------+----------+-------+------------------------------------------------------------------------+--------------------------------------------+---------+---------------------+------+----------------------------------------------+
> | 1 | SIMPLE | users | const |
> PRIMARY
>
> |
> PRIMARY
> | 4 | const
> | 1 | Using index; Using temporary; Using filesort |
> | 1 | SIMPLE | newsletters | ref |
> PRIMARY,index_newsletters_on_user_id
> |
> index_newsletters_on_user_id
> | 4 |
> const | 1 | Using index
>
> |
> | 1 | SIMPLE | articles | ref |
> index_articles_on_newsletter_id,index_articles_on_newsletter_id_and_created_at
> | index_articles_on_newsletter_id_and_created_at | 4 |
> my_db.newsletters.id | 3 | Using where
>
> |
>
>
> +----+-------------+----------+-------+------------------------------------------------------------------------+--------------------------------------------+---------+---------------------+------+----------------------------------------------+
> 3 rows in set (0.00 sec)
>
> This seems pretty decent, and does perform pretty well for some users (~0.5
> - 1 sec), but for some users (seemingly those with large numbers of
> articles) the query can take 20 - 30 seconds to run. This seems really slow
> to me. I tried adding in the index
> 'index_articles_on_newsletter_id_and_created_at' but the performance doesn't
> seem to be any different then when it uses just the
> 'index_articles_on_newsletter_id' index. I think this might be because of
> the functions I am using on the created_at column to get the day and month
> from it, making an index on created_at useless in this instance.
>
> Running both an 'optimize table entries' and 'analyze table entries' also
> didn't seem to have any real impact on the performance.
>
> I was wondering if anybody had any suggestions for what else I might be able
> to try, or if there is a better way to search on dates in this manner. Any
> ideas would be greatly appreciated.
>
> Thanks,
>
> Simon
>
--
- michael dykman
- mdykman@stripped
May the Source be with you.