List:General Discussion« Previous MessageNext Message »
From:Christian Mack Date:May 7 1999 5:17pm
Subject:Re: How to optimize this query for MySQL?
View as plain text  
Matthias Pigulla wrote:
> 
> Hi folks,
> 
> <parental advisory - explicit description ;->
> 
> I admit I'm a database novice, so maybe I've got a wrong idea of how
> queries are executed and how the necessary rows can be retreived.
> Anyhow, would you mind explaining me why I have such - on my opinion
> strange - results in EXPLAIN commands and how I can work around them?
> 
> I have a database storing discussion threads. There are several groups,
> each containing a lot of threads, once again containing much more
> postings. Each posting is written by a user. With my humble knowledge of
> db design and normalization, I desgined the db as follows:
> 
>  table group       table thread      table posting      table user
>          *id <-+            *id <-+            *id  +-->       *id
>  other stuff   +--- +belongs to   +--- +belongs to  |     nickname
>                       topic etc           text etc  |  other stuff
>                                         written by -+
>                                          timestamp
> 
> In other words, each entity (posting, thread) references the superior
> one. * is a primary key, + a "normal" one. The table index for
> "postings" is built as (belongs to, timestamp), for I thought this way
> it allowed quick access to MAX(timestamp) when having a given thread ID
> for "belongs to".
> 
> Now, I would like to SELECT a list of all threads for a certain group.
> This list ought to contain the date of the last posting and the number
> of postings for each thread, along with the authoring users name,
> ordered by lastpost.
> 
> Here's my query. There's still a bug, for the selection of "users.nick"
> along with the GROUP BY clause does not alway give me the author of the
> oldest (first) posting (which is, of course, the user who started the
> thread), but that's not important at the moment.
> 
> SELECT
> 
> threads.id,threads.topic,users.nick,COUNT(threads.id)-1,MAX(postings.post_date)
> as lastpost
> FROM
>  threads,postings,users
> WHERE
>  threads.forum=(any valid id)
>  AND postings.thread=threads.id
>  AND users.id=postings.user
> GROUP BY threads.id
> ORDER BY lastpost DESC
> 
> In my naive idea of "database internals" I expected MySQL to
> - fetch all threads for the group/forum
> - for each thread, quickly retrieve MAX(post_date) with the compound
> index described above
> - add the author
> 
> Rather unexpectedly, I get the following EXPLAIN for the above query:
> 
>
> +----------+--------+---------------+---------+-----------------+-------+------------+
> | table    | type   | possible_keys | key     | ref             | rows
> | Extra      |
>
> +----------+--------+---------------+---------+-----------------+-------+------------+
> | postings | ALL    | thread        | NULL    | NULL            | 52898
> |            |
> | threads  | eq_ref | PRIMARY,forum | PRIMARY | postings.thread |     1
> | where used |
> | users    | eq_ref | PRIMARY       | PRIMARY | postings.user   |     1
> |            |
>
> +----------+--------+---------------+---------+-----------------+-------+------------+
> 
> The "ALL" type in the first row is - as you can easily imagine - a real
> drain, even on a cool tuned up system. I tried using "threads
> STRAIGHT_JOIN postings", but the product of the rows was even worse, so
> the optimizer's desicison seems to be allright.
> 
> Now I wonder what to do. Is there a flaw in my database design? Is it an
> unwise decision to have COUNT, MAX and GROUP BY in my query - or more in
> general, are these functions indicating design flaws like NULL values
> are?
> 
> For example, I might store the date of the last posting in the thread
> table and save the MAX() call, but that seems to be unclean, for in fact
> the latest posting contains the latest date, and anything else would be
> redundancy and might introduce integrity problems.
> 
> Or should I store the group/forum id in each posting, not each thread?
> Also unclean, for threads belong to forums and postings belong to
> threads - not postings and threads to forums at the same time.
> 
> Is there any special (MySQL) thing to think of? Does anyone of you have
> an idea how to optimize this?
> 
> Any hints are appreciated, thanks a lot for the time!
> 
> Matthias

Hi Matthias

If you want to use KEY's , you have to provide one spanning all AND parts in your WHERE
part.
In your Example:
For table 'threads' use 'KEY (forum,id)'

Hope this helps
Christian

Thread
How to optimize this query for MySQL?Matthias Pigulla7 May
  • Re: How to optimize this query for MySQL?Christian Mack7 May
  • How to optimize this query for MySQL?Michael Widenius11 May
Re: MyODBCEIN)28 Mar