Matthias Pigulla wrote:
> Hi folks,
> <parental advisory - explicit description ;->
> I admit I'm a database novice, so maybe I've got a wrong idea of how
> queries are executed and how the necessary rows can be retreived.
> Anyhow, would you mind explaining me why I have such - on my opinion
> strange - results in EXPLAIN commands and how I can work around them?
> I have a database storing discussion threads. There are several groups,
> each containing a lot of threads, once again containing much more
> postings. Each posting is written by a user. With my humble knowledge of
> db design and normalization, I desgined the db as follows:
> table group table thread table posting table user
> *id <-+ *id <-+ *id +--> *id
> other stuff +--- +belongs to +--- +belongs to | nickname
> topic etc text etc | other stuff
> written by -+
> In other words, each entity (posting, thread) references the superior
> one. * is a primary key, + a "normal" one. The table index for
> "postings" is built as (belongs to, timestamp), for I thought this way
> it allowed quick access to MAX(timestamp) when having a given thread ID
> for "belongs to".
> Now, I would like to SELECT a list of all threads for a certain group.
> This list ought to contain the date of the last posting and the number
> of postings for each thread, along with the authoring users name,
> ordered by lastpost.
> Here's my query. There's still a bug, for the selection of "users.nick"
> along with the GROUP BY clause does not alway give me the author of the
> oldest (first) posting (which is, of course, the user who started the
> thread), but that's not important at the moment.
> as lastpost
> threads.forum=(any valid id)
> AND postings.thread=threads.id
> AND users.id=postings.user
> GROUP BY threads.id
> ORDER BY lastpost DESC
> In my naive idea of "database internals" I expected MySQL to
> - fetch all threads for the group/forum
> - for each thread, quickly retrieve MAX(post_date) with the compound
> index described above
> - add the author
> Rather unexpectedly, I get the following EXPLAIN for the above query:
> | table | type | possible_keys | key | ref | rows
> | Extra |
> | postings | ALL | thread | NULL | NULL | 52898
> | |
> | threads | eq_ref | PRIMARY,forum | PRIMARY | postings.thread | 1
> | where used |
> | users | eq_ref | PRIMARY | PRIMARY | postings.user | 1
> | |
> The "ALL" type in the first row is - as you can easily imagine - a real
> drain, even on a cool tuned up system. I tried using "threads
> STRAIGHT_JOIN postings", but the product of the rows was even worse, so
> the optimizer's desicison seems to be allright.
> Now I wonder what to do. Is there a flaw in my database design? Is it an
> unwise decision to have COUNT, MAX and GROUP BY in my query - or more in
> general, are these functions indicating design flaws like NULL values
> For example, I might store the date of the last posting in the thread
> table and save the MAX() call, but that seems to be unclean, for in fact
> the latest posting contains the latest date, and anything else would be
> redundancy and might introduce integrity problems.
> Or should I store the group/forum id in each posting, not each thread?
> Also unclean, for threads belong to forums and postings belong to
> threads - not postings and threads to forums at the same time.
> Is there any special (MySQL) thing to think of? Does anyone of you have
> an idea how to optimize this?
> Any hints are appreciated, thanks a lot for the time!
If you want to use KEY's , you have to provide one spanning all AND parts in your WHERE
In your Example:
For table 'threads' use 'KEY (forum,id)'
Hope this helps