List:General Discussion« Previous MessageNext Message »
From:Jon Drukman Date:October 6 1999 11:33pm
Subject:slow joins
View as plain text  
I'm trying to join three tables, and the time to do it is way more than I
would expect, given that joining two tables is so quick.  I believe I have
indexed everything appropriately.  Here's an example:

I want to find all the users in a certain zip code on a certain mailing
list we have.

To find how many users are in the zip code is quick:

select count(1) from userdata where zip like '8%';
+----------+
| count(1) |
+----------+
|    12703 |
+----------+
1 row in set (1.53 sec)


to find how many of them are on the given mailing list is quick:

select count(1) from userdata,elist where zip like '8%' and listid=2 and
elist.id=userdata.id;
+----------+
| count(1) |
+----------+
|     2604 |
+----------+
1 row in set (2.98 sec)

but to find their actual addresses takes way longer:

select email from authdata,userdata,elist where zip like '8%' and listid=2
and elist.id=userdata.id and elist.id=authdata.id;

(list snipped to save space)

2603 rows in set (15.47 sec)

also the number is off by one.  very odd.  (this is not a live database, so
someone didn't just delete a record.)

15 seconds might not seem like much, but on our live system, it's actually
more like 10 seconds for the quick queries and 3.5 minutes for the last
one.  if i'm retrieving a lot of rows (say, 5000 instead of 2500), it's
even slower.  here's explain on the slow query:

+----------+--------+---------------+---------+---------+-------------+-----
-+-------+
| table    | type   | possible_keys | key     | key_len | ref         |
rows | Extra |
+----------+--------+---------------+---------+---------+-------------+-----
-+-------+
| userdata | range  | PRIMARY,zip   | zip     |    NULL | NULL        |
9994 |       |
| elist    | ref    | id,listid     | id      |       4 | userdata.id |
10 |       |
| authdata | eq_ref | PRIMARY       | PRIMARY |       4 | elist.id    |
1 |       |
+----------+--------+---------------+---------+---------+-------------+-----
-+-------+

i've played around with moving the WHERE clauses, order of tables, etc...
i'm just wondering if there is a scientific way of determining the optimal
order of WHERE clauses and FROM tables rather than just blind guessing.


Thread
adding data is slowJon Drukman24 Jun
  • Re: adding data is slowSasha Pachev24 Jun
    • Re: adding data is slowJon Drukman25 Jun
      • Re: adding data is slowBenjamin Pflugmann25 Jun
        • Re: adding data is slowJon Drukman25 Jun
        • Yet Another Optimization QuestionJon Drukman28 Jun
  • Re: adding data is slowDaniel Koch25 Jun
  • adding data is slowMichael Widenius29 Jun
Re: Yet Another Optimization QuestionKevin Smith29 Jun
Re: Yet Another Optimization QuestionJon Drukman29 Jun
  • Re: Yet Another Optimization QuestionBenjamin Pflugmann30 Jun
    • Re: Yet Another Optimization QuestionJon Drukman30 Jun
Re: Yet Another Optimization QuestionKevin Smith29 Jun
Re: Yet Another Optimization QuestionJon Drukman29 Jun
  • Re: Yet Another Optimization QuestionTõnu Samuel29 Jun
    • Re: Yet Another Optimization QuestionJon Drukman29 Jun
      • Re: Yet Another Optimization QuestionMichael Widenius30 Jun
        • Re: Yet Another Optimization QuestionJon Drukman30 Jun
RE: Yet Another Optimization QuestionNigel Parker30 Jun
  • RE: Yet Another Optimization QuestionMichael Widenius30 Jun
Re: Yet Another Optimization Question(David Sklar)1 Jul
RE: Yet Another Optimization Questiontony2 Jul
  • RE: Yet Another Optimization QuestionMichael Widenius2 Jul
  • slow joinsJon Drukman7 Oct