List:General Discussion« Previous MessageNext Message »
From:Michael Widenius Date:March 15 1999 12:14am
Subject:MySQL speed.
View as plain text  
Hi!

(I am cross-posting this to mysql@stripped as the result may be 
interesting for others that wants to optimize queries)

I am now testing your query.

First a couple of comments about your test:

- You can easily get a dump of the database structure with:

mysqldump --no-data media

- When running 'smbentry.pl 0308.log' I got MANY entries of type:

DBD::mysql::st execute failed: Duplicate entry 'goo-6162' for key 1 at
/usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 34832.
DBD::mysql::st execute failed: Duplicate entry 'you-6169' for key 1 at
/usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 34839.
DBD::mysql::st execute failed: Duplicate entry 'goo-6196' for key 1 at
/usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 34867.
DBD::mysql::st execute failed: Duplicate entry '1-5736' for key 1 at
/usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35822.
DBD::mysql::st execute failed: Duplicate entry '1-5746' for key 1 at
/usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35833.
DBD::mysql::st execute failed: Duplicate entry '1-5749' for key 1 at
/usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35837.
DBD::mysql::st execute failed: Duplicate entry '2-5793' for key 1 at
/usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35898.
DBD::mysql::st execute failed: Duplicate entry '3-5799' for key 1 at
/usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35907.
DBD::mysql::st execute failed: Duplicate entry '5-5816' for key 1 at
/usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35930.
 
Can I assume this is normal ?

The result table sizes are:

---------
(/my/data/media) isamchk -ds *.ISM

ISAM file:     AFileInfo.ISM
Data records:             22203  Deleted blocks:              0

ISAM file:     ALookup.ISM
Data records:            119275  Deleted blocks:              0

ISAM file:     HostInfo.ISM
Data records:               246  Deleted blocks:              0

ISAM file:     IFileInfo.ISM
Data records:             14426  Deleted blocks:              0

ISAM file:     ILookup.ISM
Data records:             69141  Deleted blocks:              0

ISAM file:     VFileInfo.ISM
Data records:              1229  Deleted blocks:              0

ISAM file:     VLookup.ISM
Data records:              4718  Deleted blocks:              0

Total of all 7 ISAM-files:
Data records:   231238   Deleted blocks:        0

----------

- What problems did you have with DBI;  I have found the 1.2x series
to be very reliable!

***********************************************************

Testing:

Testing with MySQL 3.22.20 (compiled with debugging):

mysql> explain SELECT filename,size,ip,share,path,time,type,f.fid,pingtime,pingres FROM
> HostInfo h, AFileInfo f ,ALookup a0 WHERE a0.fid = f.fid and a0.keyword= 'beatles' and
> f.hostnum = h.hostnum order by time DESC;
+-------+--------+---------------------+---------+---------+---------------+------+------------+
| table | type   | possible_keys       | key     | key_len | ref           | rows | Extra 
    |
+-------+--------+---------------------+---------+---------+---------------+------+------------+
| h     | ALL    | hostnum             | NULL    |    NULL | NULL          |  246 |       
    |
| f     | ref    | PRIMARY,fid,hostnum | hostnum |       8 | h.hostnum     |  174 |       
    |
| a0    | eq_ref | PRIMARY,fid,keyword | PRIMARY |      28 | beatles,f.fid |    1 | where
used |
+-------+--------+---------------------+---------+---------+---------------+------+------------+
3 rows in set (0.00 sec)

(The query took 3.85 seconds).

With MySQL 3.21:

mysql> explain SELECT filename,size,ip,share,path,time,type,f.fid,pingtime,pingr
es FROM HostInfo h, AFileInfo f ,ALookup a0 WHERE a0.fid = f.fid and a0.keyword
= 'beatles' and f.hostnum = h.hostnum order by time DESC;
+-------+------+---------------------+---------+---------+-----------+------+---------------------------+
| table | type | possible_keys       | key     | key_len | ref       | rows | Extra       
             |
+-------+------+---------------------+---------+---------+-----------+------+---------------------------+
| a0    | ref  | PRIMARY,fid,keyword | keyword |      10 | beatles   |   16 | range used
on key keyword |
| f     | ref  | PRIMARY,fid,hostnum | fid     |       8 | a0.fid    |  174 |             
             |
| h     | ref  | hostnum             | hostnum |       8 | f.hostnum |   20 |             
             |
+-------+------+---------------------+---------+---------+-----------+------+---------------------------+

(The query took 0.05 seconds)

As you can see, the EXPLAIN is quite different.

The problem is that MySQL guesses wrong the number of rows that will
match the query.  It has simple not enough information to guess right
in this case; In some queries that are very similar to the above MySQL
3.22 will perform much better than MySQL 3.21, for other queries there
may be a speed penalty, as happens for you in this case.

One can force MySQL 3.22 to run the query exactly like MySQL 3.21:

mysql> explain SELECT straight_join filename,size,ip,share,path,time,type,f.fid,
pingtime,pingres FROM  ALookup a0, AFileInfo f, HostInfo h WHERE a0.fid = f.fid
and a0.keyword = 'beatles' and f.hostnum = h.hostnum order by time DESC;
+-------+------+---------------------+---------+---------+-----------+------+------------+
| table | type | possible_keys       | key     | key_len | ref       | rows | Extra      |
+-------+------+---------------------+---------+---------+-----------+------+------------+
| a0    | ref  | PRIMARY,fid,keyword | keyword |      10 | beatles   |  185 | where used |
| f     | ref  | PRIMARY,fid,hostnum | fid     |       8 | a0.fid    |  174 |            |
| h     | ref  | hostnum             | hostnum |       8 | f.hostnum |   20 |            |
+-------+------+---------------------+---------+---------+-----------+------+------------+

(If you remove the explain, the above query will be as fast as the
original query in MySQL 3.21)

As you can see, the main difference in MySQL 3.22 is that rows for a0
(ALookup) is 185 and not 16.

Doing a:

mysql> select count(*) from ALookup where keyword like 'beatles';
+----------+
| count(*) |
+----------+
|      249 |
+----------+

Shows that MySQL 3.22 actually has a much better approximation of the
number of rows that matches 'beatles'

This means that the reason that the query goes faster in MySQL 3.21 is 
that there was a bug in the module that calculates the number of
rows that matches a constant keyword.  This together with the fact
that 'AFileInfo' rows was wrongly guessed in both version resulted in
that MySQL 3.21 happened to use the optimal table join.

You can however tell MySQL 3.22 how the data is distributed by doing:

isamchk -a *.ISM

(The above saves the mean number of duplicates there are
for each key;  As this number doesn't vary that much over time, you
probably never have to do it again, if you don't change your table
with ALTER TABLE, in which case the analyze results are forgotten)

After this:

mysql> explain SELECT straight_join
> filename,size,ip,share,path,time,type,f.fid,pingtime,pingres FROM  ALookup a0, AFileInfo
> f, HostInfo h WHERE a0.fid = f.fid and a0.keyword = 'beatles' and f.hostnum = h.hostnum
> order by time DESC;
+-------+------+---------------------+---------+---------+-----------+------+------------+
| table | type | possible_keys       | key     | key_len | ref       | rows | Extra      |
+-------+------+---------------------+---------+---------+-----------+------+------------+
| a0    | ref  | PRIMARY,fid,keyword | keyword |      10 | beatles   |  185 | where used |
| f     | ref  | PRIMARY,fid,hostnum | fid     |       8 | a0.fid    |    1 |            |
| h     | ref  | hostnum             | hostnum |       8 | f.hostnum |    1 |            |
+-------+------+---------------------+---------+---------+-----------+------+------------+
3 rows in set (0.02 sec)

And doing the full select now takes:

249 rows in set (0.07 sec)
 

I quickly found the problem by testing EXPLAIN and timing the
following 'simpler' version of your query:

SELECT straight_join count(*) FROM ALookup a0, AFileInfo f, HostInfo h WHERE a0.fid =
f.fid and a0.keyword = 'beatles' and f.hostnum = h.hostnum;


In other words:

Do:

isamchk -a */*.ISM ; mysqladmin flush-tables

and try again.

You can of course also use STRAIGHT_JOIN to force MySQL to join the
tables in the specified order, but normally this isn't necessary.

Regards,
Monty
Thread
MySQL speed.Michael Widenius15 Mar
  • Re: MySQL speed.Naveen Nalam17 Mar