From: Michael Widenius Date: March 15 1999 12:14am Subject: MySQL speed. List-Archive: http://lists.mysql.com/mysql/256 Message-Id: <14060.17966.420792.918276@monty.pp.sci.fi> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi! (I am cross-posting this to mysql@stripped as the result may be interesting for others that wants to optimize queries) I am now testing your query. First a couple of comments about your test: - You can easily get a dump of the database structure with: mysqldump --no-data media - When running 'smbentry.pl 0308.log' I got MANY entries of type: DBD::mysql::st execute failed: Duplicate entry 'goo-6162' for key 1 at /usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 34832. DBD::mysql::st execute failed: Duplicate entry 'you-6169' for key 1 at /usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 34839. DBD::mysql::st execute failed: Duplicate entry 'goo-6196' for key 1 at /usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 34867. DBD::mysql::st execute failed: Duplicate entry '1-5736' for key 1 at /usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35822. DBD::mysql::st execute failed: Duplicate entry '1-5746' for key 1 at /usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35833. DBD::mysql::st execute failed: Duplicate entry '1-5749' for key 1 at /usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35837. DBD::mysql::st execute failed: Duplicate entry '2-5793' for key 1 at /usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35898. DBD::mysql::st execute failed: Duplicate entry '3-5799' for key 1 at /usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35907. DBD::mysql::st execute failed: Duplicate entry '5-5816' for key 1 at /usr/lib/perl5/site_perl/Mysql.pm line 162, <> chunk 35930. Can I assume this is normal ? The result table sizes are: --------- (/my/data/media) isamchk -ds *.ISM ISAM file: AFileInfo.ISM Data records: 22203 Deleted blocks: 0 ISAM file: ALookup.ISM Data records: 119275 Deleted blocks: 0 ISAM file: HostInfo.ISM Data records: 246 Deleted blocks: 0 ISAM file: IFileInfo.ISM Data records: 14426 Deleted blocks: 0 ISAM file: ILookup.ISM Data records: 69141 Deleted blocks: 0 ISAM file: VFileInfo.ISM Data records: 1229 Deleted blocks: 0 ISAM file: VLookup.ISM Data records: 4718 Deleted blocks: 0 Total of all 7 ISAM-files: Data records: 231238 Deleted blocks: 0 ---------- - What problems did you have with DBI; I have found the 1.2x series to be very reliable! *********************************************************** Testing: Testing with MySQL 3.22.20 (compiled with debugging): mysql> explain SELECT filename,size,ip,share,path,time,type,f.fid,pingtime,pingres FROM HostInfo h, AFileInfo f ,ALookup a0 WHERE a0.fid = f.fid and a0.keyword= 'beatles' and f.hostnum = h.hostnum order by time DESC; +-------+--------+---------------------+---------+---------+---------------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+--------+---------------------+---------+---------+---------------+------+------------+ | h | ALL | hostnum | NULL | NULL | NULL | 246 | | | f | ref | PRIMARY,fid,hostnum | hostnum | 8 | h.hostnum | 174 | | | a0 | eq_ref | PRIMARY,fid,keyword | PRIMARY | 28 | beatles,f.fid | 1 | where used | +-------+--------+---------------------+---------+---------+---------------+------+------------+ 3 rows in set (0.00 sec) (The query took 3.85 seconds). With MySQL 3.21: mysql> explain SELECT filename,size,ip,share,path,time,type,f.fid,pingtime,pingr es FROM HostInfo h, AFileInfo f ,ALookup a0 WHERE a0.fid = f.fid and a0.keyword = 'beatles' and f.hostnum = h.hostnum order by time DESC; +-------+------+---------------------+---------+---------+-----------+------+---------------------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------------+---------+---------+-----------+------+---------------------------+ | a0 | ref | PRIMARY,fid,keyword | keyword | 10 | beatles | 16 | range used on key keyword | | f | ref | PRIMARY,fid,hostnum | fid | 8 | a0.fid | 174 | | | h | ref | hostnum | hostnum | 8 | f.hostnum | 20 | | +-------+------+---------------------+---------+---------+-----------+------+---------------------------+ (The query took 0.05 seconds) As you can see, the EXPLAIN is quite different. The problem is that MySQL guesses wrong the number of rows that will match the query. It has simple not enough information to guess right in this case; In some queries that are very similar to the above MySQL 3.22 will perform much better than MySQL 3.21, for other queries there may be a speed penalty, as happens for you in this case. One can force MySQL 3.22 to run the query exactly like MySQL 3.21: mysql> explain SELECT straight_join filename,size,ip,share,path,time,type,f.fid, pingtime,pingres FROM ALookup a0, AFileInfo f, HostInfo h WHERE a0.fid = f.fid and a0.keyword = 'beatles' and f.hostnum = h.hostnum order by time DESC; +-------+------+---------------------+---------+---------+-----------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------------+---------+---------+-----------+------+------------+ | a0 | ref | PRIMARY,fid,keyword | keyword | 10 | beatles | 185 | where used | | f | ref | PRIMARY,fid,hostnum | fid | 8 | a0.fid | 174 | | | h | ref | hostnum | hostnum | 8 | f.hostnum | 20 | | +-------+------+---------------------+---------+---------+-----------+------+------------+ (If you remove the explain, the above query will be as fast as the original query in MySQL 3.21) As you can see, the main difference in MySQL 3.22 is that rows for a0 (ALookup) is 185 and not 16. Doing a: mysql> select count(*) from ALookup where keyword like 'beatles'; +----------+ | count(*) | +----------+ | 249 | +----------+ Shows that MySQL 3.22 actually has a much better approximation of the number of rows that matches 'beatles' This means that the reason that the query goes faster in MySQL 3.21 is that there was a bug in the module that calculates the number of rows that matches a constant keyword. This together with the fact that 'AFileInfo' rows was wrongly guessed in both version resulted in that MySQL 3.21 happened to use the optimal table join. You can however tell MySQL 3.22 how the data is distributed by doing: isamchk -a *.ISM (The above saves the mean number of duplicates there are for each key; As this number doesn't vary that much over time, you probably never have to do it again, if you don't change your table with ALTER TABLE, in which case the analyze results are forgotten) After this: mysql> explain SELECT straight_join filename,size,ip,share,path,time,type,f.fid,pingtime,pingres FROM ALookup a0, AFileInfo f, HostInfo h WHERE a0.fid = f.fid and a0.keyword = 'beatles' and f.hostnum = h.hostnum order by time DESC; +-------+------+---------------------+---------+---------+-----------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------------+---------+---------+-----------+------+------------+ | a0 | ref | PRIMARY,fid,keyword | keyword | 10 | beatles | 185 | where used | | f | ref | PRIMARY,fid,hostnum | fid | 8 | a0.fid | 1 | | | h | ref | hostnum | hostnum | 8 | f.hostnum | 1 | | +-------+------+---------------------+---------+---------+-----------+------+------------+ 3 rows in set (0.02 sec) And doing the full select now takes: 249 rows in set (0.07 sec) I quickly found the problem by testing EXPLAIN and timing the following 'simpler' version of your query: SELECT straight_join count(*) FROM ALookup a0, AFileInfo f, HostInfo h WHERE a0.fid = f.fid and a0.keyword = 'beatles' and f.hostnum = h.hostnum; In other words: Do: isamchk -a */*.ISM ; mysqladmin flush-tables and try again. You can of course also use STRAIGHT_JOIN to force MySQL to join the tables in the specified order, but normally this isn't necessary. Regards, Monty