List:General Discussion« Previous MessageNext Message »
From:Dan Nelson Date:June 19 2009 6:55am
Subject:Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?
View as plain text  
In the last episode (Jun 18), Matt Neimeyer said:
> I'm converting a PHP app from using Visual FoxPro as the database backend
> to using MySQL as the backend.  I'm testing on MySQL 4.1.22 on Mac OSX
> 10.4.  The end application will be deployed cross platform and to both 4.x
> and 5.x MySQL servers.
> 
> This query returned 21 records in .27 seconds:
> 
> SELECT zip FROM zipcodes WHERE
> degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+
>
> cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude))))*60*1.1515
> < 5

Ouch.  You might want to calculate the rectange enclosing your target
distance, add an index on lat (and/or long), and add the rectangle check to
your where clause: WHERE latitude BETWEEN lt1 AND lt2 AND longitude BETWEEN
ln2 AND ln2.  That way mysql can use the index to pare down most of the rows
without having to call all those trig functions for every zipcode.
 
> This query returned 21442 records in 1.08 seconds:
> 
>    SELECT custzip FROM customers
> 
> This query is still running half an hour later, with a Time of 2167
> and a State of "Sending Data" (according to the mysql process list)
> 
>    SELECT custzip FROM customers WHERE custzip IN (SELECT zip FROM
> zipcodes WHERE degrees(acos(sin(radians(39.0788994))*sin(radians(latitude))+
>
> cos(radians(39.0788994))*cos(radians(latitude))*cos(radians(-77.1227036-longitude))))*60*1.1515
> < 5)
> 
> When I try to EXPLAIN the query it gives me the following...
> 
> id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
> 1,PRIMARY,customers,index,NULL,cw_custzip,30,NULL,21226,Using where; Using index
> 2,DEPENDENT SUBQUERY,zipcodes,ALL,NULL,NULL,NULL,NULL,42144,Using where

Neither mysql 4 or 5 are very smart when it comes to subqueries.  Unless
your inner query is dead simple, mysql assumes it's a dependent subquery and
runs it once per row in your outer query.  You might want to try mysql 6 and
see if it does any better.  For example, here are explain plans for mysql 5
and 6 for the following query on the famous Oracle "emp" sample table:

select ename from emp where mgr in 
 (select empno from emp where ename in ('scott'));
+-------+
| ename |
+-------+
| ADAMS | 
+-------+
1 row in set (0.00 sec)

Mysql 5.1.30:
+----+--------------------+-------+-----------------+---------------+---------+---------+------+------+-------------+
| id | select_type        | table | type            | possible_keys | key     | key_len |
ref  | rows | Extra       |
+----+--------------------+-------+-----------------+---------------+---------+---------+------+------+-------------+
|  1 | PRIMARY            | emp   | ALL             | NULL          | NULL    | NULL    |
NULL |   14 | Using where |
|  2 | DEPENDENT SUBQUERY | emp   | unique_subquery | PRIMARY,ENAME | PRIMARY | 4       |
func |    1 | Using where |
+----+--------------------+-------+-----------------+---------------+---------+---------+------+------+-------------+

Note that it didn't use an index on the outer query, and had to examine all
14 rows.  It even used the wrong index on the inner query :)

Mysql 6.0.11:
+----+-------------+-------+------+---------------+-------+---------+----------------+------+-----------------------+
| id | select_type | table | type | possible_keys | key   | key_len | ref            |
rows | Extra                 |
+----+-------------+-------+------+---------------+-------+---------+----------------+------+-----------------------+
|  1 | PRIMARY     | emp   | ref  | PRIMARY,ENAME | ENAME | 13      | const          |   
1 | Using index condition |
|  1 | PRIMARY     | emp   | ref  | MGR           | MGR   | 5       | test.emp.EMPNO |   
2 |                       |
+----+-------------+-------+------+---------------+-------+---------+----------------+------+-----------------------+

Note that the queries have flipped and aren't nested anymore (id is 1 on
both queries).  The first query uses the ename index and estimates it will
return one row.  The second query uses the mgr index based on the empno
value returned by the first query and estimates it will return 2 rows.  Much
better :)
 
-- 
	Dan Nelson
	dnelson@stripped
Thread
Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Matt Neimeyer19 Jun
  • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Johnny Withers19 Jun
    • How to Optimize distinct with index周彦伟19 Jun
  • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Dan Nelson19 Jun
    • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Matt Neimeyer19 Jun
  • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Brent Baisley19 Jun
    • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Matt Neimeyer22 Jun
  • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Peter Brawley20 Jun
    • Re: Half Hour Sub-query in MySQL vs. 5 Seconds in VFP?Walter Heck - OlinData.com20 Jun
Re: How to Optimize distinct with indexDan Nelson19 Jun
Re: How to Optimize distinct with indexDarryle Steplight19 Jun
  • Re: How to Optimize distinct with indexMoon's Father26 Jun