List:General Discussion« Previous MessageNext Message »
From:Travis Ard Date:August 10 2010 10:52pm
Subject:RE: Slow query using string operator
View as plain text  
Can you create a second, indexed column in your feed_new temp table that
includes the title without the year appended?  That might allow you to get
by with a single pass through the larger prod table and avoid reading rows
from the feed_new table. 

-Travis

-----Original Message-----
From: Jerry Schwartz [mailto:jerry@stripped] 
Sent: Tuesday, August 10, 2010 3:39 PM
To: mysql@stripped
Subject: Slow query using string operator

I'm running a set of queries that look like this:

===============
SET @PUBID = (SELECT pub_id FROM pub WHERE pub_code = 'DC');

DROP TEMPORARY TABLE IF EXISTS feed_new;

CREATE TEMPORARY TABLE feed_new (
	new_title VARCHAR(255), INDEX (new_title)
	);

INSERT INTO feed_new
VALUES

('UK Investment Bonds 2010'),
('UK Protection 2010'),
('UK Personal Insurance Distribution 2010'),
('UK Private Medical Insurance 2010'),
...
('UK Private Motor Insurance 2010'),
('Wealth Management for Non-Resident Indians 2010'),
('Middle Eastern Cards Database 2010')
;

SELECT
	feed_new.new_title AS `New Title FROM Feed`,
	prod.prod_pub_prod_id AS `Lib Code FROM DB`,
	prod.prod_title AS `Title FROM DB`,
	prod.prod_num AS `Prod Num`,
	prod.prod_published AS `Published FROM DB`
FROM feed_new JOIN prod
ON LEFT(feed_new.new_title, LENGTH(feed_new.new_title) - 5) = 
LEFT(prod.prod_title, LENGTH(prod.prod_title) - 5)
WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0
ORDER BY feed_new.new_title;

============

With a relatively small number of rows in `feed_new`, this can take many 
seconds. With 163 rows in `feed_new`, compared against 11234 eligible rows
in 
prod, it took about 28 seconds. Here's what an EXPLAIN looks like:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: feed_new
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 767
          ref: NULL
         rows: 1
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: prod
         type: ref
possible_keys: pub_id,pub_id_2
          key: pub_id
      key_len: 48
          ref: const
         rows: 11040
        Extra: Using where

=========

prod.pub_id is an indexed VARCHAR(15).

If I remove the string functions, I don't get what I want -- but the
remaining 
query runs in .05 seconds. Here's an EXPLAIN of that one:

===============
us-gii >EXPLAIN
    -> SELECT
    ->  feed_new.new_title AS `New Title FROM Feed`,
    ->  prod.prod_pub_prod_id AS `Lib Code FROM DB`,
    ->  prod.prod_title AS `Title FROM DB`,
    ->  prod.prod_num AS `Prod Num`,
    ->  prod.prod_published AS `Published FROM DB`
    -> FROM feed_new JOIN prod
    -> ON feed_new.new_title = prod.prod_title
    -> WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0
    -> ORDER BY feed_new.new_title\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: feed_new
         type: index
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 767
          ref: NULL
         rows: 163
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: prod
         type: ref
possible_keys: pub_id,prod_title,pub_id_2,prod_title_fulltext
          key: prod_title
      key_len: 768
          ref: giiexpr_db.feed_new.new_title
         rows: 1
        Extra: Using where
================

Obviously the string manipulation is keeping MySQL from using `prod_title`
as 
a key, but I wouldn't have thought that using `pub_id` instead would be that

horrific.

Does anyone have any suggestions as to how to speed this business up? I
can't 
get away without some string manipulation, because I'm looking for "near 
matches" by ignoring the year at the end of the title.

Regards,

Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032

860.674.8796 / FAX: 860.674.8341
E-mail: jerry@stripped
Web site: www.the-infoshop.com





-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=1


Thread
Slow query using string operatorJerry Schwartz10 Aug
  • RE: Slow query using string operatorTravis Ard11 Aug
    • Re: Slow query using string operatorJoerg Bruehe11 Aug
    • RE: Slow query using string operatorJerry Schwartz11 Aug
    • RE: Slow query using string operatorJerry Schwartz11 Aug