List:General Discussion« Previous MessageNext Message »
From:Jerry Schwartz Date:August 11 2010 1:07pm
Subject:RE: Slow query using string operator
View as plain text  
>-----Original Message-----
>From: Travis Ard [mailto:travis_ard@stripped]
>Sent: Tuesday, August 10, 2010 6:53 PM
>To: 'Jerry Schwartz'; mysql@stripped
>Subject: RE: Slow query using string operator
>
>Can you create a second, indexed column in your feed_new temp table that
>includes the title without the year appended?  That might allow you to get
>by with a single pass through the larger prod table and avoid reading rows
>from the feed_new table.
>
[JS] I have thought about that, but haven't tried it yet.

I'll let you know.

Regards,

Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032

860.674.8796 / FAX: 860.674.8341
E-mail: jerry@stripped
Web site: www.the-infoshop.com



>-Travis
>
>-----Original Message-----
>From: Jerry Schwartz [mailto:jerry@stripped]
>Sent: Tuesday, August 10, 2010 3:39 PM
>To: mysql@stripped
>Subject: Slow query using string operator
>
>I'm running a set of queries that look like this:
>
>===============
>SET @PUBID = (SELECT pub_id FROM pub WHERE pub_code = 'DC');
>
>DROP TEMPORARY TABLE IF EXISTS feed_new;
>
>CREATE TEMPORARY TABLE feed_new (
>	new_title VARCHAR(255), INDEX (new_title)
>	);
>
>INSERT INTO feed_new
>VALUES
>
>('UK Investment Bonds 2010'),
>('UK Protection 2010'),
>('UK Personal Insurance Distribution 2010'),
>('UK Private Medical Insurance 2010'),
>...
>('UK Private Motor Insurance 2010'),
>('Wealth Management for Non-Resident Indians 2010'),
>('Middle Eastern Cards Database 2010')
>;
>
>SELECT
>	feed_new.new_title AS `New Title FROM Feed`,
>	prod.prod_pub_prod_id AS `Lib Code FROM DB`,
>	prod.prod_title AS `Title FROM DB`,
>	prod.prod_num AS `Prod Num`,
>	prod.prod_published AS `Published FROM DB`
>FROM feed_new JOIN prod
>ON LEFT(feed_new.new_title, LENGTH(feed_new.new_title) - 5) =
>LEFT(prod.prod_title, LENGTH(prod.prod_title) - 5)
>WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0
>ORDER BY feed_new.new_title;
>
>============
>
>With a relatively small number of rows in `feed_new`, this can take many
>seconds. With 163 rows in `feed_new`, compared against 11234 eligible rows
>in
>prod, it took about 28 seconds. Here's what an EXPLAIN looks like:
>
>*************************** 1. row ***************************
>           id: 1
>  select_type: SIMPLE
>        table: feed_new
>         type: index
>possible_keys: NULL
>          key: PRIMARY
>      key_len: 767
>          ref: NULL
>         rows: 1
>        Extra: Using index
>*************************** 2. row ***************************
>           id: 1
>  select_type: SIMPLE
>        table: prod
>         type: ref
>possible_keys: pub_id,pub_id_2
>          key: pub_id
>      key_len: 48
>          ref: const
>         rows: 11040
>        Extra: Using where
>
>=========
>
>prod.pub_id is an indexed VARCHAR(15).
>
>If I remove the string functions, I don't get what I want -- but the
>remaining
>query runs in .05 seconds. Here's an EXPLAIN of that one:
>
>===============
>us-gii >EXPLAIN
>    -> SELECT
>    ->  feed_new.new_title AS `New Title FROM Feed`,
>    ->  prod.prod_pub_prod_id AS `Lib Code FROM DB`,
>    ->  prod.prod_title AS `Title FROM DB`,
>    ->  prod.prod_num AS `Prod Num`,
>    ->  prod.prod_published AS `Published FROM DB`
>    -> FROM feed_new JOIN prod
>    -> ON feed_new.new_title = prod.prod_title
>    -> WHERE prod.pub_id = @PUBID AND prod.prod_discont = 0
>    -> ORDER BY feed_new.new_title\G
>*************************** 1. row ***************************
>           id: 1
>  select_type: SIMPLE
>        table: feed_new
>         type: index
>possible_keys: PRIMARY
>          key: PRIMARY
>      key_len: 767
>          ref: NULL
>         rows: 163
>        Extra: Using index
>*************************** 2. row ***************************
>           id: 1
>  select_type: SIMPLE
>        table: prod
>         type: ref
>possible_keys: pub_id,prod_title,pub_id_2,prod_title_fulltext
>          key: prod_title
>      key_len: 768
>          ref: giiexpr_db.feed_new.new_title
>         rows: 1
>        Extra: Using where
>================
>
>Obviously the string manipulation is keeping MySQL from using `prod_title`
>as
>a key, but I wouldn't have thought that using `pub_id` instead would be that
>
>horrific.
>
>Does anyone have any suggestions as to how to speed this business up? I
>can't
>get away without some string manipulation, because I'm looking for "near
>matches" by ignoring the year at the end of the title.
>
>Regards,
>
>Jerry Schwartz
>Global Information Incorporated
>195 Farmington Ave.
>Farmington, CT 06032
>
>860.674.8796 / FAX: 860.674.8341
>E-mail: jerry@stripped
>Web site: www.the-infoshop.com
>
>
>
>
>
>--
>MySQL General Mailing List
>For list archives: http://lists.mysql.com/mysql
>To unsubscribe:    http://lists.mysql.com/mysql?unsub=1
>




Thread
Slow query using string operatorJerry Schwartz10 Aug
  • RE: Slow query using string operatorTravis Ard11 Aug
    • Re: Slow query using string operatorJoerg Bruehe11 Aug
    • RE: Slow query using string operatorJerry Schwartz11 Aug
    • RE: Slow query using string operatorJerry Schwartz11 Aug