List:General Discussion« Previous MessageNext Message »
From:Baron Schwartz Date:May 4 2007 8:45pm
Subject:Re: How do I find products when a user types freeform strings like
'Sony 20" TV' or '20" Sony TV'?
View as plain text  
Hi Daevid,

Daevid Vincent wrote:
>> -----Original Message-----
>> From: Baron Schwartz [mailto:baron@stripped] 
>>
>> Daevid Vincent wrote:
>>> I'm having trouble figuring out the logic/query I want.
>>> I know that all those "OR"s are not right.
>>> I'm doing this in PHP and mySQL (of course), 
>>> so if it can't be done with a single query, I can split it up.
>>>
>>> Here's the challenge, given a text field search box, someone enters:
>>>
>>> 	Sony 20" TV
>>>
>>> How do I search for that, not knowing which fields are which?
>>> For example, they could have also entered:
>>>
>>> 	20" Sony TV
>> I think you're describing full-text indexing.  MySQL supports 
>> it but only on MyISAM 
>> tables.  If you don't want to use MyISAM, full-text search 
>> engines like Lucene or Sphinx may be worth looking at.
> 
> I don't think I am. While full-text indexing might help since 
> the indexes would be faster. I think this is a logic issue.
> 
> The full-text index would be useful on a TEXT or BLOB or some 
> long varchar field, but it doesn't solve that I'm trying to 
> pull from two different tables, Product and Company and mapping 
> the free-form string to fields that could be one of several.
> 
> I think my first attempt is close, but it's something to do with 
> all the AND and OR combinations that's not right.
> 
> My version gives many results because it matches ("SONY" OR "TV" OR "20"). 
> I need it to match ("SONY" AND "TV" AND "20")
> 
> But this isn't it either (returns 0 results) because some fields, 
> like the categories.name, products.upc and products.model don't match 
> so the entire condition fails.
> 
> SELECT products.*, companies.name AS company_name, categories.name AS category_name 
> FROM 	 products 
> 	 LEFT JOIN companies ON company_id = companies.id 
> 	 LEFT JOIN categories ON category_id = categories.id 
> WHERE  products.enabled = 1 
> 	AND( 
> 	     (products.model LIKE 'sony%'   OR products.model LIKE '20%'  OR products.model
> LIKE 'tv%') 
> 	 AND (products.upc LIKE 'sony'      OR products.upc LIKE '20'     OR products.upc
> LIKE 'tv') 
> 	 AND (products.name LIKE '%sony%'   OR products.name LIKE '20%'   OR products.name
> LIKE '%tv%') 
> 	 AND (companies.name LIKE 'sony%'   OR companies.name LIKE '20%'  OR companies.name
> LIKE 'tv%') 
> 	 AND (categories.name LIKE '%sony%' OR categories.name LIKE '20%' OR categories.name
> LIKE '%tv%') 
> 	) 
> ORDER BY categories.name DESC, products.name ASC, companies.name ASC;
> 
> Also, the 'SONY' is really the companies.name, 
> while the '20"' _AND_ 'TV' together form '20" TV' to make the products.name.
> 
> +------------------+--------------+--------------+--------------+
> | name             | model        | upc          | company_name |
> +------------------+--------------+--------------+--------------+
> | 20" TV           | STV20-KVR-HD | 097855008633 | Sony         | 
> | 36" TV           | STV36-KVR-HD | 087452047023 | Sony         | 
> ....
> +------------------+--------------+--------------+--------------+
> 
> One way might be to do three separate queries, one for each word.
> Then store them in an array and compare the overlaps, removing
> any that aren't shared. Then a final query where product.id IN(array)
> 
> That seems extremely inefficient and hackish though.

I misunderstood what you meant at first.

So, you want rows where all words appear in the row, rather than where all words appear 
in any one column.  How about this: every time you get a match on a term in any column, 
count it as 1.  Then add all these and compare to the number of words in your search 
input, which you can determine either in SQL with a little text wrangling, or probably 
more easily in whatever the client code is with a split() or similar.

SELECT products.*, companies.name AS company_name, categories.name AS category_name
FROM 	 products
	 LEFT JOIN companies ON company_id = companies.id
	 LEFT JOIN categories ON category_id = categories.id
WHERE  products.enabled = 1
	AND(
	     (products.model LIKE 'sony%'   + products.model LIKE '20%'  products.model LIKE 
'tv%')
	 + (products.upc LIKE 'sony'      + products.upc LIKE '20'     + products.upc LIKE 'tv')
	 + (products.name LIKE '%sony%'   + products.name LIKE '20%'   + products.name LIKE 
'%tv%')
	 + (companies.name LIKE 'sony%'   + companies.name LIKE '20%'  + companies.name LIKE 
'tv%')
	 + (categories.name LIKE '%sony%' + categories.name LIKE '20%' + categories.name LIKE 
'%tv%')
	)
         >= [$number_of_words_in_input]
ORDER BY categories.name DESC, products.name ASC, companies.name ASC;

This will be ugly and inefficient though.  It might be better to build and maintain a 
separate table with the concatenation of all the fields, and fulltext index that.

Baron
Thread
How do I find products when a user types freeform strings like 'Sony 20" TV' or '20" Sony TV'?Daevid Vincent4 May
  • Re: How do I find products when a user types freeform strings like 'Sony 20" TV' or '20" Sony TV'?Mogens Melander4 May
  • Re: How do I find products when a user types freeform strings like'Sony 20" TV' or '20" Sony TV'?Baron Schwartz4 May
    • RE: How do I find products when a user types freeform strings like 'Sony 20" TV' or '20" Sony TV'?Daevid Vincent4 May
      • Re: How do I find products when a user types freeform strings like'Sony 20" TV' or '20" Sony TV'?Baron Schwartz4 May
      • Re: How do I find products when a user types freeform strings like 'Sony 20" TV' or '20" Sony TV'?Mayssam Sayyadian7 May
  • Re: How do I find products when a user types freeform strings like 'Sony 20" TV' or '20" Sony TV'?Iain Alexander10 May
    • RE: How do I find products when a user types freeform strings like 'Sony 20" TV' or '20" Sony TV'?Daevid Vincent10 May
  • RE: How do I find products when a user types freeform strings like 'Sony 20" TV' or '20" Sony TV'? [SOLVED]Daevid Vincent11 May