List:General Discussion« Previous MessageNext Message »
From:Michael Stassen Date:July 25 2004 5:15am
Subject:Re: Fulltext boolean search results
View as plain text  
leegold wrote:

> <Michael.Stassen@stripped> said:
> 
>> From the manual
>> <http://dev.mysql.com/doc/mysql/en/Fulltext_Search.html>:
>>
>>
>>>MySQL uses a very simple parser to split text into words. A word is any
>>>sequence of characters consisting of letters, digits, ', or _. Some
>>>words are ignored in full-text searches:
>>>
>>> Any word that is too short is ignored...
>>
>>. and - are non-word characters, so they are treated as word separators. 
>>Hence, your query is asking for documents containing 'BT', '1034', and
>>'06'. 
>>  The first and last are too short, so they are dropped, resulting in a 
>>search for just '1034'.  Documents are indexed similarly, so each of the 
>>examples you give are indexed as '1034' only; the parts before and after
>>are too short and not indexed.  So you get the results you indicate.
>>
>>When you add the quotes, results which match '1034' are then filtered for 
>>matches containing the exact text in quotes, "BT-1034.06" in this case, 
>>yielding the result you want.
> 
> 
> Thanks for the explanation.
> BT-1034.02 is one of many primary keys. I suppose
> Fulltext is a natural language search and not the
> tool to use when searching for specific primary keys.
> 
> I could regex user search input and if I see anything
> between a '-' or '.' that's less than 4 chars I could ""
> the whole string(?)
> 
> Wonder what google or yahoo do?
> 
> Is there a way around this? Why is it the default?
> 
> Lee

It's the default because that's the purpose of the full-text index. 
Full-text matching is designed to find words in text.  It isn't designed to 
find serial numbers and the like.  Those should go in their own columns, if 
at all possible, where they can be searched directly.

I'm not sure what you mean by primary keys, as a table can have only one 
*primary* key.  In any case, it's hard to advise you on a solution without 
knowing more about your situation.  In general, I'd probably suggest that 
keys like BT-1034.02 should be in their own indexed column, and should have 
a corresponding separate input box on the search form, rather than being 
part of the full-text search.

Michael

Thread
Errcode: 27J S23 Jul
Re: Errcode: 27Aman Raheja23 Jul
Re: Errcode: 27J S24 Jul
RE: Errcode: 27J S24 Jul
  • Fulltext boolean search resultsleegold25 Jul
    • Re: Fulltext boolean search resultsMichael Stassen25 Jul
      • Re: Fulltext boolean search resultsleegold25 Jul
        • Re: Fulltext boolean search resultsMichael Stassen25 Jul
RE: Errcode: 27J S26 Jul
  • how to deal with a string of categoriesMojtaba Faridzad26 Jul
    • Re: how to deal with a string of categoriesBrent Baisley26 Jul
    • Re: how to deal with a string of categoriesSGreen26 Jul
      • Re: how to deal with a string of categoriesMojtaba Faridzad26 Jul
        • Re: how to deal with a string of categoriesSGreen26 Jul
  • Re: how to deal with a string of categoriesMartijn Tonies26 Jul
  • how to set timeout processMojtaba Faridzad27 Jul
    • /etc/my.cnfWolfgang Riedel27 Jul
      • Re: /etc/my.cnfWolfgang Riedel27 Jul
RE: Errcode: 27J S26 Jul
RE: how to set timeout processVictor Pendleton27 Jul
  • Re: how to set timeout processMojtaba Faridzad27 Jul
  • A possible bugLeonardo Javier Belén27 Jul
RE: Errcode: 27J S27 Jul
Re: A possible bugLeonardo Javier Belén28 Jul