List:General Discussion« Previous MessageNext Message »
From:Dan Nelson Date:January 31 2007 10:16pm
Subject:Re: Fulltext relevance and weighting....
View as plain text  
In the last episode (Jan 31), Mike Morton said:
> Mike:
> 
> :)  I wish!  Free ham for everyone!
> 
> I had already changed the min length to 2 actually - so that is not the
> affecting factor...
> 
> It is more of an issue to prioritizing fields for relevance, and whether it
> is possible to do this within a fulltext query, or whether it needs to be
> done through multiple queries, and then "outside" php processing of those
> query results....

You should be able to do what you need by making your 'score'
expression something like this:

select *, 
 match(code) against ('ham*' in boolean mode) * 8 +
 match(name) against ('ham*' in boolean mode) * 4 +
 match(small_desc) against ('ham*' in boolean mode) * 2 +
 match(large_desc) against ('ham*' in boolean mode)
 as score from products where active='y' and site like '%,1,%' and 
 match(code,name,small_desc,large_desc) against ('ham*' IN BOOLEAN MODE)
 order by score desc

This takes advantage of the fact that boolean mode matches always
return 1 or 0, so a record matching in the "code" field will sort
higher than a record with "ham" in all 3 of the others but not in
"code".

> >> Does anyone have any suggestions on how to solve the result
> >> weighting problem? I have a client whose search results are
> >> becoming more and more important, and the relevance demands on the
> >> results are not entirely satisfactory...
> >> 
> >> The fields that are searched are code, name, small description and
> >> large description, and are ranked in relevance in that order.
> >> 
> >> For example, a product with the name: "Bone-In Serrano Ham" should
> >> ALWAYS outweigh the product with the name of "Boneless Jamon
> >> Iberico", even if the Jamon Iberico has the word "ham" in the
> >> description 20 times more than the Serrano product...
> >> 
> >> The query that is being run is: select
> >> *,match(code,name,small_desc,large_desc) against ('ham*') as score
> >> from products where active='y' and site like '%,1,%' and
> >> match(code,name,small_desc,large_desc) against ('ham*' IN BOOLEAN
> >> MODE) order by score desc
> >> 
> >> It returns some good relevant matches, but then in the middle of
> >> products names with "ham" in them, it returns one without....
> >> 
> >> Does this require a complete logic switch, or is there a way to
> >> build a query to do this?
> >> 
> >> Obviously the actual build of the query is more complex, and there
> >> are other rules that need to be applied to the user submitted
> >> query, but this is the basics...
> >> 
> >> If there is a Fulltext search relevance expert out there in list
> >> land, I am at my wits end trying to make the results the most
> >> relevant that they can be - I am willing to work closely with
> >> (pay) someone with the knowledge and expertise to assist in this. 
> >> (using PHP)

-- 
	Dan Nelson
	dnelson@stripped
Thread
Fulltext relevance and weighting....Mike Morton31 Jan
  • Re: Fulltext relevance and weighting....mos31 Jan
    • Re: Fulltext relevance and weighting....Mike Morton31 Jan
      • Re: Fulltext relevance and weighting....Dan Nelson31 Jan
        • Re: Fulltext relevance and weighting....Philip Mather3 Feb