On 22-Apr-99 chas wrote:
> Don Read wrote :
>>select a.docid from search as a, search as b
>> where a.docid=b.docid and (a.wordid=45 and b.wordid=49);
>
> wow... I'm pretty much speechless Don - that worked a treat
> and was so succinct; you just made several script functions
> redundant :)
>
> however, I tried using this technique for intersecting 3 or 4
> words and kept getting empty sets. (I'll have to play around with
> it a little more).
>
> That's when I also saw the mail from Robin Bowes :
>>The usual way of doing this is:
>>
>>select
>> s1.docid
>>from
>> search s1
>> join search s2
>>where
>> s1.docid = s2.docid
>>and
>> s1.wordid = 49
>>and
>> s2.wordid = 45
>
> This is easier for me to get my head around and seems to work
> well for the 3 word intersection :
>
> select s1.docid
> from
> search s1
> join search s2
> join search s3
> where
> s1.docid = s2.docid
> and
> s2.docid = s3.docid
> and
> s1.wordid = 49
> and
> s2.wordid = 45
> and
> s3.wordid = 1
> ;
>
> Or perhaps I've just been lucky so far and got coincidental results
> so far.
>
Whatever your eyeball can grok:
select a.docid from search as a,search as b,search as c
where a.docid=b.docid
and b.docid=c.docid
and a.wordid=49
and b.wordid=45
and c.wordid=1;
>
> Also from Don Read :
>>Also you might want to checkout the SOUNDEX() functions for your
>>word-id's.
>
> This was the first time I'd even heard of SOUNDEX() - sorry, I'm
> not strong at DB's - so i dug through the archives; also, using
> the exceptionally-accurate search engine infoseek, i found :
>
> http://www.hpl.lib.tx.us/clayton/soundex.html
> http://physics.nist.gov/cuu/Reference/soundex.html
> http://www.outfitters.com/genealogy/what-soundex.html
>
> Now, since my search engine is for Chinese as well as English text,
> I guess this is probably not appropriate. Right ?
>
http://www.mysql.com/Manual_chapter/manual_Reference.html#String_functions
> But, even if it were all English, would it really be advisable to
> use soundex() ? Doesn't it lead to inaccurate results due to the
> number of words that could be soundex'ed to the same value ?
>
Depends on the application; in my case I update a couple~4 tables of addresses;
SOUNDEX'ing street,sub-divisions and city-names for order routing and
delivery stats.
Some of our data entry folks can't type (or spell).
all the same sub-division :
RIVERPLACE - RIVER PLACE - RIVVER PL - RIVER PALCE - ...
'tis a major pain to exact match the possibilities, so SOUNDEX() gives a bit
of fuzziness to the help automate the search. I only put in a hour or so
each week to fix-up the truly fubar orders.
Just a thought. (YMMV).
Regards,
---
Don Read sysop@stripped
EDP Manager dread@stripped
Calcasieu Lumber Co. Austin TX
-- Any clod can have the facts, but having an opinion is an art.