On 22-Apr-99 chas wrote:
<snip>
>
> And then the occurences of each word in each document (and the
> number of occurences, used for accuracy purposes) are stored in
> a third 'search' table :
>
> Table : "search"
>
> wordid | docid | numtimes
> ------------------------------
> 367 | 47 | 8
> 368 | 47 | 2
> 367 | 45 | 4
>
>
>
<snip again>
>
> Now for my problem :
> --------------------
> Now, if the user searches for "Happy world", I can find all the
> document IDs where the wordid is 369 (ie. "happy") and then
> all the document IDs where the wordid is 368 (ie. "world")
> but is it possible to use SQL to return the intersection of the
> two ? (At the moment, I'm using Python but this doesn't seem
> scalable)
>
> For example, this is taken from my current database with
> some common words whose word ID's are 49 and 45 :
>
>
> Assuming that documents which have both words are more likely to be correct,
> how can I find the intersection ? I can't use either :
> 'select search.docid where wordid = 45 or wordid = 49' ... produces a UNION
> or
> 'select search.docid where wordid = 45 and wordid = 49' ... always EMPTY
>
select a.docid from search as a, search as b
where a.docid=b.docid and (a.wordid=45 and b.wordid=49);
Also you might want to checkout the SOUNDEX() functions for your
word-id's.
Regards,
---
Don Read sysop@stripped
EDP Manager dread@stripped
Calcasieu Lumber Co. Austin TX
-- Any clod can have the facts, but having an opinion is an art.