David Lennartsson wrote:
> I'm trying to build a small searchengine for documents.
> First I tried to
> build it with subselects, but we all know now that they aren't supported.
> After that I did it as in yesterday thread ( I stored the lWordRef in an
> variablelength string and did an select lDocref from doclink where
> lWordRef IN (...) ). It works well but an simple join may be nicer, BUT
> the joins turn out to join the whole tables if I do an szWord LIKE ... OR
> szWord LIKE ... .(takes about 20 seconds, toooo loong).. the join do space
> out totally if i search on szWord LIKE "%sql" (no index).. Doing it with
> select ... ; store in string select ...IN (...) makes it in about
> 10seconds when the join took about 3,5 minutes ;)
> I can't find a way around that except using the string and make a
> new select using IN (...). It's only one or two seconds slower on ordinary
> queries and it never fails. Can we speed it up?
> But my real question is about the next step.
> I do queries like to search for the word linux and mysql. It also returns
> the number of hits in each document. See below.
> mysql> select lDocRef,count(lDocRef) as freq from doclink,words where
> doclink.lWordRef = words.lWordRef and szWord LIKE "linux" or szWord LIKE
> "mysql" GROUP BY lDocRef;
> Now I also want to want to do queries like Find: (linux or mysql) and
> (answer or question).
> Right now i'm doing two selects like above, store the results in an linked
> list. Then I make an "itemwise"-AND on the two lists. In the new linked
> list I only have documents existing in both queries resultset. Of course I
> do add the total freq to the new list's datapart.
> Can I do all of the above in one nice query with joins, subselects or
> whatever? And not that slow either, it moves quite fast already..
> Thank you all for good discussions!
> /dle - there has to be a way to do joins with OR, or?
< cut >
JOINs are not always the best solution!
In your case the IN(...) method seems to be the best I can think of.
If you want to do the (linux or mysql) and (answer or question) thing with JOINs, then you
have to do a selfjoin of words under another alias.
Instead of using linked lists, you could use a temporary table within mysql.