On Mon, May 31, 1999 at 04:34:28PM +0200, Stefan Möhl wrote:
> Hello SQL-wizards!
> I have problems with building a query while doing free-text search. I
> have a database of documents stored in a table, each with a docID.
> Each word in a document is also stored in a word list. The word list
> consists of word-docID pairs, so that I can perform fast free-text
> searches for documents. I am having problems when constructing a
> query where I try to find all documents that do NOT match a keyword.
i'm not an sql wizard .. but the job you want to do is not a job
that mysql even postgresql or any databse can do well if at all.
i have been trying to bend mysql (postgresql has better support
for this kind of work, but mysql is faster, because it lack the
inherant support required .. so take your pick) to make this a
reality .. and failed miserably.
as i said mysql (*sql) is not the hammer to use fro thos nail.
better support can be had by setting up a wais database or by
using a tool called isearch that is specifically dsigned to do
this sort of mangling.
isearch is stable and works on freebsd, hence reliabilty
underload is not a problem.
> Is there a way of doing what I want in a single query?
not with the tools that you are using.
we have a small text pool here (about 4 gb) that requires
frequent repairs (out of sync realworld data and static text
descriptions). we have tried sevreal ways of doing this using
current database technology. it has come to (our) my conclusion
that databases are good at juggling data not words.
> my technique for free-text search is quite common, so someone must
> have encountered this problem before!
yes, the probelm is common, the solution is also common, but
only if you use the right tools ... isearch. isite or setup a
wais database and ust teh supplied clients to to the searching.
i found the hardest part was teh desision to leave behind the
world of database building, we are after all talking about two
disparete worlds, here.
> If this can not be done in one query it can of course easily be done
> in several: first I query to see if the keyword exists in the word
> list. If it does I pick out the documents that do not contain the
> keyword, otherwize I pick out all documents.
> Here I get a second problem: If I have many negative keywords, how do
> I in one query find out which keywords exist in the word table and
> which ones do not?
this is not a problem that needs solving, if the apropriate
tools are used.
please overlook my tone and spelling, i am an asocial, 35 percent
disabled person. i make no apologies fro my demeanour or
spelling. thats the hand life dealt me at birth.
PO Box 144, Rosebery, NSW 1445 Australia