At 09:08, 19990818, Don Philtrodt wrote:
>Questions: What exactly (if anything) was "promised" by TcX regarding
>indexing of TEXT fields? (I see mention in the change log of indexing
>on T/B fields, but what gives?)
Indexing of TEXT fields is already in MySQL 3.23. This is a normal
index (key), nothing special. You have to specify how much of the
field you want to index (i.e., the first N characters). I don't
think this is what you're really trying to get at, though.
>Will the entire TEXT field be indexed, or just a portion? Will
>extensions be made to the language that allow for "fuzzy" stuff like
>"NEAR", automagic word-stemming, plural-matching, exact matches and
>boolean operators?
SerG (not at TcX) had plans to do full-text indexing on TEXT fields a
while ago. Things were delayed until 3.23 came out, because there was
some work in 3.23 that was needed for a good implementation. Now SerG
is delayed with other things, but still has plans to work on this. I
believe it will have NEAR and boolean operators. I don't know what
features are planned, but I'm sure that if there's no stemming, etc.,
then you'd be welcome to hack it in yourself.
>On a related note... has anybody used/implemented a do-it-yourself
>word searcher using SOUNDEX()? Are SOUNDEX searches remotely
>intuitive or more like "why the heck did THAT show up?!"
I think they're usually pretty far off for most applications. For
some things (like looking up an address / name in a list of customers)
it can be useful. That's just my experience.
><think mode=out_loud>
>I wonder if there's another model that, like Soundex, converts a word
>to a value, but perhaps has accomodations for msipelings, plurals, and
>tenses.
></think>
I think you could do it, but it'd have to be language-specific (which
might not be a big deal to you, if you're only using English and it's
written for English). It might help for it to be domiain-specific,
too (i.e., I'm looking for a surname). I haven't read any papers on
this, though. If you find a good one, let me know.
Tim