Robb Kerr <rkerr.news@stripped> wrote on 11/25/2005 11:59:48 AM:
> I'm building a new clipart site. I need to have keyword searching. I'm
> seeking opinions about table design. Here are my proposed options. If
> anyone has any other suggestions, please make them.
> Option One
> Related tables. Table one (clipart pieces) contains ClipartID and
> ClipartName fields. Table two (keywords) contains KeywordID, ClipartID
> Keyword fields. This option will create an incredibly large related
> (keywords) with each piece of clipart having tens of related fields in
> keyword table. But, searching ought to be fast.
> Option Two
> Single table. Table one (clipart pieces) contains ClipartID, ClipartName
> and Keywords fields. The Keywords field would be a long text field that
> would be searched with a full-text search. Searching and maintenance
> be easier but would searching be slowed down significantly?
> Please provide any input you have and make any alternate suggestions.
> Robb Kerr
> Digital IGUANA
If speed and flexibility is your priority, use option one.
Remember: FT indexing will skip all stopwords and words smaller than the
minimum FT length. By default, that length is set to 4 but there are ways
to make it smaller and you can also provide an empty stopword list so you
can get around that, too.
For option 1 I think you need a third table that maps keywords to clipart.
That way you only need to store the string value of a keyword once. It
will keep your keywords table smaller and make lookups faster. Technically
speaking, doing it this way "normalizes" your keyword data. Normalized
databases most often perform much better than denomalized databases. There
are some notable exceptions but as a general rule this is true.
Making a separate keyword table makes it easier to search for a list of
keywords and locate not just full matches but partial matches, too
(matched 6 of 8 search terms). The FT search returns a relevance number
but as applied to "smaller" chunks of text (a list of 20 or 30 keywords or
key phrases), that may not be very useful.
Search the archives for various techniques of finding lists of values from
normalized data. This type of question has appeared frequently.
Unimin Corporation - Spruce Pine