After hearing both sides being suggested, I think I would recommend the
more normalized approach of having one or more separate tables for the
names (one name per row) and a mapping table that associates a name field
(mother's maiden name, decedent's given name, decedent's surname, etc.) on
a record (marriage, birth, death, etc.) to the possible names the
transcriptionist thinks were written on the record. A simple index would
very quickly find all full matches and partial beginning matches. With
mid-word or word-ending matches**, the engine will still need to scan the
entire list of names (not every name in the database, just the unique list
of spellings which will be much smaller) .
This (the multi-table design) may add a little complexity to your GUI code
but I think that the performance improvement on the searches will more
than offset a little extra effort added to the front end and data entry
portions of your process.
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine
** You could optimize word-ending searches by storing a copy of the name
reversed. Then a regular index could also be used to quickly find the end
of a name.
Lets say you want to find all of the names that end in "esh"... You have
names like "Abesh", "Janesh", and "Phresh". If you added another column
(indexed of course) to your names table that stored the names as "hsebA",
"hsenaJ", and "hserhP" a search using LIKE 'hse%' would use that index
where a search using LIKE '%esh' could not.
The trade off is that you nearly double the storage size of your names
tables (one extra column plus its index) but you seriously improve search
performance for an entire class of queries (names ending with ....).
"Jigal van Hemert" <jigal@stripped> wrote on 01/06/2005 09:30:08 AM:
> From: "Kentucky Families"
>
> > A full-text search won't work. Too many records -- at least 8 million.
> From what I've read, full-text won't pick up the individual initials or
very
> short names, like Vu, Lee, Doe, etc.
>
> You can set the ft_min_word_len in an option file in the [mysqld]
section to
> change the minimum word length (rebuild the full text indexes
afterwards!)
>
> Maybe alter the data a bit?
> If you store only firstnames you can search for J* to pick up an
initial.
> If you only know initials simply use Jxzx as a first name (and let your
> application filter out the xzx part)
> If a name is shorter than four letters add xzx to the name and filter
out in
> the application that shows the data.
>
> Regards, Jigal.
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql?unsub=1
>