-----BEGIN PGP SIGNED MESSAGE-----
what is the best way to match german umlauts like 'ä' also their
alternative writing 'ae'?
For example I'm searching for "übersee" and I also want to find the word
"uebersee" in the database. The "words" are actually names of persons.
One possibility is to dynamically expand the SQL statement if such
special characters are found. So the search term "übersee" will be
expanded to "SELECT * FROM person WHERE name LIKE 'übersee%' AND name
LIKE 'uebersee%'" but this is getting dirty and very very long if
multiple umlauts are used to cover all cases ...
So the other idea is to have the name twice in the database for every
person and the second "version" of the name is a normalized for where
all special characters are replaced with their alternative writing. E.g.
I store the field name "übersee" and also name2 "uebersee" and when
matching I match against name2. If the field would container more
special characters it still would work without much more work, e.g. name
is "überseemöbel" then name2 would be "ueberseemoebel" and when the term
"überseemö" is entered it's also normalized to "ueberseemoe" and the
LIKE statement will still match. Basically this is some kind of
primitive stemming like lucene does it.
Is there maybe some built-in support from MySQL for such special cases?
thanks for any pointers,
- - Markus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v18.104.22.168 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----