List:General Discussion« Previous MessageNext Message »
From:Markus Fischer Date:March 22 2006 2:35pm
Subject:Matching of german umlauts with LIKE
View as plain text  
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

what is the best way to match german umlauts like 'ä' also their
alternative writing 'ae'?

For example I'm searching for "übersee" and I also want to find the word
"uebersee" in the database. The "words" are actually names of persons.

One possibility  is to dynamically expand the SQL statement if such
special characters are found. So the search term "übersee" will be
expanded to "SELECT * FROM person WHERE name LIKE 'übersee%' AND name
LIKE 'uebersee%'" but this is getting dirty and very very long if
multiple umlauts are used to cover all cases ...

So the other idea is to have the name twice in the database for every
person and the second "version" of the name is a normalized for where
all special characters are replaced with their alternative writing. E.g.
I store the field name "übersee" and also name2 "uebersee" and when
matching I match against name2. If the field would container more
special characters it still would work without much more work, e.g. name
is "überseemöbel" then name2 would be "ueberseemoebel" and when the term
"überseemö" is entered it's also normalized to "ueberseemoe" and the
LIKE statement will still match. Basically this is some kind of
primitive stemming like lucene does it.

Is there maybe some built-in support from MySQL for such special cases?

thanks for any pointers,
- - Markus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEIWDH1nS0RcInK9ARAkzyAKCyoPPVd1YRfhs1p/p8kY465/QPVQCfa5uj
r2ZarPZvsJp5FPNDsdhAN7E=
=5ADZ
-----END PGP SIGNATURE-----
Thread
Matching of german umlauts with LIKEMarkus Fischer22 Mar
  • Re: Matching of german umlauts with LIKEsheeri kritzer23 Mar
    • Re: Matching of german umlauts with LIKEMarkus Fischer24 Mar