List:General Discussion« Previous MessageNext Message »
From:Waynn Lue Date:May 10 2008 1:36pm
Subject:latin1 vs UTF-8
View as plain text  
I'm trying to store the symbol (R) (that's the registered trademark
symbol) in my database, but I get a weird Ctrl-A (^A) character
whenever I try.  At first, I thought it was because I was calling
htmlentities without passing in "UTF-8" as the last argument, but that
only solved one of my problems.  Then I spent some time looking at
encodings, and I'm trying to figure out if the fact that the charset
is set to latin1 is the reason why.

Assuming it is, is there anything I can do to avoid having to dump the
database and recreate it with the other encoding?  I've spent some
time tonight looking on the web and at MySQL's documentation on
charsets and these are the options I've come up with.
alter table TABLE_NAME convert to character set utf8;
I'm assuming this is just a regular alter table, which means I'm going
to have to take down my server for the duration of the change, which
can take long periods of time.

http://www.oreillynet.com/onlamp/blog/2006/01/turning_mysql_data_in_latin1_t.html
This essentially is dump and recreate the db.

iconv
This was mentioned somewhere, but no one had a concrete implementation.

Also, are there any gotchas in doing this?  I assume I should check if
my mysql has support for UTF-8, that I need to issue SET NAMES 'utf8';
or put it into my.cnf, and that my php code needs to output the
headers with UTF-8 as well.

Thanks,
Waynn
Thread
latin1 vs UTF-8Waynn Lue10 May 2008
  • Re: latin1 vs UTF-8Warren Young12 May 2008
    • Re: latin1 vs UTF-8Waynn Lue13 May 2008
      • Re: latin1 vs UTF-8Warren Young13 May 2008
        • RE: latin1 vs UTF-8Jerry Schwartz13 May 2008
  • RE: latin1 vs UTF-8Jerry Schwartz12 May 2008