List:General Discussion« Previous MessageNext Message »
From:Mark Phillips Date:September 24 2012 11:28pm
Subject:Need Help Converting Character Sets
View as plain text  
I have a table, Articles, of news articles (in English) with three text
columns for the intro, body, and caption. The data came from a web page,
and the content was cut and pasted from other sources. I am finding that
there are some non utf-8 characters in these three text columns. I would
like to (1) convert these text fields to be strict utf-8 and then (2) fix
the input page to keep all new submissions utf-8.

91) For the first step, fixing the current database, I tried:

update Articles set body = CONVERT(body USING ASCII);

However, when I checked one of the articles I found an apostrophe had been
converted into a question mark. (FWIW, the apostrophe was one of those
offending non utf-8 characters):

Before conversion: "I stepped into the observatory’s control room ..."

After conversion: "I stepped into the observatory?s control room..."

Is there a better way to accomplish my first goal, without reading each
article and manually making the changes?

(2) For the second goal, insuring that all future articles are utf-8, do I
need to change the table structure or the insert query to insure I get the
correct utf-8 characters into the database?

Thanks,

Mark

Thread
Need Help Converting Character SetsMark Phillips24 Sep
  • RE: Need Help Converting Character SetsRick James24 Sep
    • Re: Need Help Converting Character SetsDerek Downey28 Sep
  • Re: Need Help Converting Character Setshsv28 Sep
    • RE: Need Help Converting Character SetsRick James28 Sep
      • Re: Need Help Converting Character SetsMark Phillips30 Sep
        • RE: Need Help Converting Character SetsRick James1 Oct
        • Re: Need Help Converting Character Setshsv2 Oct