List:Commits« Previous MessageNext Message »
From:stefan Date:June 26 2006 4:07pm
Subject:svn commit - mysqldoc@docsrva: r2517 - trunk/refman-common
View as plain text  
Author: shinz
Date: 2006-06-26 18:07:34 +0200 (Mon, 26 Jun 2006)
New Revision: 2517

Log:
Attached PeterG's CJK FAQ to the Charset chapter (the actual FAQ was missing in my last commit ;-)

Added:
   trunk/refman-common/cjk-faq.en.xml

Added: trunk/refman-common/cjk-faq.en.xml
===================================================================
--- trunk/refman-common/cjk-faq.en.xml	                        (rev 0)
+++ trunk/refman-common/cjk-faq.en.xml	2006-06-26 16:07:34 UTC (rev 2517)
@@ -0,0 +1,1353 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
+          "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd"
+          [
+          <!ENTITY % fixedchars.entities  SYSTEM "fixedchars.ent">
+          %fixedchars.entities;
+          <!ENTITY % title.entities       SYSTEM "titles.en.ent">
+          %title.entities;
+          ]>
+<section id="cjk-faq">
+
+  <title>&title-cjk-faq;</title>
+
+  <indexterm type="concept">
+    <primary>CJK</primary>
+    <secondary>FAQ</secondary>
+  </indexterm>
+
+  <indexterm type="concept">
+    <primary>Chinese, Japanese, Korean character sets</primary>
+    <secondary>frequently asked questions</secondary>
+  </indexterm>
+
+  <indexterm type="concept">
+    <primary>Japanese, Korean, Chinese character sets</primary>
+    <secondary>frequently asked questions</secondary>
+  </indexterm>
+
+  <indexterm type="concept">
+    <primary>Korean, Chinese, Japanese character sets</primary>
+    <secondary>frequently asked questions</secondary>
+  </indexterm>
+
+  <para>
+    This Frequently-Asked-Questions section comes from the experiences
+    of MySQL's Support and Development groups, after handling many
+    enquiries about CJK (Chinese Japanese Korean) issues.
+  </para>
+
+<!-- This list can be removed if TOC shows up correctly
+  <para>
+    Contents:
+
+    <itemizedlist>
+
+      <listitem>
+        <para></para>
+      </listitem>
+
+      <listitem>
+        <para>
+          SELECT shows non-Latin characters as "?"s. Why?
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Troubles with GB character sets (Chinese)
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Troubles with Big5 character set (Chinese)
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Troubles with character-set conversions (Japanese)
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          The Great Yen Sign Problem (Japanese)
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Troubles with euckr character set (Korean)
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          The "Data truncated" message
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Troubles with Access (or Perl) (or PHP) (etc.)
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          How can I get old MySQL-4.0 behaviour back?
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Why do some LIKE and FULLTEXT searches fail?
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          What CJK character sets are available?
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Is character X available in all character sets?
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Strings Don't Sort Correctly in Unicode (I)
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Strings Don't Sort Correctly in Unicode (II)
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          My supplementary characters get rejected
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Shouldn't it be CJKV (V for Vietnamese)?
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Will MySQL fix any CJK problems in version 5.1?
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          When will MySQL translate the manual again?
+        </para>
+      </listitem>
+
+      <listitem>
+        <para>
+          Whom can I talk to?
+        </para>
+      </listitem>
+
+    </itemizedlist>
+  </para>
+-->
+
+  <section id="cjk-faq-question-marks">
+
+    <title>SELECT shows non-Latin characters as "?"s. Why?</title>
+
+    <para>
+      You inserted CJK characters with <literal>INSERT</literal>, but
+      when you do a <literal>SELECT</literal>, they all look like
+      <quote>?</quote>. It usually is a setting in MySQL that doesn't
+      match the settings for the application program or the operating
+      system. These are common troubleshooting steps:
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            Find out: what version do you have? The statement
+            <literal>SELECT VERSION();</literal> will tell you. This FAQ
+            is for MySQL version 5, so some of the answers here will not
+            apply to you if you have version 4.0 or 4.1.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Find out: what character set is the database column really
+            in? Too frequently, people think that the character set will
+            be the same as the server's set (false), or the set used for
+            display purposes (false). Make sure, by saying <literal>SHOW
+            CREATE TABLE tablename</literal>, or better yet by saying
+            this:
+
+<programlisting>
+        SELECT character_set_name, collation_name
+        FROM   information_schema.columns WHERE  table_schema = your_database_name
+        AND    table_name = your_table_name
+        AND    column_name = your_column_name;
+      </programlisting>
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Find out: what is the hexadecimal value?
+
+<programlisting>
+          SELECT HEX(your_column_name)
+          FROM your_table_name;
+        </programlisting>
+
+            If you see <literal>3F</literal>, then that really is the
+            encoding for <literal>?</literal>, so no wonder you see
+            <quote>?</quote>. Probably this happened because of a
+            problem converting a particular character from your client
+            character set to the target character set.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Find out: is a literal round trip possible, that is, if you
+            select <quote>literal</quote> (or <quote>_introducer
+            hexadecimal-value</quote>) do you get <quote>literal</quote>
+            as a result? For example, with the Japanese Katakana Letter
+            Pe, which looks like <literal>ペ'</literal>, and which
+            exists in all CJK character sets, and which has the code
+            point value (hexadecimal coding) <literal>0x30da</literal>,
+            enter:
+
+<programlisting>
+SELECT 'ペ' AS `ペ`;         /* or SELECT _ucs2 30da; */
+</programlisting>
+
+            If the result doesn't look like <literal>ペ</literal>, a
+            round trip failed. For bug reports, we might ask people to
+            follow up with <literal>SELECT hex('ペ');</literal>. Then
+            we can see whether the client encoding is right.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Find out: is it the browser or application? Just use
+            <command>mysql</command> (the MySQL client program, which on
+            Windows will be <command>mysql.exe</command>). If
+            <command>mysql</command> displays correctly but your
+            application doesn't, then your problem is probably
+            <quote>Settings</quote>, but consult also the question about
+            <quote>Troubles with Access (or Perl) (or PHP)
+            (etc.)</quote> much later in this FAQ.
+          </para>
+
+          <para>
+            To find your settings, the statement you need here is
+            <literal>SHOW VARIABLES</literal>. For example:
+
+<programlisting>
+mysql> <userinput>SHOW VARIABLES LIKE 'char%';</userinput>
++--------------------------+----------------------------------------+
+| Variable_name            | Value                                  |
++--------------------------+----------------------------------------+
+| character_set_client     | utf8                                   |
+| character_set_connection | utf8                                   |
+| character_set_database   | latin1                                 |
+| character_set_filesystem | binary                                 |
+| character_set_results    | utf8                                   |
+| character_set_server     | latin1                                 |
+| character_set_system     | utf8                                   |
+| character_sets_dir       | /usr/local/mysql/share/mysql/charsets/ |
++--------------------------+----------------------------------------+
+8 rows in set (0.03 sec)
+</programlisting>
+
+            The above are typical character-set settings for an
+            international-oriented client (notice the use of
+            <literal>utf8</literal> Unicode) connected to a server in
+            the West (<literal>latin1</literal> is a West Europe
+            character set and a default for MySQL).
+          </para>
+
+          <para>
+            Although Unicode (usually the <literal>utf8</literal>
+            variant on Unix, usually the <literal>ucs2</literal> variant
+            on Windows) is better than <quote>latin</quote>, it's often
+            not what your operating system utilities support best. Many
+            Windows users find that a Microsoft character set, such as
+            <literal>cp932</literal> for Japanese Windows, is what's
+            suitable.
+          </para>
+
+          <para>
+            If you can't control the server settings, and you have no
+            idea what your underlying computer is about, then try
+            changing to a common character set for the country that
+            you're in (<literal>euckr</literal> = Korea,
+            <literal>gb2312</literal> or <literal>gbk</literal> =
+            People's Republic of China, <literal>big5</literal> = other
+            China, <literal>sjis</literal> or <literal>ujis</literal> or
+            <literal>cp932</literal> or <literal>eucjpms</literal> =
+            Japan, <literal>ucs2</literal> or <literal>utf8</literal> =
+            anywhere). Usually it is only necessary to change the client
+            and connection and results settings, and there is a simple
+            statement which changes all three at once, namely
+            <literal>SET NAMES</literal>. For example:
+
+<programlisting>
+SET NAMES 'big5';
+</programlisting>
+
+            Once you get the correct setting, you can make it permanent
+            by editing <filename>my.cnf</filename> or
+            <filename>my.ini</filename>. For example you might add lines
+            looking like this:
+
+<programlisting>
+[mysqld]
+SET NAMES 'big5'
+</programlisting>
+          </para>
+        </listitem>
+
+      </itemizedlist>
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-gb-charset-problems">
+
+    <title>Troubles with GB character sets (Chinese)</title>
+
+    <para>
+      <remark role="update">
+        [SH] References to d.udm.net (Bar's pages) need to be changed
+        once we've moved those pages to the Reference Manual.
+      </remark>
+
+      MySQL supports the two common variants of the GB (<quote>Guojia
+      Biaozhun</quote> or <quote>National Standard</quote>) character
+      sets which are official in the People's Republic of China:
+      <literal>gb2312</literal> and <literal>gbk</literal>. Sometimes
+      people try to insert <literal>gbk</literal> characters into
+      <literal>gb2312</literal>, and it works most of the time because
+      <literal>gbk</literal> is a superset of <literal>gb2312</literal>.
+      But eventually they try to insert a rarer Chinese character and it
+      doesn't work. (Example: bug #16072 in our bugs database,
+      <ulink url="http://bugs.mysql.com/bug.php?id=16072"/>). So we'll
+      try to clarify here exactly what characters are legitimate in
+      <literal>gb2312</literal> or <literal>gbk</literal>, with
+      reference to the official documents. Please check these references
+      before reporting <literal>gb2312</literal> or
+      <literal>gbk</literal> bugs. We now have a graphic listing of the
+      <literal>gbk</literal> characters, currently on the site of Mr
+      Alexander Barkov (MySQL's principal programmer for character set
+      issues). The chart is in order according to the
+      <literal>gb2312_chinese_ci</literal> collation:
+      <ulink url="http://d.udm.net/bar/~bar/charts/gb2312_chinese_ci.html"/>.
+      MySQL's <literal>gbk</literal> is in reality <quote>Microsoft code
+      page 936</quote>. This differs from the official
+      <literal>gbk</literal> for characters <literal>A1A4</literal>
+      (middle dot), <literal>A1AA</literal> (em dash),
+      <literal>A6E0-A6F5</literal>, and <literal>A8BB-A8C0</literal>.
+      For a listing of the differences, see
+      <ulink url="http://recode.progiciels-bpi.ca/showfile.html?name=dist/libiconv/gbk.h"/>.
+      For a listing of gbk/Unicode mappings, see
+      <ulink url="http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP936.TXT"/>.
+      For MySQL's listing of gbk characters, see
+      <ulink url="http://d.udm.net/bar/~bar/charts/gbk_chinese_ci.html"/>.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-big5-charset-problems">
+
+    <title>Troubles with big5 character set (Chinese)</title>
+
+    <para>
+      MySQL supports the Big5 character set which is common in Hong Kong
+      and the Republic of China (Taiwan). MySQL's
+      <literal>big5</literal> is in reality <quote>Microsoft code page
+      950</quote>, which is very similar to the original
+      <literal>big5</literal> character set. This is a recent change,
+      starting with MySQL version 4.1.16 / 5.0.16. We made the change as
+      a result of a bug report, bug #12476 in our bugs database,
+      <ulink url="http://bugs.mysql.com/bug.php?id=12476"/> (title:
+      <quote>Some big5 codes are still missing ...</quote>). For
+      example, the following statements work in the current version of
+      MySQL, but not in old versions:
+
+<programlisting>
+mysql> <userinput>create table big5 (big5 char(1) character set big5);</userinput>
+Query OK, 0 rows affected (0.13 sec)
+
+mysql> <userinput>insert into big5 values (0xf9dc);</userinput>
+Query OK, 1 row affected (0.00 sec)
+
+mysql> <userinput>select * from big5;</userinput>
++------+
+| big5 |
++------+
+| 嫺  |
++------+
+1 row in set (0.02 sec)
+</programlisting>
+
+      There is a feature request for adding HKSCS extensions (bug #13577
+      in our bugs database,
+      <ulink url="http://bugs.mysql.com/bug.php?id=13577)"/>. People who
+      need the extension may find the suggested patch for bug #13577 is
+      of interest.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-charset-conversion-problems">
+
+    <title>Troubles with character-set conversions (Japanese)</title>
+
+    <para>
+      MySQL supports the <literal>sjis</literal>,
+      <literal>ujis</literal>, <literal>cp932</literal>, and eucjpms
+      character sets, as well as Unicode. A common need is to convert
+      between character sets. For example, there might be a Unix server
+      (typically with <literal>sjis</literal> or
+      <literal>ujis</literal>) and a Windows client (typically with
+      <literal>cp932</literal>). But conversions can seem to fail.
+      Here's why. In this conversion table, the <literal>ucs2</literal>
+      column is the source, and the
+      <literal>sjis</literal>/<literal>cp932</literal>/<literal>ujis</literal>/<literal>eucjpms</literal>
+      columns are the destination, that is, what the hexadecimal result
+      would be if we used <literal>CONVERT(ucs2)</literal> or if we
+      assigned a <literal>ucs2</literal> column containing the value to
+      an
+      <literal>sjis</literal>/<literal>cp932</literal>/<literal>ujis</literal>/<literal>eucjpms</literal>
+      column.
+
+<programlisting>
+character name         ucs2 sjis  cp932 ujis   eucjpms
+--------------         ---- ----  ----  ----   -------
+
+BROKEN BAR             00A6   3F    3F  8FA2C3   3F
+FULLWIDTH BROKEN BAR   FFE4   3F  FA55    3F   8FA2
+
+YEN SIGN               00A5   3F    3F    20     3F
+FULLWIDTH YEN SIGN     FFE5 818F  818F  A1EF     3F
+
+TILDE                  007E   7E    7E    7E     7E
+OVERLINE               203E   3F    3F    20     3F
+
+HORIZONTAL BAR         2015 815C  815C  A1BD   A1BD
+EM DASH                2014   3F    3F    3F     3F
+
+REVERSE SOLIDUS        005C 815F    5C    5C     5C
+FULLWIDTH ""           FF3C   3F  815F    3F   A1C0
+
+WAVE DASH              301C 8160    3F  A1C1     3F
+FULLWIDTH TILDE        FF5E   3F  8160    3F   A1C1
+
+DOUBLE VERTICAL LINE   2016 8161    3F  A1C2     3F
+PARALLEL TO            2225   3F  8161    3F   A1C2
+
+MINUS SIGN             2212 817C    3F  A1DD     3F
+FULLWIDTH HYPHEN-MINUS FF0D   3F  817C    3F   A1DD
+
+CENT SIGN              00A2 8191    3F  A1F1     3F
+FULLWIDTH CENT SIGN    FFE0   3F  8191    3F   A1F1
+
+POUND SIGN             00A3 8192    3F  A1F2     3F
+FULLWIDTH POUND SIGN   FFE1   3F  8192    3F   A1F2
+
+NOT SIGN               00AC 81CA    3F  A2CC     3F
+FULLWIDTH NOT SIGN     FFE2   3F  81CA    3F   A2CC
+</programlisting>
+
+      For example, consider this extract from the table:
+
+<programlisting>
+                       ucs2 sjis cp932
+                       ---- ---- -----
+NOT SIGN               00AC 81CA    3F
+FULLWIDTH NOT SIGN     FFE2   3F  81CA
+</programlisting>
+
+      It means <quote>for NOT SIGN which is Unicode U+00AC, MySQL
+      converts to sjis code point 0x81CA and to cp932 code point
+      3F</quote>. (<literal>3F</literal> is question mark
+      (<quote>?</quote>) and is what we always use when we can't
+      convert.) Now, what should we do if we want to convert
+      <literal>sjis 81CA</literal> to <literal>cp932</literal>? Our
+      answer is: <quote>?</quote>. There are serious complaints about
+      this, many people would prefer a <quote>loose</quote> conversion,
+      so that <literal>81CA (NOT SIGN)</literal> in
+      <literal>sjis</literal> becomes <literal>81CA (FULLWIDTH NOT
+      SIGN)</literal> in <literal>cp932</literal>. We are considering
+      changing.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-great-yen-sign-problem">
+
+    <title>The Great Yen Sign Problem (Japanese)</title>
+
+    <para>
+      In SJIS the code for Yen Sign (<literal>¥</literal>) is
+      <literal>5C</literal>. In SJIS the code for Reverse Solidus
+      (<literal>\</literal>) is <literal>5C</literal>. Since the above
+      statements are contradictory, confusion often results. Well, to
+      put it more seriously, some versions of Japanese character sets
+      (both <literal>sjis</literal> and <literal>euc</literal>) have
+      treated <literal>5C</literal> as a reverse solidus, also known as
+      a backslash, and others have treated it as a yen sign. There's
+      nothing we can do, except take sides: MySQL follows only one
+      version of the JIS (Japanese Industrial Standards) standard
+      description, and <emphasis>5C is Reverse Solidus</emphasis>,
+      always. Should we make a separate character set where
+      <literal>5C</literal> is Yen Sign, as another DBMS (Oracle) does?
+      We haven't decided. Certainly not in version 5.1 or 5.2. But if
+      people keep complaining about The Great Yen Sign Problem, that's
+      one possible solution.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-euckr-charset-problems">
+
+    <title>Troubles with euckr character set (Korean)</title>
+
+    <para>
+      MySQL supports the <literal>euckr</literal> (Extended Unix Code
+      Korea) character set which is common in South Korea. In theory,
+      problems could arise because there have been several versions of
+      this character set. So far, only one problem has been noted, for
+      Korea's currency symbol. We use the <quote>ASCII</quote> variant
+      of EUC-KR, in which the code point <literal>0x5c</literal> is
+      REVERSE SOLIDUS, that is <literal>\</literal>, instead of the
+      <quote>KS-Roman</quote> variant of EUC-KR, in which the code point
+      <literal>0x5c</literal> is WON SIGN, that is <quote>₩</quote>.
+      You can't convert Unicode <literal>U+20A9</literal> WON SIGN to
+      <literal>euckr</literal>:
+
+<programlisting>
+mysql> <userinput>SELECT CONVERT('₩' USING euckr) AS euckr,</userinput>
+-> <userinput>HEX(CONVERT('₩' USING euckr)) AS hexeuckr;</userinput>
++-------+----------+
+| euckr | hexeuckr |
++-------+----------+
+| ?     | 3F       |
++-------+----------+
+1 row in set (0.00 sec)
+</programlisting>
+
+      MySQL's graphic Korean chart is here:
+      <ulink url="http://d.udm.net/bar/~bar/charts/euckr_korean_ci.html"/>.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-data-truncated">
+
+    <title>The <quote>Data truncated</quote> message</title>
+
+    <para>
+      For illustration, we'll make a table with one Unicode
+      (<literal>ucs2</literal>) column and one Chinese
+      (<literal>gb2312</literal>) column.
+
+<programlisting>
+mysql> <userinput>CREATE TABLE ch</userinput>
+    -> <userinput>(ucs2 CHAR(3) CHARACTER SET ucs2,</userinput>
+    -> <userinput>gb2312 CHAR(3) CHARACTER SET gb2312);</userinput>
+Query OK, 0 rows affected (0.05 sec) 
+</programlisting>
+
+      We'll try to place the rare character <literal>汌</literal> in
+      both columns.
+
+<programlisting>
+mysql> <userinput>INSERT INTO ch VALUES ('A汌B','A汌B');</userinput>
+Query OK, 1 row affected, 1 warning (0.00 sec) 
+</programlisting>
+
+      Ah, there's a warning. Let's see what it is.
+
+<programlisting>
+mysql> <userinput>SHOW WARNINGS;</userinput>
++---------+------+---------------------------------------------+
+| Level   | Code | Message                                     |
++---------+------+---------------------------------------------+
+| Warning | 1265 | Data truncated for column 'gb2312' at row 1 |
++---------+------+---------------------------------------------+
+1 row in set (0.00 sec)
+</programlisting>
+
+      So it's a warning about the gb2312 column only.
+
+<programlisting>
+mysql> SELECT ucs2,HEX(ucs2),gb2312,HEX(gb2312) FROM ch;
++-------+--------------+--------+-------------+
+| ucs2  | HEX(ucs2)    | gb2312 | HEX(gb2312) |
++-------+--------------+--------+-------------+
+| A汌B | 00416C4C0042 | A?B    | 413F42      |
++-------+--------------+--------+-------------+
+1 row in set (0.00 sec)
+</programlisting>
+
+      There are several things that need explanation here.
+
+      <orderedlist>
+
+        <listitem>
+          <para>
+            The fact that it's a <quote>warning</quote> rather than an
+            <quote>error</quote> is characteristic of MySQL. We like to
+            try to do what we can, to get the best fit, rather than give
+            up.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            The <literal>汌</literal> character isn't in the
+            <literal>gb2312</literal> character set. We described that
+            problem earlier.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Admittedly the message is misleading. We didn't
+            <quote>truncate</quote> in this case, we replaced with a
+            question mark. We've had a complaint about this message (bug
+            #9337). But until we come up with something better, just
+            accept that error/warning code 2165 can mean a variety of
+            things.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            With <literal>SQL_MODE=TRADITIONAL</literal>, there would be
+            an error message, but instead of error 2165 you would see:
+            <literal>ERROR 1406 (22001): Data too long for column
+            'gb2312' at row 1</literal>.
+          </para>
+        </listitem>
+
+      </orderedlist>
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-access-perl-php-troubles">
+
+    <title>Troubles with Access, Perl, PHP, etc.</title>
+
+    <para>
+      You can't get things to look right with your special program for a
+      GUI front end or browser? Get a direct connection to the server
+      (with <command>mysql</command> on Unix or with
+      <command>mysql.exe</command> on Windows) and try the same query
+      there. If mysql is okay, then the trouble is probably that your
+      application interface needs some initializing. Use
+      <command>mysql</command> to tell you what character set(s) it
+      uses, by saying <literal>SHOW VARIABLES LIKE 'char%';</literal>.
+      If it's Access, you're probably connecting with MyODBC. So you'll
+      want to check out the Reference Manual page for
+      <xref linkend="myodbc-configuration-dsn-windows"/>, and pay
+      attention particularly to the illustrations for <quote>SQL command
+      on connect</quote>. You should enter <literal>SET NAMES
+      'big5'</literal> (supposing that you use <literal>big5</literal>)
+      (you don't need a <literal>;</literal> here). If it's ASP, you
+      might need to add <literal>SET NAMES</literal> in the code. Here
+      is an example that has worked in the past:
+
+<programlisting>
+&lt;%
+Session.CodePage=0
+Dim strConnection
+Dim Conn
+strConnection="driver={MySQL ODBC 3.51 Driver};server=yourserver;uid=yourusername;pwd=yourpassword;database=yourdatabase;stmt=SET NAMES 'big5';"
+Set Conn = Server.CreateObject(<quote>ADODB.Connection</quote>) 
+Conn.Open strConnection
+%> 
+</programlisting>
+
+      If it's PHP, here's a slightly different user suggestion:
+
+<programlisting>
+&lt;?php 
+  $link = mysql_connect($host,$usr,$pwd); 
+  mysql_select_db($db); 
+  if (mysql_error()) { print "Database ERROR: " . mysql_error(); } 
+  mysql_query("SET CHARACTER SET utf8", $link); 
+  mysql_query("SET NAMES 'utf8'", $link); 
+?>
+</programlisting>
+
+      In this case, the tipper used <literal>SET CHARACTER SET</literal>
+      statement to change <literal>character_set_client</literal> and
+      <literal>character_set_system</literal>, and used <literal>SET
+      NAMES</literal> to change <literal>character_set_client</literal>
+      and <literal>character_set_connection</literal> and
+      <literal>character_set_results</literal>. (Incidentally, MySQL
+      people encourage the use of the <literal>mysqli</literal>
+      extension, rather than the <literal>mysql</literal> example that
+      this example uses.) Another thing to check with PHP is the browser
+      assumptions. Sometimes a meta tag change in the heading area
+      suffices, for example: <literal>&lt;meta http-equiv="Content-Type"
+      content="text/html; charset=utf-8"></literal>
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-restore-mysql40-behavior">
+
+    <title>How can I get old MySQL 4.0 behaviour back?</title>
+
+    <para>
+      In the old days, with MySQL Version 4.0, there was a single
+      <quote>global</quote> character set for both server and client
+      sides, and the decision was made by the server administrator. We
+      changed that starting with MySQL Version 4.1. What happens now is
+      a <quote>handshake</quote>. The MySQL Reference Manual describes
+      it thus:
+
+      <blockquote>
+
+        <para>
+          When a client connects, it sends to the server the name of the
+          character set that it wants to use. The server uses the name
+          to set the <literal>character_set_client</literal>,
+          <literal>character_set_results</literal>, and
+          <literal>character_set_connection</literal> system variables.
+          In effect, the server performs a <literal>SET NAMES</literal>
+          operation using the character set name.
+        </para>
+
+      </blockquote>
+
+      The effect of this is: you can't control the client character set
+      by saying <literal>mysqld --character-set-server=utf8</literal>.
+      But some Asian customers said that they don't like that, they want
+      the MySQL 4.0 behaviour. So we added a <command>mysqld</command>
+      switch, <option>--character-set-client-handshake</option>, which
+      (and this is the interesting part) can be turned off with
+      <option>--skip-character-set-client-handshake</option>. If you
+      start mysqld with
+      <option>--skip-character-set-client-handshake</option>, then the
+      behaviour is like this: When a client connects, it sends to the
+      server the name of the character set that it wants to use. The
+      server ignores it! Here is an illustration with the handshake
+      switch on or off. Pretend that your favourite server character set
+      is <literal>latin1</literal> (of course that's unlikely in a CJK
+      area but it's MySQL's default if there's no
+      <filename>my.ini</filename> or <filename>my.cnf</filename> file).
+      Pretend that the client operates with <literal>utf8</literal>
+      because that's what the client's operating system supports. Start
+      the server with a default character set,
+      <literal>latin1</literal>:
+
+<programlisting>
+mysqld --character-set-server=latin1
+</programlisting>
+
+      Start the client with a default character set,
+      <literal>utf8</literal>:
+
+<programlisting>
+mysql --default-character-set=utf8
+</programlisting>
+
+      Show what the current settings are:
+
+<programlisting>
+mysql> <userinput>SHOW VARIABLES LIKE 'char%';</userinput>
++--------------------------+----------------------------------------+
+| Variable_name            | Value                                  |
++--------------------------+----------------------------------------+
+| character_set_client     | utf8                                   |
+| character_set_connection | utf8                                   |
+| character_set_database   | latin1                                 |
+| character_set_filesystem | binary                                 |
+| character_set_results    | utf8                                   |
+| character_set_server     | latin1                                 |
+| character_set_system     | utf8                                   |
+| character_sets_dir       | /usr/local/mysql/share/mysql/charsets/ |
++--------------------------+----------------------------------------+
+8 rows in set (0.01 sec)
+</programlisting>
+
+      Stop the client. Stop the server with
+      <command>mysqladmin</command>. Start the server again but this
+      time say <quote>skip the handshake</quote>:
+
+<programlisting>
+mysqld --character-set-server=utf8 --skip-character-set-client-handshake
+</programlisting>
+
+      Start the client with a default character set,
+      <literal>utf8</literal>, again. Show what the current settings
+      are, again:
+
+<programlisting>
+mysql> <userinput>SHOW VARIABLES LIKE 'char%';</userinput>
++--------------------------+----------------------------------------+
+| Variable_name            | Value                                  |
++--------------------------+----------------------------------------+
+| character_set_client     | latin1                                 |
+| character_set_connection | latin1                                 |
+| character_set_database   | latin1                                 |
+| character_set_filesystem | binary                                 |
+| character_set_results    | latin1                                 |
+| character_set_server     | latin1                                 |
+| character_set_system     | utf8                                   |
+| character_sets_dir       | /usr/local/mysql/share/mysql/charsets/ |
++--------------------------+----------------------------------------+
+8 rows in set (0.01 sec)
+</programlisting>
+
+      As you can see by comparing the <literal>SHOW VARIABLES</literal>
+      results, the server ignores the client's initial settings if the
+      <option>--skip-character-set-client-handshake</option> is used.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-fulltext-searches">
+
+    <title>Why do some LIKE and FULLTEXT searches fail?</title>
+
+    <para>
+      There is a simple problem with <literal>LIKE</literal> searches on
+      <literal>BINARY</literal> and <literal>BLOB</literal> columns: we
+      need to know the end of a character. With multi-byte character
+      sets, different characters might have different octet lengths. For
+      example, in <literal>utf8</literal>, <literal>A</literal> requires
+      one byte but <literal>ペ</literal> requires three bytes.
+      Illustration:
+
+<programlisting>
+        +-------------------------+---------------------------+
+        | octet_length(_utf8 'A') | octet_length(_utf8 'ペ') |
+        +-------------------------+---------------------------+
+        |                       1 |                         3 |
+        +-------------------------+---------------------------+
+        1 row in set (0.00 sec)
+      </programlisting>
+
+      If we don't know where the first character ends, then we don't
+      know where the second character begins, and even simple-looking
+      searches like <literal>LIKE '_A%'</literal> will fail. The
+      solution is to use a regular CJK character set in the first place,
+      or convert to a CJK character character set before comparing.
+      Incidentally, this is one reason why MySQL cannot allow encodings
+      of nonexistent characters: It must be strict about rejecting bad
+      input, or it won't know where characters end. There is a simple
+      problem with <literal>FULLTEXT</literal>: we need to know the end
+      of a word. With Western writing this is rarely a problem because
+      there are spaces between words. With Asian writing this is not the
+      case. We could use half-good solutions, like saying that all Han
+      characters represent words, or depending on (Japanese) changes
+      from Katakana to Hiragana which are due to grammatical endings.
+      But the only good solution requires a dictionary, and we haven't
+      found a good open-source dictionary.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-available-cjk-charsets">
+
+    <title>What CJK character sets are available?</title>
+
+    <para>
+      The list of CJK character sets may vary depending on version. For
+      example, the <literal>eucjpms</literal> character set is a recent
+      addition. But the language name appears in the
+      <literal>DESCRIPTION</literal> column for every entry in
+      <literal>information_schema.character_sets</literal>. Therefore,
+      to get a current list of all the non-Unicode CJK character sets,
+      say:
+
+<programlisting>
+mysql> <userinput>SELECT character_set_name, description</userinput>
+    -> <userinput>FROM information_schema.character_sets</userinput>
+    -> <userinput>WHERE description LIKE '%Chinese%'</userinput>
+    -> <userinput>OR    description LIKE '%Japanese%'</userinput>
+    -> <userinput>OR    description LIKE '%Korean%'</userinput>
+    -> <userinput>ORDER BY character_set_name;</userinput>
++--------------------+---------------------------+
+| character_set_name | description               |
++--------------------+---------------------------+
+| big5               | Big5 Traditional Chinese  |
+| cp932              | SJIS for Windows Japanese |
+| eucjpms            | UJIS for Windows Japanese |
+| euckr              | EUC-KR Korean             |
+| gb2312             | GB2312 Simplified Chinese |
+| gbk                | GBK Simplified Chinese    |
+| sjis               | Shift-JIS Japanese        |
+| ujis               | EUC-JP Japanese           |
++--------------------+---------------------------+
+8 rows in set (0.01 sec)
+</programlisting>
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-character-x-availability">
+
+    <title>Is character X available in all character sets?</title>
+
+    <para>
+      The majority of everyday-use Chinese/Japanese characters
+      (simplified Chinese and basic non-halfwidth Kana Japanese) appear
+      in all CJK character sets. Here is a stored procedure which
+      accepts a UCS-2 Unicode character, converts it to all other
+      character sets, and displays the results in hexadecimal.
+
+<programlisting>
+DELIMITER //
+
+CREATE PROCEDURE p_convert (ucs2_char CHAR(1) CHARACTER SET ucs2)
+BEGIN
+
+CREATE TABLE tj
+             (ucs2 CHAR(1) character set ucs2,
+              utf8 CHAR(1) character set utf8,
+              big5 CHAR(1) character set big5,
+              cp932 CHAR(1) character set cp932,
+              eucjpms CHAR(1) character set eucjpms,
+              euckr CHAR(1) character set euckr,
+              gb2312 CHAR(1) character set gb2312,
+              gbk CHAR(1) character set gbk,
+              sjis CHAR(1) character set sjis,
+              ujis CHAR(1) character set ujis);
+
+INSERT INTO tj (ucs2) VALUES (ucs2_char);
+
+UPDATE tj SET utf8=ucs2,
+              big5=ucs2,
+              cp932=ucs2,
+              eucjpms=ucs2,
+              euckr=ucs2,
+              gb2312=ucs2,
+              gbk=ucs2,
+              sjis=ucs2,
+              ujis=ucs2;
+
+/* If there's a conversion problem, UPDATE will produce a warning. */
+
+SELECT hex(ucs2) AS ucs2,
+       hex(utf8) AS utf8,
+       hex(big5) AS big5,
+       hex(cp932) AS cp932,
+       hex(eucjpms) AS eucjpms,
+       hex(euckr) AS euckr,
+       hex(gb2312) AS gb2312,
+       hex(gbk) AS gbk,
+       hex(sjis) AS sjis,
+       hex(ujis) AS ujis
+FROM tj;
+
+DROP TABLE tj;
+
+END//
+</programlisting>
+
+      The input can be any single <literal>ucs2</literal> character, or
+      it can be the code point value (hexadecimal representation) of
+      that character. Here's an example of what
+      <function>P_CONVERT()</function> can do. An earlier answer said
+      that the character <quote>Katakana Letter Ge</quote> appears in
+      all CJK character sets. We know that the code point value of
+      Katakana Letter Ge is <literal>0x30da</literal>. (By the way, we
+      got the name from Unicode's list of ucs2 encodings and names:
+      <ulink url="http://www.unicode.org/Public/UNIDATA/UnicodeData.txt"/>.)
+      So we'll say:
+
+<programlisting>
+mysql> <userinput>CALL P_CONVERT(0x30da)//</userinput>
++------+--------+------+-------+---------+-------+--------+------+------+------+
+| ucs2 | utf8   | big5 | cp932 | eucjpms | euckr | gb2312 | gbk  | sjis | ujis |
++------+--------+------+-------+---------+-------+--------+------+------+------+
+| 30DA | E3839A | C772 | 8379  | A5DA    | ABDA  | A5DA   | A5DA | 8379 | A5DA |
++------+--------+------+-------+---------+-------+--------+------+------+------+
+1 row in set (0.04 sec)
+</programlisting>
+
+      Since none of the column values is <literal>3F</literal>, we know
+      that every conversion worked.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-sorting-problems-unicode-1">
+
+    <title>Strings don't sort correctly in Unicode (I)</title>
+
+    <para>
+      Sometimes people observe that the result of a
+      <literal>utf8_unicode_ci</literal> or
+      <literal>ucs2_unicode_ci</literal> search or <literal>ORDER
+      BY</literal> sort is not what they think a native would expect.
+      Although we never rule out the chance that there is a bug, we have
+      found in the past that people are not correctly reading the
+      standard table of weights for the Unicode Collation Algorithm. So,
+      here's how to check whether we're using the right collation. The
+      correct table for MySQL is this one:
+      <ulink url="http://www.unicode.org/Public/UCA/4.0.0/allkeys-4.0.0.txt"/>.
+      This is different from the first table you will find by navigating
+      from the <literal>unicode.org</literal> home page. MySQL
+      deliberately uses the older 4.0.0 <quote>allkeys</quote> table,
+      instead of the current 4.1.0 table. We are very wary about
+      changing ordering which affects indexes. Here is an example of a
+      problem that we handled recently, for a complaint in our bugs
+      database, <ulink url="http://bugs.mysql.com/bug.php?id=16526"/>:
+
+<programlisting>
+mysql> <userinput>CREATE TABLE tj (s1 CHAR(1) CHARACTER SET utf8 COLLATE utf8_unicode_ci);</userinput>
+Query OK, 0 rows affected (0.05 sec)
+
+mysql> <userinput>INSERT INTO tj VALUES ('が'),('か');</userinput>
+Query OK, 2 rows affected (0.00 sec)
+Records: 2  Duplicates: 0  Warnings: 0
+
+mysql> <userinput>SELECT * FROM tj WHERE s1 = 'か';</userinput>
++------+
+| s1   |
++------+
+| が  |
+| か  |
++------+
+2 rows in set (0.00 sec)
+</programlisting>
+
+      If your eyes are sharp, you'll see that the character in the first
+      result row isn't the one that we searched for. Why did MySQL
+      retrieve it? First we look for the Unicode code point value, which
+      is possible by reading the hexadecimal number for the
+      <literal>ucs2</literal> version of the characters:
+
+<programlisting>
+mysql> <userinput>SELECT s1,HEX(CONVERT(s1 USING ucs2)) FROM tj;</userinput>
++------+-----------------------------+
+| s1   | HEX(CONVERT(s1 USING ucs2)) |
++------+-----------------------------+
+| が  | 304C                        |
+| か  | 304B                        |
++------+-----------------------------+
+2 rows in set (0.03 sec)
+</programlisting>
+
+      Now let's search for <literal>304B</literal> and
+      <literal>304C</literal> in the 4.0.0 allkeys table. We'll find
+      these lines:
+
+<programlisting>
+304B  ; [.1E57.0020.000E.304B] # HIRAGANA LETTER KA
+304C  ; [.1E57.0020.000E.304B][.0000.0140.0002.3099] # HIRAGANA LETTER GA; QQCM
+</programlisting>
+
+      The official Unicode names (following the <quote>#</quote> mark)
+      are informative; they tell us the Japanese syllabary (Hiragana),
+      the informal classification (letter instead of digit or
+      punctuation), and the Western identifier (<literal>KA</literal> or
+      <literal>GA</literal>, which happen to be voiced/unvoiced
+      components of the same letter pair). More importantly, the Primary
+      Weight (the first hexadecimal number inside the square brackets)
+      is <literal>1E57</literal> on both lines. For comparisons in both
+      searching and sorting, MySQL pays attention only to the Primary
+      Weight, it ignores all the other numbers. So now we know that
+      we're sorting <literal>が</literal> and <literal>か</literal>
+      correctly according to the Unicode specification. If we wanted to
+      distinguish them, we'd have to use a
+      non-Unicode-Collation-Algorithm collation
+      (<literal>utf8_unicode_bin</literal> or
+      <literal>utf8_general_ci</literal>), or compare the
+      <function>HEX()</function> values, or say <literal>ORDER BY
+      CONVERT(s1 USING sjis)</literal>. Being correct <quote>according
+      to Unicode</quote> isn't enough, of course: the person who
+      submitted the bug was equally correct. We plan to add another
+      collation for Japanese according to the JIS X 4061 standard, where
+      voiced/unvoiced letters like KA/GA are distinguishable for
+      ordering purposes.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-sorting-problems-unicode-2">
+
+    <title>Strings Don't Sort Correctly In Unicode (II)</title>
+
+    <para>
+      You're using Unicode (<literal>ucs2</literal> or
+      <literal>utf8</literal>), and you know what the Unicode sort order
+      is (see the previous question and answer), but MySQL still seems
+      to sort your table wrong? This might be easy.
+
+<programlisting>
+mysql> <userinput>SHOW CREATE TABLE t\G</userinput>
+******************** 1. row ******************
+Table: t
+Create Table: CREATE TABLE `t` (
+`s1` char(1) CHARACTER SET ucs2 DEFAULT NULL
+) ENGINE=MyISAM DEFAULT CHARSET=latin1
+1 row in set (0.00 sec)
+</programlisting>
+
+      Hmm, the character set looks okay. Let's look at the
+      <literal>information_schema</literal> for this column.
+
+<programlisting>
+mysql> <userinput>SELECT column_name, character_set_name, collation_name</userinput>
+    -> <userinput>FROM information_schema.columns</userinput>
+    -> <userinput>WHERE column_name = 's1'</userinput>
+    -> <userinput>AND table_name = 't';</userinput>
++-------------+--------------------+-----------------+
+| column_name | character_set_name | collation_name  |
++-------------+--------------------+-----------------+
+| s1          | ucs2               | ucs2_general_ci |
++-------------+--------------------+-----------------+
+1 row in set (0.01 sec)
+</programlisting>
+
+      Oops, the collation is <literal>ucs2_general_ci</literal> instead
+      of <literal>ucs2_unicode_ci</literal>! Here's why:
+
+<programlisting>
+mysql> <userinput>SHOW CHARSET LIKE 'ucs2%';</userinput>
++---------+---------------+-------------------+--------+
+| Charset | Description   | Default collation | Maxlen |
++---------+---------------+-------------------+--------+
+| ucs2    | UCS-2 Unicode | ucs2_general_ci   |      2 |
++---------+---------------+-------------------+--------+
+1 row in set (0.00 sec)
+</programlisting>
+
+      For <literal>ucs2</literal> and <literal>utf8</literal>, the
+      <quote>general</quote> collation is the default. To specify that
+      you wanted a <quote>unicode</quote> collation, you should have
+      specified <literal>COLLATE ucs2_unicode_ci</literal>.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-supplementary-chars-rejected">
+
+    <title>My supplementary characters get rejected</title>
+
+    <para>
+      Right. MySQL doesn't support supplementary characters (characters
+      which need more than 3 bytes with UTF-8). We support only what
+      Unicode calls the <emphasis>Basic Multilingual Plane / Plane
+      0</emphasis>. Only a few very rare Han characters are
+      supplementary; support for them is uncommon. This has led to bug
+      #12600 (<ulink url="http://bugs.mysql.com/bug.php?id=12600"/>)
+      which we rejected as <quote>not a bug</quote>. With
+      <literal>utf8</literal>, we must truncate an input string when we
+      encounter bytes that we don't understand. Otherwise, we wouldn't
+      know how long the bad multi-byte character is. A workaround is: if
+      you use <literal>ucs2</literal> instead of
+      <literal>utf8</literal>, then the bad characters will change to
+      question marks, but there will be no truncation. Or change the
+      data type to <literal>BLOB</literal> or <literal>BINARY</literal>,
+      which have no validity checking. In our bugs database, bug #14052
+      (<ulink url="http://bugs.mysql.com/bug.php?id=14052"/>) is a
+      feature request for Wikipedia, asking us to support supplementary
+      characters extending <literal>ucs2</literal> as well as
+      <literal>utf8</literal>.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-cjkv">
+
+    <title>Shouldn't it be CJKV (V for Vietnamese)?</title>
+
+    <para>
+      No. The term CJKV (Chinese Japanese Korean Vietnamese) refers to
+      character sets which contain Han (originally Chinese) characters.
+      MySQL has no plan to support the old Vietnamese script using Han
+      characters. MySQL does of course support the modern Vietnamese
+      script with Western characters. Another question that has come up
+      (once) is a request for specialized Vietnamese collation, see
+      <ulink url="http://bugs.mysql.com/bug.php?id=4745"/>. We might do
+      something about it someday, if many more requests arise.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-fixing-cjk-problems">
+
+    <title>Will MySQL fix any CJK problems in version 5.1?</title>
+
+    <remark role="update">
+      [SH] Remove (or rewrite) whole section once the fixes it talks
+      about are implemented.
+    </remark>
+
+    <para>
+      Yes. We're changing the names of files and directories. Here's an
+      example, using mysql as <literal>root</literal> under Linux:
+
+      <orderedlist>
+
+        <listitem>
+          <para>
+            Create a table with a name containing a Han character:
+
+<programlisting>
+mysql> <userinput>CREATE TABLE tab_楮 (s1 INT);</userinput>
+Query OK, 0 rows affected (0.07 sec)
+</programlisting>
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Find out where MySQL stores database files:
+
+<programlisting>
+mysql> <userinput>SHOW VARIABLES LIKE 'datadir';</userinput>
++---------------+-----------------------+
+| Variable_name | Value                 |
++---------------+-----------------------+
+| datadir       | /usr/local/mysql/var/ |
++---------------+-----------------------+
+1 row in set (0.00 sec)
+</programlisting>
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Look at the directory to see the MyISAM table files:
+
+<programlisting>
+# cd /usr/local/mysql/var/dba
+# dir tab_*
+-rw-rw----  1 root root    0 2006-05-16 10:22 tab_@stripped
+-rw-rw----  1 root root 1024 2006-05-16 10:22 tab_@stripped
+-rw-rw----  1 root root 8556 2006-05-16 10:22 tab_@stripped
+</programlisting>
+          </para>
+        </listitem>
+
+      </orderedlist>
+
+      Notice that MySQL has converted the Han character to
+      <literal>@</literal> + (Unicode value of Han character), that is,
+      to a purely ASCII representation. This solves an old problem, that
+      database files weren't portable, because some computers wouldn't
+      allow <literal>楮</literal> in a file name. Conversion to the new
+      file names will be automatic when you upgrade to version 5.1. This
+      should take care of bug #6313 in our bugs database,
+      <ulink url="http://bugs.mysql.com/bug.php?id=6313"/>.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-manual-translation">
+
+    <title>When will MySQL translate the manual again?</title>
+
+    <remark role="update">
+      [SH] Update as CJK translations of manuals are updated.
+    </remark>
+
+    <para>
+      A Beijing-based group has produced a Simplified Chinese version
+      for us under contract. It's complete and can be found on
+      <ulink url="http://dev.mysql.com/doc/#chinese-5.1"/>. It's up to
+      date as of version 5.1.2. The Japanese manual can be downloaded
+      from this page: http://dev.mysql.com/doc/ (Scroll down the page
+      until you see the word <quote>Japanese</quote>.) It is still for
+      version 4.1.
+    </para>
+
+  </section>
+
+  <section id="cjk-faq-contact">
+
+    <title>Whom can I talk to?</title>
+
+    <remark role="update">
+      [SH] Update as things change.
+    </remark>
+
+    <para>
+      Check <ulink url="http://dev.mysql.com/user-groups/"/> to see if
+      there is a MySQL user group near you. If there isn't: why not
+      start one yourself? To contact a sales engineer in MySQL KK's
+      Japan office:
+
+<programlisting>
+Tel: +81(0)3-5326-3133
+Fax: +81(0)3-5326-3001
+Email: dsaito@stripped
+</programlisting>
+
+      To see feature requests about language issues:
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            Go to <ulink url="http://bugs.mysql.com"/>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Click <guimenu>Advanced Search</guimenu>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            In the <guilabel>Severity</guilabel> dropdown box, click
+            <literal>S4 (Feature Request)</literal>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            In the list box beside <guilabel>Category</guilabel>, click
+            <literal>Character Sets</literal>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            Click the <guibutton>Search</guibutton> button.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      You can post CJK questions, or see previous answers, on MySQL's
+      <quote>Character Sets, Collation, Unicode</quote> forum:
+      <ulink url="http://forums.mysql.com/list.php?103"/>. MySQL plans
+      to add native-language forums on
+      <ulink url="http://forums.mysql.com/"/> very soon.
+    </para>
+
+  </section>
+
+</section>

Thread
svn commit - mysqldoc@docsrva: r2517 - trunk/refman-commonstefan26 Jun