Author: paul
Date: 2006-01-15 05:57:51 +0100 (Sun, 15 Jan 2006)
New Revision: 836
Log:
r6216@frost: paul | 2006-01-14 21:58:15 -0600
General revisions.
Modified:
trunk/
trunk/refman-4.1/charset.xml
trunk/refman-5.0/charset.xml
trunk/refman-5.1/charset.xml
Property changes on: trunk
___________________________________________________________________
Name: svk:merge
- b5ec3a16-e900-0410-9ad2-d183a3acac99:/mysqldoc-local/mysqldoc/trunk:6215
bf112a9c-6c03-0410-a055-ad865cd57414:/mysqldoc-local/mysqldoc/trunk:2175
+ b5ec3a16-e900-0410-9ad2-d183a3acac99:/mysqldoc-local/mysqldoc/trunk:6216
bf112a9c-6c03-0410-a055-ad865cd57414:/mysqldoc-local/mysqldoc/trunk:2175
Modified: trunk/refman-4.1/charset.xml
===================================================================
--- trunk/refman-4.1/charset.xml 2006-01-15 04:57:13 UTC (rev 835)
+++ trunk/refman-4.1/charset.xml 2006-01-15 04:57:51 UTC (rev 836)
@@ -2301,22 +2301,22 @@
<listitem>
<para>
Metadata must include all characters in all languages.
- Otherwise, users wouldn't be able to name columns and tables
- in their own languages.
+ Otherwise, users would not be able to name columns and tables
+ using their own languages.
</para>
</listitem>
</itemizedlist>
<para>
- In order to satisfy both requirements, MySQL stores metadata in a
- Unicode character set, namely UTF-8. This does not cause any
- disruption if you never use accented characters. But if you do,
- you should be aware that metadata is in UTF-8.
+ To satisfy both requirements, MySQL stores metadata in a Unicode
+ character set, namely UTF-8. This does not cause any disruption if
+ you never use accented or non-Latin characters. But if you do, you
+ should be aware that metadata is in UTF-8.
</para>
<para>
- This means that the return values of the
+ The metadata requirements mean that the return values of the
<literal>USER()</literal>, <literal>CURRENT_USER()</literal>,
<literal>DATABASE()</literal>, and <literal>VERSION()</literal>
functions have the UTF-8 character set by default, as do synonyms
@@ -2340,32 +2340,40 @@
<para>
Storage of metadata using Unicode does <emphasis>not</emphasis>
- mean that the headers of columns and the results of
- <literal>DESCRIBE</literal> functions are in the
+ mean that the server returns headers of columns and the results of
+ <literal>DESCRIBE</literal> functions in the
<literal>character_set_system</literal> character set by default.
When you use <literal>SELECT column1 FROM t</literal>, the name
<literal>column1</literal> itself is returned from the server to
- the client in the character set as determined by the <literal>SET
- NAMES</literal> statement. More specifically, the character set
- used is determined by the value of the
- <literal>character_set_results</literal> system variable. If this
- variable is set to <literal>NULL</literal>, no conversion is
- performed and the server returns metadata using its original
- character set (the set indicated by
- <literal>character_set_system</literal>).
+ the client in the character set determined by the value of the
+ <literal>character_set_results</literal> system variable, which
+ has a default value of <literal>latin1</literal>. If you want the
+ server to pass metadata results back in a different character set,
+ use the <literal>SET NAMES</literal> statement to force the server
+ to perform character set conversion. <literal>SET NAMES</literal>
+ sets the <literal>character_set_results</literal> and other
+ related system variables. (See
+ <xref linkend="charset-connection"/>.) Alternatively, a client
+ program can perform the conversion after receiving the result from
+ the server. It is more efficient for the client perform the the
+ conversion, but this option is not available for many clients
+ until late in the MySQL 4.x product cycle.
</para>
<para>
- If you want the server to pass metadata results back in a
- non-UTF-8 character set, then use <literal>SET NAMES</literal> to
- force the server to perform character set conversion (see
- <xref linkend="charset-connection"/>), or else set the client to
- do the conversion. It is always more efficient to set the client
- to do the conversion, but this option is not available for many
- clients until late in the MySQL 4.x product cycle.
+ If <literal>character_set_results</literal> is set to
+ <literal>NULL</literal>, no conversion is performed and the server
+ returns metadata using its original character set (the set
+ indicated by <literal>character_set_system</literal>).
</para>
<para>
+ Beginning with MySQL 4.1.1, error messages returned from the
+ server to the client are converted to the client character set
+ automatically, as with metadata.
+ </para>
+
+ <para>
If you are using (for example) the <literal>USER()</literal>
function for comparison or assignment within a single statement,
don't worry. MySQL performs some automatic conversion for you.
@@ -2402,13 +2410,6 @@
strings.
</para>
- <para>
- <emphasis role="bold">Note</emphasis>: Beginning with MySQL 4.1.1,
- the <filename>errmsg.txt</filename> files all use UTF-8.
- Conversion to the client character set is automatic, as with
- metadata.
- </para>
-
</section>
<section id="charset-upgrading">
@@ -2808,11 +2809,18 @@
<title>&title-charset-charsets;</title>
<para>
- MySQL supports 70+ collations for 30+ character sets. The
- character sets and their default collations are displayed by the
- <literal>SHOW CHARACTER SET</literal> statement:
+ MySQL supports 70+ collations for 30+ character sets. This section
+ indicates which character sets MySQL supports. There is one
+ subsection for each group of related character sets. For each
+ character set, the allowable collations are listed.
</para>
+ <para>
+ You can always list the available character sets and their default
+ collations with the <literal>SHOW CHARACTER SET</literal>
+ statement:
+ </para>
+
<programlisting>
mysql> <userinput>SHOW CHARACTER SET;</userinput>
+----------+-----------------------------+---------------------+
@@ -2990,53 +2998,6 @@
</listitem>
</itemizedlist>
-
- <para>
- Note that there's a limitation in MySQL 4.1 that results in
- two characters not being correctly handled when a user tries
- to change their case using <literal>LOWER()</literal> or
- <literal>UPPER()</literal>:
- </para>
-
- <itemizedlist>
-
- <listitem>
- <para>
- LATIN SMALL LETTER DOTLESS i
- </para>
- </listitem>
-
- <listitem>
- <para>
- LATIN CAPITAL LETTER I WITH DOT ABOVE
- </para>
- </listitem>
-
- </itemizedlist>
-
- <para>
- Here are two workarounds for MySQL 4.1:
- </para>
-
- <orderedlist>
-
- <listitem>
- <para>
- Use UCS2 if you have Turkish data.
- </para>
- </listitem>
-
- <listitem>
- <para>
- Use these function calls:
- </para>
-
-<programlisting>
-CONVERT(LOWER(CONVERT(<replaceable>col</replaceable> USING ucs2)) USING utf8)
-</programlisting>
- </listitem>
-
- </orderedlist>
</listitem>
<listitem>
@@ -3165,6 +3126,53 @@
</itemizedlist>
+ <para>
+ There is a limitation in MySQL 4.1 that results in two
+ characters not being correctly handled when a user tries to
+ change their case using <literal>LOWER()</literal> or
+ <literal>UPPER()</literal>:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ LATIN SMALL LETTER DOTLESS i
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ LATIN CAPITAL LETTER I WITH DOT ABOVE
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ Here are two workarounds for MySQL 4.1:
+ </para>
+
+ <orderedlist>
+
+ <listitem>
+ <para>
+ Use UCS2 if you have Turkish data.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Use these function calls:
+ </para>
+
+<programlisting>
+CONVERT(LOWER(CONVERT(<replaceable>col</replaceable> USING ucs2)) USING utf8)
+</programlisting>
+ </listitem>
+
+ </orderedlist>
+
<indexterm type="type">
<primary>Unicode Collation Algorithm</primary>
</indexterm>
Modified: trunk/refman-5.0/charset.xml
===================================================================
--- trunk/refman-5.0/charset.xml 2006-01-15 04:57:13 UTC (rev 835)
+++ trunk/refman-5.0/charset.xml 2006-01-15 04:57:51 UTC (rev 836)
@@ -2202,9 +2202,9 @@
<emphasis>contents</emphasis> of the database — is metadata.
Thus column names, database names, usernames, version names, and
most of the string results from <literal>SHOW</literal> are
- metadata. This is also true of the contents of tables in the
- <literal>INFORMATION_SCHEMA</literal> database, because those
- tables by definition contain information about database objects.
+ metadata. This is also true of the contents of tables in
+ <literal>INFORMATION_SCHEMA</literal>, because those tables by
+ definition contain information about database objects.
</para>
<para>
@@ -2217,33 +2217,32 @@
<para>
All metadata must be in the same character set. Otherwise,
neither the <literal>SHOW</literal> commands nor
- <literal>SELECT</literal> queries against tables in the
- <literal>INFORMATION_SCHEMA</literal> database would work
- properly because different rows in the same column of the
- results of these operations would be in different character
- sets.
+ <literal>SELECT</literal> statements for tables in
+ <literal>INFORMATION_SCHEMA</literal> would work properly
+ because different rows in the same column of the results of
+ these operations would be in different character sets.
</para>
</listitem>
<listitem>
<para>
Metadata must include all characters in all languages.
- Otherwise, users wouldn't be able to name columns and tables
- in their own languages.
+ Otherwise, users would not be able to name columns and tables
+ using their own languages.
</para>
</listitem>
</itemizedlist>
<para>
- In order to satisfy both requirements, MySQL stores metadata in a
- Unicode character set, namely UTF-8. This does not cause any
- disruption if you never use accented characters. But if you do,
- you should be aware that metadata is in UTF-8.
+ To satisfy both requirements, MySQL stores metadata in a Unicode
+ character set, namely UTF-8. This does not cause any disruption if
+ you never use accented or non-Latin characters. But if you do, you
+ should be aware that metadata is in UTF-8.
</para>
<para>
- This means that the return values of the
+ The metadata requirements mean that the return values of the
<literal>USER()</literal>, <literal>CURRENT_USER()</literal>,
<literal>DATABASE()</literal>, and <literal>VERSION()</literal>
functions have the UTF-8 character set by default, as do synonyms
@@ -2267,32 +2266,40 @@
<para>
Storage of metadata using Unicode does <emphasis>not</emphasis>
- mean that the headers of columns and the results of
- <literal>DESCRIBE</literal> functions are in the
+ mean that the server returns headers of columns and the results of
+ <literal>DESCRIBE</literal> functions in the
<literal>character_set_system</literal> character set by default.
When you use <literal>SELECT column1 FROM t</literal>, the name
<literal>column1</literal> itself is returned from the server to
- the client in the character set as determined by the <literal>SET
- NAMES</literal> statement. More specifically, the character set
- used is determined by the value of the
- <literal>character_set_results</literal> system variable. If this
- variable is set to <literal>NULL</literal>, no conversion is
- performed and the server returns metadata using its original
- character set (the set indicated by
- <literal>character_set_system</literal>).
+ the client in the character set determined by the value of the
+ <literal>character_set_results</literal> system variable, which
+ has a default value of <literal>latin1</literal>. If you want the
+ server to pass metadata results back in a different character set,
+ use the <literal>SET NAMES</literal> statement to force the server
+ to perform character set conversion. <literal>SET NAMES</literal>
+ sets the <literal>character_set_results</literal> and other
+ related system variables. (See
+ <xref linkend="charset-connection"/>.) Alternatively, a client
+ program can perform the conversion after receiving the result from
+ the server. It is more efficient for the client perform the
+ conversion, but this option is not always available for all
+ clients.
</para>
<para>
- If you want the server to pass metadata results back in a
- non-UTF-8 character set, then use <literal>SET NAMES</literal> to
- force the server to perform character set conversion (see
- <xref linkend="charset-connection"/>), or else cause the client to
- perform the conversion. It is more efficient to have the client
- perform the conversion, but this option is not always available
- for all clients.
+ If <literal>character_set_results</literal> is set to
+ <literal>NULL</literal>, no conversion is performed and the server
+ returns metadata using its original character set (the set
+ indicated by <literal>character_set_system</literal>).
</para>
<para>
+ Error messages returned from the server to the client are
+ converted to the client character set automatically, as with
+ metadata.
+ </para>
+
+ <para>
If you are using (for example) the <literal>USER()</literal>
function for comparison or assignment within a single statement,
don't worry. MySQL performs some automatic conversion for you.
@@ -2329,13 +2336,6 @@
strings.
</para>
- <para>
- <emphasis role="bold">Note</emphasis>: In MySQL ¤t-series;,
- the <filename>errmsg.txt</filename> files all use UTF-8.
- Conversion to the client character set is automatic, as with
- metadata.
- </para>
-
</section>
<section id="charset-charsets">
@@ -2343,11 +2343,18 @@
<title>&title-charset-charsets;</title>
<para>
- MySQL supports 70+ collations for 30+ character sets. The
- character sets and their default collations are displayed by the
- <literal>SHOW CHARACTER SET</literal> statement:
+ MySQL supports 70+ collations for 30+ character sets. This section
+ indicates which character sets MySQL supports. There is one
+ subsection for each group of related character sets. For each
+ character set, the allowable collations are listed.
</para>
+ <para>
+ You can always list the available character sets and their default
+ collations with the <literal>SHOW CHARACTER SET</literal>
+ statement:
+ </para>
+
<programlisting>
mysql> <userinput>SHOW CHARACTER SET;</userinput>
+----------+-----------------------------+---------------------+
@@ -2408,34 +2415,135 @@
<literal>ucs2</literal> (UCS-2 Unicode) collations:
</para>
-<programlisting>
-mysql> <userinput>SHOW COLLATION LIKE 'ucs2%';</userinput>
-+--------------------+---------+-----+---------+----------+---------+
-| Collation | Charset | Id | Default | Compiled | Sortlen |
-+--------------------+---------+-----+---------+----------+---------+
-| ucs2_general_ci | ucs2 | 35 | Yes | Yes | 1 |
-| ucs2_bin | ucs2 | 90 | | Yes | 1 |
-| ucs2_unicode_ci | ucs2 | 128 | | Yes | 8 |
-| ucs2_icelandic_ci | ucs2 | 129 | | Yes | 8 |
-| ucs2_latvian_ci | ucs2 | 130 | | Yes | 8 |
-| ucs2_romanian_ci | ucs2 | 131 | | Yes | 8 |
-| ucs2_slovenian_ci | ucs2 | 132 | | Yes | 8 |
-| ucs2_polish_ci | ucs2 | 133 | | Yes | 8 |
-| ucs2_estonian_ci | ucs2 | 134 | | Yes | 8 |
-| ucs2_spanish_ci | ucs2 | 135 | | Yes | 8 |
-| ucs2_swedish_ci | ucs2 | 136 | | Yes | 8 |
-| ucs2_turkish_ci | ucs2 | 137 | | Yes | 8 |
-| ucs2_czech_ci | ucs2 | 138 | | Yes | 8 |
-| ucs2_danish_ci | ucs2 | 139 | | Yes | 8 |
-| ucs2_lithuanian_ci | ucs2 | 140 | | Yes | 8 |
-| ucs2_slovak_ci | ucs2 | 141 | | Yes | 8 |
-| ucs2_spanish2_ci | ucs2 | 142 | | Yes | 8 |
-| ucs2_roman_ci | ucs2 | 143 | | Yes | 8 |
-| ucs2_persian_ci | ucs2 | 144 | | Yes | 8 |
-| ucs2_esperanto_ci | ucs2 | 145 | | Yes | 8 |
-| ucs2_hungarian_ci | ucs2 | 146 | | Yes | 8 |
-+--------------------+---------+-----+---------+----------+---------+
-</programlisting>
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>ucs2_bin</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_czech_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_danish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_esperanto_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_estonian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_general_ci</literal> (default)
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_hungarian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_icelandic_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_latvian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_lithuanian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_persian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_polish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_roman_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_romanian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_slovak_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_slovenian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_spanish2_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_spanish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_swedish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_turkish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_unicode_ci</literal>
+ </para>
+ </listitem>
+
+ </itemizedlist>
</listitem>
<listitem>
@@ -2443,34 +2551,135 @@
<literal>utf8</literal> (UTF-8 Unicode) collations:
</para>
-<programlisting>
-mysql> <userinput>SHOW COLLATION LIKE 'utf8%';</userinput>
-+--------------------+---------+-----+---------+----------+---------+
-| Collation | Charset | Id | Default | Compiled | Sortlen |
-+--------------------+---------+-----+---------+----------+---------+
-| utf8_general_ci | utf8 | 33 | Yes | Yes | 1 |
-| utf8_bin | utf8 | 83 | | Yes | 1 |
-| utf8_unicode_ci | utf8 | 192 | | Yes | 8 |
-| utf8_icelandic_ci | utf8 | 193 | | Yes | 8 |
-| utf8_latvian_ci | utf8 | 194 | | Yes | 8 |
-| utf8_romanian_ci | utf8 | 195 | | Yes | 8 |
-| utf8_slovenian_ci | utf8 | 196 | | Yes | 8 |
-| utf8_polish_ci | utf8 | 197 | | Yes | 8 |
-| utf8_estonian_ci | utf8 | 198 | | Yes | 8 |
-| utf8_spanish_ci | utf8 | 199 | | Yes | 8 |
-| utf8_swedish_ci | utf8 | 200 | | Yes | 8 |
-| utf8_turkish_ci | utf8 | 201 | | Yes | 8 |
-| utf8_czech_ci | utf8 | 202 | | Yes | 8 |
-| utf8_danish_ci | utf8 | 203 | | Yes | 8 |
-| utf8_lithuanian_ci | utf8 | 204 | | Yes | 8 |
-| utf8_slovak_ci | utf8 | 205 | | Yes | 8 |
-| utf8_spanish2_ci | utf8 | 206 | | Yes | 8 |
-| utf8_roman_ci | utf8 | 207 | | Yes | 8 |
-| utf8_persian_ci | utf8 | 208 | | Yes | 8 |
-| utf8_esperanto_ci | utf8 | 209 | | Yes | 8 |
-| utf8_hungarian_ci | utf8 | 210 | | Yes | 8 |
-+--------------------+---------+-----+---------+----------+---------+
-</programlisting>
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>utf8_bin</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_czech_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_danish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_esperanto_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_estonian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_general_ci</literal> (default)
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_hungarian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_icelandic_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_latvian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_lithuanian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_persian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_polish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_roman_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_romanian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_slovak_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_slovenian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_spanish2_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_spanish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_swedish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_turkish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_unicode_ci</literal>
+ </para>
+ </listitem>
+
+ </itemizedlist>
</listitem>
</itemizedlist>
Modified: trunk/refman-5.1/charset.xml
===================================================================
--- trunk/refman-5.1/charset.xml 2006-01-15 04:57:13 UTC (rev 835)
+++ trunk/refman-5.1/charset.xml 2006-01-15 04:57:51 UTC (rev 836)
@@ -2194,9 +2194,9 @@
<emphasis>contents</emphasis> of the database — is metadata.
Thus column names, database names, usernames, version names, and
most of the string results from <literal>SHOW</literal> are
- metadata. This is also true of the contents of tables in the
- <literal>INFORMATION_SCHEMA</literal> database, because those
- tables by definition contain information about database objects.
+ metadata. This is also true of the contents of tables in
+ <literal>INFORMATION_SCHEMA</literal>, because those tables by
+ definition contain information about database objects.
</para>
<para>
@@ -2209,33 +2209,32 @@
<para>
All metadata must be in the same character set. Otherwise,
neither the <literal>SHOW</literal> commands nor
- <literal>SELECT</literal> queries against tables in the
- <literal>INFORMATION_SCHEMA</literal> database would work
- properly because different rows in the same column of the
- results of these operations would be in different character
- sets.
+ <literal>SELECT</literal> statements for tables in
+ <literal>INFORMATION_SCHEMA</literal> would work properly
+ because different rows in the same column of the results of
+ these operations would be in different character sets.
</para>
</listitem>
<listitem>
<para>
Metadata must include all characters in all languages.
- Otherwise, users wouldn't be able to name columns and tables
- in their own languages.
+ Otherwise, users would not be able to name columns and tables
+ using their own languages.
</para>
</listitem>
</itemizedlist>
<para>
- In order to satisfy both requirements, MySQL stores metadata in a
- Unicode character set, namely UTF-8. This does not cause any
- disruption if you never use accented characters. But if you do,
- you should be aware that metadata is in UTF-8.
+ To satisfy both requirements, MySQL stores metadata in a Unicode
+ character set, namely UTF-8. This does not cause any disruption if
+ you never use accented or non-Latin characters. But if you do, you
+ should be aware that metadata is in UTF-8.
</para>
<para>
- This means that the return values of the
+ The metadata requirements mean that the return values of the
<literal>USER()</literal>, <literal>CURRENT_USER()</literal>,
<literal>DATABASE()</literal>, and <literal>VERSION()</literal>
functions have the UTF-8 character set by default, as do synonyms
@@ -2259,32 +2258,40 @@
<para>
Storage of metadata using Unicode does <emphasis>not</emphasis>
- mean that the headers of columns and the results of
- <literal>DESCRIBE</literal> functions are in the
+ mean that the server returns headers of columns and the results of
+ <literal>DESCRIBE</literal> functions in the
<literal>character_set_system</literal> character set by default.
When you use <literal>SELECT column1 FROM t</literal>, the name
<literal>column1</literal> itself is returned from the server to
- the client in the character set as determined by the <literal>SET
- NAMES</literal> statement. More specifically, the character set
- used is determined by the value of the
- <literal>character_set_results</literal> system variable. If this
- variable is set to <literal>NULL</literal>, no conversion is
- performed and the server returns metadata using its original
- character set (the set indicated by
- <literal>character_set_system</literal>).
+ the client in the character set determined by the value of the
+ <literal>character_set_results</literal> system variable, which
+ has a default value of <literal>latin1</literal>. If you want the
+ server to pass metadata results back in a different character set,
+ use the <literal>SET NAMES</literal> statement to force the server
+ to perform character set conversion. <literal>SET NAMES</literal>
+ sets the <literal>character_set_results</literal> and other
+ related system variables. (See
+ <xref linkend="charset-connection"/>.) Alternatively, a client
+ program can perform the conversion after receiving the result from
+ the server. It is more efficient for the client perform the
+ conversion, but this option is not always available for all
+ clients.
</para>
<para>
- If you want the server to pass metadata results back in a
- non-UTF-8 character set, then use <literal>SET NAMES</literal> to
- force the server to perform character set conversion (see
- <xref linkend="charset-connection"/>), or else cause the client to
- perform the conversion. It is more efficient to have the client
- perform the conversion, but this option is not always available
- for all clients.
+ If <literal>character_set_results</literal> is set to
+ <literal>NULL</literal>, no conversion is performed and the server
+ returns metadata using its original character set (the set
+ indicated by <literal>character_set_system</literal>).
</para>
<para>
+ Error messages returned from the server to the client are
+ converted to the client character set automatically, as with
+ metadata.
+ </para>
+
+ <para>
If you are using (for example) the <literal>USER()</literal>
function for comparison or assignment within a single statement,
don't worry. MySQL performs some automatic conversion for you.
@@ -2321,13 +2328,6 @@
strings.
</para>
- <para>
- <emphasis role="bold">Note</emphasis>: In MySQL ¤t-series;,
- the <filename>errmsg.txt</filename> files all use UTF-8.
- Conversion to the client character set is automatic, as with
- metadata.
- </para>
-
</section>
<section id="charset-charsets">
@@ -2335,11 +2335,18 @@
<title>&title-charset-charsets;</title>
<para>
- MySQL supports 70+ collations for 30+ character sets. The
- character sets and their default collations are displayed by the
- <literal>SHOW CHARACTER SET</literal> statement:
+ MySQL supports 70+ collations for 30+ character sets. This section
+ indicates which character sets MySQL supports. There is one
+ subsection for each group of related character sets. For each
+ character set, the allowable collations are listed.
</para>
+ <para>
+ You can always list the available character sets and their default
+ collations with the <literal>SHOW CHARACTER SET</literal>
+ statement:
+ </para>
+
<programlisting>
mysql> <userinput>SHOW CHARACTER SET;</userinput>
+----------+-----------------------------+---------------------+
@@ -2400,34 +2407,135 @@
<literal>ucs2</literal> (UCS-2 Unicode) collations:
</para>
-<programlisting>
-mysql> <userinput>SHOW COLLATION LIKE 'ucs2%';</userinput>
-+--------------------+---------+-----+---------+----------+---------+
-| Collation | Charset | Id | Default | Compiled | Sortlen |
-+--------------------+---------+-----+---------+----------+---------+
-| ucs2_general_ci | ucs2 | 35 | Yes | Yes | 1 |
-| ucs2_bin | ucs2 | 90 | | Yes | 1 |
-| ucs2_unicode_ci | ucs2 | 128 | | Yes | 8 |
-| ucs2_icelandic_ci | ucs2 | 129 | | Yes | 8 |
-| ucs2_latvian_ci | ucs2 | 130 | | Yes | 8 |
-| ucs2_romanian_ci | ucs2 | 131 | | Yes | 8 |
-| ucs2_slovenian_ci | ucs2 | 132 | | Yes | 8 |
-| ucs2_polish_ci | ucs2 | 133 | | Yes | 8 |
-| ucs2_estonian_ci | ucs2 | 134 | | Yes | 8 |
-| ucs2_spanish_ci | ucs2 | 135 | | Yes | 8 |
-| ucs2_swedish_ci | ucs2 | 136 | | Yes | 8 |
-| ucs2_turkish_ci | ucs2 | 137 | | Yes | 8 |
-| ucs2_czech_ci | ucs2 | 138 | | Yes | 8 |
-| ucs2_danish_ci | ucs2 | 139 | | Yes | 8 |
-| ucs2_lithuanian_ci | ucs2 | 140 | | Yes | 8 |
-| ucs2_slovak_ci | ucs2 | 141 | | Yes | 8 |
-| ucs2_spanish2_ci | ucs2 | 142 | | Yes | 8 |
-| ucs2_roman_ci | ucs2 | 143 | | Yes | 8 |
-| ucs2_persian_ci | ucs2 | 144 | | Yes | 8 |
-| ucs2_esperanto_ci | ucs2 | 145 | | Yes | 8 |
-| ucs2_hungarian_ci | ucs2 | 146 | | Yes | 8 |
-+--------------------+---------+-----+---------+----------+---------+
-</programlisting>
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>ucs2_bin</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_czech_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_danish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_esperanto_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_estonian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_general_ci</literal> (default)
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_hungarian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_icelandic_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_latvian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_lithuanian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_persian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_polish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_roman_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_romanian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_slovak_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_slovenian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_spanish2_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_spanish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_swedish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_turkish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>ucs2_unicode_ci</literal>
+ </para>
+ </listitem>
+
+ </itemizedlist>
</listitem>
<listitem>
@@ -2435,34 +2543,135 @@
<literal>utf8</literal> (UTF-8 Unicode) collations:
</para>
-<programlisting>
-mysql> <userinput>SHOW COLLATION LIKE 'utf8%';</userinput>
-+--------------------+---------+-----+---------+----------+---------+
-| Collation | Charset | Id | Default | Compiled | Sortlen |
-+--------------------+---------+-----+---------+----------+---------+
-| utf8_general_ci | utf8 | 33 | Yes | Yes | 1 |
-| utf8_bin | utf8 | 83 | | Yes | 1 |
-| utf8_unicode_ci | utf8 | 192 | | Yes | 8 |
-| utf8_icelandic_ci | utf8 | 193 | | Yes | 8 |
-| utf8_latvian_ci | utf8 | 194 | | Yes | 8 |
-| utf8_romanian_ci | utf8 | 195 | | Yes | 8 |
-| utf8_slovenian_ci | utf8 | 196 | | Yes | 8 |
-| utf8_polish_ci | utf8 | 197 | | Yes | 8 |
-| utf8_estonian_ci | utf8 | 198 | | Yes | 8 |
-| utf8_spanish_ci | utf8 | 199 | | Yes | 8 |
-| utf8_swedish_ci | utf8 | 200 | | Yes | 8 |
-| utf8_turkish_ci | utf8 | 201 | | Yes | 8 |
-| utf8_czech_ci | utf8 | 202 | | Yes | 8 |
-| utf8_danish_ci | utf8 | 203 | | Yes | 8 |
-| utf8_lithuanian_ci | utf8 | 204 | | Yes | 8 |
-| utf8_slovak_ci | utf8 | 205 | | Yes | 8 |
-| utf8_spanish2_ci | utf8 | 206 | | Yes | 8 |
-| utf8_roman_ci | utf8 | 207 | | Yes | 8 |
-| utf8_persian_ci | utf8 | 208 | | Yes | 8 |
-| utf8_esperanto_ci | utf8 | 209 | | Yes | 8 |
-| utf8_hungarian_ci | utf8 | 210 | | Yes | 8 |
-+--------------------+---------+-----+---------+----------+---------+
-</programlisting>
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal>utf8_bin</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_czech_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_danish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_esperanto_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_estonian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_general_ci</literal> (default)
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_hungarian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_icelandic_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_latvian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_lithuanian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_persian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_polish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_roman_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_romanian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_slovak_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_slovenian_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_spanish2_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_spanish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_swedish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_turkish_ci</literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>utf8_unicode_ci</literal>
+ </para>
+ </listitem>
+
+ </itemizedlist>
</listitem>
</itemizedlist>
| Thread |
|---|
| • svn commit - mysqldoc@docsrva: r836 - in trunk: . refman-4.1 refman-5.0 refman-5.1 | paul | 15 Jan |