Author: paul
Date: 2006-01-15 05:56:50 +0100 (Sun, 15 Jan 2006)
New Revision: 834
Log:
r6210@frost: paul | 2006-01-14 19:55:11 -0600
General revisions.
Modified:
trunk/
trunk/refman-4.1/charset.xml
trunk/refman-4.1/installing.xml
trunk/refman-4.1/language-structure.xml
trunk/refman-5.0/charset.xml
trunk/refman-5.0/installing.xml
trunk/refman-5.0/language-structure.xml
trunk/refman-5.1/charset.xml
trunk/refman-5.1/installing.xml
trunk/refman-5.1/language-structure.xml
trunk/refman-common/titles.en.ent
Property changes on: trunk
___________________________________________________________________
Name: svk:merge
- b5ec3a16-e900-0410-9ad2-d183a3acac99:/mysqldoc-local/mysqldoc/trunk:6209
bf112a9c-6c03-0410-a055-ad865cd57414:/mysqldoc-local/mysqldoc/trunk:2175
+ b5ec3a16-e900-0410-9ad2-d183a3acac99:/mysqldoc-local/mysqldoc/trunk:6210
bf112a9c-6c03-0410-a055-ad865cd57414:/mysqldoc-local/mysqldoc/trunk:2175
Modified: trunk/refman-4.1/charset.xml
===================================================================
--- trunk/refman-4.1/charset.xml 2006-01-15 04:56:33 UTC (rev 833)
+++ trunk/refman-4.1/charset.xml 2006-01-15 04:56:50 UTC (rev 834)
@@ -22,10 +22,6 @@
</indexterm>
<indexterm>
- <primary>UTF8</primary>
- </indexterm>
-
- <indexterm>
<primary>UTF-8</primary>
</indexterm>
@@ -508,6 +504,12 @@
</para>
<para>
+ All database options are stored in a text file named
+ <filename>db.opt</filename> that can be found in the database
+ directory.
+ </para>
+
+ <para>
The <literal>CHARACTER SET</literal> and
<literal>COLLATE</literal> clauses make it possible to create
databases with different character sets and collations on the
@@ -820,7 +822,8 @@
</programlisting>
<para>
- MySQL determines a literal's character set and collation thus:
+ MySQL determines a literal's character set and collation in the
+ following manner:
</para>
<itemizedlist>
@@ -1464,6 +1467,26 @@
SELECT * FROM t1 ORDER BY a COLLATE latin1_bin;
</programlisting>
+ <para>
+ The <literal>BINARY</literal> attribute in character column
+ definitions has a similar effect. A character column defined
+ with the <literal>BINARY</literal> attribute is assigned the
+ binary collation of the column's character set.
+ </para>
+
+ <para>
+ The effect of <literal>BINARY</literal> in both contexts differs
+ from its effect prior to MySQL 4.1. Formerly,
+ <literal>BINARY</literal> resulted in binary string comparisons
+ or binary string column definitions. A binary string is a string
+ of bytes that has no character set or collation, which differs
+ from a non-binary character string that has a binary collation.
+ For both types of strings, comparisons are based on the numeric
+ values of the string unit, but for non-binary strings the unit
+ is the character and some character sets allow multi-byte
+ characters. <xref linkend="binary-varbinary"/>.
+ </para>
+
</section>
<section id="charset-collate-tricky">
@@ -1471,11 +1494,10 @@
<title>&title-charset-collate-tricky;</title>
<para>
- In the great majority of queries, it is obvious what collation
- MySQL uses to resolve a comparison operation. For example, in
- the following cases, it should be clear that the collation is
- <quote>the column collation of column
- <literal>x</literal></quote>:
+ In the great majority of statements, it is obvious what
+ collation MySQL uses to resolve a comparison operation. For
+ example, in the following cases, it should be clear that the
+ collation is the collation of column <literal>x</literal>:
</para>
<programlisting>
@@ -1504,7 +1526,7 @@
called <quote>coercibility</quote> rules. Basically, this means:
Both <literal>x</literal> and <literal>'Y'</literal> have
collations, so which collation takes precedence? This can be
- difficult to resolve, but the following rules take care of most
+ difficult to resolve, but the following rules cover most
situations:
</para>
@@ -1560,7 +1582,7 @@
</para>
<para>
- Those rules resolve ambiguities thus:
+ Those rules resolve ambiguities in the following manner:
</para>
<itemizedlist>
@@ -1637,9 +1659,9 @@
<title>&title-charset-collation-charset;</title>
<para>
- Recall that each character set has one or more collations, and
- each collation is associated with one and only one character
- set. Therefore, the following statement causes an error message
+ Cach character set has one or more collations, but each
+ collation is associated with one and only one character set.
+ Therefore, the following statement causes an error message
because the <literal>latin2_bin</literal> collation is not legal
with the <literal>latin1</literal> character set:
</para>
@@ -1760,8 +1782,8 @@
</programlisting>
<para>
- The resulting order of the values for different collations is
- shown in this table:
+ The following table shows the resulting order of the values if
+ we use <literal>ORDER BY</literal> with different collations:
</para>
<informaltable>
@@ -1800,10 +1822,8 @@
</informaltable>
<para>
- The table is an example that shows what the effect would be if
- we used different collations in an <literal>ORDER BY</literal>
- clause. The character that causes the different sort orders in
- this example is the U with two dots over it
+ The character that causes the different sort orders in this
+ example is the U with two dots over it
(<literal>ü</literal>), which the Germans call "U-umlaut".
</para>
@@ -1873,15 +1893,18 @@
<literal>RTRIM()</literal>, <literal>SOUNDEX()</literal>,
<literal>SUBSTRING()</literal>, <literal>TRIM()</literal>,
<literal>UCASE()</literal>, and <literal>UPPER()</literal>.
- (Also note: The <literal>REPLACE()</literal> function, unlike
- all other functions, always ignores the collation of the string
- input and performs a case-sensitive comparison.)
</para>
<para>
+ Note: The <literal>REPLACE()</literal> function, unlike all
+ other functions, always ignores the collation of the string
+ input and performs a case-sensitive comparison.
+ </para>
+
+ <para>
For operations that combine multiple string inputs and return a
single string output, the <quote>aggregation rules</quote> of
- standard SQL apply:
+ standard SQL apply for determining the collation of the result:
</para>
<itemizedlist>
@@ -1889,7 +1912,7 @@
<listitem>
<para>
If an explicit <literal>COLLATE
- <replaceable>X</replaceable></literal> occurs, then use
+ <replaceable>X</replaceable></literal> occurs, use
<replaceable>X</replaceable>.
</para>
</listitem>
@@ -1898,7 +1921,7 @@
<para>
If explicit <literal>COLLATE
<replaceable>X</replaceable></literal> and <literal>COLLATE
- <replaceable>Y</replaceable></literal> occur, then raise an
+ <replaceable>Y</replaceable></literal> occur, raise an
error.
</para>
</listitem>
@@ -1906,7 +1929,7 @@
<listitem>
<para>
Otherwise, if all collations are
- <replaceable>X</replaceable>, then use
+ <replaceable>X</replaceable>, use
<replaceable>X</replaceable>.
</para>
</listitem>
@@ -1920,8 +1943,8 @@
</itemizedlist>
<para>
- For example, with <literal>CASE ... WHEN a THEN b WHEN b THEN c
- COLLATE <replaceable>X</replaceable> END</literal>, the
+ For example, with <literal>CASE … WHEN a THEN b WHEN b
+ THEN c COLLATE <replaceable>X</replaceable> END</literal>, the
resulting collation is <replaceable>X</replaceable>. The same
applies for <literal>CASE</literal>, <literal>UNION</literal>,
<literal>||</literal>, <literal>CONCAT()</literal>,
@@ -1970,8 +1993,8 @@
</programlisting>
<para>
- <literal>CONVERT(... USING ...)</literal> is implemented
- according to the standard SQL specification.
+ <literal>CONVERT(… USING …)</literal> is
+ implemented according to the standard SQL specification.
</para>
<para>
@@ -1998,7 +2021,7 @@
<literal>character_set_connection</literal> and
<literal>collation_connection</literal> system variables. If you
use <literal>CAST()</literal> with <literal>CHARACTER SET
- X</literal>, then the resulting character set and collation are
+ X</literal>, the resulting character set and collation are
<literal>X</literal> and the default collation of
<literal>X</literal>.
</para>
@@ -2006,8 +2029,8 @@
<para>
You may not use a <literal>COLLATE</literal> clause inside a
<literal>CAST()</literal>, but you may use it outside. That is,
- <literal>CAST(... COLLATE ...)</literal> is illegal, but
- <literal>CAST(...) COLLATE ...</literal> is legal.
+ <literal>CAST(… COLLATE …)</literal> is illegal,
+ but <literal>CAST(…) COLLATE …</literal> is legal.
</para>
<para>
@@ -2030,7 +2053,8 @@
<literal>SHOW CHARACTER SET</literal>, <literal>SHOW
COLLATION</literal>, and <literal>SHOW CREATE DATABASE</literal>
are new. <literal>SHOW CREATE TABLE</literal> and <literal>SHOW
- COLUMNS</literal> are modified.
+ COLUMNS</literal> are modified. These statements are described
+ here briefly. For more information, see <xref linkend="show"/>.
</para>
<para>
@@ -2053,10 +2077,6 @@
</programlisting>
<para>
- See <xref linkend="show-character-set"/>.
- </para>
-
- <para>
The output from <literal>SHOW COLLATION</literal> includes all
available character sets. It takes an optional
<literal>LIKE</literal> clause that indicates which collation
@@ -2080,17 +2100,9 @@
</programlisting>
<para>
- See <xref linkend="show-collation"/>.
- </para>
-
- <para>
<literal>SHOW CREATE DATABASE</literal> displays the
<literal>CREATE DATABASE</literal> statement that creates a
- given database. The result includes all database options.
- <literal>DEFAULT CHARACTER SET</literal> and
- <literal>COLLATE</literal> are supported. All database options
- are stored in a text file named <filename>db.opt</filename> that
- can be found in the database directory.
+ given database:
</para>
<programlisting>
@@ -2103,7 +2115,8 @@
</programlisting>
<para>
- See <xref linkend="show-create-database"/>.
+ If no <literal>COLLATE</literal> clause is shown, the default
+ collation for the character set applies.
</para>
<para>
@@ -2115,17 +2128,13 @@
</para>
<para>
- See <xref linkend="show-create-table"/>.
- </para>
-
- <para>
The <literal>SHOW COLUMNS</literal> statement displays the
collations of a table's columns when invoked as <literal>SHOW
FULL COLUMNS</literal>. Columns with <literal>CHAR</literal>,
<literal>VARCHAR</literal>, or <literal>TEXT</literal> data
- types have non-<literal>NULL</literal> collations. Numeric and
- other non-character types have <literal>NULL</literal>
- collations. For example:
+ types have collations. Numeric and other non-character types
+ have no collation (indicated by <literal>NULL</literal> as the
+ <literal>Collation</literal> value). For example:
</para>
<programlisting>
@@ -2154,14 +2163,10 @@
</programlisting>
<para>
- The character set is not part of the display. (The character set
- name is implied by the collation name.)
+ The character set is not part of the display but is implied by
+ the collation name.
</para>
- <para>
- See <xref linkend="show-columns"/>.
- </para>
-
</section>
</section>
@@ -2185,7 +2190,7 @@
<listitem>
<para>
- <literal>utf8</literal>, the UTF8 encoding of the Unicode
+ <literal>utf8</literal>, the UTF-8 encoding of the Unicode
character set.
</para>
</listitem>
@@ -2196,10 +2201,11 @@
In UCS-2 (binary Unicode representation), every character is
represented by a two-byte Unicode code with the most significant
byte first. For example: "LATIN CAPITAL LETTER A" has the code
- 0x0041 and it's stored as a two-byte sequence: 0x00 0x41.
- "CYRILLIC SMALL LETTER YERU" (Unicode 0x044B) is stored as a
- two-byte sequence: 0x04 0x4B. For Unicode characters and their
- codes, please refer to the
+ <literal>0x0041</literal> and it is stored as a two-byte sequence:
+ <literal>0x00 0x41</literal>. "CYRILLIC SMALL LETTER YERU"
+ (Unicode <literal>0x044B</literal>) is stored as a two-byte
+ sequence: <literal>0x04 0x4B</literal>. For Unicode characters and
+ their codes, please refer to the
<ulink url="http://www.unicode.org/">Unicode Home Page</ulink>.
</para>
@@ -2209,9 +2215,9 @@
</para>
<para>
- The UTF8 character set (transform Unicode representation) is an
+ The UTF-8 character set (transform Unicode representation) is an
alternative way to store Unicode data. It is implemented according
- to RFC 3629. The idea of the UTF8 character set is that various
+ to RFC 3629. The idea of the UTF-8 character set is that various
Unicode characters are encoded using byte sequences of different
lengths:
</para>
@@ -2245,19 +2251,20 @@
<para>
RFC 3629 describes encoding sequences that take from one to four
- bytes. Currently, MySQL UTF8 support does not include four-byte
- sequences. (An older standard for UTF8 encoding is given by RFC
- 2279, which describes UTF8 sequences that take from one to six
- bytes. RFC 3629 renders RFC 2279 obsolete; for this reason,
- sequences with five and six bytes are no longer used.)
+ bytes. Currently, MySQL support for UTF-8 does not include
+ four-byte sequences. (An older standard for UTF-8 encoding is
+ given by RFC 2279, which describes UTF-8 sequences that take from
+ one to six bytes. RFC 3629 renders RFC 2279 obsolete; for this
+ reason, sequences with five and six bytes are no longer used.)
</para>
<para>
- <emphasis role="bold">Tip</emphasis>: To save space with UTF8, use
- <literal>VARCHAR</literal> instead of <literal>CHAR</literal>.
- Otherwise, MySQL has to reserve 30 bytes for a <literal>CHAR(10)
- CHARACTER SET utf8</literal> column, because this is the maximum
- possible length.
+ <emphasis role="bold">Tip</emphasis>: To save space with UTF-8,
+ use <literal>VARCHAR</literal> instead of <literal>CHAR</literal>.
+ Otherwise, MySQL must reserve three bytes for each character in a
+ <literal>CHAR CHARACTER SET utf8</literal> column because that is
+ the maximum possible length. For example, MySQL must reserve 30
+ bytes for a <literal>CHAR(10) CHARACTER SET utf8</literal> column.
</para>
</section>
@@ -2303,16 +2310,16 @@
<para>
In order to satisfy both requirements, MySQL stores metadata in a
- Unicode character set, namely UTF8. This does not cause any
+ Unicode character set, namely UTF-8. This does not cause any
disruption if you never use accented characters. But if you do,
- you should be aware that metadata is in UTF8.
+ you should be aware that metadata is in UTF-8.
</para>
<para>
This means that the return values of the
<literal>USER()</literal>, <literal>CURRENT_USER()</literal>,
<literal>DATABASE()</literal>, and <literal>VERSION()</literal>
- functions have the UTF8 character set by default, as do synonyms
+ functions have the UTF-8 character set by default, as do synonyms
such as <literal>SESSION_USER()</literal> and
<literal>SYSTEM_USER()</literal>.
</para>
@@ -2349,9 +2356,9 @@
</para>
<para>
- If you want the server to pass metadata results back in a non-UTF8
- character set, then use <literal>SET NAMES</literal> to force the
- server to perform character set conversion (see
+ If you want the server to pass metadata results back in a
+ non-UTF-8 character set, then use <literal>SET NAMES</literal> to
+ force the server to perform character set conversion (see
<xref linkend="charset-connection"/>), or else set the client to
do the conversion. It is always more efficient to set the client
to do the conversion, but this option is not available for many
@@ -2371,7 +2378,7 @@
<para>
This works because the contents of
<literal>latin1_column</literal> are automatically converted to
- UTF8 before the comparison.
+ UTF-8 before the comparison.
</para>
<programlisting>
@@ -2397,8 +2404,9 @@
<para>
<emphasis role="bold">Note</emphasis>: Beginning with MySQL 4.1.1,
- the <filename>errmsg.txt</filename> files all use UTF8. Conversion
- to the client character set is automatic, as with metadata.
+ the <filename>errmsg.txt</filename> files all use UTF-8.
+ Conversion to the client character set is automatic, as with
+ metadata.
</para>
</section>
Modified: trunk/refman-4.1/installing.xml
===================================================================
--- trunk/refman-4.1/installing.xml 2006-01-15 04:56:33 UTC (rev 833)
+++ trunk/refman-4.1/installing.xml 2006-01-15 04:56:50 UTC (rev 834)
@@ -3855,8 +3855,8 @@
<para>
<guimenuitem>Best Support For
Multilingualism</guimenuitem>: Choose this option if you
- want to use <literal>UTF8</literal> as the default server
- character set. <literal>UTF8</literal> can store
+ want to use <literal>utf8</literal> as the default server
+ character set. <literal>utf8</literal> can store
characters from many different languages in a single
character set.
</para>
@@ -12302,7 +12302,7 @@
<listitem>
<para>
- If you have used column prefix indexes on UTF8 columns
+ If you have used column prefix indexes on UTF-8 columns
or other multi-byte character set columns in MySQL 4.1.0
to 4.1.5, you must rebuild the tables when you upgrade
to MySQL 4.1.6 or later.
@@ -12315,9 +12315,9 @@
values of 128 to 255) in database names, table names,
constraint names, or column names in versions of MySQL
earlier than 4.1, you cannot upgrade to MySQL 4.1
- directly, because 4.1 uses UTF8 to store metadata names.
- Use <literal>RENAME TABLE</literal> to overcome this if
- the accent character is in the table name or the
+ directly, because 4.1 uses UTF-8 to store metadata
+ names. Use <literal>RENAME TABLE</literal> to overcome
+ this if the accent character is in the table name or the
database name, or rebuild the table.
</para>
</listitem>
@@ -12429,7 +12429,7 @@
<para>
<emphasis role="bold">Important note:</emphasis> MySQL 4.1
stores table names and column names in
- <literal>UTF8</literal>. If you have table names or column
+ <literal>utf8</literal>. If you have table names or column
names that use characters outside of the standard 7-bit
US-ASCII range, you may have to do a
<command>mysqldump</command> of your tables in MySQL 4.0 and
@@ -12490,7 +12490,7 @@
character set using the instructions in
<xref linkend="charset-conversion"/>. Also, database, table,
and column identifiers are stored internally using Unicode
- (UTF8) regardless of the default character set. See
+ (UTF-8) regardless of the default character set. See
<xref linkend="legal-names"/>.
</para>
</listitem>
Modified: trunk/refman-4.1/language-structure.xml
===================================================================
--- trunk/refman-4.1/language-structure.xml 2006-01-15 04:56:33 UTC (rev 833)
+++ trunk/refman-4.1/language-structure.xml 2006-01-15 04:56:50 UTC (rev 834)
@@ -809,7 +809,7 @@
<para>
Beginning with MySQL 4.1, identifiers are stored using Unicode
- (UTF8). This applies to identifiers in table definitions that
+ (UTF-8). This applies to identifiers in table definitions that
stored in <filename>.frm</filename> files and to identifiers
stored in the grant tables in the <literal>mysql</literal>
database. Although Unicode identifiers can include multi-byte
Modified: trunk/refman-5.0/charset.xml
===================================================================
--- trunk/refman-5.0/charset.xml 2006-01-15 04:56:33 UTC (rev 833)
+++ trunk/refman-5.0/charset.xml 2006-01-15 04:56:50 UTC (rev 834)
@@ -22,10 +22,6 @@
</indexterm>
<indexterm>
- <primary>UTF8</primary>
- </indexterm>
-
- <indexterm>
<primary>UTF-8</primary>
</indexterm>
@@ -495,6 +491,12 @@
</para>
<para>
+ All database options are stored in a text file named
+ <filename>db.opt</filename> that can be found in the database
+ directory.
+ </para>
+
+ <para>
The <literal>CHARACTER SET</literal> and
<literal>COLLATE</literal> clauses make it possible to create
databases with different character sets and collations on the
@@ -807,7 +809,8 @@
</programlisting>
<para>
- MySQL determines a literal's character set and collation thus:
+ MySQL determines a literal's character set and collation in the
+ following manner:
</para>
<itemizedlist>
@@ -1456,6 +1459,26 @@
SELECT * FROM t1 ORDER BY a COLLATE latin1_bin;
</programlisting>
+ <para>
+ The <literal>BINARY</literal> attribute in character column
+ definitions has a similar effect. A character column defined
+ with the <literal>BINARY</literal> attribute is assigned the
+ binary collation of the column's character set.
+ </para>
+
+ <para>
+ The effect of <literal>BINARY</literal> in both contexts differs
+ from its effect prior to MySQL 4.1. Formerly,
+ <literal>BINARY</literal> resulted in binary string comparisons
+ or binary string column definitions. A binary string is a string
+ of bytes that has no character set or collation, which differs
+ from a non-binary character string that has a binary collation.
+ For both types of strings, comparisons are based on the numeric
+ values of the string unit, but for non-binary strings the unit
+ is the character and some character sets allow multi-byte
+ characters. <xref linkend="binary-varbinary"/>.
+ </para>
+
</section>
<section id="charset-collate-tricky">
@@ -1463,11 +1486,10 @@
<title>&title-charset-collate-tricky;</title>
<para>
- In the great majority of queries, it is obvious what collation
- MySQL uses to resolve a comparison operation. For example, in
- the following cases, it should be clear that the collation is
- <quote>the column collation of column
- <literal>x</literal></quote>:
+ In the great majority of statements, it is obvious what
+ collation MySQL uses to resolve a comparison operation. For
+ example, in the following cases, it should be clear that the
+ collation is the collation of column <literal>x</literal>:
</para>
<programlisting>
@@ -1496,7 +1518,7 @@
called <quote>coercibility</quote> rules. Basically, this means:
Both <literal>x</literal> and <literal>'Y'</literal> have
collations, so which collation takes precedence? This can be
- difficult to resolve, but the following rules take care of most
+ difficult to resolve, but the following rules cover most
situations:
</para>
@@ -1552,7 +1574,7 @@
</para>
<para>
- Those rules resolve ambiguities thus:
+ Those rules resolve ambiguities in the following manner:
</para>
<itemizedlist>
@@ -1629,9 +1651,9 @@
<title>&title-charset-collation-charset;</title>
<para>
- Recall that each character set has one or more collations, and
- each collation is associated with one and only one character
- set. Therefore, the following statement causes an error message
+ Cach character set has one or more collations, but each
+ collation is associated with one and only one character set.
+ Therefore, the following statement causes an error message
because the <literal>latin2_bin</literal> collation is not legal
with the <literal>latin1</literal> character set:
</para>
@@ -1671,8 +1693,8 @@
</programlisting>
<para>
- The resulting order of the values for different collations is
- shown in this table:
+ The following table shows the resulting order of the values if
+ we use <literal>ORDER BY</literal> with different collations:
</para>
<informaltable>
@@ -1711,10 +1733,8 @@
</informaltable>
<para>
- The table is an example that shows what the effect would be if
- we used different collations in an <literal>ORDER BY</literal>
- clause. The character that causes the different sort orders in
- this example is the U with two dots over it
+ The character that causes the different sort orders in this
+ example is the U with two dots over it
(<literal>ü</literal>), which the Germans call "U-umlaut".
</para>
@@ -1756,7 +1776,7 @@
<para>
This section describes operations that take character set
- information into account in MySQL ¤t-series;.
+ information into account.
</para>
<section id="charset-result">
@@ -1784,15 +1804,18 @@
<literal>RTRIM()</literal>, <literal>SOUNDEX()</literal>,
<literal>SUBSTRING()</literal>, <literal>TRIM()</literal>,
<literal>UCASE()</literal>, and <literal>UPPER()</literal>.
- (Also note: The <literal>REPLACE()</literal> function, unlike
- all other functions, always ignores the collation of the string
- input and performs a case-sensitive comparison.)
</para>
<para>
+ Note: The <literal>REPLACE()</literal> function, unlike all
+ other functions, always ignores the collation of the string
+ input and performs a case-sensitive comparison.
+ </para>
+
+ <para>
For operations that combine multiple string inputs and return a
single string output, the <quote>aggregation rules</quote> of
- standard SQL apply:
+ standard SQL apply for determining the collation of the result:
</para>
<itemizedlist>
@@ -1800,7 +1823,7 @@
<listitem>
<para>
If an explicit <literal>COLLATE
- <replaceable>X</replaceable></literal> occurs, then use
+ <replaceable>X</replaceable></literal> occurs, use
<replaceable>X</replaceable>.
</para>
</listitem>
@@ -1809,7 +1832,7 @@
<para>
If explicit <literal>COLLATE
<replaceable>X</replaceable></literal> and <literal>COLLATE
- <replaceable>Y</replaceable></literal> occur, then raise an
+ <replaceable>Y</replaceable></literal> occur, raise an
error.
</para>
</listitem>
@@ -1817,7 +1840,7 @@
<listitem>
<para>
Otherwise, if all collations are
- <replaceable>X</replaceable>, then use
+ <replaceable>X</replaceable>, use
<replaceable>X</replaceable>.
</para>
</listitem>
@@ -1831,8 +1854,8 @@
</itemizedlist>
<para>
- For example, with <literal>CASE ... WHEN a THEN b WHEN b THEN c
- COLLATE <replaceable>X</replaceable> END</literal>, the
+ For example, with <literal>CASE … WHEN a THEN b WHEN b
+ THEN c COLLATE <replaceable>X</replaceable> END</literal>, the
resulting collation is <replaceable>X</replaceable>. The same
applies for <literal>CASE</literal>, <literal>UNION</literal>,
<literal>||</literal>, <literal>CONCAT()</literal>,
@@ -1881,8 +1904,8 @@
</programlisting>
<para>
- <literal>CONVERT(... USING ...)</literal> is implemented
- according to the standard SQL specification.
+ <literal>CONVERT(… USING …)</literal> is
+ implemented according to the standard SQL specification.
</para>
<para>
@@ -1909,7 +1932,7 @@
<literal>character_set_connection</literal> and
<literal>collation_connection</literal> system variables. If you
use <literal>CAST()</literal> with <literal>CHARACTER SET
- X</literal>, then the resulting character set and collation are
+ X</literal>, the resulting character set and collation are
<literal>X</literal> and the default collation of
<literal>X</literal>.
</para>
@@ -1917,8 +1940,8 @@
<para>
You may not use a <literal>COLLATE</literal> clause inside a
<literal>CAST()</literal>, but you may use it outside. That is,
- <literal>CAST(... COLLATE ...)</literal> is illegal, but
- <literal>CAST(...) COLLATE ...</literal> is legal.
+ <literal>CAST(… COLLATE …)</literal> is illegal,
+ but <literal>CAST(…) COLLATE …</literal> is legal.
</para>
<para>
@@ -1940,10 +1963,23 @@
character set information. These include <literal>SHOW CHARACTER
SET</literal>, <literal>SHOW COLLATION</literal>, <literal>SHOW
CREATE DATABASE</literal>, <literal>SHOW CREATE TABLE</literal>
- and <literal>SHOW COLUMNS</literal>.
+ and <literal>SHOW COLUMNS</literal>. These statements are
+ described here briefly. For more information, see
+ <xref linkend="show"/>.
</para>
<para>
+ <literal>INFORMATION_SCHEMA</literal> has several tables that
+ contain information information similar to that displayed by the
+ <literal>SHOW</literal> statements. For example, the
+ <literal>CHARACTER_SETS</literal> and
+ <literal>COLLATIONS</literal> tables contain the information
+ displayed by <literal>SHOW CHARACTER SET</literal> and
+ <literal>SHOW COLLATION</literal>.
+ <xref linkend="information-schema"/>.
+ </para>
+
+ <para>
The <literal>SHOW CHARACTER SET</literal> command shows all
available character sets. It takes an optional
<literal>LIKE</literal> clause that indicates which character
@@ -1963,10 +1999,6 @@
</programlisting>
<para>
- See <xref linkend="show-character-set"/>.
- </para>
-
- <para>
The output from <literal>SHOW COLLATION</literal> includes all
available character sets. It takes an optional
<literal>LIKE</literal> clause that indicates which collation
@@ -1990,17 +2022,9 @@
</programlisting>
<para>
- See <xref linkend="show-collation"/>.
- </para>
-
- <para>
<literal>SHOW CREATE DATABASE</literal> displays the
<literal>CREATE DATABASE</literal> statement that creates a
- given database. The result includes all database options.
- <literal>DEFAULT CHARACTER SET</literal> and
- <literal>COLLATE</literal> are supported. All database options
- are stored in a text file named <filename>db.opt</filename> that
- can be found in the database directory.
+ given database:
</para>
<programlisting>
@@ -2013,7 +2037,8 @@
</programlisting>
<para>
- See <xref linkend="show-create-database"/>.
+ If no <literal>COLLATE</literal> clause is shown, the default
+ collation for the character set applies.
</para>
<para>
@@ -2025,17 +2050,13 @@
</para>
<para>
- See <xref linkend="show-create-table"/>.
- </para>
-
- <para>
The <literal>SHOW COLUMNS</literal> statement displays the
collations of a table's columns when invoked as <literal>SHOW
FULL COLUMNS</literal>. Columns with <literal>CHAR</literal>,
<literal>VARCHAR</literal>, or <literal>TEXT</literal> data
- types have non-<literal>NULL</literal> collations. Numeric and
- other non-character types have <literal>NULL</literal>
- collations. For example:
+ types have collations. Numeric and other non-character types
+ have no collation (indicated by <literal>NULL</literal> as the
+ <literal>Collation</literal> value). For example:
</para>
<programlisting>
@@ -2064,14 +2085,10 @@
</programlisting>
<para>
- The character set is not part of the display. (The character set
- name is implied by the collation name.)
+ The character set is not part of the display but is implied by
+ the collation name.
</para>
- <para>
- See <xref linkend="show-columns"/>.
- </para>
-
</section>
</section>
@@ -2095,7 +2112,7 @@
<listitem>
<para>
- <literal>utf8</literal>, the UTF8 encoding of the Unicode
+ <literal>utf8</literal>, the UTF-8 encoding of the Unicode
character set.
</para>
</listitem>
@@ -2106,10 +2123,11 @@
In UCS-2 (binary Unicode representation), every character is
represented by a two-byte Unicode code with the most significant
byte first. For example: "LATIN CAPITAL LETTER A" has the code
- 0x0041 and it's stored as a two-byte sequence: 0x00 0x41.
- "CYRILLIC SMALL LETTER YERU" (Unicode 0x044B) is stored as a
- two-byte sequence: 0x04 0x4B. For Unicode characters and their
- codes, please refer to the
+ <literal>0x0041</literal> and it is stored as a two-byte sequence:
+ <literal>0x00 0x41</literal>. "CYRILLIC SMALL LETTER YERU"
+ (Unicode <literal>0x044B</literal>) is stored as a two-byte
+ sequence: <literal>0x04 0x4B</literal>. For Unicode characters and
+ their codes, please refer to the
<ulink url="http://www.unicode.org/">Unicode Home Page</ulink>.
</para>
@@ -2119,9 +2137,9 @@
</para>
<para>
- The UTF8 character set (transform Unicode representation) is an
+ The UTF-8 character set (transform Unicode representation) is an
alternative way to store Unicode data. It is implemented according
- to RFC 3629. The idea of the UTF8 character set is that various
+ to RFC 3629. The idea of the UTF-8 character set is that various
Unicode characters are encoded using byte sequences of different
lengths:
</para>
@@ -2155,19 +2173,20 @@
<para>
RFC 3629 describes encoding sequences that take from one to four
- bytes. Currently, MySQL UTF8 support does not include four-byte
- sequences. (An older standard for UTF8 encoding is given by RFC
- 2279, which describes UTF8 sequences that take from one to six
- bytes. RFC 3629 renders RFC 2279 obsolete; for this reason,
- sequences with five and six bytes are no longer used.)
+ bytes. Currently, MySQL support for UTF-8 does not include
+ four-byte sequences. (An older standard for UTF-8 encoding is
+ given by RFC 2279, which describes UTF-8 sequences that take from
+ one to six bytes. RFC 3629 renders RFC 2279 obsolete; for this
+ reason, sequences with five and six bytes are no longer used.)
</para>
<para>
- <emphasis role="bold">Tip</emphasis>: To save space with UTF8, use
- <literal>VARCHAR</literal> instead of <literal>CHAR</literal>.
- Otherwise, MySQL has to reserve 30 bytes for a <literal>CHAR(10)
- CHARACTER SET utf8</literal> column, because this is the maximum
- possible length.
+ <emphasis role="bold">Tip</emphasis>: To save space with UTF-8,
+ use <literal>VARCHAR</literal> instead of <literal>CHAR</literal>.
+ Otherwise, MySQL must reserve three bytes for each character in a
+ <literal>CHAR CHARACTER SET utf8</literal> column because that is
+ the maximum possible length. For example, MySQL must reserve 30
+ bytes for a <literal>CHAR(10) CHARACTER SET utf8</literal> column.
</para>
</section>
@@ -2218,16 +2237,16 @@
<para>
In order to satisfy both requirements, MySQL stores metadata in a
- Unicode character set, namely UTF8. This does not cause any
+ Unicode character set, namely UTF-8. This does not cause any
disruption if you never use accented characters. But if you do,
- you should be aware that metadata is in UTF8.
+ you should be aware that metadata is in UTF-8.
</para>
<para>
This means that the return values of the
<literal>USER()</literal>, <literal>CURRENT_USER()</literal>,
<literal>DATABASE()</literal>, and <literal>VERSION()</literal>
- functions have the UTF8 character set by default, as do synonyms
+ functions have the UTF-8 character set by default, as do synonyms
such as <literal>SESSION_USER()</literal> and
<literal>SYSTEM_USER()</literal>.
</para>
@@ -2264,9 +2283,9 @@
</para>
<para>
- If you want the server to pass metadata results back in a non-UTF8
- character set, then use <literal>SET NAMES</literal> to force the
- server to perform character set conversion (see
+ If you want the server to pass metadata results back in a
+ non-UTF-8 character set, then use <literal>SET NAMES</literal> to
+ force the server to perform character set conversion (see
<xref linkend="charset-connection"/>), or else cause the client to
perform the conversion. It is more efficient to have the client
perform the conversion, but this option is not always available
@@ -2286,7 +2305,7 @@
<para>
This works because the contents of
<literal>latin1_column</literal> are automatically converted to
- UTF8 before the comparison.
+ UTF-8 before the comparison.
</para>
<programlisting>
@@ -2312,8 +2331,9 @@
<para>
<emphasis role="bold">Note</emphasis>: In MySQL ¤t-series;,
- the <filename>errmsg.txt</filename> files all use UTF8. Conversion
- to the client character set is automatic, as with metadata.
+ the <filename>errmsg.txt</filename> files all use UTF-8.
+ Conversion to the client character set is automatic, as with
+ metadata.
</para>
</section>
Modified: trunk/refman-5.0/installing.xml
===================================================================
--- trunk/refman-5.0/installing.xml 2006-01-15 04:56:33 UTC (rev 833)
+++ trunk/refman-5.0/installing.xml 2006-01-15 04:56:50 UTC (rev 834)
@@ -3893,8 +3893,8 @@
<para>
<guimenuitem>Best Support For
Multilingualism</guimenuitem>: Choose this option if you
- want to use <literal>UTF8</literal> as the default server
- character set. <literal>UTF8</literal> can store
+ want to use <literal>utf8</literal> as the default server
+ character set. <literal>utf8</literal> can store
characters from many different languages in a single
character set.
</para>
Modified: trunk/refman-5.0/language-structure.xml
===================================================================
--- trunk/refman-5.0/language-structure.xml 2006-01-15 04:56:33 UTC (rev 833)
+++ trunk/refman-5.0/language-structure.xml 2006-01-15 04:56:50 UTC (rev 834)
@@ -834,7 +834,7 @@
</para>
<para>
- Identifiers are stored using Unicode (UTF8). This applies to
+ Identifiers are stored using Unicode (UTF-8). This applies to
identifiers in table definitions that stored in
<filename>.frm</filename> files and to identifiers stored in the
grant tables in the <literal>mysql</literal> database. The sizes
Modified: trunk/refman-5.1/charset.xml
===================================================================
--- trunk/refman-5.1/charset.xml 2006-01-15 04:56:33 UTC (rev 833)
+++ trunk/refman-5.1/charset.xml 2006-01-15 04:56:50 UTC (rev 834)
@@ -22,10 +22,6 @@
</indexterm>
<indexterm>
- <primary>UTF8</primary>
- </indexterm>
-
- <indexterm>
<primary>UTF-8</primary>
</indexterm>
@@ -495,6 +491,12 @@
</para>
<para>
+ All database options are stored in a text file named
+ <filename>db.opt</filename> that can be found in the database
+ directory.
+ </para>
+
+ <para>
The <literal>CHARACTER SET</literal> and
<literal>COLLATE</literal> clauses make it possible to create
databases with different character sets and collations on the
@@ -807,7 +809,8 @@
</programlisting>
<para>
- MySQL determines a literal's character set and collation thus:
+ MySQL determines a literal's character set and collation in the
+ following manner:
</para>
<itemizedlist>
@@ -1456,6 +1459,26 @@
SELECT * FROM t1 ORDER BY a COLLATE latin1_bin;
</programlisting>
+ <para>
+ The <literal>BINARY</literal> attribute in character column
+ definitions has a similar effect. A character column defined
+ with the <literal>BINARY</literal> attribute is assigned the
+ binary collation of the column's character set.
+ </para>
+
+ <para>
+ The effect of <literal>BINARY</literal> in both contexts differs
+ from its effect prior to MySQL 4.1. Formerly,
+ <literal>BINARY</literal> resulted in binary string comparisons
+ or binary string column definitions. A binary string is a string
+ of bytes that has no character set or collation, which differs
+ from a non-binary character string that has a binary collation.
+ For both types of strings, comparisons are based on the numeric
+ values of the string unit, but for non-binary strings the unit
+ is the character and some character sets allow multi-byte
+ characters. <xref linkend="binary-varbinary"/>.
+ </para>
+
</section>
<section id="charset-collate-tricky">
@@ -1463,11 +1486,10 @@
<title>&title-charset-collate-tricky;</title>
<para>
- In the great majority of queries, it is obvious what collation
- MySQL uses to resolve a comparison operation. For example, in
- the following cases, it should be clear that the collation is
- <quote>the column collation of column
- <literal>x</literal></quote>:
+ In the great majority of statements, it is obvious what
+ collation MySQL uses to resolve a comparison operation. For
+ example, in the following cases, it should be clear that the
+ collation is the collation of column <literal>x</literal>:
</para>
<programlisting>
@@ -1496,7 +1518,7 @@
called <quote>coercibility</quote> rules. Basically, this means:
Both <literal>x</literal> and <literal>'Y'</literal> have
collations, so which collation takes precedence? This can be
- difficult to resolve, but the following rules take care of most
+ difficult to resolve, but the following rules cover most
situations:
</para>
@@ -1551,7 +1573,7 @@
</para>
<para>
- Those rules resolve ambiguities thus:
+ Those rules resolve ambiguities in the following manner:
</para>
<itemizedlist>
@@ -1614,13 +1636,6 @@
See <xref linkend="information-functions"/>.
</para>
- <para>
- There is no system constant or ignorable coercibility. Functions
- such as <literal>USER()</literal> have a coercibility of 2
- rather than 3, and literals have a coercibility of 3 rather than
- 4.
- </para>
-
</section>
<section id="charset-collation-charset">
@@ -1628,9 +1643,9 @@
<title>&title-charset-collation-charset;</title>
<para>
- Recall that each character set has one or more collations, and
- each collation is associated with one and only one character
- set. Therefore, the following statement causes an error message
+ Cach character set has one or more collations, but each
+ collation is associated with one and only one character set.
+ Therefore, the following statement causes an error message
because the <literal>latin2_bin</literal> collation is not legal
with the <literal>latin1</literal> character set:
</para>
@@ -1670,8 +1685,8 @@
</programlisting>
<para>
- The resulting order of the values for different collations is
- shown in this table:
+ The following table shows the resulting order of the values if
+ we use <literal>ORDER BY</literal> with different collations:
</para>
<informaltable>
@@ -1710,10 +1725,8 @@
</informaltable>
<para>
- The table is an example that shows what the effect would be if
- we used different collations in an <literal>ORDER BY</literal>
- clause. The character that causes the different sort orders in
- this example is the U with two dots over it
+ The character that causes the different sort orders in this
+ example is the U with two dots over it
(<literal>ü</literal>), which the Germans call "U-umlaut".
</para>
@@ -1755,7 +1768,7 @@
<para>
This section describes operations that take character set
- information into account in MySQL ¤t-series;.
+ information into account.
</para>
<section id="charset-result">
@@ -1783,15 +1796,18 @@
<literal>RTRIM()</literal>, <literal>SOUNDEX()</literal>,
<literal>SUBSTRING()</literal>, <literal>TRIM()</literal>,
<literal>UCASE()</literal>, and <literal>UPPER()</literal>.
- (Also note: The <literal>REPLACE()</literal> function, unlike
- all other functions, always ignores the collation of the string
- input and performs a case-sensitive comparison.)
</para>
<para>
+ Note: The <literal>REPLACE()</literal> function, unlike all
+ other functions, always ignores the collation of the string
+ input and performs a case-sensitive comparison.
+ </para>
+
+ <para>
For operations that combine multiple string inputs and return a
single string output, the <quote>aggregation rules</quote> of
- standard SQL apply:
+ standard SQL apply for determining the collation of the result:
</para>
<itemizedlist>
@@ -1799,7 +1815,7 @@
<listitem>
<para>
If an explicit <literal>COLLATE
- <replaceable>X</replaceable></literal> occurs, then use
+ <replaceable>X</replaceable></literal> occurs, use
<replaceable>X</replaceable>.
</para>
</listitem>
@@ -1808,7 +1824,7 @@
<para>
If explicit <literal>COLLATE
<replaceable>X</replaceable></literal> and <literal>COLLATE
- <replaceable>Y</replaceable></literal> occur, then raise an
+ <replaceable>Y</replaceable></literal> occur, raise an
error.
</para>
</listitem>
@@ -1816,7 +1832,7 @@
<listitem>
<para>
Otherwise, if all collations are
- <replaceable>X</replaceable>, then use
+ <replaceable>X</replaceable>, use
<replaceable>X</replaceable>.
</para>
</listitem>
@@ -1830,8 +1846,8 @@
</itemizedlist>
<para>
- For example, with <literal>CASE ... WHEN a THEN b WHEN b THEN c
- COLLATE <replaceable>X</replaceable> END</literal>, the
+ For example, with <literal>CASE … WHEN a THEN b WHEN b
+ THEN c COLLATE <replaceable>X</replaceable> END</literal>, the
resulting collation is <replaceable>X</replaceable>. The same
applies for <literal>CASE</literal>, <literal>UNION</literal>,
<literal>||</literal>, <literal>CONCAT()</literal>,
@@ -1880,8 +1896,8 @@
</programlisting>
<para>
- <literal>CONVERT(... USING ...)</literal> is implemented
- according to the standard SQL specification.
+ <literal>CONVERT(… USING …)</literal> is
+ implemented according to the standard SQL specification.
</para>
<para>
@@ -1908,7 +1924,7 @@
<literal>character_set_connection</literal> and
<literal>collation_connection</literal> system variables. If you
use <literal>CAST()</literal> with <literal>CHARACTER SET
- X</literal>, then the resulting character set and collation are
+ X</literal>, the resulting character set and collation are
<literal>X</literal> and the default collation of
<literal>X</literal>.
</para>
@@ -1916,8 +1932,8 @@
<para>
You may not use a <literal>COLLATE</literal> clause inside a
<literal>CAST()</literal>, but you may use it outside. That is,
- <literal>CAST(... COLLATE ...)</literal> is illegal, but
- <literal>CAST(...) COLLATE ...</literal> is legal.
+ <literal>CAST(… COLLATE …)</literal> is illegal,
+ but <literal>CAST(…) COLLATE …</literal> is legal.
</para>
<para>
@@ -1939,10 +1955,23 @@
character set information. These include <literal>SHOW CHARACTER
SET</literal>, <literal>SHOW COLLATION</literal>, <literal>SHOW
CREATE DATABASE</literal>, <literal>SHOW CREATE TABLE</literal>
- and <literal>SHOW COLUMNS</literal>.
+ and <literal>SHOW COLUMNS</literal>. These statements are
+ described here briefly. For more information, see
+ <xref linkend="show"/>.
</para>
<para>
+ <literal>INFORMATION_SCHEMA</literal> has several tables that
+ contain information information similar to that displayed by the
+ <literal>SHOW</literal> statements. For example, the
+ <literal>CHARACTER_SETS</literal> and
+ <literal>COLLATIONS</literal> tables contain the information
+ displayed by <literal>SHOW CHARACTER SET</literal> and
+ <literal>SHOW COLLATION</literal>.
+ <xref linkend="information-schema"/>.
+ </para>
+
+ <para>
The <literal>SHOW CHARACTER SET</literal> command shows all
available character sets. It takes an optional
<literal>LIKE</literal> clause that indicates which character
@@ -1962,10 +1991,6 @@
</programlisting>
<para>
- See <xref linkend="show-character-set"/>.
- </para>
-
- <para>
The output from <literal>SHOW COLLATION</literal> includes all
available character sets. It takes an optional
<literal>LIKE</literal> clause that indicates which collation
@@ -1989,17 +2014,9 @@
</programlisting>
<para>
- See <xref linkend="show-collation"/>.
- </para>
-
- <para>
<literal>SHOW CREATE DATABASE</literal> displays the
<literal>CREATE DATABASE</literal> statement that creates a
- given database. The result includes all database options.
- <literal>DEFAULT CHARACTER SET</literal> and
- <literal>COLLATE</literal> are supported. All database options
- are stored in a text file named <filename>db.opt</filename> that
- can be found in the database directory.
+ given database:
</para>
<programlisting>
@@ -2012,7 +2029,8 @@
</programlisting>
<para>
- See <xref linkend="show-create-database"/>.
+ If no <literal>COLLATE</literal> clause is shown, the default
+ collation for the character set applies.
</para>
<para>
@@ -2024,17 +2042,13 @@
</para>
<para>
- See <xref linkend="show-create-table"/>.
- </para>
-
- <para>
The <literal>SHOW COLUMNS</literal> statement displays the
collations of a table's columns when invoked as <literal>SHOW
FULL COLUMNS</literal>. Columns with <literal>CHAR</literal>,
<literal>VARCHAR</literal>, or <literal>TEXT</literal> data
- types have non-<literal>NULL</literal> collations. Numeric and
- other non-character types have <literal>NULL</literal>
- collations. For example:
+ types have collations. Numeric and other non-character types
+ have no collation (indicated by <literal>NULL</literal> as the
+ <literal>Collation</literal> value). For example:
</para>
<programlisting>
@@ -2063,14 +2077,10 @@
</programlisting>
<para>
- The character set is not part of the display. (The character set
- name is implied by the collation name.)
+ The character set is not part of the display but is implied by
+ the collation name.
</para>
- <para>
- See <xref linkend="show-columns"/>.
- </para>
-
</section>
</section>
@@ -2094,7 +2104,7 @@
<listitem>
<para>
- <literal>utf8</literal>, the UTF8 encoding of the Unicode
+ <literal>utf8</literal>, the UTF-8 encoding of the Unicode
character set.
</para>
</listitem>
@@ -2105,10 +2115,11 @@
In UCS-2 (binary Unicode representation), every character is
represented by a two-byte Unicode code with the most significant
byte first. For example: "LATIN CAPITAL LETTER A" has the code
- 0x0041 and it's stored as a two-byte sequence: 0x00 0x41.
- "CYRILLIC SMALL LETTER YERU" (Unicode 0x044B) is stored as a
- two-byte sequence: 0x04 0x4B. For Unicode characters and their
- codes, please refer to the
+ <literal>0x0041</literal> and it is stored as a two-byte sequence:
+ <literal>0x00 0x41</literal>. "CYRILLIC SMALL LETTER YERU"
+ (Unicode <literal>0x044B</literal>) is stored as a two-byte
+ sequence: <literal>0x04 0x4B</literal>. For Unicode characters and
+ their codes, please refer to the
<ulink url="http://www.unicode.org/">Unicode Home Page</ulink>.
</para>
@@ -2118,9 +2129,9 @@
</para>
<para>
- The UTF8 character set (transform Unicode representation) is an
+ The UTF-8 character set (transform Unicode representation) is an
alternative way to store Unicode data. It is implemented according
- to RFC 3629. The idea of the UTF8 character set is that various
+ to RFC 3629. The idea of the UTF-8 character set is that various
Unicode characters are encoded using byte sequences of different
lengths:
</para>
@@ -2154,19 +2165,20 @@
<para>
RFC 3629 describes encoding sequences that take from one to four
- bytes. Currently, MySQL UTF8 support does not include four-byte
- sequences. (An older standard for UTF8 encoding is given by RFC
- 2279, which describes UTF8 sequences that take from one to six
- bytes. RFC 3629 renders RFC 2279 obsolete; for this reason,
- sequences with five and six bytes are no longer used.)
+ bytes. Currently, MySQL support for UTF-8 does not include
+ four-byte sequences. (An older standard for UTF-8 encoding is
+ given by RFC 2279, which describes UTF-8 sequences that take from
+ one to six bytes. RFC 3629 renders RFC 2279 obsolete; for this
+ reason, sequences with five and six bytes are no longer used.)
</para>
<para>
- <emphasis role="bold">Tip</emphasis>: To save space with UTF8, use
- <literal>VARCHAR</literal> instead of <literal>CHAR</literal>.
- Otherwise, MySQL has to reserve 30 bytes for a <literal>CHAR(10)
- CHARACTER SET utf8</literal> column, because this is the maximum
- possible length.
+ <emphasis role="bold">Tip</emphasis>: To save space with UTF-8,
+ use <literal>VARCHAR</literal> instead of <literal>CHAR</literal>.
+ Otherwise, MySQL must reserve three bytes for each character in a
+ <literal>CHAR CHARACTER SET utf8</literal> column because that is
+ the maximum possible length. For example, MySQL must reserve 30
+ bytes for a <literal>CHAR(10) CHARACTER SET utf8</literal> column.
</para>
</section>
@@ -2217,16 +2229,16 @@
<para>
In order to satisfy both requirements, MySQL stores metadata in a
- Unicode character set, namely UTF8. This does not cause any
+ Unicode character set, namely UTF-8. This does not cause any
disruption if you never use accented characters. But if you do,
- you should be aware that metadata is in UTF8.
+ you should be aware that metadata is in UTF-8.
</para>
<para>
This means that the return values of the
<literal>USER()</literal>, <literal>CURRENT_USER()</literal>,
<literal>DATABASE()</literal>, and <literal>VERSION()</literal>
- functions have the UTF8 character set by default, as do synonyms
+ functions have the UTF-8 character set by default, as do synonyms
such as <literal>SESSION_USER()</literal> and
<literal>SYSTEM_USER()</literal>.
</para>
@@ -2263,9 +2275,9 @@
</para>
<para>
- If you want the server to pass metadata results back in a non-UTF8
- character set, then use <literal>SET NAMES</literal> to force the
- server to perform character set conversion (see
+ If you want the server to pass metadata results back in a
+ non-UTF-8 character set, then use <literal>SET NAMES</literal> to
+ force the server to perform character set conversion (see
<xref linkend="charset-connection"/>), or else cause the client to
perform the conversion. It is more efficient to have the client
perform the conversion, but this option is not always available
@@ -2285,7 +2297,7 @@
<para>
This works because the contents of
<literal>latin1_column</literal> are automatically converted to
- UTF8 before the comparison.
+ UTF-8 before the comparison.
</para>
<programlisting>
@@ -2311,8 +2323,9 @@
<para>
<emphasis role="bold">Note</emphasis>: In MySQL ¤t-series;,
- the <filename>errmsg.txt</filename> files all use UTF8. Conversion
- to the client character set is automatic, as with metadata.
+ the <filename>errmsg.txt</filename> files all use UTF-8.
+ Conversion to the client character set is automatic, as with
+ metadata.
</para>
</section>
Modified: trunk/refman-5.1/installing.xml
===================================================================
--- trunk/refman-5.1/installing.xml 2006-01-15 04:56:33 UTC (rev 833)
+++ trunk/refman-5.1/installing.xml 2006-01-15 04:56:50 UTC (rev 834)
@@ -3890,8 +3890,8 @@
<para>
<guimenuitem>Best Support For
Multilingualism</guimenuitem>: Choose this option if you
- want to use <literal>UTF8</literal> as the default server
- character set. <literal>UTF8</literal> can store
+ want to use <literal>utf8</literal> as the default server
+ character set. <literal>utf8</literal> can store
characters from many different languages in a single
character set.
</para>
Modified: trunk/refman-5.1/language-structure.xml
===================================================================
--- trunk/refman-5.1/language-structure.xml 2006-01-15 04:56:33 UTC (rev 833)
+++ trunk/refman-5.1/language-structure.xml 2006-01-15 04:56:50 UTC (rev 834)
@@ -834,7 +834,7 @@
</para>
<para>
- Identifiers are stored using Unicode (UTF8). This applies to
+ Identifiers are stored using Unicode (UTF-8). This applies to
identifiers in table definitions that stored in
<filename>.frm</filename> files and to identifiers stored in the
grant tables in the <literal>mysql</literal> database. The sizes
Modified: trunk/refman-common/titles.en.ent
===================================================================
--- trunk/refman-common/titles.en.ent 2006-01-15 04:56:33 UTC (rev 833)
+++ trunk/refman-common/titles.en.ent 2006-01-15 04:56:50 UTC (rev 834)
@@ -155,7 +155,7 @@
<!ENTITY title-charset-result "Result Strings">
<!ENTITY title-charset-se-me-sets "South European and Middle East Character Sets">
<!ENTITY title-charset-server "Server Character Set and Collation">
-<!ENTITY title-charset-show "<literal>SHOW</literal> Statements">
+<!ENTITY title-charset-show "<literal>SHOW</literal> Statements and <literal>INFORMATION_SCHEMA</literal>">
<!ENTITY title-charset-syntax "Specifying Character Sets and Collations">
<!ENTITY title-charset-table "Table Character Set and Collation">
<!ENTITY title-charset-unicode "Unicode Support">
| Thread |
|---|
| • svn commit - mysqldoc@docsrva: r834 - in trunk: . refman-4.1 refman-5.0 refman-5.1 refman-common | paul | 15 Jan |