Author: shinz
Date: 2007-08-22 16:14:47 +0200 (Wed, 22 Aug 2007)
New Revision: 7504
Log:
Add example for the effect that ?\195?\132=A and similar equalities have on comparisons or
searches
Modified:
trunk/refman-4.1/charset.xml
trunk/refman-5.0/charset.xml
trunk/refman-5.1/charset.xml
trunk/refman-5.2/charset.xml
Modified: trunk/refman-4.1/charset.xml
===================================================================
--- trunk/refman-4.1/charset.xml 2007-08-22 04:44:11 UTC (rev 7503)
+++ trunk/refman-4.1/charset.xml 2007-08-22 14:14:47 UTC (rev 7504)
Changed blocks: 5, Lines Added: 87, Lines Deleted: 2; 4110 bytes
@@ -1953,9 +1953,14 @@
<section id="charset-collation-effect">
- <title>An Example of the Effect of Collation</title>
+ <title>Examples of the Effect of Collation</title>
<para>
+ <emphasis role="bold">Example 1: Sorting German
+ Umlauts</emphasis>
+ </para>
+
+ <para>
Suppose that column <literal>X</literal> in table
<literal>T</literal> has these <literal>latin1</literal>
column
values:
@@ -2052,6 +2057,75 @@
</itemizedlist>
+ <para>
+ <emphasis role="bold">Example 2: Searching for German
+ Umlauts</emphasis>
+ </para>
+
+ <para>
+ Suppose you have three tables that differ only by the character
+ set and collation used:
+ </para>
+
+<programlisting>
+mysql> <userinput>CREATE TABLE german1 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=latin1 COLLATE=latin1_german1_ci;</userinput>
+mysql> <userinput>CREATE TABLE german2 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=latin1 COLLATE=latin2_german1_ci;</userinput>
+mysql> <userinput>CREATE TABLE germanutf8 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=utf8 COLLATE=utf8_general_ci;</userinput>
+</programlisting>
+
+ <para>
+ Each table contains two records:
+ </para>
+
+<programlisting>
+mysql> <userinput>INSERT INTO german1 VALUES ('Bar'),
('Bär');</userinput>
+mysql> <userinput>INSERT INTO german2 VALUES ('Bar'),
('Bär');</userinput>
+mysql> <userinput>INSERT INTO germanutf8 VALUES ('Bar'),
('Bär');</userinput>
+</programlisting>
+
+ <para>
+ Two of the above collations have an <literal>A = Ä</literal>
+ equality, and one has no such equality
+ (<literal>latin2_general_ci</literal>). For that reason, you'll
+ get these results in comparisons:
+ </para>
+
+<programlisting>
+mysql> <userinput>SELECT * FROM german1 WHERE c = 'Bär';</userinput>
++------+
+| c |
++------+
+| Bar |
+| Bär |
++------+
+mysql> <userinput>SELECT * FROM german2 WHERE c = 'Bär';</userinput>
++------+
+| c |
++------+
+| Bär |
++------+
+mysql> <userinput>SELECT * FROM germanutf8 WHERE c =
'Bär';</userinput>
++------+
+| c |
++------+
+| Bar |
+| Bär |
++------+
+</programlisting>
+
+ <para>
+ This is not a bug but rather a consequence of the sorting that
+ <literal>latin1_german1_ci</literal> or
+ <literal>utf8_general_ci</literal> do (which is according to the
+ German DIN 5007 standard).
+ </para>
+
</section>
</section>
@@ -3494,7 +3568,9 @@
<para>
For example, the following equalities hold in both
<literal>utf8_general_ci</literal> and
- <literal>utf8_unicode_ci</literal>:
+ <literal>utf8_unicode_ci</literal> (for the effect this has in
+ comparisons or when doing searches, see
+ <xref linkend="charset-collation-effect"/>):
</para>
<programlisting>
@@ -3781,6 +3857,9 @@
Normung</foreignphrase> (the German equivalent of ANSI).
DIN-1 is called the <quote>dictionary collation</quote> and
DIN-2 is called the <quote>phone book collation.</quote>
+ For an example which effect this has in comparisons or when
+ doing searches, see
+ <xref linkend="charset-collation-effect"/>.
</para>
<itemizedlist>
@@ -3822,6 +3901,12 @@
</itemizedlist>
<para>
+ For an example which effect this has in comparisons or when
+ doing searches, see
+ <xref linkend="charset-collation-effect"/>.
+ </para>
+
+ <para>
In the <literal>latin1_spanish_ci</literal> collation,
‘<literal>ñ</literal>’ (n-tilde) is a
separate
letter between ‘<literal>n</literal>’ and
Modified: trunk/refman-5.0/charset.xml
===================================================================
--- trunk/refman-5.0/charset.xml 2007-08-22 04:44:11 UTC (rev 7503)
+++ trunk/refman-5.0/charset.xml 2007-08-22 14:14:47 UTC (rev 7504)
Changed blocks: 4, Lines Added: 81, Lines Deleted: 2; 3643 bytes
@@ -1885,9 +1885,14 @@
<section id="charset-collation-effect">
- <title>An Example of the Effect of Collation</title>
+ <title>Examples of the Effect of Collation</title>
<para>
+ <emphasis role="bold">Example 1: Sorting German
+ Umlauts</emphasis>
+ </para>
+
+ <para>
Suppose that column <literal>X</literal> in table
<literal>T</literal> has these <literal>latin1</literal>
column
values:
@@ -1984,6 +1989,75 @@
</itemizedlist>
+ <para>
+ <emphasis role="bold">Example 2: Searching for German
+ Umlauts</emphasis>
+ </para>
+
+ <para>
+ Suppose you have three tables that differ only by the character
+ set and collation used:
+ </para>
+
+<programlisting>
+mysql> <userinput>CREATE TABLE german1 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=latin1 COLLATE=latin1_german1_ci;</userinput>
+mysql> <userinput>CREATE TABLE german2 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=latin1 COLLATE=latin2_german1_ci;</userinput>
+mysql> <userinput>CREATE TABLE germanutf8 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=utf8 COLLATE=utf8_general_ci;</userinput>
+</programlisting>
+
+ <para>
+ Each table contains two records:
+ </para>
+
+<programlisting>
+mysql> <userinput>INSERT INTO german1 VALUES ('Bar'),
('Bär');</userinput>
+mysql> <userinput>INSERT INTO german2 VALUES ('Bar'),
('Bär');</userinput>
+mysql> <userinput>INSERT INTO germanutf8 VALUES ('Bar'),
('Bär');</userinput>
+</programlisting>
+
+ <para>
+ Two of the above collations have an <literal>A = Ä</literal>
+ equality, and one has no such equality
+ (<literal>latin2_general_ci</literal>). For that reason, you'll
+ get these results in comparisons:
+ </para>
+
+<programlisting>
+mysql> <userinput>SELECT * FROM german1 WHERE c = 'Bär';</userinput>
++------+
+| c |
++------+
+| Bar |
+| Bär |
++------+
+mysql> <userinput>SELECT * FROM german2 WHERE c = 'Bär';</userinput>
++------+
+| c |
++------+
+| Bär |
++------+
+mysql> <userinput>SELECT * FROM germanutf8 WHERE c =
'Bär';</userinput>
++------+
+| c |
++------+
+| Bar |
+| Bär |
++------+
+</programlisting>
+
+ <para>
+ This is not a bug but rather a consequence of the sorting that
+ <literal>latin1_german1_ci</literal> or
+ <literal>utf8_general_ci</literal> do (which is according to the
+ German DIN 5007 standard).
+ </para>
+
</section>
</section>
@@ -3388,7 +3462,9 @@
<para>
For example, the following equalities hold in both
<literal>utf8_general_ci</literal> and
- <literal>utf8_unicode_ci</literal>:
+ <literal>utf8_unicode_ci</literal> (for the effect this has in
+ comparisons or when doing searches, see
+ <xref linkend="charset-collation-effect"/>):
</para>
<programlisting>
@@ -3675,6 +3751,9 @@
Normung</foreignphrase> (the German equivalent of ANSI).
DIN-1 is called the <quote>dictionary collation</quote> and
DIN-2 is called the <quote>phone book collation.</quote>
+ For an example which effect this has in comparisons or when
+ doing searches, see
+ <xref linkend="charset-collation-effect"/>.
</para>
<itemizedlist>
Modified: trunk/refman-5.1/charset.xml
===================================================================
--- trunk/refman-5.1/charset.xml 2007-08-22 04:44:11 UTC (rev 7503)
+++ trunk/refman-5.1/charset.xml 2007-08-22 14:14:47 UTC (rev 7504)
Changed blocks: 4, Lines Added: 81, Lines Deleted: 2; 3643 bytes
@@ -1882,9 +1882,14 @@
<section id="charset-collation-effect">
- <title>An Example of the Effect of Collation</title>
+ <title>Examples of the Effect of Collation</title>
<para>
+ <emphasis role="bold">Example 1: Sorting German
+ Umlauts</emphasis>
+ </para>
+
+ <para>
Suppose that column <literal>X</literal> in table
<literal>T</literal> has these <literal>latin1</literal>
column
values:
@@ -1981,6 +1986,75 @@
</itemizedlist>
+ <para>
+ <emphasis role="bold">Example 2: Searching for German
+ Umlauts</emphasis>
+ </para>
+
+ <para>
+ Suppose you have three tables that differ only by the character
+ set and collation used:
+ </para>
+
+<programlisting>
+mysql> <userinput>CREATE TABLE german1 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=latin1 COLLATE=latin1_german1_ci;</userinput>
+mysql> <userinput>CREATE TABLE german2 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=latin1 COLLATE=latin2_german1_ci;</userinput>
+mysql> <userinput>CREATE TABLE germanutf8 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=utf8 COLLATE=utf8_general_ci;</userinput>
+</programlisting>
+
+ <para>
+ Each table contains two records:
+ </para>
+
+<programlisting>
+mysql> <userinput>INSERT INTO german1 VALUES ('Bar'),
('Bär');</userinput>
+mysql> <userinput>INSERT INTO german2 VALUES ('Bar'),
('Bär');</userinput>
+mysql> <userinput>INSERT INTO germanutf8 VALUES ('Bar'),
('Bär');</userinput>
+</programlisting>
+
+ <para>
+ Two of the above collations have an <literal>A = Ä</literal>
+ equality, and one has no such equality
+ (<literal>latin2_general_ci</literal>). For that reason, you'll
+ get these results in comparisons:
+ </para>
+
+<programlisting>
+mysql> <userinput>SELECT * FROM german1 WHERE c = 'Bär';</userinput>
++------+
+| c |
++------+
+| Bar |
+| Bär |
++------+
+mysql> <userinput>SELECT * FROM german2 WHERE c = 'Bär';</userinput>
++------+
+| c |
++------+
+| Bär |
++------+
+mysql> <userinput>SELECT * FROM germanutf8 WHERE c =
'Bär';</userinput>
++------+
+| c |
++------+
+| Bar |
+| Bär |
++------+
+</programlisting>
+
+ <para>
+ This is not a bug but rather a consequence of the sorting that
+ <literal>latin1_german1_ci</literal> or
+ <literal>utf8_general_ci</literal> do (which is according to the
+ German DIN 5007 standard).
+ </para>
+
</section>
</section>
@@ -3385,7 +3459,9 @@
<para>
For example, the following equalities hold in both
<literal>utf8_general_ci</literal> and
- <literal>utf8_unicode_ci</literal>:
+ <literal>utf8_unicode_ci</literal> (for the effect this has in
+ comparisons or when doing searches, see
+ <xref linkend="charset-collation-effect"/>):
</para>
<programlisting>
@@ -3672,6 +3748,9 @@
Normung</foreignphrase> (the German equivalent of ANSI).
DIN-1 is called the <quote>dictionary collation</quote> and
DIN-2 is called the <quote>phone book collation.</quote>
+ For an example which effect this has in comparisons or when
+ doing searches, see
+ <xref linkend="charset-collation-effect"/>.
</para>
<itemizedlist>
Modified: trunk/refman-5.2/charset.xml
===================================================================
--- trunk/refman-5.2/charset.xml 2007-08-22 04:44:11 UTC (rev 7503)
+++ trunk/refman-5.2/charset.xml 2007-08-22 14:14:47 UTC (rev 7504)
Changed blocks: 4, Lines Added: 81, Lines Deleted: 2; 3643 bytes
@@ -1882,9 +1882,14 @@
<section id="charset-collation-effect">
- <title>An Example of the Effect of Collation</title>
+ <title>Examples of the Effect of Collation</title>
<para>
+ <emphasis role="bold">Example 1: Sorting German
+ Umlauts</emphasis>
+ </para>
+
+ <para>
Suppose that column <literal>X</literal> in table
<literal>T</literal> has these <literal>latin1</literal>
column
values:
@@ -1981,6 +1986,75 @@
</itemizedlist>
+ <para>
+ <emphasis role="bold">Example 2: Searching for German
+ Umlauts</emphasis>
+ </para>
+
+ <para>
+ Suppose you have three tables that differ only by the character
+ set and collation used:
+ </para>
+
+<programlisting>
+mysql> <userinput>CREATE TABLE german1 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=latin1 COLLATE=latin1_german1_ci;</userinput>
+mysql> <userinput>CREATE TABLE german2 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=latin1 COLLATE=latin2_german1_ci;</userinput>
+mysql> <userinput>CREATE TABLE germanutf8 (</userinput>
+ -> <userinput> c CHAR(10)</userinput>
+ -> <userinput>) CHARSET=utf8 COLLATE=utf8_general_ci;</userinput>
+</programlisting>
+
+ <para>
+ Each table contains two records:
+ </para>
+
+<programlisting>
+mysql> <userinput>INSERT INTO german1 VALUES ('Bar'),
('Bär');</userinput>
+mysql> <userinput>INSERT INTO german2 VALUES ('Bar'),
('Bär');</userinput>
+mysql> <userinput>INSERT INTO germanutf8 VALUES ('Bar'),
('Bär');</userinput>
+</programlisting>
+
+ <para>
+ Two of the above collations have an <literal>A = Ä</literal>
+ equality, and one has no such equality
+ (<literal>latin2_general_ci</literal>). For that reason, you'll
+ get these results in comparisons:
+ </para>
+
+<programlisting>
+mysql> <userinput>SELECT * FROM german1 WHERE c = 'Bär';</userinput>
++------+
+| c |
++------+
+| Bar |
+| Bär |
++------+
+mysql> <userinput>SELECT * FROM german2 WHERE c = 'Bär';</userinput>
++------+
+| c |
++------+
+| Bär |
++------+
+mysql> <userinput>SELECT * FROM germanutf8 WHERE c =
'Bär';</userinput>
++------+
+| c |
++------+
+| Bar |
+| Bär |
++------+
+</programlisting>
+
+ <para>
+ This is not a bug but rather a consequence of the sorting that
+ <literal>latin1_german1_ci</literal> or
+ <literal>utf8_general_ci</literal> do (which is according to the
+ German DIN 5007 standard).
+ </para>
+
</section>
</section>
@@ -3385,7 +3459,9 @@
<para>
For example, the following equalities hold in both
<literal>utf8_general_ci</literal> and
- <literal>utf8_unicode_ci</literal>:
+ <literal>utf8_unicode_ci</literal> (for the effect this has in
+ comparisons or when doing searches, see
+ <xref linkend="charset-collation-effect"/>):
</para>
<programlisting>
@@ -3672,6 +3748,9 @@
Normung</foreignphrase> (the German equivalent of ANSI).
DIN-1 is called the <quote>dictionary collation</quote> and
DIN-2 is called the <quote>phone book collation.</quote>
+ For an example which effect this has in comparisons or when
+ doing searches, see
+ <xref linkend="charset-collation-effect"/>.
</para>
<itemizedlist>
| Thread |
|---|
| • svn commit - mysqldoc@docsrva: r7504 - in trunk: refman-4.1 refman-5.0 refman-5.1 refman-5.2 | stefan | 22 Aug |