Author: paul
Date: 2008-05-08 21:21:53 +0200 (Thu, 08 May 2008)
New Revision: 10700
Log:
r31265@frost: paul | 2008-05-08 14:10:46 -0500
Additions to adding-collation section.
Modified:
trunk/refman-6.0/collation-tmp.xml
Property changes on: trunk
___________________________________________________________________
Name: svk:merge
- 4767c598-dc10-0410-bea0-d01b485662eb:/mysqldoc-local/mysqldoc/trunk:35828
7d8d2c4e-af1d-0410-ab9f-b038ce55645b:/mysqldoc-local/mysqldoc:31262
b5ec3a16-e900-0410-9ad2-d183a3acac99:/mysqldoc-local/mysqldoc/trunk:14218
bf112a9c-6c03-0410-a055-ad865cd57414:/mysqldoc-local/mysqldoc/trunk:31182
+ 4767c598-dc10-0410-bea0-d01b485662eb:/mysqldoc-local/mysqldoc/trunk:35828
7d8d2c4e-af1d-0410-ab9f-b038ce55645b:/mysqldoc-local/mysqldoc:31265
b5ec3a16-e900-0410-9ad2-d183a3acac99:/mysqldoc-local/mysqldoc/trunk:14218
bf112a9c-6c03-0410-a055-ad865cd57414:/mysqldoc-local/mysqldoc/trunk:31182
Modified: trunk/refman-6.0/collation-tmp.xml
===================================================================
--- trunk/refman-6.0/collation-tmp.xml 2008-05-08 17:09:56 UTC (rev 10699)
+++ trunk/refman-6.0/collation-tmp.xml 2008-05-08 19:21:53 UTC (rev 10700)
Changed blocks: 6, Lines Added: 230, Lines Deleted: 32; 9102 bytes
@@ -1,6 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
-<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
-"http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
+<!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
"http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd"[
<!ENTITY % all.entities SYSTEM "all-entities.ent">
%all.entities;
]>
@@ -21,10 +20,31 @@
</para>
<para>
+ Collations are based on weights. For each character in a character
+ set, it character code maps to a weight. Characters with equal
+ weights compare as equal, and characters with unequal weights
+ compare according to the relative magnitude of their weights.
+ </para>
+
+ <remark role="todo">
+ 6.0-specific note follows
+ </remark>
+
+ <para>
+ The <function role="sql">WEIGHT_STRING()</function> function can be
+ used to see the weights for the characters in a string.
+ </para>
+
+ <para>
+ [SHOW SOME EXAMPLES]
+ </para>
+
+ <para>
This section describes how to add a collation to a character set.
The instructions assume that the character set already exists. If
the character set does not exist, see
- <xref linkend="adding-character-set"/>.
+ <xref
+ linkend="adding-character-set"/>.
<remark role="todo">
4.1-specific remark; for 4.1 manual only.
@@ -42,38 +62,49 @@
<listitem>
<para>
- Simple collation for 8-bit character set.
+ Simple collations for 8-bit character sets.
</para>
+
+ <para>
+ These are implemented using an array of 256 weights that defines
+ a one-to-one mapping from character codes to weights.
+ </para>
</listitem>
<listitem>
<para>
- Complex collation for 8-bit character set.
+ Complex collations for 8-bit character sets.
</para>
+
+ <para>
+ These are implemented using functions in a C source file.
+ </para>
</listitem>
<listitem>
<para>
- Collation for non-Unicode multi-byte character set. Characters
- in the ASCII range map character codes to weights in
- case-insensitive fashion. Multi-byte characters outside the
- ASCII range have two types of relationship between character
- codes and weights:
+ Collations for non-Unicode multi-byte character sets.
</para>
+ <para>
+ For characters in the ASCII range, character codes map to
+ weights in case-insensitive fashion. For multi-byte characters
+ outside the ASCII range, there are two types of relationship
+ between character codes and weights:
+ </para>
+
<itemizedlist>
<listitem>
<para>
- Type 1: Weights are equal to character codes.
+ Weights are equal to character codes.
</para>
</listitem>
<listitem>
<para>
- Type 2: There is a one-to-one mapping from character codes
- to weights, but a code is not necessarily equal to the
- weight.
+ There is a one-to-one mapping from character codes to
+ weights, but a code is not necessarily equal to the weight.
</para>
</listitem>
@@ -82,24 +113,24 @@
<listitem>
<para>
- Collation for Unicode multi-byte character set.
+ Collations for Unicode multi-byte character sets.
</para>
<itemizedlist>
<listitem>
<para>
- Type 1: Not based on the Unicode Collation Algorithm (UCA).
+ Based on the Unicode Collation Algorithm (UCA).
+
+ <remark role="todo">
+ There are several properties here.
+ </remark>
</para>
</listitem>
<listitem>
<para>
- Type 2: Based on the UCA.
-
- <remark role="todo">
- There are several properties here.
- </remark>
+ Not based on the UCA.
</para>
</listitem>
@@ -109,23 +140,66 @@
</itemizedlist>
<para>
- The following sections describe how to add collations without
- recompiling MySQL. There are instructions for adding a simple 8-bit
- collation, and for adding a Unicode collation that is based on the
- UCA.
+ Some collations can be added to MySQL without recompiling:
</para>
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ Simple collations for 8-bit character sets
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ UCA-based collations for Unicode character sets
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Binary
(<literal><replaceable>xxx</replaceable>_bin</literal>)
+ collations
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
<para>
- For instructions on adding a collation that requires recompiling,
- use the instructions at <xref linkend="adding-character-set"/>.
- However, instead of adding all the information required for a
- complete character set, just modify existing files for the
- appropriate existing character set. Add new data structures,
- functions, and configuration information similar to what is already
- present for existing collations.
+ The following sections describe how to add collations of the first
+ two types. All existing character sets already have a binary
+ collation, so there is no need here to describe how to add one.
</para>
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ How to add a simple 8-bit collation
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ How to add a Unicode collation that is based on the UCA
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
<para>
+ To add a collation that requires recompiling, use the instructions
+ at <xref
+ linkend="adding-character-set"/>. However, instead of
+ adding all the information required for a complete character set,
+ just modify the existing files for the appropriate character set.
+ Add new data structures, functions, and configuration information
+ similar to what is already present for the character set's current
+ collations.
+ </para>
+
+ <para>
<emphasis role="bold">Additional resources</emphasis>
</para>
@@ -147,4 +221,128 @@
</itemizedlist>
+ <section id="adding-collation-choosing-id">
+
+ <title>Choosing a Collation ID</title>
+
+ <para>
+ Each collation must have a unique ID, so to add a collation, you
+ must choose an ID value that is not currently used.
+ </para>
+
+ <para>
+ To determine the largest currently used ID, issue the following
+ statement:
+ </para>
+
+<programlisting>
+mysql> <userinput>SELECT MAX(ID) FROM
INFORMATION_SCHEMA.COLLATIONS;</userinput>
++---------+
+| MAX(ID) |
++---------+
+| 210 |
++---------+
+</programlisting>
+
+ <para>
+ In this case, you can choose an ID higher than 210 for the new
+ collation.
+ </para>
+
+ <para>
+ To display a list of all currently used IDs, issue this statement:
+ </para>
+
+<programlisting>
+mysql> <userinput>SELECT ID FROM INFORMATION_SCHEMA.COLLATIONS ORDER BY
ID;</userinput>
++-----+
+| ID |
++-----+
+| 1 |
+| 2 |
+| ... |
+| 52 |
+| 53 |
+| 57 |
+| 58 |
+| ... |
+| 98 |
+| 99 |
+| 128 |
+| 129 |
+| ... |
+| 210 |
++-----+
+</programlisting>
+
+ <para>
+ You can either choose an ID within the current range of IDs that
+ is not used, or choose an ID that is higher than the current
+ maximum ID. For example, in the output just shown, there are
+ unused IDs betwen 53 and 57, and between 99 and 128. Or you could
+ choose an ID higher than 210.
+ </para>
+
+ </section>
+
+ <section id="adding-collation-simple-8bit">
+
+ <title>Adding a Simple Collation for an 8-Bit Character Set</title>
+
+ <para>
+ To add a simple collation for an 8-bit character set, use this
+ procedure
+ </para>
+
+ <orderedlist>
+
+ <listitem>
+ <para>
+ Choose a collation ID, as shown in
+ <xref linkend="adding-collation-choosing-id"
+ />
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Choose a name for the collation and list it in the
+ <filename>Index.xml</filename> file
+ </para>
+ </listitem>
+
+ </orderedlist>
+
+ </section>
+
+ <section id="adding-collation-unicode-uca">
+
+ <title>Adding a UCA Collation for a Unicode Character Set</title>
+
+ <para>
+ To add a UCA collation for a Unicode character set, use this
+ procedure
+ </para>
+
+ <orderedlist>
+
+ <listitem>
+ <para>
+ Choose a collation ID, as shown in
+ <xref linkend="adding-collation-choosing-id"
+ />
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Choose a name for the collation and list it in the
+ <filename>Index.xml</filename> file
+ </para>
+ </listitem>
+
+ </orderedlist>
+
+ </section>
+
</section>
| Thread |
|---|
| • svn commit - mysqldoc@docsrva: r10700 - in trunk: . refman-6.0 | paul | 8 May 2008 |