List:Commits« Previous MessageNext Message »
From:paul Date:January 30 2006 3:00pm
Subject:svn commit - mysqldoc@docsrva: r1124 - in trunk: . refman-4.1 refman-5.0 refman-5.1 refman-common
View as plain text  
Author: paul
Date: 2006-01-30 16:00:57 +0100 (Mon, 30 Jan 2006)
New Revision: 1124

Log:
 r6910@frost:  paul | 2006-01-30 08:57:54 -0600
 Move regexp.xml to refman-common directory.


Added:
   trunk/refman-common/regexp.xml
Removed:
   trunk/refman-4.1/regexp.xml
   trunk/refman-5.0/regexp.xml
   trunk/refman-5.1/regexp.xml
Modified:
   trunk/


Property changes on: trunk
___________________________________________________________________
Name: svk:merge
   - b5ec3a16-e900-0410-9ad2-d183a3acac99:/mysqldoc-local/mysqldoc/trunk:6904
bf112a9c-6c03-0410-a055-ad865cd57414:/mysqldoc-local/mysqldoc/trunk:2588
   + b5ec3a16-e900-0410-9ad2-d183a3acac99:/mysqldoc-local/mysqldoc/trunk:6910
bf112a9c-6c03-0410-a055-ad865cd57414:/mysqldoc-local/mysqldoc/trunk:2588

Deleted: trunk/refman-4.1/regexp.xml

Deleted: trunk/refman-5.0/regexp.xml

Deleted: trunk/refman-5.1/regexp.xml

Added: trunk/refman-common/regexp.xml
===================================================================
--- trunk/refman-common/regexp.xml	2006-01-30 14:13:09 UTC (rev 1123)
+++ trunk/refman-common/regexp.xml	2006-01-30 15:00:57 UTC (rev 1124)
@@ -0,0 +1,477 @@
+<?xml version="1.0" encoding="utf-8"?>
+<!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
+"http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd"
+[
+  <!ENTITY % fixedchars.entities  SYSTEM "../refman-common/fixedchars.ent">
+  %fixedchars.entities;
+  <!ENTITY % title.entities       SYSTEM "../refman-common/titles.en.ent">
+  %title.entities;
+  <!ENTITY % versions.entities    SYSTEM "versions.ent">
+  %versions.entities;
+]>
+<appendix id="regexp">
+
+  <title>&title-regexp;</title>
+
+  <indexterm>
+    <primary>regex</primary>
+  </indexterm>
+
+  <indexterm>
+    <primary>regular expression syntax</primary>
+    <secondary>described</secondary>
+  </indexterm>
+
+  <indexterm>
+    <primary>syntax</primary>
+    <secondary>regular expression</secondary>
+  </indexterm>
+
+  <para>
+    A regular expression is a powerful way of specifying a pattern for a
+    complex search.
+  </para>
+
+  <para>
+    MySQL uses Henry Spencer's implementation of regular expressions,
+    which is aimed at conformance with POSIX 1003.2. See
+    <xref linkend="credits"/>. MySQL uses the extended version to
+    support pattern-matching operations performed with the
+    <literal>REGEXP</literal> operator in SQL statements. See
+    <xref linkend="pattern-matching"/>.
+  </para>
+
+  <para>
+    This appendix is a summary, with examples, of the special characters
+    and constructs that can be used in MySQL for
+    <literal>REGEXP</literal> operations. It does not contain all the
+    details that can be found in Henry Spencer's
+    <literal>regex(7)</literal> manual page. That manual page is
+    included in MySQL source distributions, in the
+    <filename>regex.7</filename> file under the
+    <filename>regex</filename> directory.
+  </para>
+
+  <para>
+    A regular expression describes a set of strings. The simplest
+    regular expression is one that has no special characters in it. For
+    example, the regular expression <literal>hello</literal> matches
+    <literal>hello</literal> and nothing else.
+  </para>
+
+  <para>
+    Non-trivial regular expressions use certain special constructs so
+    that they can match more than one string. For example, the regular
+    expression <literal>hello|word</literal> matches either the string
+    <literal>hello</literal> or the string <literal>word</literal>.
+  </para>
+
+  <para>
+    As a more complex example, the regular expression
+    <literal>B[an]*s</literal> matches any of the strings
+    <literal>Bananas</literal>, <literal>Baaaaas</literal>,
+    <literal>Bs</literal>, and any other string starting with a
+    <literal>B</literal>, ending with an <literal>s</literal>, and
+    containing any number of <literal>a</literal> or
+    <literal>n</literal> characters in between.
+  </para>
+
+  <para>
+    A regular expression for the <literal>REGEXP</literal> operator may
+    use any of the following special characters and constructs:
+  </para>
+
+  <itemizedlist>
+
+    <listitem>
+      <para>
+        <literal>^</literal>
+      </para>
+
+      <para>
+        Match the beginning of a string.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'fo\nfo' REGEXP '^fo$';</userinput>                   -&gt; 0
+mysql&gt; <userinput>SELECT 'fofo' REGEXP '^fo';</userinput>                      -&gt; 1
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>$</literal>
+      </para>
+
+      <para>
+        Match the end of a string.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'fo\no' REGEXP '^fo\no$';</userinput>                 -&gt; 1
+mysql&gt; <userinput>SELECT 'fo\no' REGEXP '^fo$';</userinput>                    -&gt; 0
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>.</literal>
+      </para>
+
+      <para>
+        Match any character (including carriage return and newline).
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'fofo' REGEXP '^f.*$';</userinput>                    -&gt; 1
+mysql&gt; <userinput>SELECT 'fo\r\nfo' REGEXP '^f.*$';</userinput>                -&gt; 1
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>a*</literal>
+      </para>
+
+      <para>
+        Match any sequence of zero or more <literal>a</literal>
+        characters.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'Ban' REGEXP '^Ba*n';</userinput>                     -&gt; 1
+mysql&gt; <userinput>SELECT 'Baaan' REGEXP '^Ba*n';</userinput>                   -&gt; 1
+mysql&gt; <userinput>SELECT 'Bn' REGEXP '^Ba*n';</userinput>                      -&gt; 1
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>a+</literal>
+      </para>
+
+      <para>
+        Match any sequence of one or more <literal>a</literal>
+        characters.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'Ban' REGEXP '^Ba+n';</userinput>                     -&gt; 1
+mysql&gt; <userinput>SELECT 'Bn' REGEXP '^Ba+n';</userinput>                      -&gt; 0
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>a?</literal>
+      </para>
+
+      <para>
+        Match either zero or one <literal>a</literal> character.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'Bn' REGEXP '^Ba?n';</userinput>                      -&gt; 1
+mysql&gt; <userinput>SELECT 'Ban' REGEXP '^Ba?n';</userinput>                     -&gt; 1
+mysql&gt; <userinput>SELECT 'Baan' REGEXP '^Ba?n';</userinput>                    -&gt; 0
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>de|abc</literal>
+      </para>
+
+      <para>
+        Match either of the sequences <literal>de</literal> or
+        <literal>abc</literal>.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'pi' REGEXP 'pi|apa';</userinput>                     -&gt; 1
+mysql&gt; <userinput>SELECT 'axe' REGEXP 'pi|apa';</userinput>                    -&gt; 0
+mysql&gt; <userinput>SELECT 'apa' REGEXP 'pi|apa';</userinput>                    -&gt; 1
+mysql&gt; <userinput>SELECT 'apa' REGEXP '^(pi|apa)$';</userinput>                -&gt; 1
+mysql&gt; <userinput>SELECT 'pi' REGEXP '^(pi|apa)$';</userinput>                 -&gt; 1
+mysql&gt; <userinput>SELECT 'pix' REGEXP '^(pi|apa)$';</userinput>                -&gt; 0
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>(abc)*</literal>
+      </para>
+
+      <para>
+        Match zero or more instances of the sequence
+        <literal>abc</literal>.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'pi' REGEXP '^(pi)*$';</userinput>                    -&gt; 1
+mysql&gt; <userinput>SELECT 'pip' REGEXP '^(pi)*$';</userinput>                   -&gt; 0
+mysql&gt; <userinput>SELECT 'pipi' REGEXP '^(pi)*$';</userinput>                  -&gt; 1
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>{1}</literal>, <literal>{2,3}</literal>
+      </para>
+
+      <para>
+        <literal>{n}</literal> or <literal>{m,n}</literal> notation
+        provides a more general way of writing regular expressions that
+        match many occurrences of the previous atom (or
+        <quote>piece</quote>) of the pattern. <literal>m</literal> and
+        <literal>n</literal> are integers.
+      </para>
+
+      <itemizedlist>
+
+        <listitem>
+          <para>
+            <literal>a*</literal>
+          </para>
+
+          <para>
+            Can be written as <literal>a{0,}</literal>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>a+</literal>
+          </para>
+
+          <para>
+            Can be written as <literal>a{1,}</literal>.
+          </para>
+        </listitem>
+
+        <listitem>
+          <para>
+            <literal>a?</literal>
+          </para>
+
+          <para>
+            Can be written as <literal>a{0,1}</literal>.
+          </para>
+        </listitem>
+
+      </itemizedlist>
+
+      <para>
+        To be more precise, <literal>a{n}</literal> matches exactly
+        <literal>n</literal> instances of <literal>a</literal>.
+        <literal>a{n,}</literal> matches <literal>n</literal> or more
+        instances of <literal>a</literal>. <literal>a{m,n}</literal>
+        matches <literal>m</literal> through <literal>n</literal>
+        instances of <literal>a</literal>, inclusive.
+      </para>
+
+      <para>
+        <literal>m</literal> and <literal>n</literal> must be in the
+        range from <literal>0</literal> to <literal>RE_DUP_MAX</literal>
+        (default 255), inclusive. If both <literal>m</literal> and
+        <literal>n</literal> are given, <literal>m</literal> must be
+        less than or equal to <literal>n</literal>.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'abcde' REGEXP 'a[bcd]{2}e';</userinput>              -&gt; 0
+mysql&gt; <userinput>SELECT 'abcde' REGEXP 'a[bcd]{3}e';</userinput>              -&gt; 1
+mysql&gt; <userinput>SELECT 'abcde' REGEXP 'a[bcd]{1,10}e';</userinput>           -&gt; 1
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>[a-dX]</literal>, <literal>[^a-dX]</literal>
+      </para>
+
+      <para>
+        Matches any character that is (or is not, if ^ is used) either
+        <literal>a</literal>, <literal>b</literal>,
+        <literal>c</literal>, <literal>d</literal> or
+        <literal>X</literal>. A <literal>-</literal> character between
+        two other characters forms a range that matches all characters
+        from the first character to the second. For example,
+        <literal>[0-9]</literal> matches any decimal digit. To include a
+        literal <literal>]</literal> character, it must immediately
+        follow the opening bracket <literal>[</literal>. To include a
+        literal <literal>-</literal> character, it must be written first
+        or last. Any character that does not have a defined special
+        meaning inside a <literal>[]</literal> pair matches only itself.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'aXbc' REGEXP '[a-dXYZ]';</userinput>                 -&gt; 1
+mysql&gt; <userinput>SELECT 'aXbc' REGEXP '^[a-dXYZ]$';</userinput>               -&gt; 0
+mysql&gt; <userinput>SELECT 'aXbc' REGEXP '^[a-dXYZ]+$';</userinput>              -&gt; 1
+mysql&gt; <userinput>SELECT 'aXbc' REGEXP '^[^a-dXYZ]+$';</userinput>             -&gt; 0
+mysql&gt; <userinput>SELECT 'gheis' REGEXP '^[^a-dXYZ]+$';</userinput>            -&gt; 1
+mysql&gt; <userinput>SELECT 'gheisa' REGEXP '^[^a-dXYZ]+$';</userinput>           -&gt; 0
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>[.characters.]</literal>
+      </para>
+
+      <para>
+        Within a bracket expression (written using <literal>[</literal>
+        and <literal>]</literal>), matches the sequence of characters of
+        that collating element. <literal>characters</literal> is either
+        a single character or a character name like
+        <literal>newline</literal>. You can find the full list of
+        character names in the <filename>regexp/cname.h</filename> file.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT '~' REGEXP '[[.~.]]';</userinput>                     -&gt; 1
+mysql&gt; <userinput>SELECT '~' REGEXP '[[.tilde.]]';</userinput>                 -&gt; 1
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>[=character_class=]</literal>
+      </para>
+
+      <para>
+        Within a bracket expression (written using <literal>[</literal>
+        and <literal>]</literal>),
+        <literal>[=character_class=]</literal> represents an equivalence
+        class. It matches all characters with the same collation value,
+        including itself. For example, if <literal>o</literal> and
+        <literal>(+)</literal> are the members of an equivalence class,
+        then <literal>[[=o=]]</literal>, <literal>[[=(+)=]]</literal>,
+        and <literal>[o(+)]</literal> are all synonymous. An equivalence
+        class may not be used as an endpoint of a range.
+      </para>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>[:character_class:]</literal>
+      </para>
+
+      <para>
+        Within a bracket expression (written using <literal>[</literal>
+        and <literal>]</literal>),
+        <literal>[:character_class:]</literal> represents a character
+        class that matches all characters belonging to that class. The
+        standard class names are:
+      </para>
+
+      <informaltable>
+        <tgroup cols="2">
+          <colspec colwidth="10*"/>
+          <colspec colwidth="90*"/>
+          <tbody>
+            <row>
+              <entry><literal>alnum</literal></entry>
+              <entry>Alphanumeric characters</entry>
+            </row>
+            <row>
+              <entry><literal>alpha</literal></entry>
+              <entry>Alphabetic characters</entry>
+            </row>
+            <row>
+              <entry><literal>blank</literal></entry>
+              <entry>Whitespace characters</entry>
+            </row>
+            <row>
+              <entry><literal>cntrl</literal></entry>
+              <entry>Control characters</entry>
+            </row>
+            <row>
+              <entry><literal>digit</literal></entry>
+              <entry>Digit characters</entry>
+            </row>
+            <row>
+              <entry><literal>graph</literal></entry>
+              <entry>Graphic characters</entry>
+            </row>
+            <row>
+              <entry><literal>lower</literal></entry>
+              <entry>Lowercase alphabetic characters</entry>
+            </row>
+            <row>
+              <entry><literal>print</literal></entry>
+              <entry>Graphic or space characters</entry>
+            </row>
+            <row>
+              <entry><literal>punct</literal></entry>
+              <entry>Punctuation characters</entry>
+            </row>
+            <row>
+              <entry><literal>space</literal></entry>
+              <entry>Space, tab, newline, and carriage return</entry>
+            </row>
+            <row>
+              <entry><literal>upper</literal></entry>
+              <entry>Uppercase alphabetic characters</entry>
+            </row>
+            <row>
+              <entry><literal>xdigit</literal></entry>
+              <entry>Hexadecimal digit characters</entry>
+            </row>
+          </tbody>
+        </tgroup>
+      </informaltable>
+
+      <para>
+        These stand for the character classes defined in the
+        <literal>ctype(3)</literal> manual page. A particular locale may
+        provide other class names. A character class may not be used as
+        an endpoint of a range.
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'justalnums' REGEXP '[[:alnum:]]+';</userinput>       -&gt; 1
+mysql&gt; <userinput>SELECT '!!' REGEXP '[[:alnum:]]+';</userinput>               -&gt; 0
+</programlisting>
+    </listitem>
+
+    <listitem>
+      <para>
+        <literal>[[:&lt;:]]</literal>, <literal>[[:&gt;:]]</literal>
+      </para>
+
+      <para>
+        These markers stand for word boundaries. They match the
+        beginning and end of words, respectively. A word is a sequence
+        of word characters that is not preceded by or followed by word
+        characters. A word character is an alphanumeric character in the
+        <literal>alnum</literal> class or an underscore
+        (<literal>_</literal>).
+      </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT 'a word a' REGEXP '[[:&lt;:]]word[[:&gt;:]]';</userinput>   -&gt; 1
+mysql&gt; <userinput>SELECT 'a xword a' REGEXP '[[:&lt;:]]word[[:&gt;:]]';</userinput>  -&gt; 0
+</programlisting>
+    </listitem>
+
+  </itemizedlist>
+
+  <para>
+    To use a literal instance of a special character in a regular
+    expression, precede it by two backslash (\) characters. The MySQL
+    parser interprets one of the backslashes, and the regular expression
+    library interprets the other. For example, to match the string
+    <literal>1+2</literal> that contains the special
+    <literal>+</literal> character, only the last of the following
+    regular expressions is the correct one:
+  </para>
+
+<programlisting>
+mysql&gt; <userinput>SELECT '1+2' REGEXP '1+2';</userinput>                       -&gt; 0
+mysql&gt; <userinput>SELECT '1+2' REGEXP '1\+2';</userinput>                      -&gt; 0
+mysql&gt; <userinput>SELECT '1+2' REGEXP '1\\+2';</userinput>                     -&gt; 1
+</programlisting>
+
+</appendix>


Property changes on: trunk/refman-common/regexp.xml
___________________________________________________________________
Name: svn:eol-style
   + LF

Thread
svn commit - mysqldoc@docsrva: r1124 - in trunk: . refman-4.1 refman-5.0 refman-5.1 refman-commonpaul30 Jan