List:Commits« Previous MessageNext Message »
From:jon Date:February 24 2006 5:45pm
Subject:svn commit - mysqldoc@docsrva: r1440 - in trunk: refman-5.1 refman-common
View as plain text  
Author: jstephens
Date: 2006-02-24 17:45:03 +0100 (Fri, 24 Feb 2006)
New Revision: 1440

Log:

trunk/refman-common/news-5.1.xml - Fix typo
trunk/refman-common/titles.en.ent - New section title
trunk/refman-5.1/sql-syntax.xml - Couple of fixes relating to partitioning
trunk/refman-5.1/partitioning.xml 
  - added note about partition keys and primary keys (Thanks, Jeb!)
  - New section: Partition Pruning



Modified:
   trunk/refman-5.1/partitioning.xml
   trunk/refman-5.1/sql-syntax.xml
   trunk/refman-common/news-5.1.xml
   trunk/refman-common/titles.en.ent

Modified: trunk/refman-5.1/partitioning.xml
===================================================================
--- trunk/refman-5.1/partitioning.xml	2006-02-24 15:18:24 UTC (rev 1439)
+++ trunk/refman-5.1/partitioning.xml	2006-02-24 16:45:03 UTC (rev 1440)
@@ -363,7 +363,8 @@
           queries that may not have been so when the partitioning scheme
           was first set up. This capability, sometimes referred to as
           <firstterm>partition pruning</firstterm>, was implemented in
-          MySQL 5.1.6.
+          MySQL 5.1.6. For additional informaiton, see 
+          <xref linkend="partitioning-pruning"/>. 
         </para>
       </listitem>
 
@@ -406,8 +407,8 @@
     </itemizedlist>
 
     <para>
-      Be sure to check this page and chapter frequently for updates as
-      Partitioning development for MySQL 5.1 continues.
+      Be sure to check this section and chapter frequently for updates
+      as Partitioning development continues.
     </para>
 
   </section>
@@ -3352,7 +3353,261 @@
     </section>
 
   </section>
+  
+  <section id="partitioning-pruning">
+    <title>&title-partitioning-pruning;</title>
+    
+    <para>
+      This section discusses <firstterm>partition pruning</firstterm>,
+      an opimisation which was implemented for partitioned tables in
+      MySQL 5.1.6.
+    </para>
+    
+    <para>
+      The core concept behind partitioning is relatively simple, and can
+      be described as <quote>Do not scan partitions where there can be
+        no matching values</quote>. For example, suppose you have a
+      partitioned table <literal>t1</literal> defined by this statement: 
+    </para>
+    
+<programlisting>
+CREATE TABLE t1 (
+    fname VARCHAR(50) NOT NULL,
+    lname VARCHAR(50) NOT NULL,
+    region_code TINYINT UNSIGNED NOT NULL,
+    dob DATE NOT NULL
+)
+PARTITION BY RANGE( region_code ) (
+    PARTITION p0 VALUES LESS THAN (64),
+    PARTITION p1 VALUES LESS THAN (128),
+    PARTITION p2 VALUES LESS THAN (192)
+    PARTITION p3 VALUES LESS THAN MAXVALUE
+);
+</programlisting>
+    
+    <para>
+      Consider the case where you wish to obtain results from a query
+      such as this one:
+    </para>
 
+<programlisting>
+SELECT fname, lname, postcode, dob
+    FROM t1 
+    WHERE region_code &gt; 125 AND region_code &lt; 130; 
+</programlisting>
+    
+    <para>
+      It is easy to see that none of the rows which ought to be returned
+      will be in either of the partitions <literal>p0</literal> or
+      <literal>p3</literal>; that is, we need to search only in
+      partitions <literal>p1</literal> and <literal>p2</literal>
to find
+      matching rows. By doing so, it is possible to expend much time
+      time and effort in finding matching rows than it is to scan all
+      partitions in the table. This <quote>cutting away</quote> of
+      unneeded partitions is known as <firstterm>pruning</firstterm>.
+      When the optimiser can make use of partition pruning in performing
+      a query, execution of the query can be an order of magnitude
+      faster than the same query against a non-partitioned table
+      containing the same column definitions and data.
+    </para>
+    
+    <para>
+      The query optimiser can perform pruning whenever a
+      <literal>WHERE</literal> condition can be reduced to either one of
+      the following:
+    </para>
+    
+    <itemizedlist>
+      
+      <listitem>
+        <para>
+          <literal><replaceable>partition_column</replaceable> =
+            <replaceable>constant</replaceable></literal>
+        </para>
+      </listitem>
+      
+      <listitem>
+        <para>
+          <literal><replaceable>partition_column</replaceable> IN
+            (<replaceable>constant1</replaceable>,
+            <replaceable>constant2</replaceable>, ...,
+            <replaceable>constantN</replaceable>)</literal>
+        </para>
+      </listitem>
+            
+    </itemizedlist>
+    
+    <para>
+      In the first case, the optimiser simply evaluates evaluates the
+      partitioning expression for the value given, determines which
+      partition contains that value, and scans only this partition. In
+      the second case, the optimiser evaluates the partitioning
+      expression for each value in the list, creates a list of
+      matching partitions, and then scans only the partitions in this
+      partition list.
+    </para>
+    
+    <para>
+      Pruning can also be applied to short ranges, which the optimiser
+      can convert into equivalent lists of values. For instance, in the
+      previous example, the <literal>WHERE</literal> clause can be
+      converted to <literal>WHERE region_code IN (125, 126, 127, 128,
+        129, 130)</literal>. Then the optimiser can that the first three
+      values in the list are found in partition <literal>p1</literal>,
+      the remaining three values in partition <literal>p2</literal>, and
+      that the other partitions contain no relevant values and so do not
+      need to be searched for matching rows.
+    </para>
+    
+    <para>
+      This type of optimisation can be applied whenever the partitioning
+      expression consists of an equality or a range which can be reduced
+      to a set of equalities. It can also be employed when the
+      partitioning expression represents an incresing or decreasing
+      relationship or uses a function such as <literal>YEAR()</literal>
+      or <literal>TO_DAYS()</literal> that produces an integer value
+      when applied to a <literal>DATE</literal> or
+      <literal>DATETIME</literal> column value. For example, suppose
+      that table <literal>t2</literal>, defined as shown here, is
+      partitioned on a <literal>DATE</literal> column:
+    </para>
+    
+<programlisting>
+CREATE TABLE t2 (
+    fname VARCHAR(50) NOT NULL,
+    lname VARCHAR(50) NOT NULL,
+    region_code TINYINT UNSIGNED NOT NULL,
+    dob DATE NOT NULL
+)
+PARTITION BY RANGE( YEAR(dob) ) (
+    PARTITION d0 VALUES LESS THAN (1970),
+    PARTITION d1 VALUES LESS THAN (1975),
+    PARTITION d2 VALUES LESS THAN (1980),
+    PARTITION d3 VALUES LESS THAN (1985),
+    PARTITION d4 VALUES LESS THAN (1990),
+    PARTITION d5 VALUES LESS THAN (2000),
+    PARTITION d6 VALUES LESS THAN (2005),
+    PARTITION d7 VALUES LESS THAN MAXVALUE
+);
+</programlisting>
+    
+    <para>
+      The following queries on <literal>t2</literal> can make of use
+      partition pruning:
+    </para>
+
+<programlisting>
+SELECT * FROM t2 WHERE dob = '1982-06-23'; 
+
+SELECT * FROM t2 WHERE dob BETWEEN '1991-02-15' AND '1997-04-25';
+
+SELECT * FROM t2 WHERE YEAR(dob) IN (1979, 1980, 1983, 1985, 1986, 1988);
+
+SELECT * FROM t2 WHERE dob &gt;= '1984-06-21' AND dob &lt;= '1999-06-21'
+</programlisting>
+    
+    <para>
+      In the case of the last query, the optimiser can also act as
+      follows:
+    </para>
+    
+    <orderedlist>
+      
+      <listitem>
+        <para>
+          <emphasis>Find the partition containing the low end of the
+            range</emphasis>.
+        </para>
+        
+        <para>
+          <literal>YEAR('1984-06-21')</literal> yields the
+          value <literal>1974</literal>, which is found in partition
+          <literal>d3</literal>.
+        </para>
+      </listitem>
+      
+      <listitem>
+        <para>
+          <emphasis>Find the partition containing the high end of the
+            range</emphasis>.
+        </para>
+        
+        <para>
+          <literal>YEAR('1999-06-21')</literal> evaluates to
+          <literal>1999</literal>, which is found in partition
+          <literal>d5</literal>.
+        </para>
+      </listitem>
+      
+      <listitem>
+        <para>
+          <emphasis>Scan only these two partitions and any partitions
+            that may lie between them</emphasis>.
+        </para>
+        
+        <para>
+          In this case, this means that only partitions
+          <literal>d3</literal>, <literal>d4</literal>, and
+          <literal>d5</literal> are scanned. The remaining partitions
+          may be safely ignored (and are ignored).
+        </para>
+      </listitem>
+      
+    </orderedlist>
+    
+    <para>
+      So far, we have looked only at examples using
+      <literal>RANGE</literal> partitioning, but pruning can be applied
+      with other partitioning types as well.
+    </para>
+    
+    <para>
+      Consider a table that is partitioned by <literal>LIST</literal>, where
+      the partitioning expression is increasing or decreasing, such as
+      the table <literal>t3</literal> shown here. (In this example, we
+      assume for the sake of brevity that the
+      <literal>region_code</literal> column is limited to values between
+      1 and 10 inclusive.)
+    </para>
+    
+<programlisting>
+CREATE TABLE t3 (
+    fname VARCHAR(50) NOT NULL,
+    lname VARCHAR(50) NOT NULL,
+    region_code TINYINT UNSIGNED NOT NULL,
+    dob DATE NOT NULL
+)
+PARTITION BY LIST(region_code) (
+    PARTITION r0 VALUES IN (1, 3),
+    PARTITION r1 VALUES IN (2, 5, 8),
+    PARTITION r2 VALUES IN (4, 9),
+    PARTITION r3 VALUES IN (6, 7, 10)
+);
+</programlisting>
+    
+    <para>
+      For a query such as <literal>SELECT * FROM t3 WHERE region_code
+        BETWEEN 1 AND 3</literal>, the optimiser determines in which
+      partitions the values 1, 2, and 3 are found (<literal>r0</literal>
+      and <literal>r1</literal>) and skips the remaining ones
+      (<literal>r2</literal> and <literal>r3</literal>).
+    </para>
+    
+    <para>
+      For tables that are partitioned by <literal>HASH</literal> or
+      <literal>KEY</literal>, partition pruning is also possible.
+      However, it can be used only on integer columns of tables using
+      these partitioning types. For example, if the table has a
+      <literal>DATE</literal> column, a query with a
+      <literal>WHERE</literal> condition such as <literal>WHERE
date_col
+        &gt;=- '2001-04-14' AND date_col &lt;= '2005-10-15'</literal>
+      does not use partition pruning, but if the table stores year
+      values in an <literal>INT</literal> column, then a query having
+      <literal>WHERE year_col &gt;= 2001 AND year_col &lt;=
+        2005</literal> can be pruned.
+    </para>
+  </section>  
+  
   <section id="partitioning-limitations">
 
     <title>&title-partitioning-limitations;</title>
@@ -3397,12 +3652,77 @@
       <listitem>
         <para>
           A partitioning key must be either an integer column or an
-          expression that resolves to an integer. Currently,
-          <literal>NULL</literal> values are also permitted; however,
-          this is subject to change.
+          expression that resolves to an integer. The column or
+          expression value may also be <literal>NULL</literal>. (See 
+          <xref linkend="partitioning-handling-nulls"/>.)
         </para>
       </listitem>
+      
+      <listitem>
+        <para>
+          If a table that is to be partitioned has a primary key, then
+          any columns used in the partitioning expression must be part
+          of the primary key. For example, the following two statements
+          are invalid:
+        </para>
+        
+<programlisting>
+CREATE TABLE t1 (
+    col1 INT NOT NULL,
+    col2 DATE NOT NULL,
+    col3 INT NOT NULL,
+    col4 INT NOT NULL,
+    PRIMARY KEY(col1, col2)
+)
+PARTITION BY HASH(col3)
+PARTITIONS 4;
 
+CREATE TABLE t1 (
+    col1 INT NOT NULL,
+    col2 DATE NOT NULL,
+    col3 INT NOT NULL,
+    col4 INT NOT NULL,
+    PRIMARY KEY(col1, col3)
+)
+PARTITION BY HASH( YEAR(col2) )
+PARTITIONS 4;
+</programlisting>
+        
+        <para>
+          In both cases, the primary key does not include all columns
+          referenced in the partitioning expression. However, both of
+          the next two statements are valid:
+        </para>
+        
+<programlisting>
+CREATE TABLE t1 (
+    col1 INT NOT NULL,
+    col2 DATE NOT NULL,
+    col3 INT NOT NULL,
+    col4 INT NOT NULL,
+    PRIMARY KEY(col1, col2)
+)
+PARTITION BY HASH(col1 + YEAR(col2))
+PARTITIONS 4;
+
+CREATE TABLE t1 (
+    col1 INT NOT NULL,
+    col2 DATE NOT NULL,
+    col3 INT NOT NULL,
+    col4 INT NOT NULL,
+    PRIMARY KEY(col1, col2, col4)
+)
+PARTITION BY HASH(col1 + YEAR(col2))
+PARTITIONS 4;
+</programlisting>
+        
+        <para>
+          If the table does not have a primary key, then this
+          restriction does not apply, and you may use any column or
+          columns in the partitioning expression.
+        </para>
+      </listitem>
+
       <listitem>
         <para>
           Subpartitions are limited to <literal>HASH</literal> or

Modified: trunk/refman-5.1/sql-syntax.xml
===================================================================
--- trunk/refman-5.1/sql-syntax.xml	2006-02-24 15:18:24 UTC (rev 1439)
+++ trunk/refman-5.1/sql-syntax.xml	2006-02-24 16:45:03 UTC (rev 1440)
@@ -966,7 +966,9 @@
             <emphasis role="bold">Note</emphasis>: For partitions that
             have not been explicitly named, MySQL automatically provides
             the default names <literal>p0</literal>,
-            <literal>p1</literal>, <literal>p2</literal>, and so
on.
+            <literal>p1</literal>, <literal>p2</literal>, and so
on. As
+            of MySQL 5.1.7, the same is true with regard to
+            subpartitions.
           </para>
 
           <para>
@@ -1476,7 +1478,7 @@
         COMPRESSED CONNECTION CSV FEDERATED FIXED DYNAMIC DEFAULT
         DELAY_KEY_WRITE DELETE DIRECTORY INDEX FIRST FOREIGN ROW_FORMAT
         FULL FULLTEXT HEAP INNOBASE INNODB ISAM INSERT_METHOD MAX_ROWS
-        MIN_ROWS PACK_KEYS PARTIAL MERGE MRG_MYISAM MYISAM NDB
+        MIN_ROWS PACK_KEYS PARTIAL PARTITION MERGE MRG_MYISAM MYISAM NDB
         NDBCLUSTER NO REDUNDANT REFERENCES SERIAL WITH PARSER
       </remark>
 

Modified: trunk/refman-common/news-5.1.xml
===================================================================
--- trunk/refman-common/news-5.1.xml	2006-02-24 15:18:24 UTC (rev 1439)
+++ trunk/refman-common/news-5.1.xml	2006-02-24 16:45:03 UTC (rev 1440)
@@ -143,7 +143,7 @@
       <listitem>
         <para>
           Attempting to add a new partition to a table partitioned by a
-          unque key would cause an <errortext>Out of memory</errortext>
+          unique key would cause an <errortext>Out of memory</errortext>
           error. (Bug #17169)
         </para>
       </listitem>

Modified: trunk/refman-common/titles.en.ent
===================================================================
--- trunk/refman-common/titles.en.ent	2006-02-24 15:18:24 UTC (rev 1439)
+++ trunk/refman-common/titles.en.ent	2006-02-24 16:45:03 UTC (rev 1440)
@@ -1336,6 +1336,7 @@
 <!ENTITY title-partitioning-management-hash-key "Management of
<literal>HASH</literal> and <literal>KEY</literal>
Partitions">
 <!ENTITY title-partitioning-management-range-list "Management of
<literal>RANGE</literal> and <literal>LIST</literal>
Partitions">
 <!ENTITY title-partitioning-overview "Overview of Partitioning in MySQL">
+<!ENTITY title-partitioning-pruning "Partition Pruning">
 <!ENTITY title-partitioning-range "<literal>RANGE</literal>
Partitioning">
 <!ENTITY title-partitioning-subpartitions "Subpartitioning">
 <!ENTITY title-partitioning-types "Partition Types">

Thread
svn commit - mysqldoc@docsrva: r1440 - in trunk: refman-5.1 refman-commonjon24 Feb