Author: jstephens
Date: 2006-02-24 17:45:03 +0100 (Fri, 24 Feb 2006)
New Revision: 1440
Log:
trunk/refman-common/news-5.1.xml - Fix typo
trunk/refman-common/titles.en.ent - New section title
trunk/refman-5.1/sql-syntax.xml - Couple of fixes relating to partitioning
trunk/refman-5.1/partitioning.xml
- added note about partition keys and primary keys (Thanks, Jeb!)
- New section: Partition Pruning
Modified:
trunk/refman-5.1/partitioning.xml
trunk/refman-5.1/sql-syntax.xml
trunk/refman-common/news-5.1.xml
trunk/refman-common/titles.en.ent
Modified: trunk/refman-5.1/partitioning.xml
===================================================================
--- trunk/refman-5.1/partitioning.xml 2006-02-24 15:18:24 UTC (rev 1439)
+++ trunk/refman-5.1/partitioning.xml 2006-02-24 16:45:03 UTC (rev 1440)
@@ -363,7 +363,8 @@
queries that may not have been so when the partitioning scheme
was first set up. This capability, sometimes referred to as
<firstterm>partition pruning</firstterm>, was implemented in
- MySQL 5.1.6.
+ MySQL 5.1.6. For additional informaiton, see
+ <xref linkend="partitioning-pruning"/>.
</para>
</listitem>
@@ -406,8 +407,8 @@
</itemizedlist>
<para>
- Be sure to check this page and chapter frequently for updates as
- Partitioning development for MySQL 5.1 continues.
+ Be sure to check this section and chapter frequently for updates
+ as Partitioning development continues.
</para>
</section>
@@ -3352,7 +3353,261 @@
</section>
</section>
+
+ <section id="partitioning-pruning">
+ <title>&title-partitioning-pruning;</title>
+
+ <para>
+ This section discusses <firstterm>partition pruning</firstterm>,
+ an opimisation which was implemented for partitioned tables in
+ MySQL 5.1.6.
+ </para>
+
+ <para>
+ The core concept behind partitioning is relatively simple, and can
+ be described as <quote>Do not scan partitions where there can be
+ no matching values</quote>. For example, suppose you have a
+ partitioned table <literal>t1</literal> defined by this statement:
+ </para>
+
+<programlisting>
+CREATE TABLE t1 (
+ fname VARCHAR(50) NOT NULL,
+ lname VARCHAR(50) NOT NULL,
+ region_code TINYINT UNSIGNED NOT NULL,
+ dob DATE NOT NULL
+)
+PARTITION BY RANGE( region_code ) (
+ PARTITION p0 VALUES LESS THAN (64),
+ PARTITION p1 VALUES LESS THAN (128),
+ PARTITION p2 VALUES LESS THAN (192)
+ PARTITION p3 VALUES LESS THAN MAXVALUE
+);
+</programlisting>
+
+ <para>
+ Consider the case where you wish to obtain results from a query
+ such as this one:
+ </para>
+<programlisting>
+SELECT fname, lname, postcode, dob
+ FROM t1
+ WHERE region_code > 125 AND region_code < 130;
+</programlisting>
+
+ <para>
+ It is easy to see that none of the rows which ought to be returned
+ will be in either of the partitions <literal>p0</literal> or
+ <literal>p3</literal>; that is, we need to search only in
+ partitions <literal>p1</literal> and <literal>p2</literal>
to find
+ matching rows. By doing so, it is possible to expend much time
+ time and effort in finding matching rows than it is to scan all
+ partitions in the table. This <quote>cutting away</quote> of
+ unneeded partitions is known as <firstterm>pruning</firstterm>.
+ When the optimiser can make use of partition pruning in performing
+ a query, execution of the query can be an order of magnitude
+ faster than the same query against a non-partitioned table
+ containing the same column definitions and data.
+ </para>
+
+ <para>
+ The query optimiser can perform pruning whenever a
+ <literal>WHERE</literal> condition can be reduced to either one of
+ the following:
+ </para>
+
+ <itemizedlist>
+
+ <listitem>
+ <para>
+ <literal><replaceable>partition_column</replaceable> =
+ <replaceable>constant</replaceable></literal>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal><replaceable>partition_column</replaceable> IN
+ (<replaceable>constant1</replaceable>,
+ <replaceable>constant2</replaceable>, ...,
+ <replaceable>constantN</replaceable>)</literal>
+ </para>
+ </listitem>
+
+ </itemizedlist>
+
+ <para>
+ In the first case, the optimiser simply evaluates evaluates the
+ partitioning expression for the value given, determines which
+ partition contains that value, and scans only this partition. In
+ the second case, the optimiser evaluates the partitioning
+ expression for each value in the list, creates a list of
+ matching partitions, and then scans only the partitions in this
+ partition list.
+ </para>
+
+ <para>
+ Pruning can also be applied to short ranges, which the optimiser
+ can convert into equivalent lists of values. For instance, in the
+ previous example, the <literal>WHERE</literal> clause can be
+ converted to <literal>WHERE region_code IN (125, 126, 127, 128,
+ 129, 130)</literal>. Then the optimiser can that the first three
+ values in the list are found in partition <literal>p1</literal>,
+ the remaining three values in partition <literal>p2</literal>, and
+ that the other partitions contain no relevant values and so do not
+ need to be searched for matching rows.
+ </para>
+
+ <para>
+ This type of optimisation can be applied whenever the partitioning
+ expression consists of an equality or a range which can be reduced
+ to a set of equalities. It can also be employed when the
+ partitioning expression represents an incresing or decreasing
+ relationship or uses a function such as <literal>YEAR()</literal>
+ or <literal>TO_DAYS()</literal> that produces an integer value
+ when applied to a <literal>DATE</literal> or
+ <literal>DATETIME</literal> column value. For example, suppose
+ that table <literal>t2</literal>, defined as shown here, is
+ partitioned on a <literal>DATE</literal> column:
+ </para>
+
+<programlisting>
+CREATE TABLE t2 (
+ fname VARCHAR(50) NOT NULL,
+ lname VARCHAR(50) NOT NULL,
+ region_code TINYINT UNSIGNED NOT NULL,
+ dob DATE NOT NULL
+)
+PARTITION BY RANGE( YEAR(dob) ) (
+ PARTITION d0 VALUES LESS THAN (1970),
+ PARTITION d1 VALUES LESS THAN (1975),
+ PARTITION d2 VALUES LESS THAN (1980),
+ PARTITION d3 VALUES LESS THAN (1985),
+ PARTITION d4 VALUES LESS THAN (1990),
+ PARTITION d5 VALUES LESS THAN (2000),
+ PARTITION d6 VALUES LESS THAN (2005),
+ PARTITION d7 VALUES LESS THAN MAXVALUE
+);
+</programlisting>
+
+ <para>
+ The following queries on <literal>t2</literal> can make of use
+ partition pruning:
+ </para>
+
+<programlisting>
+SELECT * FROM t2 WHERE dob = '1982-06-23';
+
+SELECT * FROM t2 WHERE dob BETWEEN '1991-02-15' AND '1997-04-25';
+
+SELECT * FROM t2 WHERE YEAR(dob) IN (1979, 1980, 1983, 1985, 1986, 1988);
+
+SELECT * FROM t2 WHERE dob >= '1984-06-21' AND dob <= '1999-06-21'
+</programlisting>
+
+ <para>
+ In the case of the last query, the optimiser can also act as
+ follows:
+ </para>
+
+ <orderedlist>
+
+ <listitem>
+ <para>
+ <emphasis>Find the partition containing the low end of the
+ range</emphasis>.
+ </para>
+
+ <para>
+ <literal>YEAR('1984-06-21')</literal> yields the
+ value <literal>1974</literal>, which is found in partition
+ <literal>d3</literal>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <emphasis>Find the partition containing the high end of the
+ range</emphasis>.
+ </para>
+
+ <para>
+ <literal>YEAR('1999-06-21')</literal> evaluates to
+ <literal>1999</literal>, which is found in partition
+ <literal>d5</literal>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <emphasis>Scan only these two partitions and any partitions
+ that may lie between them</emphasis>.
+ </para>
+
+ <para>
+ In this case, this means that only partitions
+ <literal>d3</literal>, <literal>d4</literal>, and
+ <literal>d5</literal> are scanned. The remaining partitions
+ may be safely ignored (and are ignored).
+ </para>
+ </listitem>
+
+ </orderedlist>
+
+ <para>
+ So far, we have looked only at examples using
+ <literal>RANGE</literal> partitioning, but pruning can be applied
+ with other partitioning types as well.
+ </para>
+
+ <para>
+ Consider a table that is partitioned by <literal>LIST</literal>, where
+ the partitioning expression is increasing or decreasing, such as
+ the table <literal>t3</literal> shown here. (In this example, we
+ assume for the sake of brevity that the
+ <literal>region_code</literal> column is limited to values between
+ 1 and 10 inclusive.)
+ </para>
+
+<programlisting>
+CREATE TABLE t3 (
+ fname VARCHAR(50) NOT NULL,
+ lname VARCHAR(50) NOT NULL,
+ region_code TINYINT UNSIGNED NOT NULL,
+ dob DATE NOT NULL
+)
+PARTITION BY LIST(region_code) (
+ PARTITION r0 VALUES IN (1, 3),
+ PARTITION r1 VALUES IN (2, 5, 8),
+ PARTITION r2 VALUES IN (4, 9),
+ PARTITION r3 VALUES IN (6, 7, 10)
+);
+</programlisting>
+
+ <para>
+ For a query such as <literal>SELECT * FROM t3 WHERE region_code
+ BETWEEN 1 AND 3</literal>, the optimiser determines in which
+ partitions the values 1, 2, and 3 are found (<literal>r0</literal>
+ and <literal>r1</literal>) and skips the remaining ones
+ (<literal>r2</literal> and <literal>r3</literal>).
+ </para>
+
+ <para>
+ For tables that are partitioned by <literal>HASH</literal> or
+ <literal>KEY</literal>, partition pruning is also possible.
+ However, it can be used only on integer columns of tables using
+ these partitioning types. For example, if the table has a
+ <literal>DATE</literal> column, a query with a
+ <literal>WHERE</literal> condition such as <literal>WHERE
date_col
+ >=- '2001-04-14' AND date_col <= '2005-10-15'</literal>
+ does not use partition pruning, but if the table stores year
+ values in an <literal>INT</literal> column, then a query having
+ <literal>WHERE year_col >= 2001 AND year_col <=
+ 2005</literal> can be pruned.
+ </para>
+ </section>
+
<section id="partitioning-limitations">
<title>&title-partitioning-limitations;</title>
@@ -3397,12 +3652,77 @@
<listitem>
<para>
A partitioning key must be either an integer column or an
- expression that resolves to an integer. Currently,
- <literal>NULL</literal> values are also permitted; however,
- this is subject to change.
+ expression that resolves to an integer. The column or
+ expression value may also be <literal>NULL</literal>. (See
+ <xref linkend="partitioning-handling-nulls"/>.)
</para>
</listitem>
+
+ <listitem>
+ <para>
+ If a table that is to be partitioned has a primary key, then
+ any columns used in the partitioning expression must be part
+ of the primary key. For example, the following two statements
+ are invalid:
+ </para>
+
+<programlisting>
+CREATE TABLE t1 (
+ col1 INT NOT NULL,
+ col2 DATE NOT NULL,
+ col3 INT NOT NULL,
+ col4 INT NOT NULL,
+ PRIMARY KEY(col1, col2)
+)
+PARTITION BY HASH(col3)
+PARTITIONS 4;
+CREATE TABLE t1 (
+ col1 INT NOT NULL,
+ col2 DATE NOT NULL,
+ col3 INT NOT NULL,
+ col4 INT NOT NULL,
+ PRIMARY KEY(col1, col3)
+)
+PARTITION BY HASH( YEAR(col2) )
+PARTITIONS 4;
+</programlisting>
+
+ <para>
+ In both cases, the primary key does not include all columns
+ referenced in the partitioning expression. However, both of
+ the next two statements are valid:
+ </para>
+
+<programlisting>
+CREATE TABLE t1 (
+ col1 INT NOT NULL,
+ col2 DATE NOT NULL,
+ col3 INT NOT NULL,
+ col4 INT NOT NULL,
+ PRIMARY KEY(col1, col2)
+)
+PARTITION BY HASH(col1 + YEAR(col2))
+PARTITIONS 4;
+
+CREATE TABLE t1 (
+ col1 INT NOT NULL,
+ col2 DATE NOT NULL,
+ col3 INT NOT NULL,
+ col4 INT NOT NULL,
+ PRIMARY KEY(col1, col2, col4)
+)
+PARTITION BY HASH(col1 + YEAR(col2))
+PARTITIONS 4;
+</programlisting>
+
+ <para>
+ If the table does not have a primary key, then this
+ restriction does not apply, and you may use any column or
+ columns in the partitioning expression.
+ </para>
+ </listitem>
+
<listitem>
<para>
Subpartitions are limited to <literal>HASH</literal> or
Modified: trunk/refman-5.1/sql-syntax.xml
===================================================================
--- trunk/refman-5.1/sql-syntax.xml 2006-02-24 15:18:24 UTC (rev 1439)
+++ trunk/refman-5.1/sql-syntax.xml 2006-02-24 16:45:03 UTC (rev 1440)
@@ -966,7 +966,9 @@
<emphasis role="bold">Note</emphasis>: For partitions that
have not been explicitly named, MySQL automatically provides
the default names <literal>p0</literal>,
- <literal>p1</literal>, <literal>p2</literal>, and so
on.
+ <literal>p1</literal>, <literal>p2</literal>, and so
on. As
+ of MySQL 5.1.7, the same is true with regard to
+ subpartitions.
</para>
<para>
@@ -1476,7 +1478,7 @@
COMPRESSED CONNECTION CSV FEDERATED FIXED DYNAMIC DEFAULT
DELAY_KEY_WRITE DELETE DIRECTORY INDEX FIRST FOREIGN ROW_FORMAT
FULL FULLTEXT HEAP INNOBASE INNODB ISAM INSERT_METHOD MAX_ROWS
- MIN_ROWS PACK_KEYS PARTIAL MERGE MRG_MYISAM MYISAM NDB
+ MIN_ROWS PACK_KEYS PARTIAL PARTITION MERGE MRG_MYISAM MYISAM NDB
NDBCLUSTER NO REDUNDANT REFERENCES SERIAL WITH PARSER
</remark>
Modified: trunk/refman-common/news-5.1.xml
===================================================================
--- trunk/refman-common/news-5.1.xml 2006-02-24 15:18:24 UTC (rev 1439)
+++ trunk/refman-common/news-5.1.xml 2006-02-24 16:45:03 UTC (rev 1440)
@@ -143,7 +143,7 @@
<listitem>
<para>
Attempting to add a new partition to a table partitioned by a
- unque key would cause an <errortext>Out of memory</errortext>
+ unique key would cause an <errortext>Out of memory</errortext>
error. (Bug #17169)
</para>
</listitem>
Modified: trunk/refman-common/titles.en.ent
===================================================================
--- trunk/refman-common/titles.en.ent 2006-02-24 15:18:24 UTC (rev 1439)
+++ trunk/refman-common/titles.en.ent 2006-02-24 16:45:03 UTC (rev 1440)
@@ -1336,6 +1336,7 @@
<!ENTITY title-partitioning-management-hash-key "Management of
<literal>HASH</literal> and <literal>KEY</literal>
Partitions">
<!ENTITY title-partitioning-management-range-list "Management of
<literal>RANGE</literal> and <literal>LIST</literal>
Partitions">
<!ENTITY title-partitioning-overview "Overview of Partitioning in MySQL">
+<!ENTITY title-partitioning-pruning "Partition Pruning">
<!ENTITY title-partitioning-range "<literal>RANGE</literal>
Partitioning">
<!ENTITY title-partitioning-subpartitions "Subpartitioning">
<!ENTITY title-partitioning-types "Partition Types">
| Thread |
|---|
| • svn commit - mysqldoc@docsrva: r1440 - in trunk: refman-5.1 refman-common | jon | 24 Feb |