List:Commits« Previous MessageNext Message »
From:Sergey Petrunia Date:January 5 2006 2:47pm
Subject:bk commit into 5.1 tree (sergefp:1.2031)
View as plain text  
Below is the list of changes that have just been committed into a local
5.1 repository of psergey. When psergey does a push these changes will
be propagated to the main repository and, within 24 hours after the
push, to the public repository.
For information on how to access the public repository
see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html

ChangeSet
  1.2031 06/01/05 17:47:07 sergefp@stripped +3 -0
  WL#2985 "Partition-pruning", "range walking" addition: better comments.

  sql/sql_partition.cc
    1.22 06/01/05 17:46:58 sergefp@stripped +14 -10
    WL#2985 "Partition-pruning", "range walking" addition: better comments.

  sql/opt_range.cc
    1.195 06/01/05 17:46:58 sergefp@stripped +86 -21
    WL#2985 "Partition-pruning", "range walking" addition: better comments.

  sql/handler.h
    1.177 06/01/05 17:46:58 sergefp@stripped +2 -3
    WL#2985 "Partition-pruning", "range walking" addition: better comments.

# This is a BitKeeper patch.  What follows are the unified diffs for the
# set of deltas contained in the patch.  The rest of the patch, the part
# that BitKeeper cares about, is below these diffs.
# User:	sergefp
# Host:	newbox.mylan
# Root:	/home/psergey/mysql-5.1-ppruning-r4

--- 1.176/sql/handler.h	2006-01-04 11:08:49 +03:00
+++ 1.177/sql/handler.h	2006-01-05 17:46:58 +03:00
@@ -690,11 +690,10 @@
   /* 
     A bitmap of partitions used by the current query. 
     Usage pattern:
-    * It is guaranteed that all partitions are set to be unused on query start.
+    * The handler->extra(HA_EXTRA_RESET) call at query start/end sets all
+      partitions to be unused.
     * Before index/rnd_init(), partition pruning code sets the bits for used
       partitions.
-    * The handler->extra(HA_EXTRA_RESET) call at query end sets all partitions
-      to be unused.
   */
   MY_BITMAP used_partitions;
 

--- 1.194/sql/opt_range.cc	2006-01-04 11:08:49 +03:00
+++ 1.195/sql/opt_range.cc	2006-01-05 17:46:58 +03:00
@@ -2115,7 +2115,7 @@
        The list of intervals we'll obtain will look like this:
        ((t1.a, t1.b) = (1,'foo')),
        ((t1.a, t1.b) = (2,'bar')), 
-       ((t1,a, t1.b) > (10,'zz'))  (**)
+       ((t1,a, t1.b) > (10,'zz'))
        
     2. for each interval I 
        {
@@ -2574,30 +2574,95 @@
     This function 
       * recursively walks the SEL_ARG* tree collecting partitioning "intervals"
       * finds the partitions one needs to use to get rows in these intervals
-      * marks these partitions as used
-  
-  NOTES    
-    WHAT IS CONSIDERED TO BE "INTERVALS"
-    A partition pruning "interval" is equivalent to condition in one of the 
-    forms:
+      * marks these partitions as used.
+    The next session desribes the process in greater detail.
+ 
+  IMPLEMENTATION
+    TYPES OF RESTRICTIONS THAT WE CAN OBTAIN PARTITIONS FOR    
+    We can find out which [sub]partitions to use if we obtain restrictions on 
+    [sub]partitioning fields in the following form:
+    1.  "partition_field1=const1 AND ... AND partition_fieldN=constN"
+    1.1  Same as (1) but for subpartition fields
+
+    If partitioning supports interval analysis (i.e. partitioning is a
+    function of a single table field, and partition_info::
+    get_part_iter_for_interval != NULL), then we can also use condition in
+    this form:
+    2.  "const1 <=? partition_field <=? const2"
+    2.1  Same as (2) but for subpartition_field
+
+    INFERRING THE RESTRICTIONS FROM SEL_ARG TREE
     
-    "partition_field1=const1 AND ... AND partition_fieldN=constN"        (1)
-    "subpartition_field1=const1 AND ... AND subpartition_fieldN=constN"  (2)
-    "(1) AND (2)"                                                        (3)
+    The below is an example of what SEL_ARG tree may represent:
     
-    In (1) and (2) all [sub]partitioning fields must be used, and "x=const"
-    includes "x IS NULL". 
+    (start)
+     |                           $
+     |   Partitioning keyparts   $  subpartitioning keyparts
+     |                           $
+     |     ...          ...      $
+     |      |            |       $
+     | +---------+  +---------+  $  +-----------+  +-----------+
+     \-| par1=c1 |--| par2=c2 |-----| subpar1=c3|--| subpar2=c5|
+       +---------+  +---------+  $  +-----------+  +-----------+
+            |                    $        |             |
+            |                    $        |        +-----------+ 
+            |                    $        |        | subpar2=c6|
+            |                    $        |        +-----------+ 
+            |                    $        |
+            |                    $  +-----------+  +-----------+
+            |                    $  | subpar1=c4|--| subpar2=c8|
+            |                    $  +-----------+  +-----------+
+            |                    $         
+            |                    $
+       +---------+               $  +------------+  +------------+
+       | par1=c2 |------------------| subpar1=c10|--| subpar2=c12|
+       +---------+               $  +------------+  +------------+
+            |                    $
+           ...                   $
+
+    The up-down connections are connections via SEL_ARG::left and
+    SEL_ARG::right. A horizontal connection to the right is the
+    SEL_ARG::next_key_part connection.
     
-    If partitioning is performed using 
-       
-       PARTITION BY RANGE(unary_monotonic_func(single_partition_field)),
-       
-    then the following is also an interval:
+    find_used_partitions() traverses the entire tree via recursion on
+     * SEL_ARG::next_key_part (from left to right on the picture)
+     * SEL_ARG::left|right (up/down on the pic). Left-right recursion is
+       performed for each depth level.
+    
+    Recursion descent on SEL_ARG::next_key_part is used to accumulate (in
+    ppar->arg_stack) constraints on partitioning and subpartitioning fields.
+    For the example in the above picture, one of stack states is:
+      in find_used_partitions(key_tree = "subpar2=c5") (***)
+      in find_used_partitions(key_tree = "subpar1=c3")
+      in find_used_partitions(key_tree = "par2=c2")   (**)
+      in find_used_partitions(key_tree = "par1=c1")
+      in prune_partitions(...)
+    We apply partitioning limits as soon as possible, e.g. when we reach the
+    depth (**), we find which partition(s) correspond to "par1=c1 AND par2=c2",
+    and save them in ppar->part_iter.
+    When we reach the depth (***), we find which subpartition(s) correspond to
+    "subpar1=c3 AND subpar2=c5", and then mark appropriate subpartitions in
+    appropriate subpartitions as used.
+    
+    It is possible that constraints on some partitioning fields are missing.
+    For the above example, consider this stack state:
+      in find_used_partitions(key_tree = "subpar2=c12") (***)
+      in find_used_partitions(key_tree = "subpar1=c10")
+      in find_used_partitions(key_tree = "par1=c2")
+      in prune_partitions(...)
+    Here we don't have constraints for all partitioning fields. Since we've
+    never set the ppar->part_iter to contain used set of partitions, we use
+    its default "all partitions" value.  We get  subpartition id for 
+    "subpar1=c3 AND subpar2=c5", and mark that subpartition as used in every
+    partition.
+
+    The inverse is also possible: we may get constraints on partitioning
+    fields, but not constraints on subpartitioning fields. In that case,
+    calls to find_used_partitions() with depth below (**) will return -1,
+    and we will mark entire partition as used.
 
-           "   const1 OP1 single_partition_field OP2 const2"              (4)
-     
-    where OP1 and OP2 are '<' OR '<=', and const_i can be +/- inf.
-    Everything else is not a partition pruning "interval".
+  TODO
+    Replace recursion on SEL_ARG::left and SEL_ARG::right with a loop
 
   RETURN
     1   OK, one or more [sub]partitions are marked as used.

--- 1.21/sql/sql_partition.cc	2006-01-05 03:12:44 +03:00
+++ 1.22/sql/sql_partition.cc	2006-01-05 17:46:58 +03:00
@@ -3673,11 +3673,11 @@
 
   DESCRIPTION
     Initialize partition set iterator to walk over the interval in
-    ordered-list-of-partitions (for RANGE partitioning) or 
-    ordered-list-of-list-constants (for LIST partitioning) space.
+    ordered-array-of-partitions (for RANGE partitioning) or 
+    ordered-array-of-list-constants (for LIST partitioning) space.
 
   IMPLEMENTATION
-    This function is applied when partitioning is done by
+    This function is used when partitioning is done by
     <RANGE|LIST>(ascending_func(t.field)), and we can map an interval in
     t.field space into a sub-array of partition_info::range_int_array or
     partition_info::list_array (see get_partition_id_range_for_endpoint,
@@ -3686,7 +3686,7 @@
     The function performs this interval mapping, and sets the iterator to
     traverse the sub-array and return appropriate partitions.
     
-  RETURN 
+  RETURN
     0 - No matching partitions (iterator not initialized)
     1 - Ok, iterator intialized for traversal of matching partitions.
    -1 - All partitions would match (iterator not initialized)
@@ -3760,7 +3760,7 @@
 
 
 /*
-  Partitioning Interval Analysis: Initialize iterator to walk integer interval
+  Partitioning Interval Analysis: Initialize iterator to walk field interval
 
   SYNOPSIS
     get_part_iter_for_interval_via_walking()
@@ -3776,7 +3776,8 @@
   DESCRIPTION
     Initialize partition set iterator to walk over interval in integer field
     space. That is, for "const1 <=? t.field <=? const2" interval, initialize 
-    the iterator to do this: 
+    the iterator to return a set of [sub]partitions obtained with the
+    following procedure:
       get partition id for t.field = const1,   return it
       get partition id for t.field = const1+1, return it
        ...                 t.field = const1+2, ...
@@ -3790,7 +3791,7 @@
       "c1 <=? t.field <=? c2", where c1 and c2 are finite. 
     Intervals with +inf/-inf, and [NULL, c1] interval can be processed but
     that is more tricky and I don't have time to do it right now.
-    
+
     Additionally we have these requirements:
     * number of values in the interval must be less then number of
       [sub]partitions, and 
@@ -3799,7 +3800,7 @@
     The rationale behind these requirements is that if they are not met
     we're likely to hit most of the partitions and traversing the interval
     will only add overhead. So it's better return "all partitions used" in
-    this case.
+    that case.
 
   RETURN
     0 - No matching partitions, iterator not initialized
@@ -3917,7 +3918,7 @@
       part_iter  Partition set iterator structure
 
   DESCRIPTION
-    This is special implementation of PARTITION_ITERATOR::get_next() for
+    This implementation of PARTITION_ITERATOR::get_next() is special for 
     LIST partitioning: it enumerates partition ids in 
     part_info->list_array[i] where i runs over [min_idx, max_idx] interval.
 
@@ -3937,13 +3938,16 @@
 
 
 /*
-  PARTITION_ITERATOR::get_next implementation: walk over integer interval
+  PARTITION_ITERATOR::get_next implementation: walk over field-space interval
 
   SYNOPSIS
     get_next_partition_via_walking()
       part_iter  Partitioning iterator
 
   DESCRIPTION
+    This implementation of PARTITION_ITERATOR::get_next() returns ids of
+    partitions that contain records with partitioning field value within
+    [start_val, end_val] interval.
 
   RETURN 
     partition id
Thread
bk commit into 5.1 tree (sergefp:1.2031)Sergey Petrunia5 Jan