List:Commits« Previous MessageNext Message »
From:mhansson Date:March 2 2007 7:03pm
Subject:bk commit into 5.1 tree (mhansson:1.2435) BUG#24791
View as plain text  
Below is the list of changes that have just been committed into a local
5.1 repository of martin. When martin does a push these changes will
be propagated to the main repository and, within 24 hours after the
push, to the public repository.
For information on how to access the public repository
see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html

ChangeSet@stripped, 2007-03-02 19:03:01+01:00, mhansson@stripped +6 -0
  Bug#24791: Union with AVG-groups generates wrong results
  
  A UNION query in MySQL creates a temporary table for each SELECT clause
  in the query, to store the result of the partial query. The procedure 
  that stores these results was being too meticulous in trying to store 
  FLOAT and DOUBLE values with exact precision, causing values that had too
  high precision to lose integer digits in favor of decimal digits. This 
  can, if the maximum precision is, say, 5 significant digits, lead to a 
  value of 123.000 to be returned from a UNION query as 10.000. 
  
  The maximum precision for getting non-approximate values is hardware 
  specific. Please refer to
  http://dev.mysql.com/doc/internals/en/floating-point-types.html
  for a discussion of this.
  
  The obvious fix is of course to switch to non-fixed precision
  (using approximatations for all values with too many significant
  digits). This is how normal INSERT statements work. 
  
  Here we run into a design problem. The class Item_type_holder 
  represents fields in the result tuple, and it is this class that has
  the final word in setting the number of significant digits prior to
  storing. However, at this point (The method
  Item_type_holder::join_types), all field metadata is gone.
  
  Item_type_holders are constructed from Item objects, a questionable
  design, but this is a fact. So there is no option but to create a 
  means of passing the information about significant digits (or more
  generally speaking, minimum field length) on the Item objects. We do this
  with the member min_length.
  
  Hence, Item_field objects initialize their min_length (as the new field
  is called) from their Field's field_length, and this will be propagated
  all the way to the Item_type_holder through its constructor.
  
  But there is one more twist to the story: Division operations, / and 
  AVG() increase the precision of the result according to the system
  variable div_prec_increment, so when the Item's that represent these
  operations fixate their precision, they must also increase their
  minimum field length.

  mysql-test/r/temp_table.result@stripped, 2007-03-02 19:02:59+01:00, mhansson@stripped
+21 -0
    Bug#24791
    
    correct results

  mysql-test/t/temp_table.test@stripped, 2007-03-02 19:02:59+01:00, mhansson@stripped
+16 -0
    Bug#24791
    
    Test case. The test
    - A table where the precision goes above max exact precision for AVG
    - A table where the precision goes above max exact precision for /
    - A table where the precision is above max exact precision already.
    
    In order to cover all the code, the field with too-great precision
    must be placed both first and non-first in a union.

  sql/item.cc@stripped, 2007-03-02 19:02:59+01:00, mhansson@stripped +41 -1
    Bug#24791
    
    Delta 1 and 2:
    The new field Item::min_length is initialized to its default value 0.
    
    Delta 3:
    When creating an Item_field the min_length is initialized to the 
    field's field_length. This guarantees that a FLOAT or DOUBLE
    value from the field will not be stored in a temporary table with a 
    smaller maximum value than the original field can hold.
    
    Delta 4: 
    Commented Item_type_holder::join_types
    
    Delta 5:
    Item_type_holder::join types is modified so that it will detect whenever
    it sets its max_length less than its or its argument's min_length. If 
    this happened it should not try to keep a fixed number of decimals. 
    
    If the above is not done, and some item more decimal places we may 
    lose integer places if we "hit the ceiling" for number of allowed 
    significant digits for FLOAT and DOUBLE in a table. These are 
    hard-coded in the line above to FLT_DIG+6 and 
    DBL_DIG+7 for float and double, respectively.
    
    Delta 6:
    The comment speaks for itself.

  sql/item.h@stripped, 2007-03-02 19:02:59+01:00, mhansson@stripped +30 -0
    Bug#24791
    
    Delta 1:
    Commented field max_length
    
    Delta 2:
    New field min_length

  sql/item_func.cc@stripped, 2007-03-02 19:02:59+01:00, mhansson@stripped +4 -3
    Bug#24791
    
    - Commented the method fix_length_and_dec
    - If number of decimals is updated, so should the minimum number of 
    significant digits.

  sql/item_sum.cc@stripped, 2007-03-02 19:02:59+01:00, mhansson@stripped +20 -2
    Bug#24791
    
    - If number of decimals is updated, so should the minimum number of 
    significant digits.

# This is a BitKeeper patch.  What follows are the unified diffs for the
# set of deltas contained in the patch.  The rest of the patch, the part
# that BitKeeper cares about, is below these diffs.
# User:	mhansson
# Host:	linux-st28.site
# Root:	/home/martin/mysql/src/5.1o-bug24791

--- 1.241/sql/item.cc	2007-02-14 01:09:33 +01:00
+++ 1.242/sql/item.cc	2007-03-02 19:02:59 +01:00
@@ -336,7 +336,8 @@ int Item::save_date_in_field(Field *fiel
 Item::Item():
   rsize(0), name(0), orig_name(0), name_length(0), fixed(0),
   is_autogenerated_name(TRUE),
-  collation(&my_charset_bin, DERIVATION_COERCIBLE)
+  collation(&my_charset_bin, DERIVATION_COERCIBLE),
+  min_length(0)
 {
   marker= 0;
   maybe_null=null_value=with_sum_func=unsigned_flag=0;
@@ -374,6 +375,7 @@ Item::Item(THD *thd, Item *item):
   name(item->name),
   orig_name(item->orig_name),
   max_length(item->max_length),
+  min_length(item->min_length),
   marker(item->marker),
   decimals(item->decimals),
   maybe_null(item->maybe_null),
@@ -1715,6 +1717,7 @@ void Item_field::set_field(Field *field_
   maybe_null=field->maybe_null();
   decimals= field->decimals();
   max_length= field_par->max_display_length();
+  min_length= field_par->field_length;
   table_name= *field_par->table_name;
   field_name= field_par->field_name;
   db_name= field_par->table->s->db.str;
@@ -6376,6 +6379,37 @@ enum_field_types Item_type_holder::get_r
     thd     thread handler
     item    given item to join its parameters with this item ones
 
+  DESCRIPTION
+    This method is called for inferring the result types of sql UNIONs. For
+    a query 
+    
+      SELECT c_1_1, ..., c_1_n
+      UNION
+      SELECT c_2_1, ..., c_2_n
+      UNION
+      ...
+      UNION
+      SELECT c_m_1, ..., c_n_n
+
+    Each result field c_1_j is represented by an Item_type_holder and its
+    join_types will be called for each c_i_j in order to infer the type of the
+    corresponding field in the result tuple. In each call, this method will 
+    update max_length and fld_type in order to find a type that can contain
+    values of fields c_1_j and c_i_j. This type is known as the merge type. After
+    all UNIONs have been processed, each Item_type_holder is used to form a
+    result tuple.
+
+    If the merge type is inferred to be FLOAT or DOUBLE with 
+    fixed number of decimals and precision (FLOAT(M, D) or DOUBLE(M,D) 
+    respectively), the hardware's "hard limit" comes into play, enforcing a 
+    maximum value based on the number of decimal places. This hard limit is 
+    hard-coded to FLT_DIG+6 and DBL_DIG+7, respectively.
+
+    Setting min_length to a value > 0 for any item passed to join_types
+    tells the algorithm to switch to a non-fixed number of decimal places if the
+    max_length is lower than this value, which will in turn tell the Field 
+    objects not to attempt to store a precise value.
+
   RETURN
     TRUE   error - types are incompatible
     FALSE  OK
@@ -6451,6 +6485,12 @@ bool Item_type_holder::join_types(THD *t
     }
     else
       max_length= (fld_type == MYSQL_TYPE_FLOAT) ? FLT_DIG+6 : DBL_DIG+7;
+    /* 
+       Switch to non-fixed number of decimals if there is a risk that
+       a value may get too large to fit within the new max_length.
+    */
+    if (max_length < min_length || max_length < item->min_length)
+      decimals= NOT_FIXED_DEC;
     break;
   }
   default:

--- 1.224/sql/item.h	2007-02-02 23:22:13 +01:00
+++ 1.225/sql/item.h	2007-03-02 19:02:59 +01:00
@@ -475,7 +475,37 @@ public:
   /* Original item name (if it was renamed)*/
   my_string orig_name;
   Item *next;
+
+  /*
+    Upper bound on the number of characters neccessary for representing this 
+    Item as a string. If the item represents a number with specified precision,
+    this is also the number of digits that can be stored for the number.
+
+    max_length is often associated with display length. But there are 
+    exceptions, most notably in type holders for temporary tables that evaluate
+    the type to C float or double. There the max_length suddenly denotes the 
+    maximum length that an item with item->decimals number of decimal places can 
+    be stored as in memory with exact precision.
+   */
   uint32 max_length;
+
+  /*
+    Lower bound on the number of characters neccessary for displaying this Item
+    as a string. If the item represents a number, this is also the number of 
+    digits that must be shown, and in case of temporary tables, stored.  
+
+    Most of the times it is not necessary to use this field, and
+    it can safely be left to its default value 0. However, when field lengths 
+    are chosen for storing fields back into tables - as in UNION queries 
+    (temporary tables) and CREATE TABLE ... AS SELECT ... (regular tables) - 
+    MySQL tries to store DOUBLE and FLOAT values with exact precision if 
+    possible. This may lead to a smaller max_length. But if the exact precision
+    length would be smaller than min_length, approximate values with varaible
+    precision will be used instead.
+
+    For more information, see Item_type_holder::join_types
+   */
+  uint32 min_length;  
   uint name_length;                     /* Length of name */
   int8 marker;
   uint8 decimals;

--- 1.353/sql/item_func.cc	2007-02-13 09:30:00 +01:00
+++ 1.354/sql/item_func.cc	2007-03-02 19:02:59 +01:00
@@ -1299,11 +1299,12 @@ void Item_func_div::fix_length_and_dec()
   switch(hybrid_type) {
   case REAL_RESULT:
   {
-    decimals=max(args[0]->decimals,args[1]->decimals)+prec_increment;
+    decimals= max(args[0]->decimals,args[1]->decimals) + prec_increment;
     set_if_smaller(decimals, NOT_FIXED_DEC);
-    max_length=args[0]->max_length - args[0]->decimals + decimals;
-    uint tmp=float_length(decimals);
+    max_length= args[0]->max_length - args[0]->decimals + decimals;
+    uint tmp= float_length(decimals);
     set_if_smaller(max_length,tmp);
+    min_length= max(args[0]->max_length, args[1]->max_length) + prec_increment;
     break;
   }
   case INT_RESULT:

--- 1.213/sql/item_sum.cc	2007-02-06 15:12:11 +01:00
+++ 1.214/sql/item_sum.cc	2007-03-02 19:02:59 +01:00
@@ -1088,7 +1088,23 @@ void Item_sum_count::cleanup()
 
 
 /*
-  Avgerage
+  Fix field length and number of decimals for average function. 
+
+  SYNOPSIS
+    Item_sum_avg::fix_length_and_dec()
+  
+  DESCRIPTION
+    As calculation of average uses division by definition, we must increase 
+    precision by the system variable div_precision_increment in sql.
+
+  IMPLEMENTATION
+    Fairly straightforward, there is a field for 'decimals' all Item objects,
+    so number of decimals is increased accordingly. Naturally, the length of the
+    field has to be increased also, which is kept in the field  'max_length'. 
+
+  SEE ALSO
+    - Item::max_length
+    - Item::decimals
 */
 void Item_sum_avg::fix_length_and_dec()
 {
@@ -1105,8 +1121,10 @@ void Item_sum_avg::fix_length_and_dec()
     f_scale=  args[0]->decimals;
     dec_bin_size= my_decimal_get_binary_size(f_precision, f_scale);
   }
-  else
+  else {
     decimals= min(args[0]->decimals + prec_increment, NOT_FIXED_DEC);
+    min_length= args[0]->max_length + prec_increment;
+  }
 }
 
 

--- 1.25/mysql-test/r/temp_table.result	2006-09-29 17:39:12 +02:00
+++ 1.26/mysql-test/r/temp_table.result	2007-03-02 19:02:59 +01:00
@@ -163,3 +163,24 @@ select * from t1;
 a
 42
 drop table t1;
+CREATE TABLE t1 ( c FLOAT( 20, 14 ) );
+INSERT INTO t1 VALUES( 12139 );
+CREATE TABLE t2 ( c FLOAT(30,18) );
+INSERT INTO t2 VALUES( 123456 );
+SELECT AVG( c ) FROM t1 UNION SELECT 1;
+AVG( c )
+12139
+1
+SELECT 1 UNION SELECT AVG( c ) FROM t1;
+1
+1
+12139
+SELECT 1 UNION SELECT * FROM t2 UNION SELECT 1;
+1
+1
+123456
+SELECT c/1 FROM t1 UNION SELECT 1;
+c/1
+12139
+1
+DROP TABLE t1, t2;

--- 1.17/mysql-test/t/temp_table.test	2007-01-22 17:42:49 +01:00
+++ 1.18/mysql-test/t/temp_table.test	2007-03-02 19:02:59 +01:00
@@ -176,3 +176,19 @@ truncate t1;
 insert into t1 values (42);
 select * from t1;
 drop table t1;
+
+#
+# Bug #24791  	Union with AVG-groups generates wrong results
+#
+CREATE TABLE t1 ( c FLOAT( 20, 14 ) );
+INSERT INTO t1 VALUES( 12139 );
+
+CREATE TABLE t2 ( c FLOAT(30,18) );
+INSERT INTO t2 VALUES( 123456 );
+
+SELECT AVG( c ) FROM t1 UNION SELECT 1;
+SELECT 1 UNION SELECT AVG( c ) FROM t1;
+SELECT 1 UNION SELECT * FROM t2 UNION SELECT 1;
+SELECT c/1 FROM t1 UNION SELECT 1;
+
+DROP TABLE t1, t2;
Thread
bk commit into 5.1 tree (mhansson:1.2435) BUG#24791mhansson2 Mar