From: Michael R. Gile
Date: April 19 1999 12:22am
Subject: Re: Serious sum() bug in MySQL 3.22.21
List-Archive: http://lists.mysql.com/mysql/2042
Message-Id: <4.2.0.32.19990418185559.03f35ba0@ns2.wsg.net>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
At 11:42 PM 4/18/1999 +0300, Michael Widenius wrote:
>Hi!
>
>>>>>> "Michael" == Michael R Gile writes:
>
>>> Hi!
>>>
>>> This is probably a bug in the FreeBSD math library.
>
>Michael> as you can see, monty doesn't like freebsd (and is very quick to blame it)
>
>As I at the same time got a mail that stated 'Works find on intel
>linux 2.0.36', I was inclined to think this was a precision bug in the atof()
>function. I don't have anything against FreeBsd.
Then I retract my statement. I was referring to some previous responses from you where you have indicated your strong preference of Linux over FreeBSD. Even if this particular problem occured on FreeBSD and not Linux, that could indicate that Mysql used a function in a manner different from its original intention. In any case, I must have misinterpreted your statement. However, this all has little bearing on the original questions, so I will try to return to the subject at hand.
>Michael> however, i get the same (-0.00) result on the following systems
>Michael> yes, i know they are not the most recent version, but should at least indicate that it isn't just a "freebsd" bug
>
>Michael> freebsd 2.2.7 Mysql 3.22.14b-gamma
>Michael> RedHat Linux 5.2 Mysql 3.22.10-beta
>Michael> solaris 2.5 Mysql 3.21.33b
>
>Michael> (note that the `dc' application produces 0 in all cases, not -0.00)
>
>>> Note that its common practice in most languages to NEVER compare a
>>> floating point value that may have fractions with '=', but instead
>>> always use:
>>>
>>> total between -0.5 and 0.5
>
>Michael> this is good advice. why do you not use it in Mysql (i take it this is the question you answer next)?
>
>>> Note that in MySQL 3.23 you will have not fixed point float fields, so
>>> in this case the sum above will be a very small value <> 0
>>>
>>> I think that if I change the '=' compare in MySQL to be 'smarter' it
>>> will brake more applications than it will fix :(
>
>Michael> don't you think that *NOT* having fixed point float fields will break more applications than it will fix? it is my understanding that all many SQL applications depend on the fact that the values are fixed point.
>
>The problem is not that easy to solve genreally within MySQL. If
>MySQL did some rounding when doing comparisons you will get problems
>with queries like:
>
>SELECT * from table_name WHERE floating_point_value > 0
>
>and
>
>SELECT * from table_name WHERE floating_point_value = 0
>
>The same row will be found in both of the above queries !
>(This is even more serious than the original problem)
>
>Note that if you declare your column as:
>
>amount decimal(10,2)
>
>This means that the stored value will be exact.
as in the example, yes.
so what you are saying is that Mysql stores the data exactly (not floating)
>This doesn't guarantee that the sum() of the values will be exact or
>even that a comparison of type:
>
>select * from table where amount = 10.95
>
>will find any rows. This is because 10.50 is here given as an
>floating point value, which is not an exact value for a computer.
>
>(MySQL for example calculates sum() with double precision)
so even though Mysql _STORES_ data as a fixed point, it doesn't perform fixed point math.
>I shall add to crash-me a test how different databases solves this,
>but I fear the result is as it's in other languages:
well, i know that if my calculator did this with a simple sum, i would replace it.
>You should never compare a floating point value with =
I think the distinction here is that no one was aware of the fact that the sum of a fixed point field was calculated with floating point math.
I don't believe anyone disagrees that it is bad to compare floating points with equals, but lets get back to the original question, and illustrate why.
tcobb> INSERT INTO test_decimal_bug VALUES (1,19.95);
tcobb> INSERT INTO test_decimal_bug VALUES (2,0.00);
tcobb> INSERT INTO test_decimal_bug VALUES (3,-39.95);
tcobb> INSERT INTO test_decimal_bug VALUES (4,20.00);
Since the 19.95 and 39.95 both have the same fraction, we may wonder why this answer is being presented.
Well, let us try to repeat this with a simple C program to verify that Mysql is not at fault.
#include
main() {
printf("%f + %f + %f = %2.20f\n", 19.95, -39.95, 20.00,
(19.95-39.95+20.00) );
printf("%f + %f + %f = %2.20f\n", 20.00, 19.95, -39.95,
(20.00+19.95-39.95) );
}
On almost every C compiler I could find, this yeilds
19.950000 + -39.950000 + 20.000000 = -0.00000000000000355271
20.000000 + 19.950000 + -39.950000 = 0.00000000000000000000
As Monty says, this shows that the number is indeed less than zero.
The reason why the first result is not equal to the second one is that there are still a fixed number of bits in the floating point values. Since 39 > 2^5, and 19.95 < 2^5, they have a DIFFERENT NUMBER of remaining bits for the fraction. This error is propagated to the answer.
Now that we know the source of the problem, what is the solution?
I believe that traditional databases like COBOL use BCD (binary coded decimals) for all math. However, the only commercial C++ compiler that I am aware of which is shipped with a BCD library is Borland/Inprise (which is certainly not available for UNIX).
>Regards,
>Monty
>
>PS: If anyone can give me a set of rules that solves this and will
> ALWAYS work, I am more than happy to implement these.
>
> I have thought about defining the comparison for fixed floating
> point values to:
>
> floor(value1 * 10^decimals) op floor(value2 * 10^decimals)
>
> but this will only shift the precision and give other problems
> with larger decimal values.
I think we now understand that all decimal math in Mysql is performed in floating point. I think that the Mysql documenation needs make this VERY clear that the decimal(x,y) notation is purely for storage, and HAS NO EFFECT ON CALCULATIONS. Aside from locating a BCD library, I agree with you now that there is no point in creating new problems by using any of these hacks.
What I _personally_ would like to see is that any operations which are performed on floating point variables should produce a floating point result which makes it very clear that the variable is not an fixed point value. In other words, if the result is -0.00000000000000355271, then show it (and more importantly, return it to the perl and c Mysql libraries). This should make it more clear that floating point math is involved, just as if the variable was calculated from within the host language. If someone wants a less precise result, then force them to round() it.
In summary, I (and the originator of this message) had wrongly assumed that the fixed point storage of data of Mysql also meant that calculations were performed with fixed point math. Thanks Monty, for clearing up this issue.
======================================================
Michael Gile gilem@stripped
President (518)435-0682
Web Services Group http://www.wsg.net/