Hello Darla and Martin,
I was completely wrong here, thank you, Martin, for correcting
me. Your explanation to the behavior of rand() was absolutely correct.
The most funniest thing is that I was asked about a month ago
very similar question as this was and I gave very similar answer
as Martin's was and now I'm falling to same little trap myself ;-)
....very silly me....
I will show you the previous mail, it will show you how confusing the
rand() can be sometimes. In the end there is also comment from Monty
which states that the behaviour of RAND() will not be changed, at
least for a while. And maybe I learn something from myself too... ;-)
----------------------------------------------------------------------
>> Hello Jamppa.
>> Can you explain this.
>> Check the value from TAL
>>
mysql> select * from forening;
>> +-------------+------------------------+
>> | Föreningsnr | Föreningsnamn |
>> +-------------+------------------------+
>> | 29 | OK Västra Sverige |
>> | 34 | OK Norrbotten |
>> | 36 | OK Syd |
>> | 37 | OK Stockholm |
>> | 40 | OK Värmland |
>> | 41 | OK Västerbotten |
>> | 53 | OK Köping |
>> | 58 | OK Gotland |
>> | 65 | OK Örnsköldsvik |
>> | 68 | OK Piteå |
>> | 50 | OKP Stationsutveckling |
>> | 00 | Okänd |
>> +-------------+------------------------+
>> 12 rows in set (0.01 sec)
>>
mysql> select föreningsnamn ,RAND() AS TAL from forening having TAL <
mysql> 0.1;
>> +---------------+--------+
>> | föreningsnamn | TAL |
>> +---------------+--------+
>> | OK Köping | 0.6521 | <-- Grather than 0.1
>> | OK Gotland | 0.5186 | <-- Grather than 0.1
>> | Okänd | 0.1073 | <-- Grather than 0.1
>> +---------------+--------+
>> 3 rows in set (0.00 sec)
>>
>>
>> Allan
Jani> Hello Allan,
Jani> This is very interesting issue!
Jani> Logically thinking it seems to be a bug at first, but
Jani> on the second thought it is correct behavior...
Jani> If you write:
Jani> select föreningsnamn ,RAND() AS TAL from forening having TAL < 0.1;
Jani> it is the same thing as if you write:
Jani> select föreningsnamn ,RAND() AS TAL from forening having RAND() < 0.1;
> This is correct.
Jani> But from the second query one can see that the first RAND() gives
Jani> different values than the second RAND(). Now the values shown as
Jani> 'TAL' are not the same values that are in the 'having' -clause test;
Jani> the latter ones we cannot know anything about.
Jani> The question is that should MySQL store the values it gets
Jani> from the first part of the query (...RAND() as TAL...) and
Jani> use them in the test or not? The RAND() function is the only
Jani> one, which can cause this kind of behavior, I think.
> Yes.
Jani> I wonder what Monty says about it, but I'm afraid that fixing
Jani> this would require planning the HAVING function in MySQL in
Jani> totally new way, it is probably a rather big job...
> Yes; I haven't any plans to fix this in the near future as one in
> cases like this doesn't need the RAND() column in the output for anything...
>
> Regards,
> Monty
----------------------------------------------------------------------
Darla Baker writes:
> I think the thing I find most interesting is that I got the
> original idea from
> reading the O'Reilly book "MySQL and mSQL" where on page 110 there is an
> example which uses virtually the same query:
>
> SELECT name
> FROM entries
> WHERE id = ROUND( (RAND()*34) + 1)
>
> and in the comments it states "...which should match one of the
> ID numbers in
> the table."
>
> Darla
Have to say, the book is wrong here. It will return from 0 to 34 rows,
34 with very little probability. Let's assume there are rows 1 - 34 in
the id.
In principle it goes like this:
MySQL finds the first row from the table, which contains some number
in id, for example number 5. (There is no rule in which order the rows
will be tested, it is up to MySQL what the program desides is best
choice)
Then MySQL does a test: is 5 = ROUND( (RAND()*34) + 1)
Well ROUND( (RAND()*34) + 1) will return 5 with probability
1/34 and with good luck the number 5 is returned as id.
Then MySQL finds the next row from the table, let's say id is this
time 6 and the test is 6 = ROUND( (RAND()*34) + 1) ? Same 1/34
probability again. And so on the testing goes like that until all the
rows in the table are tested. When the result is then printed the
'lucky' id's from the table are printed, probably none.
This is exactly same as Martin explains below, with just little
different words.
> Martin Ramsch wrote:
>
> > On Sun, 1999-10-03 19:22:40 +0300, Jani Tolonen wrote:
> > > This is a bug. rand() doesn't work in the where statement. We'll
> > > fix this ASAP, send the patch to list and fix this in the newer
> > > versions of MySQL.
> >
> > Maybe I'm completely wrong here, but I don't think that Darla Baker
> > has found a bug in the basic sense ...
> >
> > Example (using MySQL V3.22.19b):
> >
> > CREATE TABLE tmp ( id INT );
> > INSERT INTO tmp VALUES (1), (2), (3), (4);
> > SELECT * FROM tmp WHERE id=CEILING(RAND()*4);
> >
> > The big question is, when does RAND() compute a random number?
> > Only once for the complete select statement?
> > Or once for each row?
> >
> > The current behaviour is to compute a new random number for each row,
> > so for the above select statement the probability to get a certain
> > number of result rows is:
> > 75% * 75% * 75% * 75% * (0 over 4) = 31.6% for no row
> > 25% * 75% * 75% * 75% * (1 over 4) = 42.2% for 1 row
> > 25% * 25% * 75% * 75% * (2 over 4) = 21.1% for 2 rows
> > 25% * 25% * 25% * 75% * (3 over 4) = 4.7% for 3 rows
> > 25% * 25% * 25% * 25% * (4 over 4) = 0.4% for 4 rows
> > And actually, running the query 1000 times I got
> > 300, 439, 214, 43, and 4 results with 0, 1, 2, 3, and 4 rows.
> > So it does work!
> >
> > I think, the 'once per row' behaviour makes most sense if RAND() is
> > used in the SELECT part or with SET in UPDATEs. And 'once per select'
> > makes more sense in the WHERE and HAVING part.
> >
> > Think of an UPDATE statement like this:
> > "Update a random row with a random value"
> >
> > UPDATE tmp SET id=CEILING(RAND()*4) WHERE id=CEILING(RAND()*4);
> > ^^^^^^term1^^^^^^ ^^^^^^term2^^^^^^
> >
> > Here one would expect, that term2 is computed first giving a certain
> > fixed random value which is used to choose the relevant columns, and
> > then for each of these columns term1 is evaluated to give a new random
> > number each.
This is correct.
> > Is the 'bug fix' about changing behaviour along these lines?
> >
> > Regards,
> > Martin
> > --
No. There will be no 'bug fix' in this case, I was wrong.
Maybe a 'behavior' change in the future, but just maybe.
The problem is that changing the behavior of rand() would require
changing the whole select and where structure in MySQL and it is
not a small job. And there is always the question that which is
actually the expected behavior; the one that feels right at
the first thought, or the second after thinking little further
the whole thing. I guess the current behavior is correct after all.
Just one more example:
mysql> select rand(),rand(),rand() from test limit 3;
+------------------+-------------------+------------------+
| rand() | rand() | rand() |
+------------------+-------------------+------------------+
| 0.51334429021305 | 0.43675433978136 | 0.6437388590013 |
| 0.90843130639608 | 0.61093998571014 | 0.3294060400961 |
| 0.81421027408374 | 0.082833280864016 | 0.97153561280252 |
+------------------+-------------------+------------------+
3 rows in set (0.00 sec)
All the values differ. And it doesn't matter where the 'rand()' is, in
the actual select, where, having, etc. part of the query; the value is
always evaluated again and again, for any rand() that exists.
I am very sorry, if I managed to cause confusion with my previous
mail. Apologies.
Best Regards,
- Jani
--
+-------------------------------------------------------------+
| TcX ____ __ _____ _____ ___ |
| /*/\*\/\*\ /*/ \*\ /*/ \*\ |*| Jani Tolonen |
| /*/ /*/ /*/ \*\_ |*| |*||*| jani@stripped |
| /*/ /*/ /*/\*\/*/ \*\|*| |*||*| Helsinki |
| /*/ /*/ /*/\*\_/*/ \*\_/*/ |*|_____ Finland |
| ^^^^^^^^^^^^/*/^^^^^^^^^^^\*\^^^^^^^^^^^ |
| /*/ \*\ Developers Team |
+-------------------------------------------------------------+