Neroux,
RAND() gives a roughly rectangular distribution, not a normal
distribution, so samples of fewer than 100 or so values from it are
likely to show large but statistically _insignificant_ differences. In
larger samples, I don't see any tendency of RAND() to produce more
values near its mean (.5) than near its limits (0,1). You can check this
yourself by populating a table with RAND() values then running a simple
crosstab query eg
SELECT
SUM(IF(r<0.1,1,0)) AS P1,
SUM(IF(r>=0.1 AND r<0.2,1,0)) AS P2,
SUM(IF(r>=0.2 AND r<0.3,1,0)) AS P3,
SUM(IF(r>=0.3 AND r<0.4,1,0)) AS P4,
SUM(IF(r>=0.4 AND r<0.5,1,0)) AS P5,
SUM(IF(r>=0.5 AND r<0.6,1,0)) AS P6,
SUM(IF(r>=0.6 AND r<0.7,1,0)) AS P7,
SUM(IF(r>=0.7 AND r<0.8,1,0)) AS P8,
SUM(IF(r>=0.8 AND r<0.9,1,0)) AS P9,
SUM(IF(r>=0.9 AND r<1.0,1,0)) AS P10
FROM randresults;
PB
-----
neroux wrote:
> Hello,
>
> I am having a table with an integer column which I use for probability
> calculations (the higher a value the more often it is chosen).
>
> Now I am having the following query, which should actually incorporate
> these probabilities, however it seems to prefer values from the middle
> range
>
> SELECT * FROM table WHERE field>=(SELECT RAND()*MAX(field) FROM
> table) ORDER BY field LIMIT 1
>
> If I split it up it still seems to prefer mid-range values over values
> closer to the higher end, however it doesnt "completely" ignore the
> highest value anymore.
>
> random_value = SELECT RAND()*MAX(field) FROM table;
> SELECT * FROM table WHERE field>=random_value ORDER BY field LIMIT 1
>
> Does anyone have an explanation for that or a better solution? Thanks
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
>
--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.375 / Virus Database: 267.15.11/264 - Release Date: 2/17/2006