There are a couple of problems with using any database for doing this.
- Rows in a table are inherently in no particular order. That means they are
neither sorted nor random.
- Depending upon the keys you are using (an auto-increment field for example),
you might be able to select a random row; but that wouldn't guarantee that you
wouldn't get the same row twice.
There is a way around this that sounds tempting, but it would be hideously
1. Create a copy of your original table, `t2`, with an auto-increment index
`aui` (either from the original table or created at this point.
2. Create yet another table, `t3`, with the same structure but no data in it.
3. Select a random record from `t2` using
SELECT ROUND(RAND()*MAX(`t2`.`aui`)) FROM `t2` INTO @RANDREC;
4. Copy that randomly-selected record into `t3` using
INSERT INTO `t3` SELECT * FROM `t2` WHERE `t2`.`aui` = @RANDREC;
5. Remove the selected record from `t2` using
DELETE FROM `t2` WHERE `t2`.`aui` = @RANDREC;
6. Loop back to step 3 until `t2` is empty.
Here's why, tempting as it is, it wouldn't work: as the number of records in
`t2` dwindles, it will become less and less likely that @RANDREC will actually
match an existing value of `t2`.`aui`. By the time you got down to a handful
of records, it might take years to find one.
Now if the LIMIT clause accepted variables instead of constants, you could
replace steps 3 and 4 with something like this:
SELECT ROUND(RAND()*COUNT(`t2`.`aui`)) INTO @RANDSTART;
SELECT FROM `t2` LIMIT @RANDSTART, 1;
And that would actually work (give or take some corner cases I haven't thought
about very hard).
In point of fact, that's basically what you need to do whether you do it in
MySQL or in an external program. The difference is that in an external program
you can effectively renumber your rows as you remove them, so that it won't
get harder and harder to find an actual row. Even then, a random shuffle is
likely to be time-consuming.
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032
860.674.8796 / FAX: 860.674.8341
Web site: www.the-infoshop.com
>From: Adarsh Sharma [mailto:adarsh.sharma@stripped]
>Sent: Wednesday, July 13, 2011 9:59 AM
>Subject: Re: How to Shuffle data
>Do i need to move output to a CSV file & then write a Java or C code for it.
>I think a procedure can do it easily , If I know the proper function for
>> What you want to do is not shuffle, but reverse the domain order, starting
>> from the TLD (top level domain). You may pick the string functions you
>> need from the string functions page of the manual, but the problem is that
>> there may be any number of subdomains, so that in principle you would need
>> a loop construct. If I were in your position, I wouldn't use MySQL to do
>>> Dear all,
>>> I have million of sites stored in url column of a mysql table.
>>> Some few examples are :-
>>> I want to take the output in a tsv file the sites url in the below forms :
>>> I need to shuffle the . words . Is there any in built function in mysql
>>> to achieve this.
>>> MySQL General Mailing List
>>> For list archives: http://lists.mysql.com/mysql
>>> To unsubscribe: