List:General Discussion« Previous MessageNext Message »
From:(Hal Date:July 13 2011 9:03pm
Subject:Re: How to Shuffle data
View as plain text  
>>>> 2011/07/13 16:10 +0530, Adarsh Sharma >>>>
www.facebook.com/home
adelaide.yourguide.com/news/local/news/entertainment/cd-review-day-and-age-the-killers/1401702.aspx
abclive.in/abclive_business/2393.html
abclive.in/abclive_business/assocham_manufacturing_companies.html
abclive.in/abclive_business/b-ramalinga-raju-satyam-financial-irregularities.html
aktualne.centrum.cz/report/krimi/clanek.phtml?id=635342
aktualne.centrum.cz/report/krimi/clanek.phtml?id=635306

I want to take the output in a tsv file the sites url in the below forms :

com.faebook.com/home
com.yourguide.adelaide/news/local/news/entertainment/cd-review-day-and-age-the-killers/1401702.aspx
in.abclive/abclive_business/2393.html
in.abclive/abclive_business/assocham_manufacturing_companies.html
in.abclive/abclive_business/b-ramalinga-raju-satyam-financial-irregularities.html
cz.centrum.aktualne/report/krimi/clanek.phtml?id=635306
cz.centrum.aktualne/report/krimi/clanek.phtml?id=635342

I need to shuffle the . words . Is there any in built function in mysql to achieve this.
<<<<<<<<


Well, this will give you the domain name: SUBSTRING_INDEX(url, '/', 1). After that, you
reallie want a version of "FIND_IN_SET" that takes a number and yields a string, but I
have not seen such in MySQL. That leaves you with "LOCATE" to find each dot, one by one,
and "SUBSTRING" to pick each word out--or nested cases of "SUBSTRING_INDEX":
 SUBSTRING_INDEX(SUBSTRING_INDEX(dom, '.', i), '.', -1)
--and "SUBSTRING_INDEX" is very obliging, the only way, using only it, to determine that
one has reached the limit of separators is that
SUBSTRING_INDEX(dom, ',', i) = SUBSTRING_INDEX(dom, '.', i+1)
. And yes, this is a loop within an SQL procedure or function.

Are you, aside from 'com.faebook.com', only reversing the words? That is much easier than
randomly picking them for the outcome--and guaranteed to be different from the original,
relevant because most domain names are so short that a random permutation of their words
is quite likly to be the same as the original: with only three, the probability is one
sixth, with only twain, one half.

Thread
How to Shuffle dataAdarsh Sharma13 Jul
  • Re: How to Shuffle dataReindl Harald13 Jul
  • Re: How to Shuffle datawalter13 Jul
    • Re: How to Shuffle dataAdarsh Sharma13 Jul
      • RE: How to Shuffle dataJerry Schwartz14 Jul
        • Re: How to Shuffle datashawn wilson14 Jul
      • Re: How to Shuffle datahsv15 Jul
  • Re: How to Shuffle datahsv13 Jul
  • Re: How to Shuffle datashawn wilson14 Jul
  • Re: How to Shuffle dataRaj Shekhar14 Jul