From: Peter Brawley Date: December 3 2007 4:12pm Subject: Re: Help with SQL query construction List-Archive: http://lists.mysql.com/mysql/210307 Message-Id: <47542B05.6050905@earthlink.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Marcus, >I've managed to do this with a Perl-DBI script, but >would much prefer to do it completely with MySQL instead. You could port it to a recursive stored procedure. It would probably be slower, and what would you have gained? PB Marcus Claesson wrote: > Hi! > > I have a "SQL query construction" question that I hope someone can help > me with. After comparing a bunch of DNA fragments (see name below) with > a larger reference sequence I get a ordered list ranked according to > similarities, and with start/stop co-ordinates where the fragments map > to the reference sequence: > > +------+------+-------+------+----------+ > | name | rank | start | stop | sub_rank | > +------+------+-------+------+----------+ > | A | 1 | 1 | 1000 | NULL | > | B | 2 | 2 | 998 | NULL | > | C | 4 | 1100 | 2000 | NULL | > | D | 3 | 3050 | 4100 | NULL | > | E | 5 | 2040 | 3000 | NULL | > | F | 6 | 1102 | 2000 | NULL | > | G | 7 | 1098 | 1998 | NULL | > | H | 8 | 3048 | 4100 | NULL | > | I | 9 | 3051 | 4102 | NULL | > +------+------+-------+------+----------+ > > A graphical representation of fragments mapped to the ref sequence: > > ref========================================> > 1 A------> > 2 B----> > 3 D------> > 4 C------> > 5 E----> > 6 F-------> > 7 G-------> > 8 H-------> > 9 I-------> > > Now, I want to group fragments in each overlapping position and sub-rank > them according to their rank in that position. The final table would > then look like: > +------+------+-------+------+----------+ > | name | rank | start | stop | sub_rank | > +------+------+-------+------+----------+ > | A | 1 | 1 | 1000 | 1 | > | B | 2 | 2 | 998 | 2 | > | C | 4 | 1100 | 2000 | 1 | > | D | 3 | 3050 | 4100 | 1 | > | E | 5 | 2040 | 3000 | 1 | > | F | 6 | 1102 | 2000 | 2 | > | G | 7 | 1098 | 1998 | 3 | > | H | 8 | 3048 | 4100 | 2 | > | I | 9 | 3051 | 4102 | 3 | > +------+------+-------+------+----------+ > > Is this possible to achieve using SQL queries alone (perhaps with GROUP > BY, nested SELECTs etc)? > > I've managed to do this with a Perl-DBI script, but would much prefer to > do it completely with MySQL instead. The Perl code is below and below > that is the MySQL-dump of the test data set... > > Many thanks in advance! > Marcus > > > while (@{$dbh->selectcol_arrayref("SELECT rank FROM test WHERE sub_rank > IS NULL")}) { > @null_sub_ranks = @{$dbh->selectcol_arrayref("SELECT rank FROM test > WHERE sub_rank IS NULL AND NOT (start>=(SELECT stop FROM test WHERE rank > = (SELECT min(rank) FROM test WHERE sub_rank IS NULL)) OR stop <= > (SELECT start FROM test WHERE rank = (SELECT min(rank) FROM test WHERE > sub_rank IS NULL)))")}; > for ($rank=0; $rank < scalar(@null_sub_ranks); $rank++ ) { > $sub_rank = $rank + 1; > $dbh->do("UPDATE test SET sub_rank=$sub_rank WHERE rank= > $null_sub_ranks[$rank]"); > } > } > > > -- MySQL dump 10.10 > -- > -- Host: localhost Database: bxb > -- ------------------------------------------------------ > -- Server version 5.0.22 > > /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */; > /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */; > /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */; > /*!40101 SET NAMES utf8 */; > /*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */; > /*!40103 SET TIME_ZONE='+00:00' */; > /*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */; > /*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, > FOREIGN_KEY_CHECKS=0 */; > /*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' > */; > /*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */; > > -- > -- Table structure for table `test` > -- > > DROP TABLE IF EXISTS `test`; > CREATE TABLE `test` ( > `name` text, > `rank` int(11) default NULL, > `start` int(11) default NULL, > `stop` int(11) default NULL, > `sub_rank` int(11) default NULL > ) ENGINE=MyISAM DEFAULT CHARSET=latin1; > > -- > -- Dumping data for table `test` > -- > > > /*!40000 ALTER TABLE `test` DISABLE KEYS */; > LOCK TABLES `test` WRITE; > INSERT INTO `test` VALUES > ('A',1,1,1000,NULL),('B',2,2,998,NULL),('C',4,1100,2000,NULL),('D',3,3050,4100,NULL),('E',5,2040,3000,NULL),('F',6,1102,2000,NULL),('G',7,1098,1998,NULL),('H',8,3048,4100,NULL),('I',9,3051,4102,NULL); > UNLOCK TABLES; > /*!40000 ALTER TABLE `test` ENABLE KEYS */; > /*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */; > > /*!40101 SET SQL_MODE=@OLD_SQL_MODE */; > /*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */; > /*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */; > /*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */; > /*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */; > /*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */; > /*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */; > > >