List:General Discussion« Previous MessageNext Message »
From:Marcus Claesson Date:December 3 2007 12:48pm
Subject:Help with SQL query construction
View as plain text  
Hi!

I have a "SQL query construction" question that I hope someone can help
me with. After comparing a bunch of DNA fragments (see name below) with
a larger reference sequence I get a ordered list ranked according to
similarities, and with start/stop co-ordinates where the fragments map
to the reference sequence:

+------+------+-------+------+----------+
| name | rank | start | stop | sub_rank |
+------+------+-------+------+----------+
| A    |    1 |     1 | 1000 |   NULL   |
| B    |    2 |     2 |  998 |   NULL   |
| C    |    4 |  1100 | 2000 |   NULL   |
| D    |    3 |  3050 | 4100 |   NULL   |
| E    |    5 |  2040 | 3000 |   NULL   | 
| F    |    6 |  1102 | 2000 |   NULL   |
| G    |    7 |  1098 | 1998 |   NULL   |
| H    |    8 |  3048 | 4100 |   NULL   |
| I    |    9 |  3051 | 4102 |   NULL   |
+------+------+-------+------+----------+

A graphical representation of fragments mapped to the ref sequence:

ref========================================>
1  A------>
2  B---->
3                              D------>
4           C------>
5                     E---->
6           F------->
7          G------->
8                              H------->
9                               I------->

Now, I want to group fragments in each overlapping position and sub-rank
them according to their rank in that position. The final table would
then look like:
+------+------+-------+------+----------+
| name | rank | start | stop | sub_rank |
+------+------+-------+------+----------+
| A    |    1 |     1 | 1000 |     1    |
| B    |    2 |     2 |  998 |     2    |
| C    |    4 |  1100 | 2000 |     1    |
| D    |    3 |  3050 | 4100 |     1    |
| E    |    5 |  2040 | 3000 |     1    |
| F    |    6 |  1102 | 2000 |     2    |
| G    |    7 |  1098 | 1998 |     3    |
| H    |    8 |  3048 | 4100 |     2    |
| I    |    9 |  3051 | 4102 |     3    |
+------+------+-------+------+----------+

Is this possible to achieve using SQL queries alone (perhaps with GROUP
BY, nested SELECTs etc)?

I've managed to do this with a Perl-DBI script, but would much prefer to
do it completely with MySQL instead. The Perl code is below and below
that is the MySQL-dump of the test data set...

Many thanks in advance!
Marcus


while (@{$dbh->selectcol_arrayref("SELECT rank FROM test WHERE sub_rank
IS NULL")}) {
    @null_sub_ranks = @{$dbh->selectcol_arrayref("SELECT rank FROM test
WHERE sub_rank IS NULL AND NOT (start>=(SELECT stop FROM test WHERE rank
= (SELECT min(rank) FROM test WHERE sub_rank IS NULL)) OR stop <=
(SELECT start FROM test WHERE rank = (SELECT min(rank) FROM test WHERE
sub_rank IS NULL)))")};
    for ($rank=0; $rank < scalar(@null_sub_ranks); $rank++ ) {
	$sub_rank = $rank + 1;
	$dbh->do("UPDATE test SET sub_rank=$sub_rank WHERE rank=
$null_sub_ranks[$rank]");
    }
}


-- MySQL dump 10.10
--
-- Host: localhost    Database: bxb
-- ------------------------------------------------------
-- Server version       5.0.22

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS,
FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO'
*/;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Table structure for table `test`
--

DROP TABLE IF EXISTS `test`;
CREATE TABLE `test` (
  `name` text,
  `rank` int(11) default NULL,
  `start` int(11) default NULL,
  `stop` int(11) default NULL,
  `sub_rank` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

--
-- Dumping data for table `test`
--


/*!40000 ALTER TABLE `test` DISABLE KEYS */;
LOCK TABLES `test` WRITE;
INSERT INTO `test` VALUES
('A',1,1,1000,NULL),('B',2,2,998,NULL),('C',4,1100,2000,NULL),('D',3,3050,4100,NULL),('E',5,2040,3000,NULL),('F',6,1102,2000,NULL),('G',7,1098,1998,NULL),('H',8,3048,4100,NULL),('I',9,3051,4102,NULL);
UNLOCK TABLES;
/*!40000 ALTER TABLE `test` ENABLE KEYS */;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

Thread
Help with SQL query constructionMarcus Claesson3 Dec
  • RE: Help with SQL query constructionJeff Mckeon3 Dec
  • Re: Help with SQL query constructionPeter Brawley3 Dec