From: Peter Brawley Date: January 23 2009 9:31pm Subject: Re: Need to pivot rows into columns List-Archive: http://lists.mysql.com/mysql/216011 Message-Id: <497A3746.5070306@earthlink.net> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="------------080602060300000109020905" --------------080602060300000109020905 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Attila, >I would like to select only the most recent "targettime" within 1 minute and >only display only the rows that are "the latest" and print out all of the >stats as columns on a per toolname basis: One way: a three-step: 1. There is a formula (see "Group data by time periods" at http://www.artfulsoftware.com/queries.php) for calculating periods that are evenly divisible into 60 mins: |((60/periodMinutes) * HOUR( timevalue ) + FLOOR( MINUTE( timevalue ) / periodMinutes )). |For your query, periodMinutes=1, so it simplifies to 60*HOUR(target_time) + FLOOR(MINUTE(target_time)). 2. Use an exclusion join (see "Within-group aggregates" at http://www.artfulsoftware.com/queries.php) to find the latest rows to the nearest minute (from the above formula) per scenario and toolname group: SELECT a.* FROM data a LEFT JOIN data b ON a.scenarioname=b.scenarioname AND a.toolname=b.toolname AND 60*HOUR(a.target_time)+FLOOR(MINUTE(a.target_time)) < 60*HOUR(b.target_time)+FLOOR(MINUTE(b.target_time)) WHERE b.id IS NULL; 3. Use the above as the inner query for an outer query which pivots (see steps 2 and 3 in "Group column statistics in rows" at http://www.artfulsoftware.com/queries.php) the last two columns of the above result: SELECT target_time, scenarioname, toolname, SUM( CASE statname WHEN 'byte_count' THEN statvalue ELSE 0 END ) AS bytes, SUM( CASE statname WHEN 'udp_count' THEN statvalue ELSE 0 END ) AS udps, SUM( CASE statname WHEN 'tcp_count' THEN statvalue ELSE 0 END ) AS tcps FROM ( SELECT a.target_time,a.scenarioname,a.toolname,a.statname,a.statvalue FROM data a LEFT JOIN data b ON a.scenarioname=b.scenarioname AND a.toolname=b.toolname AND 60*HOUR(a.target_time)+FLOOR(MINUTE(a.target_time)) < 60*HOUR(b.target_time)+FLOOR(MINUTE(b.target_time)) WHERE b.id IS NULL ) AS stats GROUP BY target_time,scenarioname,toolname; +---------------------+--------------+----------+-------+------+------+ | target_time | scenarioname | toolname | bytes | udps | tcps | +---------------------+--------------+----------+-------+------+------+ | 2009-01-21 19:34:00 | scenario1 | tool1 | 100 | 200 | 300 | | 2009-01-21 19:34:00 | scenario1 | tool2 | 400 | 500 | 600 | +---------------------+--------------+----------+-------+------+------+ PB ----- Attila wrote: > Hi, > > I have the following table: > > CREATE TABLE DATA ( > TARGET_TIME datetime NOT NULL, > SCENARIONAME varchar(20) NOT NULL, > TOOLNAME varchar(20) NOT NULL, > STATNAME varchar(100) NOT NULL, > STATVALUE int(10) NOT NULL, > PRIMARY KEY (TARGET_TIME, SCENARIONAME, SIMTOOLNAME, STATNAME) > ); > > I am trying to collect statistics from running tools and feeding them into > the DB. > > If I have the following entries: > > ("2009-01-21 19:34:00", "scenario1", "tool1", "byte_count", 100), > ("2009-01-21 19:34:00", "scenario1", "tool1", "udp_count", 200), > ("2009-01-21 19:34:00", "scenario1", "tool1", "tcp_count", 300), > ("2009-01-21 19:34:00", "scenario1", "tool2", "byte_count", 400), > ("2009-01-21 19:34:00", "scenario1", "tool2", "udp_count", 500), > ("2009-01-21 19:34:00", "scenario1", "tool2", "tcp_count", 600), > > ("2009-01-21 19:33:00", "scenario1", "tool1", "byte_count", 10), > ("2009-01-21 19:33:00", "scenario1", "tool1", "udp_count", 20), > ("2009-01-21 19:33:00", "scenario1", "tool1", "tcp_count", 30), > ("2009-01-21 19:33:00", "scenario1", "tool2", "byte_count", 40), > ("2009-01-21 19:33:00", "scenario1", "tool2", "udp_count", 50), > ("2009-01-21 19:33:00", "scenario1", "tool2", "tcp_count", 60) > (Notice that the "old" targettime will be ignored in this particular query > as we are only interested in the "latest") > > I would like to select only the most recent "targettime" within 1 minute and > only display only the rows that are "the latest" and print out all of the > stats as columns on a per toolname basis: > > Targettime Scenario Toolname > byte_count udp_count tcp_count > 2009-01-21 19:34:00 scenario1 tool1 > 100 200 300 > 2009-01-21 19:34:00 scenario1 tool2 400 > 500 600 > > The purpose of the query is to display the "latest statistics" for each > scenario/toolname group. It is important that the original data comes in > rows (statname may not be known ahead of time, so it must be stored as a > string in a field value). > > I am using MySQL and have found some possibilities with SQL Server (Pivot, > crosstab, etc) ... but the requirements call for MySQL. > Is the above query possible? > > Thanks a lot! > > ------------------------------------------------------------------------ > > > No virus found in this incoming message. > Checked by AVG - http://www.avg.com > Version: 8.0.176 / Virus Database: 270.10.12/1909 - Release Date: 1/22/2009 7:08 AM > > --------------080602060300000109020905--