List:General Discussion« Previous MessageNext Message »
From:Peter Brawley Date:January 23 2009 9:31pm
Subject:Re: Need to pivot rows into columns
View as plain text  
Attila,

>I would like to select only the most recent "targettime" within 1 minute and
>only display only the rows that are "the latest" and print out all of the
>stats as columns on a per toolname basis:

One way: a three-step:

1. There is a formula (see "Group data by time periods" at 
http://www.artfulsoftware.com/queries.php) for calculating periods that 
are evenly divisible into 60 mins: |((60/periodMinutes) * HOUR( 
timevalue ) + FLOOR( MINUTE( timevalue ) / periodMinutes )). |For your 
query, periodMinutes=1, so it simplifies to 60*HOUR(target_time) + 
FLOOR(MINUTE(target_time)).

2. Use an exclusion join (see "Within-group aggregates" at 
http://www.artfulsoftware.com/queries.php) to find the latest rows to 
the nearest minute (from the above formula) per scenario and toolname group:

SELECT a.*
FROM data a
LEFT JOIN data b
  ON a.scenarioname=b.scenarioname AND a.toolname=b.toolname
  AND 60*HOUR(a.target_time)+FLOOR(MINUTE(a.target_time)) < 
60*HOUR(b.target_time)+FLOOR(MINUTE(b.target_time))
WHERE b.id IS NULL;

3. Use the above as the inner query for an outer query which pivots (see 
steps 2 and 3 in "Group column statistics in rows" at 
http://www.artfulsoftware.com/queries.php) the last two columns of the 
above result:

SELECT
  target_time,
  scenarioname,
  toolname,
  SUM( CASE statname WHEN 'byte_count' THEN statvalue ELSE 0 END ) AS bytes,
  SUM( CASE statname WHEN 'udp_count'  THEN statvalue ELSE 0 END ) AS udps,
  SUM( CASE statname WHEN 'tcp_count'  THEN statvalue ELSE 0 END ) AS tcps
FROM (
  SELECT a.target_time,a.scenarioname,a.toolname,a.statname,a.statvalue
  FROM data a
  LEFT JOIN data b
    ON a.scenarioname=b.scenarioname AND a.toolname=b.toolname
    AND 60*HOUR(a.target_time)+FLOOR(MINUTE(a.target_time)) < 
60*HOUR(b.target_time)+FLOOR(MINUTE(b.target_time))
  WHERE b.id IS NULL
) AS stats
GROUP BY target_time,scenarioname,toolname;
+---------------------+--------------+----------+-------+------+------+
| target_time         | scenarioname | toolname | bytes | udps | tcps |
+---------------------+--------------+----------+-------+------+------+
| 2009-01-21 19:34:00 | scenario1    | tool1    |   100 |  200 |  300 |
| 2009-01-21 19:34:00 | scenario1    | tool2    |   400 |  500 |  600 |
+---------------------+--------------+----------+-------+------+------+

PB

-----

Attila wrote:
> Hi,
>
> I have the following table:
>
> CREATE TABLE DATA (
>   TARGET_TIME datetime NOT NULL,
>   SCENARIONAME    varchar(20) NOT NULL,
>   TOOLNAME varchar(20) NOT NULL,
>   STATNAME varchar(100) NOT NULL,
>   STATVALUE int(10) NOT NULL,
>   PRIMARY KEY (TARGET_TIME, SCENARIONAME, SIMTOOLNAME, STATNAME)
> );
>
> I am trying to collect statistics from running tools and feeding them into
> the DB.
>
> If I have the following entries:
>
> ("2009-01-21 19:34:00", "scenario1", "tool1", "byte_count", 100),
> ("2009-01-21 19:34:00", "scenario1", "tool1", "udp_count", 200),
> ("2009-01-21 19:34:00", "scenario1", "tool1", "tcp_count", 300),
> ("2009-01-21 19:34:00", "scenario1", "tool2", "byte_count", 400),
> ("2009-01-21 19:34:00", "scenario1", "tool2", "udp_count", 500),
> ("2009-01-21 19:34:00", "scenario1", "tool2", "tcp_count", 600),
>
> ("2009-01-21 19:33:00", "scenario1", "tool1", "byte_count", 10),
> ("2009-01-21 19:33:00", "scenario1", "tool1", "udp_count", 20),
> ("2009-01-21 19:33:00", "scenario1", "tool1", "tcp_count", 30),
> ("2009-01-21 19:33:00", "scenario1", "tool2", "byte_count", 40),
> ("2009-01-21 19:33:00", "scenario1", "tool2", "udp_count", 50),
> ("2009-01-21 19:33:00", "scenario1", "tool2", "tcp_count", 60)
> (Notice that the "old" targettime  will be ignored in this particular query
> as we are only interested in the "latest")
>
> I would like to select only the most recent "targettime" within 1 minute and
> only display only the rows that are "the latest" and print out all of the
> stats as columns on a per toolname basis:
>
> Targettime                         Scenario        Toolname
> byte_count   udp_count     tcp_count
> 2009-01-21 19:34:00           scenario1       tool1
> 100             200               300
> 2009-01-21 19:34:00           scenario1       tool2              400
>            500               600
>
> The purpose of the query is to display the "latest statistics" for each
> scenario/toolname group.  It is important that the original data comes in
> rows (statname may not be known ahead of time, so it must be stored as a
> string in a field value).
>
> I am using MySQL and have found some possibilities with SQL Server (Pivot,
> crosstab, etc) ... but the requirements call for MySQL.
> Is the above query possible?
>
> Thanks a lot!
>   
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - http://www.avg.com 
> Version: 8.0.176 / Virus Database: 270.10.12/1909 - Release Date: 1/22/2009 7:08 AM
>
>   

Thread
Need to pivot rows into columnsAttila22 Jan
  • RE: Need to pivot rows into columnsOFT)22 Jan
    • RE: Need to pivot rows into columnsmos22 Jan
  • Re: Need to pivot rows into columnsPeter Brawley23 Jan
    • Re: Need to pivot rows into columnsAttila1 Feb