List:General Discussion« Previous MessageNext Message »
From:Attila Date:February 1 2009 4:11pm
Subject:Re: Need to pivot rows into columns
View as plain text  
Hey,
Thanks a lot! I didn't see this response until now.

I did not try it as I went with the column based solution. However
this looks like it will work.

I will implement and let you know.

Thanks again,



On 1/23/09, Peter Brawley <peter.brawley@stripped> wrote:
> Attila,
>
>>I would like to select only the most recent "targettime" within 1 minute
>> and
>>only display only the rows that are "the latest" and print out all of the
>>stats as columns on a per toolname basis:
>
> One way: a three-step:
>
> 1. There is a formula (see "Group data by time periods" at
> http://www.artfulsoftware.com/queries.php) for calculating periods that
> are evenly divisible into 60 mins: |((60/periodMinutes) * HOUR(
> timevalue ) + FLOOR( MINUTE( timevalue ) / periodMinutes )). |For your
> query, periodMinutes=1, so it simplifies to 60*HOUR(target_time) +
> FLOOR(MINUTE(target_time)).
>
> 2. Use an exclusion join (see "Within-group aggregates" at
> http://www.artfulsoftware.com/queries.php) to find the latest rows to
> the nearest minute (from the above formula) per scenario and toolname group:
>
> SELECT a.*
> FROM data a
> LEFT JOIN data b
>   ON a.scenarioname=b.scenarioname AND a.toolname=b.toolname
>   AND 60*HOUR(a.target_time)+FLOOR(MINUTE(a.target_time)) <
> 60*HOUR(b.target_time)+FLOOR(MINUTE(b.target_time))
> WHERE b.id IS NULL;
>
> 3. Use the above as the inner query for an outer query which pivots (see
> steps 2 and 3 in "Group column statistics in rows" at
> http://www.artfulsoftware.com/queries.php) the last two columns of the
> above result:
>
> SELECT
>   target_time,
>   scenarioname,
>   toolname,
>   SUM( CASE statname WHEN 'byte_count' THEN statvalue ELSE 0 END ) AS bytes,
>   SUM( CASE statname WHEN 'udp_count'  THEN statvalue ELSE 0 END ) AS udps,
>   SUM( CASE statname WHEN 'tcp_count'  THEN statvalue ELSE 0 END ) AS tcps
> FROM (
>   SELECT a.target_time,a.scenarioname,a.toolname,a.statname,a.statvalue
>   FROM data a
>   LEFT JOIN data b
>     ON a.scenarioname=b.scenarioname AND a.toolname=b.toolname
>     AND 60*HOUR(a.target_time)+FLOOR(MINUTE(a.target_time)) <
> 60*HOUR(b.target_time)+FLOOR(MINUTE(b.target_time))
>   WHERE b.id IS NULL
> ) AS stats
> GROUP BY target_time,scenarioname,toolname;
> +---------------------+--------------+----------+-------+------+------+
> | target_time         | scenarioname | toolname | bytes | udps | tcps |
> +---------------------+--------------+----------+-------+------+------+
> | 2009-01-21 19:34:00 | scenario1    | tool1    |   100 |  200 |  300 |
> | 2009-01-21 19:34:00 | scenario1    | tool2    |   400 |  500 |  600 |
> +---------------------+--------------+----------+-------+------+------+
>
> PB
>
> -----
>
> Attila wrote:
>> Hi,
>>
>> I have the following table:
>>
>> CREATE TABLE DATA (
>>   TARGET_TIME datetime NOT NULL,
>>   SCENARIONAME    varchar(20) NOT NULL,
>>   TOOLNAME varchar(20) NOT NULL,
>>   STATNAME varchar(100) NOT NULL,
>>   STATVALUE int(10) NOT NULL,
>>   PRIMARY KEY (TARGET_TIME, SCENARIONAME, SIMTOOLNAME, STATNAME)
>> );
>>
>> I am trying to collect statistics from running tools and feeding them into
>> the DB.
>>
>> If I have the following entries:
>>
>> ("2009-01-21 19:34:00", "scenario1", "tool1", "byte_count", 100),
>> ("2009-01-21 19:34:00", "scenario1", "tool1", "udp_count", 200),
>> ("2009-01-21 19:34:00", "scenario1", "tool1", "tcp_count", 300),
>> ("2009-01-21 19:34:00", "scenario1", "tool2", "byte_count", 400),
>> ("2009-01-21 19:34:00", "scenario1", "tool2", "udp_count", 500),
>> ("2009-01-21 19:34:00", "scenario1", "tool2", "tcp_count", 600),
>>
>> ("2009-01-21 19:33:00", "scenario1", "tool1", "byte_count", 10),
>> ("2009-01-21 19:33:00", "scenario1", "tool1", "udp_count", 20),
>> ("2009-01-21 19:33:00", "scenario1", "tool1", "tcp_count", 30),
>> ("2009-01-21 19:33:00", "scenario1", "tool2", "byte_count", 40),
>> ("2009-01-21 19:33:00", "scenario1", "tool2", "udp_count", 50),
>> ("2009-01-21 19:33:00", "scenario1", "tool2", "tcp_count", 60)
>> (Notice that the "old" targettime  will be ignored in this particular
>> query
>> as we are only interested in the "latest")
>>
>> I would like to select only the most recent "targettime" within 1 minute
>> and
>> only display only the rows that are "the latest" and print out all of the
>> stats as columns on a per toolname basis:
>>
>> Targettime                         Scenario        Toolname
>> byte_count   udp_count     tcp_count
>> 2009-01-21 19:34:00           scenario1       tool1
>> 100             200               300
>> 2009-01-21 19:34:00           scenario1       tool2              400
>>            500               600
>>
>> The purpose of the query is to display the "latest statistics" for each
>> scenario/toolname group.  It is important that the original data comes in
>> rows (statname may not be known ahead of time, so it must be stored as a
>> string in a field value).
>>
>> I am using MySQL and have found some possibilities with SQL Server (Pivot,
>> crosstab, etc) ... but the requirements call for MySQL.
>> Is the above query possible?
>>
>> Thanks a lot!
>>
>> ------------------------------------------------------------------------
>>
>>
>> No virus found in this incoming message.
>> Checked by AVG - http://www.avg.com
>> Version: 8.0.176 / Virus Database: 270.10.12/1909 - Release Date:
>> 1/22/2009 7:08 AM
>>
>>
>


-- 
Attila
Software Developer
atteeela@stripped
Thread
Need to pivot rows into columnsAttila22 Jan
  • RE: Need to pivot rows into columnsOFT)22 Jan
    • RE: Need to pivot rows into columnsmos22 Jan
  • Re: Need to pivot rows into columnsPeter Brawley23 Jan
    • Re: Need to pivot rows into columnsAttila1 Feb