List:General Discussion« Previous MessageNext Message »
From:Peter Brawley Date:July 14 2005 5:01pm
Subject:Re: SQL 'clustering' query?
View as plain text  
Dan,

Hi, I remember reading about an SQL query type which did something like
select all 'aircraft hangers' which contained exactly (or at least) some
given set of aircraft.

Or did it select the list of pilots qualified to fly all the airplanes in
the hanger... 

Anyway, I forget the syntax (and the fancy name for this kind of query).


I think the concept you're after is relational division. The aeroplane 
hangar was one of Celko's examples. Another from him is at 
http://www.artfulsoftware.com/queries.php#28, other examples at
http://www.artfulsoftware.com/queries.php#22, 
http://www.artfulsoftware.com/queries.php#33,

PB


Dan Bolser wrote:

>Hi, I remember reading about an SQL query type which did something like
>select all 'aircraft hangers' which contained exactly (or at least) some
>given set of aircraft.
>
>Or did it select the list of pilots qualified to fly all the airplanes in
>the hanger... 
>
>Anyway, I forget the syntax (and the fancy name for this kind of query).
>
>What I want to do is the following, given this data...
>
>
>Table: ATTRIBUTE_LIST;
>
>ID	ATTRIBUTE
>W	A
>W	B
>W	C
>X       A
>X	B
>X	C
>Y	A
>Y	B
>Y	C
>Y	D
>Z	E
>
>--> SQL MAGIC --> 
>
>
>Table: CLUSTERS
>
>G_ID	ID
>1	W
>1	X
>2	Y
>3	Z
>
>
>That is, to group together all ID's with the same 'set' of ATTRIBUTES.
>
>Currently I am doing this using 'GROUP_CONCAT', but my attribute list just
>went above the limit for the GROUP_CONCAT column...
>
>+---------+------+--------------------------------------+
>| Level   | Code | Message                              |
>+---------+------+--------------------------------------+
>| Warning | 1260 | 7 line(s) were cut by GROUP_CONCAT() |
>+---------+------+--------------------------------------+
>
>My query looks roughly like this...
>
>
>SET @i:=0, @x:='', @row:='';
>#
>DROP    TABLE CLUSTERS;
>CREATE  TABLE CLUSTERS
>  (PRIMARY KEY (ID), INDEX (G_ID))
>#
>SELECT
>  ID, G_ID
>  #
>FROM
>(
>  SELECT
>    ID,
>    #
>    @x:= ATTR_LIST               AS HIDDEN1,
>    #
>    IF(@row = @x, @i, @i:=@i+1) AS G_ID,
>    #
>       @row:= @x                 AS HIDDEN2
>    #
>  FROM 
>  (
>    SELECT 
>      ID,
>      GROUP_CONCAT(ATTRIBUTE) AS ATTR_LIST,
>    FROM
>      ATTRIBUTE_LIST
>    GROUP BY
>      ID
>    #
>  ) AS vt1
>  #
>  ORDER BY          -- This is very important for
>    ATTR_LIST       -- the overall query.
>  #
>) AS vt2;
>
>(And thats the highly simplified version!)
>
>
>I can't shake the feeling that this 'string based' approach (while quite
>speedy) is inherently messy, and that a proper 'set based' approach
>should exist, and shouldn't have the limitation in the number of
>attributes that the above method has.
>
>In general I would really like to (somehow) develop a suite of easy to use
>'SQL CLUSTER' commands, as the data mining community needs that kind of
>thing in nice general (set based) abundance :)
>
>Anyway, thanks for any feedback on any of the above,
>
>Dan.
>
>
>  
>


-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.323 / Virus Database: 267.8.15/49 - Release Date: 7/14/2005

Thread
SQL 'clustering' query?Dan Bolser14 Jul
  • Re: SQL 'clustering' query?Peter Brawley14 Jul
    • Re: SQL 'clustering' query?Dan Bolser15 Jul