List:General Discussion« Previous MessageNext Message »
From:Dan Bolser Date:July 15 2005 7:37am
Subject:Re: SQL 'clustering' query?
View as plain text  
On Thu, 14 Jul 2005, Peter Brawley wrote:

>Dan,
>
>Hi, I remember reading about an SQL query type which did something like
>select all 'aircraft hangers' which contained exactly (or at least) some
>given set of aircraft.
>
>Or did it select the list of pilots qualified to fly all the airplanes in
>the hanger... 
>
>Anyway, I forget the syntax (and the fancy name for this kind of query).
>
>
>I think the concept you're after is relational division. The aeroplane 
>hangar was one of Celko's examples. Another from him is at 
>http://www.artfulsoftware.com/queries.php#28, other examples at
>http://www.artfulsoftware.com/queries.php#22, 
>http://www.artfulsoftware.com/queries.php#33,

Cheers, I will try to rephrase my query below as 'RELATIONAL DIVISION' :)

In the short term I solved my problem with 

group_concat_max_len system variable - For some reason I thought it would
already be at the maximum lenght. I doubled it up and saw my warnings
dissapear 

Warning: 1260

Thanks very much for the above links,

Dan.


>
>PB
>
>
>Dan Bolser wrote:
>
>>Hi, I remember reading about an SQL query type which did something like
>>select all 'aircraft hangers' which contained exactly (or at least) some
>>given set of aircraft.
>>
>>Or did it select the list of pilots qualified to fly all the airplanes in
>>the hanger... 
>>
>>Anyway, I forget the syntax (and the fancy name for this kind of query).
>>
>>What I want to do is the following, given this data...
>>
>>
>>Table: ATTRIBUTE_LIST;
>>
>>ID	ATTRIBUTE
>>W	A
>>W	B
>>W	C
>>X       A
>>X	B
>>X	C
>>Y	A
>>Y	B
>>Y	C
>>Y	D
>>Z	E
>>
>>--> SQL MAGIC --> 
>>
>>
>>Table: CLUSTERS
>>
>>G_ID	ID
>>1	W
>>1	X
>>2	Y
>>3	Z
>>
>>
>>That is, to group together all ID's with the same 'set' of ATTRIBUTES.
>>
>>Currently I am doing this using 'GROUP_CONCAT', but my attribute list just
>>went above the limit for the GROUP_CONCAT column...
>>
>>+---------+------+--------------------------------------+
>>| Level   | Code | Message                              |
>>+---------+------+--------------------------------------+
>>| Warning | 1260 | 7 line(s) were cut by GROUP_CONCAT() |
>>+---------+------+--------------------------------------+
>>
>>My query looks roughly like this...
>>
>>
>>SET @i:=0, @x:='', @row:='';
>>#
>>DROP    TABLE CLUSTERS;
>>CREATE  TABLE CLUSTERS
>>  (PRIMARY KEY (ID), INDEX (G_ID))
>>#
>>SELECT
>>  ID, G_ID
>>  #
>>FROM
>>(
>>  SELECT
>>    ID,
>>    #
>>    @x:= ATTR_LIST               AS HIDDEN1,
>>    #
>>    IF(@row = @x, @i, @i:=@i+1) AS G_ID,
>>    #
>>       @row:= @x                 AS HIDDEN2
>>    #
>>  FROM 
>>  (
>>    SELECT 
>>      ID,
>>      GROUP_CONCAT(ATTRIBUTE) AS ATTR_LIST,
>>    FROM
>>      ATTRIBUTE_LIST
>>    GROUP BY
>>      ID
>>    #
>>  ) AS vt1
>>  #
>>  ORDER BY          -- This is very important for
>>    ATTR_LIST       -- the overall query.
>>  #
>>) AS vt2;
>>
>>(And thats the highly simplified version!)
>>
>>
>>I can't shake the feeling that this 'string based' approach (while quite
>>speedy) is inherently messy, and that a proper 'set based' approach
>>should exist, and shouldn't have the limitation in the number of
>>attributes that the above method has.
>>
>>In general I would really like to (somehow) develop a suite of easy to use
>>'SQL CLUSTER' commands, as the data mining community needs that kind of
>>thing in nice general (set based) abundance :)
>>
>>Anyway, thanks for any feedback on any of the above,
>>
>>Dan.
>>
>>
>>  
>>
>
>
>

Thread
SQL 'clustering' query?Dan Bolser14 Jul
  • Re: SQL 'clustering' query?Peter Brawley14 Jul
    • Re: SQL 'clustering' query?Dan Bolser15 Jul