List:General Discussion« Previous MessageNext Message »
From:James Fryer Date:September 24 2009 10:42am
Subject:Excluding records that don't match condition
View as plain text  
I have a database of Titles (books, magazines, etc.) with a M:M relation 
to Keywords.

I would like to be able to generate queries for the condition "Return 
titles matching X with keywords NOT matching A".

This seems quite hard to do. Here is a minimal table structure:

CREATE TABLE Title
     (
     id int(10) unsigned NOT NULL,
     title CHAR(2)
     );
CREATE TABLE Keyword
     (
     id int(10) unsigned NOT NULL,
     kw CHAR(1)
     );
CREATE TABLE TitleKeyword
     (
     title_id int(10) unsigned NOT NULL,
     keyword_id int(10) unsigned NOT NULL
     );

# X1: A, B
# X2: B, C
# X3: C
# Y1: A, B
# Y2: B, C
# Y3: C
INSERT INTO Title (id, title) VALUES (1, 'X1'), (2, 'X2'), (3, 'X3'), 
(4, 'Y1'), (5, 'Y2'), (6, 'Y3');
INSERT INTO Keyword (id, kw) VALUES (1, 'A'), (2, 'B'), (3, 'C');
INSERT INTO TitleKeyword VALUES (1, 1), (1, 2), (2, 2), (2, 3), (3, 3), 
(4, 1), (4, 2), (5, 2), (5, 3), (6, 3);

Naively I tried this query:

SELECT DISTINCT
     title
FROM
     Title
JOIN
     TitleKeyword ON Title.id=title_id
JOIN
     Keyword ON Keyword.id=keyword_id
WHERE
     title LIKE 'X%'
     AND kw <> 'A'
;

but this includes X1 because it matches B as well as A. I only want X2, 
X3 returned.

This works:

SELECT
     title
FROM
     Title
WHERE title LIKE 'X%'
     AND Title.id NOT IN(
     SELECT
         title_id
     FROM
         TitleKeyword
     JOIN
         Keyword ON Keyword.id=keyword_id
     WHERE
         kw = 'A'
     )
;

However, this uses subselects which I have always found slow, and there 
may be many keywords (thousands), and I believe IN() is not recommended 
for large lists.

So my question is, can this query be rewritten to use JOINs? It seems 
hard to me because all the keywords need to be examined to eliminate the 
title. On the other hand it must be a common requirement so there may be 
something I have overlooked.

Many thanks in advance,

James
-- 
James Fryer  /  jim@stripped  /  jim@stripped
Thread
Excluding records that don't match conditionJames Fryer24 Sep
  • Re: Excluding records that don't match conditionMike Spreitzer24 Sep
    • Re: Excluding records that don't match conditionJames Fryer25 Sep