List:General Discussion« Previous MessageNext Message »
From:shawn green Date:June 11 2013 9:16pm
Subject:Re: How do I select all rows of table that have some rows in another
table (AND, not OR)
View as plain text  
Hello Daevid,

On 6/11/2013 3:59 PM, Daevid Vincent wrote:
> I am trying to implement a filter so that a user could select various genres
> they want "in" or "out". Perhaps they like 'action' and 'car chases' but
> don't like 'foreign' and 'drifting' (or whatever...)
>
> So I want something sort of like this, however IN() is using an "OR"
> comparison when I need it to be an "AND"
>
> SELECT DISTINCT
>          s.`scene_id` AS `id`,
>          GROUP_CONCAT(sg.`genre_id`)
> FROM
>          `dvds` AS d
>          JOIN `scenes_list` AS s
>                  ON s.`dvd_id` = d.`dvd_id`
>          JOIN `scenes_genres` AS sg
>                  ON sg.`scene_id` = s.`scene_id`
>                  AND sg.`genre_id` IN (10,38)
>                  AND sg.`genre_id` NOT IN (22,61)
> GROUP BY s.`scene_id`;
>
> This is giving me way way too many rows returned.
>
> For example, I would expect this scene_id to be in the result set:
>
> SELECT * FROM scenes_genres WHERE scene_id = 17;
>
> scene_id  genre_id
> --------  ----------
>        17           1
>        17           3
>        17          10 <--
>        17          19
>        17          38 <--
>        17          53
>        17          58
>        17          59
>
> And this scene ID to NOT be in the result set:
>
> SELECT * FROM scenes_genres WHERE scene_id = 11;
>
> scene_id  genre_id
> --------  ----------
>        11           1
>        11          10 <--
>        11          19
>        11          31
>        11          32
> 		       <-- but does not have 38
>        11          59
>
> I've tried various subselect ideas, but of course this fails b/c genre_id
> can't be multiple things at one time (AND)
>
>     JOIN `scenes_genres` AS sg
>       ON sg.`scene_id` = s.`scene_id`
>          AND sg.`genre_id` IN (
>            SELECT `genre_id` FROM `scenes_genres`
>             WHERE `genre_id` = 10
>                    AND `genre_id` = 38
>                    AND `genre_id` <> 22
>                    AND `genre_id` <> 61
>         )
>
> And straight up like this failure too...
>
>    JOIN `scenes_genres` AS sg
>      ON sg.`scene_id` = s.`scene_id`
>         AND (sg.`genre_id` = 10 AND sg.`genre_id` = 38)
>         AND NOT (sg.`genre_id` = 22 AND sg.`genre_id` = 61)
>
> So I'm sort of out of ideas at this point and hoping someone has a way to do
> this.
>
> Also, just for S&G this is how we are currently implementing it, but we feel
> the REGEXP is killing our queries and while "clever" is a bit hacky and
> nullifies any indexes we have on the genres column as it requires a
> file_sort table scan to compare substrings basically...
>
> SELECT * FROM scene_all_genres WHERE scene_id = 17;
>
> scene_id  genres
> --------  ------------------------
>        17  1|3|10|19|38|53|58|59|
>
> SELECT * FROM scene_all_genres WHERE scene_id = 11;
>
> scene_id  genres
> --------  -------------------
>        11  1|10|19|31|32|59|
>
> SELECT DISTINCT
>          s.`scene_id` AS `id`,
>          sg.`genres`
> FROM
>          `scene_all_genres` AS sg,
>          `dvds` AS d,
>          `scenes_list` AS s
> WHERE 	 dvd_id` = d.`dvd_id`
>          AND sg.`scene_id` = s.`scene_id`
>          AND sg.`genres` REGEXP '[[:<:]]10[[:>:]].*[[:<:]]38[[:>:]]'
>          AND sg.`genres` NOT REGEXP
> '(([[:<:]]22[[:>:]])|([[:<:]]61[[:>:]]))'
>
> ;
>
> http://dev.mysql.com/doc/refman/5.0/en/regexp.html#operator_regexp
>
> [[:<:]], [[:>:]]
>
> These markers stand for word boundaries. They match the beginning and end of
> words, respectively. A word is a sequence of word characters that is not
> preceded by or followed by word characters. A word character is an
> alphanumeric character in the alnum class or an underscore (_).
>

To me it looks like you want to know how to match N for N when looking 
for subset of properties. It's a multi-step process using plain SQL but 
it's fast:

1) Build a list of scenes containing the genres you want to see

CREATE TEMPORARY TABLE tmpList(key(scene_id, genre_id))
SELECT scene_id, genre_id
FROM scenes_genres
WHERE genre_id` IN (10,38)  <-- in this case you have 2 terms to mach

2) From that list, determine which of those scenes also contain unwanted 
genres and remove them.

DELETE tmpList
FROM tmpList t
INNER JOIN scenes_genres sg
   on sg.scene_id = t.scene_id
   and sg.genre_id IN (22,61)

# at this point, tmpList contains all scenes that have any of the 
desired genres but none of the unwanted ones.

3) Check to see if any scene has all N matches.

SELECT scene_id, count(genre_id) as matches
FROM tmpList
GROUP BY scene_id
HAVING matches = 2
/* ^--- this is the number of terms you are trying to match. */


Of course you can modify the last query to eliminate the HAVING clause 
and pick the top 5 matching scenes (even if they are partial matches) 
like this

SELECT scene_id, count(genre_id) as matches
FROM tmpList
GROUP BY scene_id
ORDER BY matches DESC
LIMIT 5

Let us know if this is not what you wanted to do.



Note to the rest of the list:

Yes, we could have done most of this with self-joins or subqueries, an 
EXISTS, and a NOT EXISTS inside a single statement but I think the 
execution time would have been miserable. I also think that that 
approach also would not have allowed us to evaluate a partial match 
(like 5 out of 7 target genres), only complete matches would have been 
returned.

I am still very interested in seeing alternative solutions :)

-- 
Shawn Green
MySQL Principal Technical Support Engineer
Oracle USA, Inc. - Hardware and Software, Engineered to Work Together.
Office: Blountville, TN
Thread
How do I select all rows of table that have some rows in another table (AND, not OR)Daevid Vincent11 Jun
  • Re: How do I select all rows of table that have some rows in anothertable (AND, not OR)shawn green11 Jun
    • RE: How do I select all rows of table that have some rows in another table (AND, not OR)Daevid Vincent11 Jun
      • RE: How do I select all rows of table that have some rows inanother table (AND, not OR)Rick James12 Jun
        • RE: How do I select all rows of table that have some rows in another table (AND, not OR)Daevid Vincent12 Jun
          • RE: How do I select all rows of table that have some rows inanother table (AND, not OR)Rick James12 Jun
      • Re: How do I select all rows of table that have some rows in anothertable (AND, not OR)shawn green12 Jun
  • Re: How do I select all rows of table that have some rows inanother table (AND, not OR)hsv13 Jun
RE: How do I select all rows of table that have some rows in another table (AND, not OR)Daevid Vincent12 Jun
Re: How do I select all rows of table that have some rows inanother table (AND, not OR)hsv13 Jun
RE: How do I select all rows of table that have some rows in another table (AND, not OR)hsv18 Jun