List:General Discussion« Previous MessageNext Message »
From:SGreen Date:February 1 2006 2:31pm
Subject:Re: NOT IN vs IS NULL
View as plain text  
Devananda,

I have to support Peter on this one.  What he submitted to you is a 
perfectly appropriate solution. It seems as though you rejected his 
assistance before even trying to see if it would work. There have been and 
continue to be SQL-driven databases around that have not had and do not 
have subquery support. They function perfectly well without them.  By your 
response, it appears to me you don't know how a LEFT JOIN is supposed to 
operate. That is one of the most basic tools of any relational DBA's 
arsenal.

I implore everyone not comfortable with them to learn their "JOIN"s (INNER 
JOIN, LEFT JOIN, RIGHT JOIN, CROSS JOIN, etc.) before you learn how to 
subquery (and I do NOT mean that comma-separated crap that poses as a 
CROSS JOIN). So many of the performance problems posted to this list are 
subquery-related when a simple, straightforward JOIN or two will solve the 
same problem usually with _much_ better response time. This post is a 
perfect example.

Explicit JOIN statements have the advantage of possibly using indexes at 
every layer of data aggregation. Subqueries lose that ability at the level 
of each subquery.  Some subqueries, depending on how they are written, are 
executed once PER ROW of your parent query data. If it takes .01 seconds 
to run a subquery by itself and you have 10000 rows of parent data, that 
would be 100 seconds of nothing but subquery execution time before you 
even get to the GROUP BY or ORDER BY processing part of the query. 
Rewriting the same query to use an explicit JOIN may turn the .01 seconds 
of the "naked" subquery into .02-.04 seconds because of the indexing 
advantage and because it computes a set of matches between the table only 
once.

Subqueries have their uses and are appropriate for many situations. 
However, IMHO, they should be your second-line of attack because of their 
inherent weaknesses. I believe this to be true for the majority of the 
RDBM systems in the world today.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine



Peter Brawley <peter.brawley@stripped> wrote on 02/01/2006 12:25:01 
AM:

> Devananda,
> 
> >Definitely not. The first LEFT JOIN, with the clauses 
> >"ON pt.login_id=pab1.login_id ... WHERE pab1.login_id IS NULL",
> > does not make sense. 
> 
> ?!? It's a standard JOIN replacement for NOT IN (...). 
> 
> >There is not a 1to1 relationship,
> 
> These joins neither find nor assume 1:1 relationships.
> 
> > and pab1.login_id will rarely, if ever, be NULL in this query. 
> 
> If you LEFT JOIN table a to table b using column c WHERE b.c IS 
> NULL, you get the rows in b which have no matching c values in a, 
> which is logically equivalent to c in a and NOT IN b.
> 
> This query ...
> 
> SELECT DISTINCT pt.offer_id
> FROM paytable AS pt
> INNER JOIN offers AS o USING (offer_id)
> LEFT JOIN publisher_advertiser_blocks AS pab1 
>   ON pt.login_id=pab1.login_id
> LEFT JOIN publisher_advertiser_blocks AS pab2 ON o.
> advertiser_id=pab2.advertiser_id
> WHERE pt.login_id=1
>   AND pab1.login_id IS NULL OR pab2.advertiser_id IS NULL;
> 
> gives the same result as yours does, on the data you provided.
> 
> PB
> 
> -----
> 
> Devananda wrote: 
> Peter Brawley wrote: 
> Devananda,
> 
> OK, get the diescription in ordinary language right, and the query 
> ought to write itself. Your schema is like this? ...
> and you want the paytable rows ....
>   (i) for which there is an offers row matching paytable.offer_id, and
>   (ii) for which there is no pab row where pab.login_id=paytable.
> login_id and pab.advertiser_id=offer_id?
> Not quite.  I want all the offer_id's which are stored in paytable ...
>    (i) for a specified login_id
>    (ii) that do not belong to advertiser_id's which that login_id is
> blocked from.
> 
> Would that be ...
> 
> SELECT pt.offer_id, pt.login_id, o.advertiser_id
> FROM paytable AS pt 
> INNER JOIN offers AS o USING (offer_id)
> LEFT JOIN publisher_advertiser_blocks AS pab1 ON 
pt.login_id=pab1.login_id 
> LEFT JOIN publisher_advertiser_blocks AS pab2 ON o.
> advertiser_id=pab2.advertiser_id
> WHERE pab1.login_id IS NULL OR pab2.advertiser_id IS NULL; 
> Definitely not. The first LEFT JOIN, with the clauses "ON pt.
> login_id=pab1.login_id ... WHERE pab1.login_id IS NULL", does not 
> make sense. There is not a 1to1 relationship, and pab1.login_id will
> rarely, if ever, be NULL in this query. 
> 
> For example, let's say login_id (1) has a hundred rows in paytable, 
> each belonging to a different offer_id (1) ... (100), and of those, 
> 10 belong to advertiser_id (1), 5 belong to advertiser_id (2), and 
> the rest to (3). If pab (login_id, advertiser_id) contains the rows 
> (1,1),(1,2) then the query should return to me 85 rows from 
> paytable. Here's example data:
> 
> INSERT INTO login_data (login_id) VALUES (1);
> INSERT INTO advertiser_data (advertiser_id) VALUES (1), (2), (3);
> INSERT INTO offers (offer_id, advertiser_id) VALUES (1, 1), (2, 1), 
> ..., (11, 2), (12, 2), ..., (16, 3), ..., (100,3);
> INSERT INTO paytable (offer_id, login_id) VALUES (1, 1), (2, 1), ..., 
(100,1);
> INSERT INTO pab (login_id, advertiser_id) VALUES (1,1), (1,2);
> 
> In this case, I would want all the rows in paytable: 
>    (i) for login_id (1), 
>    (ii) that do not belong to advertiser_id's (1) or (2). 
> 
> In SQL, this could be ...
> 
> SELECT offer_id 
> FROM paytable 
> WHERE login_id=1 AND offer_id NOT IN 
> ( 
>  SELECT offer_id 
>  FROM offers 
>  LEFT JOIN publisher_advertiser_blocks AS pab USING (advertiser_id) 
>  WHERE pab.login_id=1
> );
> 
> ... but I'd like to get away from the NOT IN (...) clause. Maybe I 
> am stuck using it?
> 
> I apologize if I've been unable to adequately explain the way these 
> tables relate to each other; honestly, I have had a bit of trouble 
> wrapping my brain around it all, myself.
> 
> Thanks again for all the suggestions,
> Devananda

> 
> PB

> 
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.1.375 / Virus Database: 267.14.25/247 - Release Date: 
1/31/2006
> 
> 

> -- 
> Devananda vdv
> 
> 
> http://devananda-vdv.blogspot.com/
> http://mycat.sourceforge.net/
> 
> 
> 
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.375 / Virus Database: 267.14.25/247 - Release Date: 
1/31/2006
>   No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.1.375 / Virus Database: 267.14.25/247 - Release Date: 
1/31/2006
> 
> 
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql?unsub=1
Thread
NOT IN vs IS NULLDevananda27 Jan
  • Re: NOT IN vs IS NULLPeter Brawley28 Jan
Re: NOT IN vs IS NULLPeter Brawley31 Jan
Re: NOT IN vs IS NULLPeter Brawley1 Feb
  • Re: NOT IN vs IS NULLSGreen1 Feb
    • Re: NOT IN vs IS NULLDevananda1 Feb
      • Re: NOT IN vs IS NULLSGreen1 Feb
        • Re: NOT IN vs IS NULLDevananda2 Feb