List:General Discussion« Previous MessageNext Message »
From:Joerg Bruehe Date:April 16 2008 1:19pm
Subject:Re: select does too much work to find rows where primary key does
not match
View as plain text  
Hi Patrick, all !


Patrick J. McEvoy wrote:
> I have two MyISAM tables; each uses 'phone' as a primary key. Finding 
> rows where the primary keys match is efficient:
> 
> mysql> explain select bar.phone from foo,bar where foo.phone=bar.phone;
> [[...]}

Ok, let us take some simple example. Say tables "foo" and "bar" both 
have three rows each, with "phone" values 1, 2, and 3.
Then your matching lines will be:

   foo.phone   bar.phone
       1           1
       2           2
       3           3

Column foo.phone is shown for explanation only, your select would not 
return it.
Each individual value in bar.phone will be returned as often as there is 
an identical value in foo.phone.

I trust that is close to what you expect.
My sample data here do not show what will happen if values in foo.phone 
are not distinct - figure yourself.

> 
> 
> Finding rows in one table that do not match a row in the other table is 
> wildly inefficient:
> 
> mysql> explain select bar.phone from foo,bar where foo.phone!=bar.phone;
> [[...]]

Your resulting data from this select would be:

   foo.phone   bar.phone
       1           2
       1           3
       2           1
       2           3
       3           1
       3           2

Again, foo.phone is shown for explanation only.

> 
> (This is the same for 'NOT', '!=', or '<>'.)

Correct.

> 
> The amount of work should be identical in both cases: grab a row, look 
> up by primary key in the other table, proceed.

No, it isn't:
A  "select ... from foo, bar where CONDITION" effectively creates the 
cartesian product of both tables and then removes all lines 
(combinations) which do not meet the condition.

Of course, the system uses better strategies if possible, evaluating 
indexes etc, but the resulting data will be the same.

Assuming tables "foo" and "bar" each have a column "num" with the values 
1 to 100, a condition "... where foo.num = bar.num" will lead to a 
result with 100 rows.

But "... where foo.num != bar.num" will lead to a table of 9,900 rows:
For each of the 100 values in "foo.num", there will be 99 entries in 
"bar.num" that satisfy the inequality condition.

> 
> My real goal is to delete rows in the smaller table if there is no match 
> in the larger table:
> 
>     delete from bar using foo,bar where not bar.phone=foo.phone;

See above - wrong approach.

What you need is a subquery or an outer join, as proposed in the other 
replies.

> 
> [[...]]

HTH,
Jörg

-- 
Joerg Bruehe, Senior Production Engineer
MySQL AB, www.mysql.com


Thread
select does too much work to find rows where primary key does not matchPatrick J. McEvoy15 Apr
  • Re: select does too much work to find rows where primary key does not matchPhil15 Apr
    • Re: select does too much work to find rows where primary key does not matchddevaudreuil16 Apr
  • Re: select does too much work to find rows where primary key doesnot matchJoerg Bruehe16 Apr
Re: select does too much work to find rows where primary key does not matchPatrick J. McEvoy16 Apr