mos a écrit :
> This should be so simple, yet I've struck out.
>
> I have 2 tables, each with a common column called "pid" which is an
> integer and is a unique index. There are approx 18 million rows in each
> table, and one of the tables has approx 5000 fewer rows than the other
> table. So it should be a piece of cake finding the missing rows right?
>
> Well I did a
>
> select * from t1 left join t2 on t1.pid=t2.pid where t2.pid is null
> select * from t2 left join t1 on t2.pid=t1.pid where t1.pid is null
>
> and both queries return a null set. I then checked both tables and none
> of them have pid as null.
> I then counted the number of non-unique pid's and there aren't any (of
> course with a unique index I didn't think there would be)
>
> Ok, so there are no rows in t1 that aren't in t2, and vice versa.
> There are no duplicate sid values and no empty sid values.
> I physically counted the rows in each table and they are indeed off by
> around 5000 rows.
> I checked the tables for consistency and they passed.
>
> How can anyone explain this? How do I find the missing rows?
> TIA
>
> Mike
>
>
The 2 queries you paste seemed correct and should output the result
unless there is something really strange happening.
If you are using a version that support subquery you could try
select * from t1 where id not in (select id from t2);
Not sure which exactly is suppose to be faster but it's worth a try!
Is it possible that you are hitting some kind of limit on maximum join
number in your server ? I'm not even sure if a limit of that kind exists
(Just putting my tought on the table)
--
Mathieu Bruneau
aka ROunofF
===
GPG keys available @ http://rounoff.darktech.org