This should be so simple, yet I've struck out.
I have 2 tables, each with a common column called "pid" which is an integer
and is a unique index. There are approx 18 million rows in each table, and
one of the tables has approx 5000 fewer rows than the other table. So it
should be a piece of cake finding the missing rows right?
Well I did a
select * from t1 left join t2 on t1.pid=t2.pid where t2.pid is null
select * from t2 left join t1 on t2.pid=t1.pid where t1.pid is null
and both queries return a null set. I then checked both tables and none of
them have pid as null.
I then counted the number of non-unique pid's and there aren't any (of
course with a unique index I didn't think there would be)
Ok, so there are no rows in t1 that aren't in t2, and vice versa.
There are no duplicate sid values and no empty sid values.
I physically counted the rows in each table and they are indeed off by
around 5000 rows.
I checked the tables for consistency and they passed.
How can anyone explain this? How do I find the missing rows?
TIA
Mike