The PK thread has reminded me of a question I had but never resolved when designing the
table structure of the big data warehouse app I was droning on about just now in the
aforementioned thread. As need to import some hundreds of millions of rows in the next
week, I think now would be a good idea to get a definite answer!
The core of the app is a mass of data, broken into many tables that I normally only need
to query individually. Because I felt uneasy not including a primary key and need to get
a proof-of-concept db running I ended up putting an auto_increment int column in the data
tables. (Yes, I know, an extra 4 bytes per row when I was talking about saving every byte
possible in my last post. <blush>) But the PK column is never used either as a
foreign key or in app code for the table itself. But I couldn't put a PK on a combination
of other columns, because I don't think I can be sure of uniqueness. Can I just drop the
PK column?
BTW I'm sure this is addressed in all those good books on database design and theory I
should have, but never have, read. But I'm a bit short of time, and it's quicker just to
pick the brains of you folks! Quicker for me, that is - sorry!
TIA,
James Harvard