List:General Discussion« Previous MessageNext Message »
From:Kenneth Wagner Date:December 21 2005 11:09pm
Subject:Re: Reason for Auto-increment primary keys?
View as plain text  
Rhino,

What I do is put the ID (integer Primary Key, auto-increment, unique) first.
First key in every file.

Then define my indexes.

You could, do it the other way as you ask. But, I found this way is 
consistent.
It can be traced anywhere on any file. Timestamp on important or critical 
files
is also a big plus. It's not needed on static or almost static files like 
counties,
states, departments, etc. But orders, parts, inventory, customers and so on
benefit from a timestamp. Beside SQL is optimized for it AFAIK.

This hasn't ever been an issue aside from a few people asking about how to
use the ID? Or what's it for?

Ken

----- Original Message ----- 
From: "Rhino" <rhino1@stripped>
To: "Kenneth Wagner" <kenneth_wagner@stripped>; <SGreen@stripped>
Cc: "mysql" <mysql@stripped>
Sent: Wednesday, December 21, 2005 4:57 PM
Subject: Re: Reason for Auto-increment primary keys?


>
> ----- Original Message ----- 
> From: <SGreen@stripped>
> To: "Kenneth Wagner" <kenneth_wagner@stripped>
> Cc: "mysql" <mysql@stripped>; "Rhino" <rhino1@stripped>
> Sent: Wednesday, December 21, 2005 5:15 PM
> Subject: Re: Reason for Auto-increment primary keys?
>
>
>> "Kenneth Wagner" <kenneth_wagner@stripped> wrote on 12/21/2005
>> 04:27:53 PM:
>>
>>> Hi Rhino,
>>>
>>> Excellent question. Felt as you do, initially.
>>>
>>> Here's what changed my mind.
>>>
>>> Integer keys are fast. And small. Hence, they take very little RAM
>> space.
>>>
>>> They are contiguous. A missing PK is easy to find. There's a gap in the
>>> number sequence.
>>> Can't do this with the part description. No way to tell if a record is
>>> missing.
>>>
>>> Example: The system gets hung up or crashes and a reboot is needed.
>>> How to test the integrity of the parts table. I.e., anything missing?
>> Check
>>> the PK for
>>> continuity is a good place to start. With a timestamp I would even know
>> the
>>> date
>>> where the file got truncated. Example. It's Dec 20th. The highest date
>> in
>>> the file is
>>> Dec 1st at rec# 1203023. That's where the analysis would begin. Other
>> files
>>> that
>>> didn't get truncated but have the related key # in them would tip me off
>> as
>>> to how
>>> much is missing. Like an order file.
>>>
>>> Speed. Especially where related files are concerned. Foreign keys. Links
>> on
>>> integer
>>> fields are faster, smaller and more efficient. Keys remain smaller and
>>> faster.
>>>
>>> Activity testing: Let's say I do some statistical testing.  Like how
>> many
>>> new parts
>>> per month on average. Easy to do with the integer PK. Even easier if it
>> has
>>> a timestamp.
>>> Then if the average suddenly drops or increases I would want to know
>> why. Or
>>> modify
>>> my DB tables or coding. Note that the timestamp does not have to be in
>> your
>>> example
>>> table. It could be in an insert/update table that just tracks what has
>> been
>>> added or updated
>>> by PK, timestamp, activity type and updatedbyuserID.
>>>
>>> So, there's 2 cents worth.
>>>
>>> Wondering how relevant this is?
>>>
>>> HTH,
>>>
>>> Ken Wagner
>>>
>>>
>>>
>>> ----- Original Message ----- 
>>> From: "Rhino" <rhino1@stripped>
>>> To: "mysql" <mysql@stripped>
>>> Sent: Wednesday, December 21, 2005 2:54 PM
>>> Subject: Reason for Auto-increment primary keys?
>>>
>>>
>>> > One technique that I see a lot on this mailing list is people putting
>>> > auto-incremented integer primary keys on their tables.
>>> >
>>> > Maybe I'm just "old school" but I've always thought that you should
>> choose
>>> > a primary key based on data that is actually in the table whenever
>>> > possible, rather than generating a new value out of thin air.
>>> >
>>> > The only exception that comes to mind is things like ID numbers; for
>>> > example, it is better to use an internally-generated integer for an
>>> > employee number than it is to use an employee's name. Even the
>> combination
>>> > of first name and last name is not necessarily unique - I could cite a
>>
>>> > real life example -and, of course, people can change their names. That
>>
>>> > makes names less desireable than a generated value when you are trying
>> to
>>> > uniquely indentify such entities. In such a case, a nice, reasonable
>> short
>>> > integer is easier.
>>> >
>>> > I just found this rather good definition of primary keys at
>>> > http://www.utexas.edu/its/windows/database/datamodeling/dm/keys.html.
>> The
>>> > relevant bit says that a primary key must have:
>>> > - a non-null value for each instance of the entity
>>> > - a value that is unique for each instance of an entity
>>> > - a value that must not change or become null during the life of the
>> each
>>> > instance of the entity
>>> >
>>> > That article makes the same basic remarks about name vs. ID but makes
>> the
>>> > point that it is more commonly the case that table designers will use
>>> > something like a social security number - an _externally_ generated
>>> > number - to distinguish between employees rather than an
>>> > internally-generated number.
>>> >
>>> > But the trend in this mailing list is toward using generated values as
>>
>>> > primary keys in virtually EVERY table, even when good primary keys can
>> be
>>> > found in the (non-generated) data already existing in the table.
>>> >
>>> > Now, I haven't done anything remotely resembling a quantified analysis
>> so
>>> > maybe I'm wildly exaggerating this trend. But I do seem to recall a
>> lot of
>>> > table descriptions with auto-generated keys and I don't think they
>> were
>>> > all a name vs. ID scenario....
>>> >
>>> > Has anyone else noticed a similar trend?
>>> >
>>> > If this trend is real, it doesn't seem like a very good trend to me.
>> For
>>> > example, if you were keeping track of parts in a warehouse, why would
>>> > anyone make a table that looked like this:
>>> > ID (autogenerated PK)     PART_NO    PART_DESCRIPTION
>>> > 1                                   A01             Widget
>>> > 2                                    B03            Grapple Grommet
>>> > 3                                    A02            Snow Shovel
>>> > 4                                    D11            Whisk
>>> > 5                                    C04            Duct Tape
>>> >
>>> > when this table is simpler:
>>> >
>>> > PART_NO (PK)   PART_DESCRIPTION
>>> > A01                 Widget
>>> > B03                Grapple Grommet
>>> > A02                Snow Shovel
>>> > D11                Whisk
>>> > C04                Duct Tape
>>> >
>>> > Would anyone care to convince me that the first version of the table
>> is
>>> > "better" than the second version in some way?
>>> >
>>> > I just want to be sure that no one has come along with some new and
>>> > compelling reason to autogenerate keys when perfectly good keys can be
>>
>>> > found within the data already. I don't mind being "old school" but I
>> don't
>>> > want to be "out to lunch" :-)
>>> >
>>> >
>>> > Rhino
>>> >
>>> >
>>> >
>>> > -- 
>>> > No virus found in this outgoing message.
>>> > Checked by AVG Free Edition.
>>> > Version: 7.1.371 / Virus Database: 267.14.2/208 - Release Date:
>> 20/12/2005
>>> >
>>
>> I agree with every point that Kenneth just made. Integers compare 5 to 50
>> times faster than strings (depending on the length of the string) and
>> usually take up much less room. That means that more index items can fit
>> into memory and you are less likely to cause memory paging during an 
>> index
>> operation.
>>
>> I frequently define both an auto_inc field and a PK on other values. I 
>> use
>> the auto_inc field for FK relationships (due to the already mentioned
>> reasons) but the PK is there to preserve my data integrity.
>>
>> Basically, the heavy use of auto_increment is a practical compromise of
>> form vs. speed.
>>
> Well, you (meaning everyone who has responded, not just Shawn) have 
> certainly given me a wealth of good reasons to rethink my position on 
> auto-incremented keys! I will have to review these reasons very seriously 
> about this the next time I design a table....
>
> Shawn, I'm not quite clear what you are saying in your second last 
> paragraph. When you have this situation:
>
> ID (autogenerated)     PART_NO    PART_DESCRIPTION
> 1                              A01             Widget
> 2                              B03            Grapple Grommet
> 3                              A02            Snow Shovel
> 4                              D11            Whisk
> 5                              C04            Duct Tape
>
> Do you put the PK on ID alone, PART_NO alone, or the concatentation of ID 
> and PART_NO? I _think_ you mean that the PK is on PART_NO alone and that 
> ID is simply defined unique so that it can be the target for FKs that 
> refer to it but I want to be sure I'm not misreading you....
>
> Rhino
>
>
>
>
> -- 
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.1.371 / Virus Database: 267.14.2/208 - Release Date: 20/12/2005
>
>
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: 
> http://lists.mysql.com/mysql?unsub=1
>
> 


Thread
Reason for Auto-increment primary keys?Rhino21 Dec
  • Re: Reason for Auto-increment primary keys?Kenneth Wagner21 Dec
    • Re: Reason for Auto-increment primary keys?SGreen21 Dec
      • Re: Reason for Auto-increment primary keys?Rhino21 Dec
        • Re: Reason for Auto-increment primary keys?SGreen22 Dec
      • Re: Reason for Auto-increment primary keys?Kenneth Wagner22 Dec
        • Re: Reason for Auto-increment primary keys?David Griffiths22 Dec
        • Are primary keys essential?James Harvard22 Dec
      • Re: Are primary keys essential?Kenneth Wagner22 Dec
        • Re: Are primary keys essential?James Harvard22 Dec
          • Re: Are primary keys essential?James Harvard22 Dec
      • Re: Are primary keys essential?Kenneth Wagner22 Dec
      • Re: Are primary keys essential?Rhino22 Dec
      • Reporting tools for summary dataC.R.Vegelin22 Dec
        • Re: Reporting tools for summary dataJames Harvard22 Dec
    • Re: Reason for Auto-increment primary keys?Josh Trutwin21 Dec
  • Re: Reason for Auto-increment primary keys?Peter Brawley21 Dec
    • Re: Reason for Auto-increment primary keys?James Harvard22 Dec
  • Re: Reason for Auto-increment primary keys?Rudy Lippan21 Dec
RE: Are primary keys essential?SST - Adelaide)22 Dec