On Mon, Sep 15, 2003 at 07:29:38PM -0700, Jon Frisby wrote:
> That's an ugly way to make the distinction between A-mark and I-mark.
> In most situations, I'd move the relevant column(s) to a separate table,
> with a NULL-allowed column in that table and a FK reference back to the
> original table. The absence of a row in this child table indicates
> I-mark, and the presence of a row with a NULL in the column indicates an
> A-mark.
>
> Or alternatively you could just get a database that has two kinds of
> NULLs, specifically defined to represent the distinction you bring up.
> ( http://www.firstsql.com )
When the original ANSI committee was setting up the SQL specification,
there were people who wanted to have two NULLs, one for "not known"
and one for "not applicable". The committee decided that this was too
complicated, which I think was a mistake, but the unified NULL has
proven pretty useful. It does, however, require you to understand its
meaning and think about its use.
Instead of restructuring the tables, if the column were a text type, I
would disallow NULLs and specify the use of "Not Known" and "Not
Applicable" in the documentation. Of course, this won't work for
numeric columns. But for text columns, it has the advantage of being
self-documenting. You can set the default value to "Not Known", and
the client app will have to supply "Not Applicable" where applicable.
Generally, I've found that the following rules tend to eliminate
problems associated with the unified NULL:
1) If an attribute is essential, then don't allow NULLs in the column.
The data has to be supplied before a record can be created.
2) If you have to store incomplete records (i.e. missing essential
data), then put them in a seperate table, which is known to contain
incomplete records. You'll need some mechanism for checking the
records and transferring them to the main database when they are
complete.
3) If the record must go into the main database without the data, then
the data isn't essential. If the data isn't essential, then the
NULL is usually OK. For unessential attributes, it rarely matters
whether the data is not known or not applicable.
Bob Hall