On Friday 13 March 2009 09:48:36 Michael wrote:
> I'm sorry for the rant, but nulls in databases make me see red! NULLS
> ARE GARBAGE!
> You are using a 'NULL-safe equal to operator', meaning, WHERE NOT
> group_id = 3, and oh, by the way, I DON'T WANT ANY NULLS TO QUALIFY AS 3.
> And yes this will be much slower, because now we can't use the index on
> Null is not an initial value, it is really saying that nothing has ever
> been put into this field, item, or what-ever. It is uninitialized and
> no one can say exactly what is in it, this is why it is marked as null.
> Using the null-safe equal to operator seems wrong, of course too me
> having any nulls in your data is wrong, or just plain dirty data, and I
> wouldn't put any faith into the results from a database that contains
> nulls, because the output is unpredictable if it is not very carefully
> You (everyone in the SQL world) would be better off using the IS NULL
> and IS NOT NULL operators and the IFNULL() function to find all null
> values in their data, and initialize them to the appropriate initial
> value. This may require discussions with the application designers or
> project managers. Whom ever is responsible for these null values being
> your data all the frecking time? It is people that don't really
> understand what a null value is, and who also have the responsibility
> of designing a database application. So people like me who write the
> code need to add 'IS NOT NULL' to every conditional statement in our logic.
> What you want to do <noted in your email below> should be very simple,
> but only if you have "CLEAN DATA". I would clean or what we call scrub
> your data first, then you can execute a simple and very fast select
> statement. Use a temporary table if you need to, just don't try to
> process dirty data, the end results will only make you look bad to
> anyone looking at the results.
> My $0.02,
This is news to me, and I'd like to understand, so please don't take this as a
flame or anything.
I think I understand this part: When designing the database structure, you
don't want just one table with a million columns, most of which are empty for
any given row. What you do is break it up into several tables, with a
relationship between the tables. Thus the term "Relational database
management system" (RDBMS)
theoretically you break up your data into tables so that there is never a non-
applicable field for any given row. In the real world, this can lead to a lot
of complexity, so many people usually cheat and have a few "does not apply"
fields for some records.
Now, what I've been doing is using NULL for "does not apply" as it seems to be
more true to the data model than using some sort of fake data.
I freely admit that Nulls increase the complexity of the code, but isn't it
I would be interested in your explanation.