List:General Discussion« Previous MessageNext Message »
From:Jigal van Hemert Date:May 5 2005 9:43am
Subject:Re: Unique Index on multiple columns that can contain NULL in MySQL
View as plain text  
From: "Dennis Fogg"
> I'm getting lots of duplicate rows even though I have a
> unique index defined over multiple columns.
> The issue is that multiple NULL values are allowed,
> even when some values are not null.
> This could be as specified by the SQL standard,
> but it's certainly confusing for the developer.

It is confusing. You have to 'grok' NULL "values" to understand the problems
that may arise.
In Boolean logic you're used to two distinct values TRUE and FALSE. The
introduction of NULL actually introduced a second outcome of comparison
operators and functions. Since NULL represents "unknown", comparing NULL to
NULL will result in MAYBE.
If you look at NULL as being a yet unknown variable it starts to make sense:
(x = 2) : maybe true, maybe false, depending on the value of 'x'.
(x != 2) : maybe true, maybe false, depending on the value of 'x'.
In MySQL the outcome MAYBE is represented by NULL, so (NULL = NULL) -> NULL;
and (NULL != 2) -> NULL; (NULL * 1) -> NULL; etcetera.

Regarding indexes, there doesn't seem to be any logic involved, other than
"what the standard says". UNIQUE indexes may have multiple NULL values
(excepting BDB tables). A PRIMARY key is defined as a combination of UNIQUE
and NOT NULL.

> Here's the test case:
>
> Goal: prevent duplicate rows on the (c1, c2) pair:
This depends on your definition of 'duplicate' in the light of the
NULL-logic I explained earlier.
If there is already a 1-1 pair in the db, inserting another 1-1 pair will
require the index to compare the entry to be inserted with the current
entries in the index. Comparing 1-1 to 1-1 will result in TRUE, so the new
entry is rejected.
If you try the same with a NULL-NULL pair in the db and you try to insert
another NULL-NULL pair, the comparison does not result in TRUE (it would be
MAYBE, thus NULL) and the entry is accepted.

> +----+------+------+
> | pk | c1 | c2 |
> +----+------+------+
> | 1 | 1 | NULL |
> | 2 | 1 | NULL |
> | 3 | 1 | |
> | 4 | NULL | NULL |
> | 5 | NULL | NULL |
> +----+------+------+
> 5 rows in set (0.00 sec)
>
>
> Note: this works even with bdb engine in MySQL:
>
> mysql> alter table test_multi_column_null engine = bdb;
> Query OK, 5 rows affected (0.03 sec)
> Records: 5 Duplicates: 0 Warnings: 0

Does it? Try SHOW CREATE TABLE <tablename>; and check whether the engine is
really BDB??
If you do an ALTER TABLE <tablename> ENGINE=<current_engine_type>; the table
will be rebuilt anyway.

I tries your example and the engine type remained MyISAM...

> Conclusion: if you want to enforce uniqueness,
> don't use columns that allow NULL.

That entirely depends on your definition of 'uniqueness' as I explained
before. If you consider NULL to be equal to NULL (which it is not) then you
should only use NOT NULL columns.

Regards, Jigal.

Thread
Unique Index on multiple columns that can contain NULL in MySQLDennis Fogg5 May
  • Re: Unique Index on multiple columns that can contain NULL in MySQLHank5 May
  • Re: Unique Index on multiple columns that can contain NULL in MySQLJigal van Hemert5 May
  • Re: Unique Index on multiple columns that can contain NULL in MySQLMartijn Tonies5 May