Hi all,
I'm having the following idea and
I'm looking for a DBMS that would satisfy me.
I wish it was MySQL, one day. :-)
A conventional database index always spans
all rows of the table it's constructed on.
Sometimes, though, I'd prefer a finer resolution was possible:
I'd like to restrict the rows to be indexed.
Several examples:
1. A foreign key that realizes an optional relationship
(a "0,1" on the master side) that has only a low fraction
of non-NULLs.
But you have all the NULLs in the index,
although they won't ever be used:
* If I do joins, the NULLs won't match
(right, if I use the normal "=", not the "<=>",
but that's another story).
* If I select the "fk IS NULL" rows,
any sensible optimizer will decide to scan the table
instead of going through the index.
If I could construct the index with the condition
"fk IS NOT NULL":
CREATE INDEX idx_name
ON tab_name ( col1_name ) WHERE col1 IS NOT NULL;
I could get rid of the dead part of the complete index.
The index could be much smaller and the db operations faster.
2. Sometimes I don't trust my users and don't allow them
to do a real delete. Instead, I put an "erased" field
in the table and add an additional condition to my selects,
like "... and not erased". Now, if I have a uniqueness constraint,
I want it to work only on the "not erased" records.
I wish I could say
CREATE TABLE lookup1 (
id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
code CHAR(5) NOT NULL,
description VARCHAR(255),
erased BIT NOT NULL DEFAULT 0,
UNIQUE ( code ) WHERE NOT erased
);
3. A similar situation arises when I want to track history
and add the "valid_from" and "valid_to" fields.
The "valid_from" can't be null, but "valid_to" can:
a null value there signals the up-to-date status,
whereas a non-null means the validity of the record
has been closed.
Suppose I track identification documents of a person.
The data is precious enough for me that I don't want to delete
previous data, neither do I want to keep the non-up-to-date data
in a separate table.
At any given time, I'll honor just one identification of a type.
I wish I could do
CREATE TABLE Person_Ident_Document (
id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
person_id INTEGER UNSIGNED NOT NULL REFERENCES Person,
valid_from DATETIME NOT NULL,
valid_to DATETIME,
ident_doc_type VARCHAR(255) NOT NULL,
ident_doc_number VARCHAR(255) NOT NULL,
ident_doc_issued_when DATE,
ident_doc_issued_where VARCHAR(255),
ident_doc_issued_by VARCHAR(255),
UNIQUE ( person_id, ident_doc_type ) WHERE valid_to IS NULL
);
[End of examples.]
As far as I know, allowing for a conditional index
would not violate any of the principles of relational databases.
I think this is something other DBMS'es usually don't have,
but what would be a very handy extension over the standards.
The advantages are in a more efficient space usage,
faster index maintenance, faster database operations.
The only disadvantage I can see is that it's not implemented yet. :-)
Your comments?
Jan Dvorak
| Thread |
|---|
| • Index with condition? [A distant idea] | Jan Dvorak | 21 Nov |