List:Internals« Previous MessageNext Message »
From:sasha Date:November 21 2000 3:47pm
Subject:Re: Index with condition? [A distant idea]
View as plain text  
Jan Dvorak wrote:
> 
> Hi all,
> 
> I'm having the following idea and
> I'm looking for a DBMS that would satisfy me.
> I wish it was MySQL, one day. :-)
> 
> A conventional database index always spans
> all rows of the table it's constructed on.
> Sometimes, though, I'd prefer a finer resolution was possible:
> I'd like to restrict the rows to be indexed.
> 
> Several examples:
> 
> 1. A foreign key that realizes an optional relationship
>    (a "0,1" on the master side) that has only a low fraction
>    of non-NULLs.
>    But you have all the NULLs in the index,
>    although they won't ever be used:
>     * If I do joins, the NULLs won't match
>       (right, if I use the normal "=", not the "<=>",
>        but that's another story).
>     * If I select the "fk IS NULL" rows,
>       any sensible optimizer will decide to scan the table
>       instead of going through the index.
>    If I could construct the index with the condition
>    "fk IS NOT NULL":
> 
>     CREATE INDEX idx_name
>       ON tab_name ( col1_name ) WHERE col1 IS NOT NULL;
> 
>    I could get rid of the dead part of the complete index.
>    The index could be much smaller and the db operations faster.
> 
> 2. Sometimes I don't trust my users and don't allow them
>    to do a real delete.  Instead, I put an "erased" field
>    in the table and add an additional condition to my selects,
>    like "... and not erased".  Now, if I have a uniqueness constraint,
>    I want it to work only on the "not erased" records.
>    I wish I could say
> 
>     CREATE TABLE lookup1 (
>       id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
>       code CHAR(5) NOT NULL,
>       description VARCHAR(255),
>       erased BIT NOT NULL DEFAULT 0,
>       UNIQUE ( code ) WHERE NOT erased
>     );
> 
> 3. A similar situation arises when I want to track history
>    and add the "valid_from" and "valid_to" fields.
>    The "valid_from" can't be null, but "valid_to" can:
>    a null value there signals the up-to-date status,
>    whereas a non-null means the validity of the record
>    has been closed.
> 
>    Suppose I track identification documents of a person.
>    The data is precious enough for me that I don't want to delete
>    previous data, neither do I want to keep the non-up-to-date data
>    in a separate table.
>    At any given time, I'll honor just one identification of a type.
>    I wish I could do
> 
>      CREATE TABLE Person_Ident_Document (
>        id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
>        person_id INTEGER UNSIGNED NOT NULL REFERENCES Person,
>        valid_from DATETIME NOT NULL,
>        valid_to DATETIME,
>        ident_doc_type VARCHAR(255) NOT NULL,
>        ident_doc_number VARCHAR(255) NOT NULL,
>        ident_doc_issued_when DATE,
>        ident_doc_issued_where VARCHAR(255),
>        ident_doc_issued_by VARCHAR(255),
>        UNIQUE ( person_id, ident_doc_type ) WHERE valid_to IS NULL
>      );
> 
> [End of examples.]
> 
> As far as I know, allowing for a conditional index
> would not violate any of the principles of relational databases.
> 
> I think this is something other DBMS'es usually don't have,
> but what would be a very handy extension over the standards.
> 
> The advantages are in a more efficient space usage,
> faster index maintenance, faster database operations.
> The only disadvantage I can see is that it's not implemented yet. :-)

Monty - I think it is a good idea, and should not be too hard to do - you just
check the expr before you update the key, and check it again before doing a key
read - is there anything else involved, other than storing/retrieving the expr?

-- 
MySQL Development Team
   __  ___     ___ ____  __ 
  /  |/  /_ __/ __/ __ \/ /   Sasha Pachev <sasha@stripped>
 / /|_/ / // /\ \/ /_/ / /__  MySQL AB, http://www.mysql.com/
/_/  /_/\_, /___/\___\_\___/  Provo, Utah, USA
       <___/
Thread