List:General Discussion« Previous MessageNext Message »
From:Yves Goergen Date:February 11 2008 11:37am
Subject:Re: Inefficient query processing?
View as plain text  
On 11.02.2008 11:18 CE(S)T, Peter Brawley wrote:
> Unclear.

Okay. Then the long form.

My application is a messaging application that supports multiple users, 
messages with revisions, tags and access control.

A user is identified by a UserId which I also call "key". (Imagine it 
like the key you have for your front door.) A user can be assigned 
additional keys that give him way to additional messages. (Imagine it 
like the key for your office and the neighour's.) A user is also the 
author of a message revision.

A keylist supports storing multiple keys for another entity. This table 
is only a product of database normalisation. It is identified by a 
KeylistId.

Tags are predefined, i.e. an author cannot use tags that are not already 
there. Tags are identified by their TagId. Tags can also be used to 
restrict access to messages that have them assigned. For this, a tag can 
have a list of keys that can be used to access the message. If a keylist 
is assigned (ReadAccessKeylist IS NOT NULL), then any key from the list 
will do. E.g. you can restrict access on a certain tag to the sales and 
development departments, locking out research dept. and anonymous 
guests. If no key list is defined (ReadAccessKeylist IS NULL), then no 
access restrictions are imposed on that tag.

A message contains multiple revisions (can be one, too) that keep the 
actual data, the message content, subject, timestamp etc. A message has 
a MessageId, a revision is identified by a MessageId and a 
RevisionNumber, both starting at 1. A message can have its own 
individual access keylist that behaves the same was as for tags, but it 
only applies to the single respective message.

A message revision can be assigned tags that are used for classification 
and access control. These tag associations are stored in the 
message_revision_tag table. For search purposes, a single message 
revision is linked to that will be regarded. This is the SearchRevision 
of a message that is determined by other conditions and stored persistently.

This is the relevant part of the database schema:
message (MessageId)
message_revision (MessageId references message, RevisionNumber, Author
   references user(UserId), CreatedTime, ...)
keylist (KeylistId, Key references user(UserId))
user (UserId, AdditionalKeylist references keylist(KeylistId))
tag (TagId, ReadAccessKeylist references keylist(KeylistId))
message_revision_tag (MessageId, RevisionNumber, TagId references tag)

One additional constraint:
message_revision_tag (MessageId, RevisionNumber) reference to
   message_revision

The main search query only retrieves MessageIds. It must only return 
messages that the current session user has access to. This access may 
come from the messages' ReadAccessKeylist or any of the assigned tags' 
ReadAccessKeylist. The session user can be granted access for his own 
personal key (UserId) or one of the additional keys in his "keyring" 
(AdditionalKeylist). If there are multiple access lists, the user must 
pass all of them to get the message. I.e. if a message has an individual 
keylist and also some tags assigned that restrict access, the user must 
be in each of those key lists to get access.

This is probably the most tricky part to understand. Please tell me if 
you have questions.

The entire SQL query is now:

 > SELECT m."MessageId"
 > FROM "message" m
> WHERE
>   -- Grant all access to administrators (so that they can alter the access key
> lists)
>   (:isAdmin OR
>     -- Message has no read access keylist
>     (m."ReadAccessKeylist" IS NULL
>     OR EXISTS
>       -- Message's read access keylist contains personal or additional user key
>       (SELECT 1
>       FROM "message" m2, "keylist" m2k
>       WHERE m2k."KeylistId" = m2."ReadAccessKeylist" AND
>         m2."MessageId" = m."MessageId" AND
>         (m2k."UserId" = :sessionUserId OR
>           m2k."UserId" IN
>             (SELECT uk."UserId"
>             FROM "user" u, "keylist" uk
>             WHERE uk."KeylistId" = u."AdditionalKeylist" AND
>               u."UserId" = :sessionUserId))))
>     AND NOT EXISTS
>       -- There must not be any tag that denies access
>       (SELECT
>         EXISTS
>           -- Tag's read access keylist contains personal or additional user key
>           (SELECT 1
>           FROM "keylist" tk
>           WHERE tk."KeylistId" = t."ReadAccessKeylist" AND
>             (tk."UserId" = :sessionUserId OR
>               tk."UserId" IN
>                 (SELECT uk."UserId"
>                 FROM "user" u, "keylist" uk
>                 WHERE uk."KeylistId" = u."AdditionalKeylist" AND
>                   u."UserId" = :sessionUserId)))
>         AS "Allowed"
>       FROM "message_revision_tag" mrt
>         JOIN "tag" t USING ("TagId")
>       WHERE mrt."MessageId" = m."MessageId" AND
>         mrt."RevisionNumber" = m."SearchRevision" AND
>         t."ReadAccessKeylist" IS NOT NULL   -- Only regard tags with read access
> keylist (only they can fail the access test)
 >       HAVING NOT "Allowed")

:isAdmin is 0 or 1 depending on whether the session user is an 
administrator. Admins always get the result because they need to be able 
to see it to alter access.

:sessionUserId is the current session user's UserId. For anonymous 
guests, this is 0.

The partial query that I have quoted last time is mainly the main 
query's last condition in the WHERE clause. It handles access coming 
from tags. The first half of the condition handles access coming from 
the message's own ReadAccessKeylist.

(This is not a closed-source commercial thing. It is a web application 
that will be available on my website under the GPL when it's ready. It 
basically already works fine, just a little slow under some conditions.)

-- 
Yves Goergen "LonelyPixel" <nospam.list@stripped>
Visit my web laboratory at http://beta.unclassified.de
Thread
Inefficient query processing?Yves Goergen10 Feb
  • Re: Inefficient query processing?Peter Brawley11 Feb
    • Re: Inefficient query processing?Yves Goergen11 Feb
      • Re: Inefficient query processing?Peter Brawley11 Feb
        • Re: Inefficient query processing?Yves Goergen11 Feb
          • Re: Inefficient query processing?Peter Brawley11 Feb
            • Re: Inefficient query processing?Yves Goergen11 Feb
              • Re: Inefficient query processing?Peter Brawley11 Feb
                • Re: Inefficient query processing?Yves Goergen11 Feb
  • Re: Inefficient query processing?Perrin Harkins11 Feb
    • Re: Inefficient query processing?Yves Goergen11 Feb
      • Re: Inefficient query processing?Perrin Harkins11 Feb
      • Re: Inefficient query processing?Peter Brawley11 Feb