List:General Discussion« Previous MessageNext Message »
From:Peter Brawley Date:February 10 2008 11:29pm
Subject:Re: Inefficient query processing?
View as plain text  
Yves,

 > My problem is that the sub-select in line 7
 > ("SELECT 1") takes a rather long time

It might be possible to simplify. Do I have the schema right?

message (messageID)
keylist (keylistID)
tag ( tagID, readaccesskeylist references keylist(keylistID) )
message_revision_tag ( ???, messageID references message(messageID), 
tagID references tag(tagID))

Or is there a message_revisions table missing from your description? On 
what I have, if the condition tag.readaccesskeylist IS NOT NULL defines 
a 'banned' condition, perhaps the query can be written without 
referencing the keylist table, since the requirement simply translates 
to finding all messages with no tag having a non-null readaccesskeylist, 
i.e. ...
(i} find the messages which have a non-null tag.readaccesskeylist,
(ii) find the messages which are not in [i].

(i) Finding messages which have a deny-access tag looks like a simple join:

SELECT DISTINCT messageID
FROM message_revision_tag AS mrt
JOIN tag AS t ON mrt.tagID=t.tagID
WHERE t.readaccesskeylist IS NOT NULL;

(ii) We get the messages not in the above result with a simple exclusion 
join:

SELECT messageID
FROM message m
LEFT JOIN (
  SELECT DISTINCT messageID
  FROM message_revision_tag AS mrt
  JOIN tag AS t ON mrt.tagID=t.tagID
  WHERE t.readaccesskeylist IS NOT NULL
) AS banned USING (messageID)
WHERE banned.messageID IS NULL,

Or did I miss something?

PB
http://www.artfulsoftware.com

-----

Yves Goergen wrote:
> Hi,
>
> I have a performance problem with one of my SQL queries. It's a rather 
> complex one so I'll spare you the details. This is the situation:
>
> In my system, there's messages, tags and keylists. Each message has 
> message_revisions, each message_revision can be assigned tags (stored 
> in message_revision_tag). Each tag points to a keylist that contains 
> all keys that grant access to messages with that tag. If a tag has no 
> keylist assigned (ReadAccessKeylist IS NULL), then everybody may 
> access the messages.
>
> This query finds all messages that don't have a tag assigned that 
> would deny access to it. (Assume every message has only a single 
> revision with the number 1, for now. The actual user comparison with 
> another sub-select is hidden in <some more>.)
>
> SELECT m."MessageId"
> FROM "message" m
> WHERE
>   NOT EXISTS
>     (SELECT
>       EXISTS
>         (SELECT 1
>         FROM "keylist" tk
>         WHERE tk."KeylistId" = t."ReadAccessKeylist" AND <some more>)
>       AS "Allowed"
>     FROM "message_revision_tag" mrt
>       JOIN "tag" t USING ("TagId")
>     WHERE mrt."MessageId" = m."MessageId" AND
>       mrt."RevisionNumber" = 1 AND
>       t."ReadAccessKeylist" IS NOT NULL
>     HAVING NOT "Allowed")
>
> My problem is that the sub-select in line 7 ("SELECT 1") takes a 
> rather long time. (When I remove it, it's much faster.) I'm not sure 
> why, because there's not a single keylist in that table, however. 
> Another issue is that this query should actually never be regarded. 
> The condition in the second-last line is always false. A simple test 
> confirms that:
>
> SELECT COUNT(*) FROM "tag" WHERE "ReadAccessKeylist" IS NOT NULL
> -> 0
>
> So there should never be a reason why the FROM in line 11 would result 
> in a row (filtering out with the conditions, of course). But it still 
> gets executed. When I make sure that the condition is always false, by 
> adding a "AND 0" just before the HAVING clause, the whole thing runs 
> much faster.
>
> I have no separate timings, but it's in the magnitude of 1 vs. 5-10 
> milliseconds for the query. Run a hundred times makes a noticeable delay.
>
> My understanding of it all was that first the FROM clause is regarded 
> to see what rows there are. Then WHERE filters them, then SELECT will 
> pick some columns (and thereby execute my sub-select expression) and 
> finally HAVING filters again. Since in my theory there is no single 
> row, SELECT has nothing to do. But obviously it has.
>
> Some suggestion what's going on?
>
Thread
Inefficient query processing?Yves Goergen10 Feb
  • Re: Inefficient query processing?Peter Brawley11 Feb
    • Re: Inefficient query processing?Yves Goergen11 Feb
      • Re: Inefficient query processing?Peter Brawley11 Feb
        • Re: Inefficient query processing?Yves Goergen11 Feb
          • Re: Inefficient query processing?Peter Brawley11 Feb
            • Re: Inefficient query processing?Yves Goergen11 Feb
              • Re: Inefficient query processing?Peter Brawley11 Feb
                • Re: Inefficient query processing?Yves Goergen11 Feb
  • Re: Inefficient query processing?Perrin Harkins11 Feb
    • Re: Inefficient query processing?Yves Goergen11 Feb
      • Re: Inefficient query processing?Perrin Harkins11 Feb
      • Re: Inefficient query processing?Peter Brawley11 Feb