List:General Discussion« Previous MessageNext Message »
From:Yves Goergen Date:February 10 2008 10:30pm
Subject:Inefficient query processing?
View as plain text  
Hi,

I have a performance problem with one of my SQL queries. It's a rather 
complex one so I'll spare you the details. This is the situation:

In my system, there's messages, tags and keylists. Each message has 
message_revisions, each message_revision can be assigned tags (stored in 
message_revision_tag). Each tag points to a keylist that contains all 
keys that grant access to messages with that tag. If a tag has no 
keylist assigned (ReadAccessKeylist IS NULL), then everybody may access 
the messages.

This query finds all messages that don't have a tag assigned that would 
deny access to it. (Assume every message has only a single revision with 
the number 1, for now. The actual user comparison with another 
sub-select is hidden in <some more>.)

SELECT m."MessageId"
FROM "message" m
WHERE
   NOT EXISTS
     (SELECT
       EXISTS
         (SELECT 1
         FROM "keylist" tk
         WHERE tk."KeylistId" = t."ReadAccessKeylist" AND <some more>)
       AS "Allowed"
     FROM "message_revision_tag" mrt
       JOIN "tag" t USING ("TagId")
     WHERE mrt."MessageId" = m."MessageId" AND
       mrt."RevisionNumber" = 1 AND
       t."ReadAccessKeylist" IS NOT NULL
     HAVING NOT "Allowed")

My problem is that the sub-select in line 7 ("SELECT 1") takes a rather 
long time. (When I remove it, it's much faster.) I'm not sure why, 
because there's not a single keylist in that table, however. Another 
issue is that this query should actually never be regarded. The 
condition in the second-last line is always false. A simple test 
confirms that:

SELECT COUNT(*) FROM "tag" WHERE "ReadAccessKeylist" IS NOT NULL
-> 0

So there should never be a reason why the FROM in line 11 would result 
in a row (filtering out with the conditions, of course). But it still 
gets executed. When I make sure that the condition is always false, by 
adding a "AND 0" just before the HAVING clause, the whole thing runs 
much faster.

I have no separate timings, but it's in the magnitude of 1 vs. 5-10 
milliseconds for the query. Run a hundred times makes a noticeable delay.

My understanding of it all was that first the FROM clause is regarded to 
see what rows there are. Then WHERE filters them, then SELECT will pick 
some columns (and thereby execute my sub-select expression) and finally 
HAVING filters again. Since in my theory there is no single row, SELECT 
has nothing to do. But obviously it has.

Some suggestion what's going on?

-- 
Yves Goergen "LonelyPixel" <nospam.list@stripped>
Visit my web laboratory at http://beta.unclassified.de
Thread
Inefficient query processing?Yves Goergen10 Feb
  • Re: Inefficient query processing?Peter Brawley11 Feb
    • Re: Inefficient query processing?Yves Goergen11 Feb
      • Re: Inefficient query processing?Peter Brawley11 Feb
        • Re: Inefficient query processing?Yves Goergen11 Feb
          • Re: Inefficient query processing?Peter Brawley11 Feb
            • Re: Inefficient query processing?Yves Goergen11 Feb
              • Re: Inefficient query processing?Peter Brawley11 Feb
                • Re: Inefficient query processing?Yves Goergen11 Feb
  • Re: Inefficient query processing?Perrin Harkins11 Feb
    • Re: Inefficient query processing?Yves Goergen11 Feb
      • Re: Inefficient query processing?Perrin Harkins11 Feb
      • Re: Inefficient query processing?Peter Brawley11 Feb