Ok, so I guess it is more complicated than that.
This query which has 5M records that match its criteria returns
instantly:
SELECT ELEMS.id
FROM ELEMS
WHERE ((
ELEMS.nodeID IN (
SELECT link.childID
FROM link
JOIN path ON link.parentID=path.decendantId
WHERE (path.ancestorId = 1))))
LIMIT 0,100;
Now if I change the ancestorId criteria to a node group that does not
exist the query takes a very long time. Btw, it also looks like this is
being optimized into a less efficient EXISTS query. Anyway, the join
version doesn't have the same problem, it is fast if it is searching for
a conditions that has results or one that has none. Note that the JOIN
version requires a SELECT in case a node has multiple ancestors.
SELECT DISTINCT ELEMS.id
FROM ELEMS
JOIN link ON ELEMS.nodeID = link.childID
JOIN path ON link.parentID=path.decendantId
WHERE (path.ancestorId = 1)
LIMIT 0,100;
Anyone have any ideas why this is the case?
-----Original Message-----
From: Robert DiFalco [mailto:rdifalco@stripped]
Sent: Thursday, May 24, 2007 1:11 PM
To: mysql@stripped
Subject: Need confirmation: Subselects are broken with regards to index
usage?
I think I'm discovering that sub-selects in MySQL are broken. Is that
true? It seems like you cannot have a sub-select without doing a table
scan -- even for a constant IN expression -- this because it gets
re-written as an EXISTS that executes for each row.
Is that true? Forcing an index doesn't even seem to help.
R.