On the previous query optimization problem that I posted (for which no
response yet), I've ran into another twist.
The key/value pairs can be duplicated . I-e an Entity can have multiple
values for a key. This makes the count() > = no of conditions check
invalid. Now, I'm really breaking my brain trying to figure out what's
the accurate/fastest way to do this.
Here's the problem again:
I am using mysql 4.0.12 and am trying to build a parametric search
engine. The problem I believe can be essentially generalized as
Have a table named Entity (PK EntityID) and another table called
EntityProperties (PK EPID, Key varchar, Value varchar). EP stores
key/value proeprties of E.
What I would like to be able to do is to return a list of Entities that
match a given set of key/value pairs. The query in itself is simple:
Lets say I have 4 key value pairs (key1, value1) , (key2, value2),
(key3, value3), (key4, value4) entered for search. The resulting search
query would look like:
Entity E, EntityProperties EP
E.EID = EP.EID AND
(EP.KEY = key1 AND EP.VALUE = value1) OR
(EP.KEY = key2 AND EP.VALUE = value2) OR
(EP.KEY = key3 AND EP.VALUE = value3) OR
(EP.KEY = key4 AND EP.VALUE = value4) OR
GROUP BY E.EID
Having count(E.EID) >= 4;
The count is 4, as there are 4 matching conditions required.
The problem comes in when I have like 200,000 rows in E and 2 million
rows in EP, the query above just crawls to like 20 secs average.
So I create a 2 column index on EP (Key, value), (To do that I had to
change the table type to MyISAM from InnoDB as Value is of type Text
which does not support index creation on InnoDB), and the query does
become faster, comes down to around 7 secs avg.
But a) This is still not fast enough, I would like it to be under a
second, and b) I expect my EP to grow upto around 20-30 millions rows,
so it certainly won't scale too well.
I did think about creating key-value pair keywords, creating a column
in E to store them and then create a FULL-TEXT index and use a IN
BOOLEAN MODE search, however, In the key/value match conditions I use
above, I might also need comparison operators..i-e something like:
EP.KEY = key1 and EP.VALUE > value1
and that the current mysql full-text implementation does not allow.
I was also wondering if hash indexes can be used somehow, But I have no
idea a) how they work in mysql and b) whether they would be useful.
I would appreciate any advice/guidance in optimizing this query or a
totally different solution that accomplishes the same result and
showing me how to do this right.
|• Query Challenge - Another Issue||Aamir Baig||27 May|