Newbie on the list here having a bit of confusion at the moment why an
INNER JOIN is taking so long... I have replaced a few column names to
make it a bit more succinct:
SELECT COUNT(*)
FROM data, values, names
WHERE data.campaign_id = 22
AND names.name = 'content'
AND values.value = 'index'
AND data.name_id = names.id
AND data.value_id = value.id;
This query should pull out all of the index content from the data table
for campaign 22. name_id and value_id are indexed, as are the name and
value fields of the names and values tables. campaign_id is also indexed
in the data table and each name and value is unique per campaign. The
vardata dataset for this campaign that has around 163000 entries and the
above query takes nearly a minute to run. Total size of data table is
around 3 million records.
On the same machine, the following query takes roughly 2 seconds to run:
CREATE TEMPORARY TABLE IF NOT EXISTS names_temp
SELECT names.id as var_id, values.id as val_id
FROM values, names
WHERE names.campaign_id = 22
AND values.campaign_id = names.campaign_id
AND names.name = 'content'
AND values.value = 'index';
SELECT COUNT(*)
FROM vardata, names_temp
WHERE vardata.varNameId = names_temp.var_id
AND vardata.varValueId = names_temp.val_id;
After looking at EXPLAIN for both, I understand that the latter is
faster because it is doing lookups based on constant values, but I have
a gut feeling that this kind of join should be able to be done with
similar speed without having to use a temp table... Is there any way to
optimize the performance of the join query without having to go with the
two-query option?
--
Thanks,
Mathew
..................................
Mathew J. Ray
Sr. Interactive Developer
IQ Television Group
..................................