From: Peter Brawley Date: February 15 2005 4:07am Subject: Re: join speed vs. 2 queries List-Archive: http://lists.mysql.com/mysql/179936 Message-Id: <4211759E.6030008@earthlink.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit >I have a gut feeling that this kind of join should be able to be >done with similar speed without having to use a temp table Yep but remember the query engine uses one index per table so without seeing your EXPLAIN output I'd try indexing ... the data table on name_id,value_id,campaign_id, the names table on id,name, the values table on id,value, then writing the query as ... SELECT COUNT(*) FROM data INNER JOIN names ON data.name_id=names.id INNER JOIN values ON data.value_id=values.id WHERE data.campaign_id = 22 AND names.name = 'content' AND values.value = 'index' PB ----- Mathew Ray wrote: > Newbie on the list here having a bit of confusion at the moment why an > INNER JOIN is taking so long... I have replaced a few column names to > make it a bit more succinct: > > > SELECT COUNT(*) > FROM data, values, names > > WHERE data.campaign_id = 22 > AND names.name = 'content' > AND values.value = 'index' > > AND data.name_id = names.id > AND data.value_id = value.id; > > > This query should pull out all of the index content from the data > table for campaign 22. name_id and value_id are indexed, as are the > name and value fields of the names and values tables. campaign_id is > also indexed in the data table and each name and value is unique per > campaign. The vardata dataset for this campaign that has around 163000 > entries and the above query takes nearly a minute to run. Total size > of data table is around 3 million records. > > On the same machine, the following query takes roughly 2 seconds to run: > > > CREATE TEMPORARY TABLE IF NOT EXISTS names_temp > SELECT names.id as var_id, values.id as val_id > FROM values, names > > WHERE names.campaign_id = 22 > AND values.campaign_id = names.campaign_id > > AND names.name = 'content' > AND values.value = 'index'; > > SELECT COUNT(*) > FROM vardata, names_temp > WHERE vardata.varNameId = names_temp.var_id > AND vardata.varValueId = names_temp.val_id; > > > After looking at EXPLAIN for both, I understand that the latter is > faster because it is doing lookups based on constant values, but I > have a gut feeling that this kind of join should be able to be done > with similar speed without having to use a temp table... Is there any > way to optimize the performance of the join query without having to go > with the two-query option? > > > -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.300 / Virus Database: 265.8.7 - Release Date: 2/10/2005