From: Michael Widenius Date: July 16 2009 5:28pm Subject: Re: help with index_merge and clustering keys List-Archive: http://lists.mysql.com/internals/37197 Message-Id: <19039.25432.84019.768030@narttu.askmonty.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi! >>>>> "Sergey" == Sergey Petrunya writes: Sergey> Hi! Sergey> On Thu, Jul 09, 2009 at 12:06:03PM -0400, Zardosht Kasheff wrote: >> Does anyone have any pointers on any functions to look at for a hint >> of where to look here? >> >> For the example query, select * from foo where a = 1 and b = 1, if we >> have a primary key of a, the optimizer knows that an intersect is not >> necessary. Where is this code so that I may try to extend that logic >> to clustering keys? What do you mean with intersect here ? Do you mean that if you have a clustered key on 'b', then 'a' should be regarded as part of this key ? Sergey> Do I understand it correctly (from reading Sergey> http://blogs.tokutek.com/tokuview/introducing_multiple_clustering_indexes) that Sergey> each key with HA_CLUSTERED_KEY flag has an invisble 'tail' which has all table Sergey> columns that were not explicitly specified in the key? In other words, exactly how things works with InnoDB and it's primary key? Sergey> And the said 'tail' is not present in table->key_info[index_no].key_parts Sergey> array? (like it is for InnoDB's clustered primary key) In other words, for InnoDB clustered primary key we don't have things in key_parts either. Sergey> In that case, I see three possible solutions: Sergey> Solution1: Make the 'tail' be shown in table->key_info[index_no].key_parts. Sergey> Then the optimizer won't need to care about clustered keys having some Sergey> implicitly-present columns at the end. Don't think this is practical if you have many key segments (MySQL has a limit to how many key segments you can have). Sergey> If you're not after the broadest solution and just need to make the optimizer Sergey> not to construct index_merge/interesect(..., clustered_key, ...), here is patch Sergey> that should do exactly that: Sergey> Solution3: Sergey> === modified file 'sql/opt_range.cc' Sergey> --- sql/opt_range.cc 2009-03-24 13:58:52 +0000 Sergey> +++ sql/opt_range.cc 2009-07-10 09:22:49 +0000 Sergey> @@ -4535,8 +4535,16 @@ TRP_ROR_INTERSECT *get_best_ror_intersec Sergey> for (idx= 0, cur_ror_scan= tree->ror_scans; idx < param->keys; idx++) Sergey> { Sergey> ROR_SCAN_INFO *scan; Sergey> - if (!tree->ror_scans_map.is_set(idx)) Sergey> + /* Sergey> + Ignore non-ROR scans or 'clustered' (in Tokutek's sense) keys. Sergey> + */ Sergey> + uint keyno= param->real_keynr[idx]; Sergey> + if (!tree->ror_scans_map.is_set(idx) || Sergey> + (keyno != cpk_no && Sergey> + param->table->key_info[keyno].flags & HA_HA_CLUSTERED_KEY)) Sergey> + { Sergey> continue; Sergey> + } Sergey> if (!(scan= make_ror_scan(param, idx, tree->keys[idx]))) Sergey> return NULL; Sergey> if (param->real_keynr[idx] == cpk_no) Wouldn't it be easier to change the row: cpk_no= ((param->table->file->primary_key_is_clustered()) ? param->table->s->primary_key : MAX_KEY); to: cpk_no= table->file->clustered_on_key(); Where clustered_on_key() would return the either the key which is clustered or MAX_KEY ? Sergey> (about the special treatment of clustered primary key in index_merge/intersect: it is Sergey> included in index_merge but it's not really a scan. We get columns of clustered primary Sergey> key from secondary key scans, and then use clustered primary key scan condition Sergey> only for filtering) Isn't the above patch something that is generally useful even for InnoDB's clustered primary key? Regards, Monty