List:Internals« Previous MessageNext Message »
From:Rick James Date:July 10 2009 8:17pm
Subject:RE: help with index_merge and clustering keys
View as plain text  
Let's look at "select * from foo where a=1 and b = 1" in multiple contexts:

(I am assuming there an index containing both a and b.)

* Neither a nor b is clustered:  index_merge could be the most efficient way to perform
the query...  Find all PKs for a=1, find all PKs for b=1, take the intersection; then
fetch the rows.

* One of a or b is clustered:  As mentioned, use the clustered key only.  The other key
buys nothing.

* Both a and b are clustered:  Again, the other key is useless; simply use the clustered
key.

Are these the directions you are headed?

 
Rick James
MySQL Geeks - Consulting & Review

 

> -----Original Message-----
> From: Zardosht Kasheff [mailto:zardosht@stripped] 
> Sent: Friday, July 10, 2009 6:26 AM
> To: Sergey Petrunya
> Cc: internals@stripped; serg@stripped
> Subject: Re: help with index_merge and clustering keys
> 
> Hello Sergey,
> 
> Thank you very much for taking the time to look into this and provide
> feedback. I am trying the suggestion as we speak.
> 
> I want to answer several questions you have below.
> 
> 1) Yes, you are correct in understanding the clustering keys. Just
> like InnoDB has an "invisible tail" including all of the columns that
> does not show up in table->key_info[index_no].key_parts, so do
> clustering keys.
> 2) Of the proposed solutions below, I was hoping to implement #2 over
> time in a piece-wise fashion. I realize that the entire solution is
> difficult, so I was hoping to figure out pieces of the optimizer.
> There must be cases in the optimizer that deal with the fact that
> InnoDB's primary key is clustered. I was hoping to extend that logic
> to deal with clustering keys.
> 
> For the internals list, I also have a question of my own about the
> optimizer. Suppose I have an InnoDB table:
> create table foo (a int, b int, c int, d int ..., primary key (a),
> key(b), key(c))engine = InnoDB
> 
> Now suppose I have the query "select * from foo where a=1 and b = 1
> and c = 1 and d = 1".
> 
> The general way to answer this query is to use the keys to retrieve
> all rows were a, b, and c are all 1, and then filter the rows where d
> = 1. There seem to be two ways to retrieve all rows where a, b, and c
> are all 1. One is to do an intersect of b=1 and c=1, and then filter
> the rows retrieved where a != 1. Another way is to do a range query on
> the clustered primary key where a=1, and filter the rows where b != 1
> and c != 1. (There are also bad query plans where we can retrieve all
> rows where b=1 and filter those a !=1 and c != 1).
> 
> Where is the logic that makes that determines these specific options?
> Unless I do not understand the code correctly, it does not seem to be
> in  get_best_ror_intersect.
> 
> Thanks
> -Zardosht
> 
> On Fri, Jul 10, 2009 at 5:39 AM, Sergey 
> Petrunya<psergey@stripped> wrote:
> > Hi!
> >
> > On Thu, Jul 09, 2009 at 12:06:03PM -0400, Zardosht Kasheff wrote:
> >> Does anyone have any pointers on any functions to look at 
> for a hint
> >> of where to look here?
> >>
> >> For the example query, select * from foo where a = 1 and b 
> = 1, if we
> >> have a primary key of a, the optimizer knows that an 
> intersect is not
> >> necessary. Where is this code so that I may try to extend 
> that logic
> >> to clustering keys?
> >
> > Do I understand it correctly (from reading
> > 
> http://blogs.tokutek.com/tokuview/introducing_multiple_cluster
> ing_indexes) that
> > each key with HA_CLUSTERED_KEY flag has an invisble 'tail' 
> which has all table
> > columns that were not explicitly specified in the key?
> >
> > And the said 'tail' is not  present in 
> table->key_info[index_no].key_parts
> > array? (like it is for InnoDB's clustered primary key)
> >
> > In that case, I see three possible solutions:
> >
> > Solution1: Make the 'tail' be shown in 
> table->key_info[index_no].key_parts.
> > Then the optimizer won't need to care about clustered keys 
> having some
> > implicitly-present columns at the end.
> >
> > Solution2: Make the range/index_merge optimizer be aware of 
> the implicitly-
> > present 'tail' so that it uses tail members as regular key 
> parts (one of
> > the things the range optimizer will need will be the order 
> of elements in
> > the 'tail').
> >
> >
> > #1 and #2 will require relatively big changes/reviews etc, 
> although they will
> > provide complete solutions (the fact that the key is 
> clustered will be used to
> > the fullest extent).
> >
> > If you're not after the broadest solution and just need to 
> make the optimizer
> > not to construct index_merge/interesect(..., clustered_key, 
> ...), here is patch
> > that should do exactly that:
> >
> > Solution3:
> > === modified file 'sql/opt_range.cc'
> > --- sql/opt_range.cc    2009-03-24 13:58:52 +0000
> > +++ sql/opt_range.cc    2009-07-10 09:22:49 +0000
> > @@ -4535,8 +4535,16 @@ TRP_ROR_INTERSECT *get_best_ror_intersec
> >   for (idx= 0, cur_ror_scan= tree->ror_scans; idx < 
> param->keys; idx++)
> >   {
> >     ROR_SCAN_INFO *scan;
> > -    if (!tree->ror_scans_map.is_set(idx))
> > +    /*
> > +      Ignore non-ROR scans or 'clustered' (in Tokutek's 
> sense) keys.
> > +    */
> > +    uint keyno= param->real_keynr[idx];
> > +    if (!tree->ror_scans_map.is_set(idx) ||
> > +        (keyno != cpk_no &&
> > +         param->table->key_info[keyno].flags & 
> HA_HA_CLUSTERED_KEY))
> > +    {
> >       continue;
> > +    }
> >     if (!(scan= make_ror_scan(param, idx, tree->keys[idx])))
> >       return NULL;
> >     if (param->real_keynr[idx] == cpk_no)
> >
> >
> > (about the special treatment of clustered primary key in 
> index_merge/intersect: it is
> > included in index_merge but it's not really a scan. We get 
> columns of clustered primary
> > key from secondary key scans, and then use clustered 
> primary key scan condition
> > only for filtering)
> >
> > BR
> >  Sergey
> > --
> > Sergey Petrunia, Software Developer
> > Monty Program AB, http://askmonty.org
> > Blog: http://s.petrunia.net/blog
> >
> 
> -- 
> MySQL Internals Mailing List
> For list archives: http://lists.mysql.com/internals
> To unsubscribe:    
> http://lists.mysql.com/internals?unsub=1
> 
> 
Thread
help with index_merge and clustering keysZardosht Kasheff23 Jun
  • Re: help with index_merge and clustering keysZardosht Kasheff9 Jul
    • Re: help with index_merge and clustering keysSergey Petrunya10 Jul
      • Re: help with index_merge and clustering keysZardosht Kasheff10 Jul
        • RE: help with index_merge and clustering keysRick James10 Jul
          • RE: help with index_merge and clustering keysMichael Widenius16 Jul
            • Re: help with index_merge and clustering keysZardosht Kasheff16 Jul
            • RE: help with index_merge and clustering keysRick James16 Jul
              • RE: help with index_merge and clustering keysMichael Widenius19 Jul
        • Re: help with index_merge and clustering keysSergey Petrunya12 Jul
          • Re: help with index_merge and clustering keysZardosht Kasheff12 Jul
            • Re: help with index_merge and clustering keysSergey Petrunya15 Jul
              • RE: help with index_merge and clustering keysRick James15 Jul
              • Re: help with index_merge and clustering keysZardosht Kasheff24 Jul
      • Re: help with index_merge and clustering keysMichael Widenius16 Jul
Re: help with index_merge and clustering keysZardosht Kasheff10 Jul
Re: help with index_merge and clustering keysZardosht Kasheff16 Jul
  • RE: help with index_merge and clustering keysRick James16 Jul
    • Re: help with index_merge and clustering keysZardosht Kasheff16 Jul