List: | Internals | « Previous MessageNext Message » | |

From: | Rick James | Date: | July 15 2009 6:09pm |

Subject: | RE: help with index_merge and clustering keys | ||

View as plain text |

With ranges in the WHERE (a > 1 and b > 1 and c > 1 and d > 1) * With clustered keys all you can do is pick the range with the lowest cardinality, and ignore the other indexes during the execution. * If none of the keys are clustered, then index merge may be the most efficient way to execute the query. * With some keys clustered and some are not, it becomes a battle between the above schemes. Probably not easy to pick the optimal plan. Rick James MySQL Geeks - Consulting & Review > -----Original Message----- > From: Sergey Petrunya [mailto:psergey@stripped] > Sent: Wednesday, July 15, 2009 3:21 AM > To: Zardosht Kasheff > Cc: internals@stripped> Subject: Re: help with index_merge and clustering keys > > On Sun, Jul 12, 2009 at 01:01:14PM -0400, Zardosht Kasheff wrote: > > ah, yes, I made the mistake of making my example have > equality checks > > instead of ranges. > > > > So instead of "= 1", suppose each of the range checks are > "> 1" (or > > > 100000, some value where the optimizer needs records_in_range to > > determine the cost estimate of using a key). > > > > So for the simple example Rich mentioned, "select * from > foo where a > > > 1 and b > 1" > > The more complex example, "select * from foo where a > 1 and b > 1 > > and c > 1 and d > 1" > > > > What happens here? > Ok, here it goes: > > The whole process is in opt_range.cc, > SQL_SELECT::test_quick_select() function. > get_mm_tree() call analyzes the WHERE condition and > constructs possible > range/index_merge accesses. > > Then get_key_scans_params() is called, which uses > records_in_range() calls to > get #records estimates and costs for range accesses (i.e. > accesses using > individual indexes), and picks the best range access (this is > fairly easy as > the number of options is limited by the number of available indexes). > > Then it starts looking for best ROR[*]-intersection. This is > more complicated > as N ranges over different indexes give 2^N-N possible ways > to intersect it. > This is done in two steps > > * get_best_ror_intersect() searches for best ROR-intersection > > Then get_best_ror_intersect() is called which tries to construct a > ROR-intersection. The idea is that we available range scans by their > desirability: > > ranges[]= {range_scan(key1), range_scan(key2), range_scan(key3) } > > and then consider doing intersect on > ranges[0], ranges[1] > ranges[0], ranges[1], ranges[2] > ranges[0], ranges[1], ranges[2], ranges[3] > ... and so forth > > and pick the best intersection. > > After that, get_best_covering_ror_intersect() is called which > uses a similar > approach but only considers covering intersections (i.e. > those where one won't > have to call rnd_pos() to retrieve the full table record). > > This algorithm works, albeit not fully, it is known to make > poor choices in > certain situations. > > Hope this clarifies things a bit. > > [*] ROR=Rowid Ordered Retrieval. At the moment we only > support intersection of > ordered streams, more exactly rowid-ordered. index scans are > expected produce > records in rowid order when: > * it is a scan on clustered primary key > * it is a scan on "keyXpart1=const AND .. keyXpartN=const", > where N is the last > [visible] part. > > BR > Sergey > -- > Sergey Petrunia, Software Developer > Monty Program AB, http://askmonty.org > Blog: http://s.petrunia.net/blog > > -- > MySQL Internals Mailing List > For list archives: http://lists.mysql.com/internals > To unsubscribe: > http://lists.mysql.com/internals?unsub=1 > >