With ranges in the WHERE (a > 1 and b > 1 and c > 1 and d > 1)
* With clustered keys all you can do is pick the range with the lowest
cardinality, and ignore the other indexes during the execution.
* If none of the keys are clustered, then index merge may be the most
efficient way to execute the query.
* With some keys clustered and some are not, it becomes a battle between
the above schemes. Probably not easy to pick the optimal plan.
Rick James
MySQL Geeks - Consulting & Review
> -----Original Message-----
> From: Sergey Petrunya [mailto:psergey@stripped]
> Sent: Wednesday, July 15, 2009 3:21 AM
> To: Zardosht Kasheff
> Cc: internals@stripped
> Subject: Re: help with index_merge and clustering keys
>
> On Sun, Jul 12, 2009 at 01:01:14PM -0400, Zardosht Kasheff wrote:
> > ah, yes, I made the mistake of making my example have
> equality checks
> > instead of ranges.
> >
> > So instead of "= 1", suppose each of the range checks are
> "> 1" (or >
> > 100000, some value where the optimizer needs records_in_range to
> > determine the cost estimate of using a key).
> >
> > So for the simple example Rich mentioned, "select * from
> foo where a >
> > 1 and b > 1"
> > The more complex example, "select * from foo where a > 1 and b > 1
> > and c > 1 and d > 1"
> >
> > What happens here?
> Ok, here it goes:
>
> The whole process is in opt_range.cc,
> SQL_SELECT::test_quick_select() function.
> get_mm_tree() call analyzes the WHERE condition and
> constructs possible
> range/index_merge accesses.
>
> Then get_key_scans_params() is called, which uses
> records_in_range() calls to
> get #records estimates and costs for range accesses (i.e.
> accesses using
> individual indexes), and picks the best range access (this is
> fairly easy as
> the number of options is limited by the number of available indexes).
>
> Then it starts looking for best ROR[*]-intersection. This is
> more complicated
> as N ranges over different indexes give 2^N-N possible ways
> to intersect it.
> This is done in two steps
>
> * get_best_ror_intersect() searches for best ROR-intersection
>
> Then get_best_ror_intersect() is called which tries to construct a
> ROR-intersection. The idea is that we available range scans by their
> desirability:
>
> ranges[]= {range_scan(key1), range_scan(key2), range_scan(key3) }
>
> and then consider doing intersect on
> ranges[0], ranges[1]
> ranges[0], ranges[1], ranges[2]
> ranges[0], ranges[1], ranges[2], ranges[3]
> ... and so forth
>
> and pick the best intersection.
>
> After that, get_best_covering_ror_intersect() is called which
> uses a similar
> approach but only considers covering intersections (i.e.
> those where one won't
> have to call rnd_pos() to retrieve the full table record).
>
> This algorithm works, albeit not fully, it is known to make
> poor choices in
> certain situations.
>
> Hope this clarifies things a bit.
>
> [*] ROR=Rowid Ordered Retrieval. At the moment we only
> support intersection of
> ordered streams, more exactly rowid-ordered. index scans are
> expected produce
> records in rowid order when:
> * it is a scan on clustered primary key
> * it is a scan on "keyXpart1=const AND .. keyXpartN=const",
> where N is the last
> [visible] part.
>
> BR
> Sergey
> --
> Sergey Petrunia, Software Developer
> Monty Program AB, http://askmonty.org
> Blog: http://s.petrunia.net/blog
>
> --
> MySQL Internals Mailing List
> For list archives: http://lists.mysql.com/internals
> To unsubscribe:
> http://lists.mysql.com/internals?unsub=1
>
>