List:Internals« Previous MessageNext Message »
From:Antony T Curtis Date:March 5 2001 12:09pm
Subject:Re: Optimisation suggestion...
View as plain text  
Michael Widenius wrote:
> 
> Hi!
> 
> Sorry for the delayed reply; I have been on a vacation for a week and
> are slowly catching up with all old emails.

No probs... Glad you're back and hope you enjoyed your holiday.

> >>>>> "Antony" == Antony T Curtis <antony@stripped> writes:
> 
> Antony> Thimble Smith wrote:
> >>
> >> On Wed, Feb 21, 2001 at 12:59:43PM +0000, Antony T Curtis wrote:
> >> > In the case where there is a SELECT on a table where no index is
> >> > directly applicable, if there is a compound index where the second
> part
> >> > (or third, etc) of the index is used in the WHERE clause, it should be
> >> > possible to still use that index by being able to skip rows rather
> than
> >> > doing a brute-force search through all the records.
> >> >
> >> > Any hints where I should look to see if such an optimisation could be
> >> > implemented?
> >>
> >> It isn't clear to me how that would work.  Do you mean, to scan
> >> the whole index in order to find rows where the secondary index
> >> part matches, instead of scanning the data file?  Or do you have
> >> something else in mind?  I think this optimization wouldn't work,
> >> but I'm not sure I understand what you're suggesting.
> >>
> >> Tim
> 
> Antony> The kind of thing it'd do is something like this...
> 
> Antony> If the key-part to scan is the (n)th key part of the index,
> 
> Antony> 1. Seek index to start
> Antony> 2. Read key from index
> Antony> 3. Check (n)th key part against end-range, if less or equal, goto 6
> Antony> 4. Increment (n-1)th key part and seek to greater/equal key
> Antony> 5. goto stage 2
> Antony> 6. Check (n)th key part against start-range, if greater/equal, goto xx
> Antony> 7. Set (n)th key part to start-range and seek to greather/equal key
> Antony> 8. goto 2
> Antony> 9. (the key should point to valid row in range)
> Antony> 10. Seek to next key
> Antony> 11. if not end of index, goto 6
> 
> The problem is that the above will only work if a major part of the
> keys are identical in the first n-1 parts. (You would guess that at
> least 1/3 of the keys must be identical per key block for this to
> work, which is not a very common situation).
> 
> The major problem is that when it comes to scan indexes compared to
> scan a fixed length data file, index scanning is at least 3 times
> slower than data file scanning (for B-trees).  The reason is that
> there is much more seeks involved when scanning the index structure
> and also because the index structure is more complicated than the data
> file structure, especially when we talk about compressed indexes.

This method for searches is currently used by the legacy database used
here where the developers hand-write it for where they want to use it -
they call it bounce/bump index searches.

Perhaps it could be made available on a per-table or per-handler (as a
flag), or even some variable the user may set.

<snip>

> As MySQL keeps a note about which indexes can be used to satisfy the
> WHERE and the SELECT part, it would not be hard to do this
> optimization for the handlers where the data is stored in the trees
> (Currently only BDB tables).
> 
> The general optimization, where you allow any operator on the key
> parts is easy to do:
> 
> - Add a handler function:
>   read_next_unique(length-of-key-to-test)
>   and add in sql_select.cc two new functions:
>   join_read_next_unique() and init_join_read_next_unique()
>   that could be based on join_read_next() and init_join_read_next()
> 
>   in this case you need to call handler::read_next() as long as the
>   WHERE clause was satisfied and read_next_unique() when the WHERE
>   clause wasn't satisfied.
> 
> The general optimization to handle cases like:
> 
> SELECT * from test where key_part2>= "A" and key_part2 <="B" or
> key_part2 >="D" or key_part2 <="E"
> 
> is much harder to do;  In this case we could theoretically skip to
> next key_part1 key when: we have found a key (#, "F"), but to do this
> one would have to do a lot of new code, in the style of opt_range.cc

I think I'd read up on the code and see if I could develop an
implementation... If I get anything to work, this list would be the
first to know... Perhaps a useful feature to add in to 4.0.x

...

As an aside, is there a timescale for 4.0.x? I'd really like to see
unions and views ;)


-- 
ANTONY T CURTIS                     Tel: +44 (1635) 36222
Abacus Polar Holdings Ltd           Fax: +44 (1635) 38670
> The full impact of parenthood doesn't hit you until you multiply the
> number of your kids by 32 teeth.
Thread
character set conf fileChi-Wai Lau21 Feb
  • character set conf fileMichael Widenius21 Feb
  • Optimisation suggestion...Antony T Curtis21 Feb
    • Re: Optimisation suggestion...Thimble Smith21 Feb
  • Re: Optimisation suggestion...Antony T Curtis21 Feb
    • Re: Optimisation suggestion...Michael Widenius5 Mar
  • Re: Optimisation suggestion...Antony T Curtis5 Mar
    • Re: Optimisation suggestion...Michael Widenius5 Mar
  • Re: Optimisation suggestion...Antony T Curtis6 Mar
RE: Optimisation suggestion...Enrique Villar6 Mar
  • Re: Optimisation suggestion...Antony T Curtis6 Mar