On May 18, 2009, at 4:42 PM, Kevin Lewis wrote:
> I'm not sure if this code is in the context of searching from one
> page to the next, or also for the first node found from a parent
> page, in which case an extra expand would not be needed. But this
> looks like the right thing to do.
Kevin, you are correct about this. It is necessary when searching from
on page to the next at the same level, but it is not necessary for the
very first page we look at. Then this should have been done already in
> Lars-Erik Bjørk wrote:
>> Hi Ann!
>> Reading your e-mail again, I believe that adding the
>> following piece of code will be the right thing to do:
>> for (;; first = true)
>> if (first)
>> first = false;
>> recordNumber = node.getNumber();
>> if (recordNumber >= 0)
>> node.expandKey(&indexKey); // This line
>> return recordNumber;
>> Given your explanation, I don't think it works if the BUCKET_END
>> key over time
>> diverges from the first entry in the next block unless we add this
>> line (unless the search
>> only touches a single page).
>> Btw, thank you for your nice explanation :)
>> On May 15, 2009, at 12:52 AM, Ann W. Harrison wrote:
>>> Lars-Erik Bjørk wrote:
>>>> I have been looking at the remaining cases where we are returned
>>>> few, or no, rows when doing a LIMIT query. It seems that when we
>>>> a node with a recordNumber >= 0 as the first node on a page, we
>>>> forget to expand the key. I believed that this was deliberate, as
>>>> the key should already be
>>>> expanded, being identical to the last entry on the previous page.
>>>> Looking at this in the debugger however, this does not seem to be
>>>> the case.
>>> OK. As an index is created and grows, the last entry on page n will
>>> be the same as the first entry on page n + 1 and the first entry on
>>> each page will be completely expanded.
>>> The reason for expanding the first node completely is that we
>>> look at the first entry on an index page without having looked
>>> at the previous page as we're coming down from the top.
>>> The reason for duplicating entries is to avoid the cost of
>>> looking at the next page if the first entry on the next page
>>> is not in the range we want. The last entry on a page is
>>> NOT a valid index node and should not be used to look up
>>> a candidate record. If the key value of the last entry is
>>> in range, the look up should resume with the first entry on
>>> the next page.
>>> That's the case when we're adding entries to an index. When we
>>> remove entries, the rules change somewhat - and my guess is that
>>> they change too much. A specific difference, as Vlad pointed out,
>>> happens when you remove the first entry on a page. It's OK not
>>> to move the new first value to the previous page. If you remove
>>> the first entry on a page, the next entry is either the same or
>>> higher. In either case, the last entry on the previous page still
>>> marks the end of range acceptably. There's some loss of
>>> optimization -
>>> you may read page n + 1 thinking that the first value is in range
>>> based on the last value on page n, and find that the first actual
>>> value on that page is not in range - and old in-range value having
>>> been removed.
>>> However, if we don't expand the new first entry, all hell breaks
>>> loose. So,
>>> 1) The value of the last entry on a page is usually the
>>> same as the first entry on the next page.
>>> 2) The value of the last entry on a page is is never greater
>>> than the value of the first entry on the next page.
>>> 3) The first entry on each page is completely expanded.
>>> If #3 isn't true, it should be.
>>> Falcon Storage Engine Mailing List
>>> For list archives: http://lists.mysql.com/falcon
>>> To unsubscribe: http://lists.mysql.com/falcon?unsub=1
>> Falcon Storage Engine Mailing List
>> For list archives: http://lists.mysql.com/falcon
>> To unsubscribe: http://lists.mysql.com/falcon?unsub=1
> Falcon Storage Engine Mailing List
> For list archives: http://lists.mysql.com/falcon
> To unsubscribe: http://lists.mysql.com/falcon?unsub=1