List:Falcon Storage Engine« Previous MessageNext Message »
From:Lars-Erik Bjørk Date:July 14 2009 12:41pm
Subject:Possible reason for the eternal loop in addIndexEntry?
View as plain text  
Hi all,

I have looked at bug#46083 (falcon_pagesize2K fails in 
IndexRootPage::addIndexEntry() during recovery) today, and I have 
debugged the relevant parts using the log and tablespace files provided 
by Olav.

This is what seems to happen:

We try to add a node in IndexRootPage::addIndexEntry. Here is a piece of 
the code:


/* If the node fits on page, we're done */

AddNodeResult result;

for (;;)
    result = page->addNode(dbb, key, recordNumber);

The node does not fit on the page, and we are told to split the page at 
the middle.

In IndexPage::splitIndexPageMiddle, we search for a node to split on. We 
want to chose a node
where the prefix compression is poor.

for (; node.node < pageEnd; node.getNext(bucketEnd))
    //int l =

    if (node.offset || node.length)
        if (node.nextNode > midpoint)

        chain = node.node;


Then we check to make sure that this is not the last node on the page, 
and if it is, we split on the previous node (prevNode)
(I sent a mail about the calculation of the previous node some weeks 
ago, unrelated to this):

    // The split node should never be the last node on the page
    if ((UCHAR*) node.nextNode > (UCHAR*) this + length)
        node = prevNode;

What I see in the debugger is that (UCHAR*) node.nextNode == (UCHAR*) 
this + length (note *equal*). To my understanding, this *is* the last 
node on the page, and the condition should have been '>=' instead of 
just '>'. This means that we in this case split on the last node instead 
of the second to last node. On the next iteration we will navigate to 
this page once again, and try to insert into it. This page is still full 
(same length as the last time, we split on the END_BUCKET node I 
assume), and we will split it again. This goes on ... and on (I have 
looped through 14 of the iterations).

If I change this condition from '>' to '>=' and also actually update 
prevNode in the loop showed earlier, so that it does not refer to the 
first node on the page, the node will be inserted on the second 
iteration, after the first split. The recovery still segfaults, but on a 
different place that I have no knowledge of, further down the recovery 
process :)

Does this sound like a plausible theory?


Possible reason for the eternal loop in addIndexEntry?Lars-Erik Bjørk14 Jul
  • Re: Possible reason for the eternal loop in addIndexEntry?Kevin Lewis14 Jul
  • Gold Star for LarsJim Starkey14 Jul
    • Re: Gold Star for LarsLars-Erik Bjørk15 Jul
Re: Possible reason for the eternal loop in addIndexEntry?Lars-Erik Bjørk16 Jul