From: Kevin Lewis Date: July 14 2009 1:11pm Subject: Re: Possible reason for the eternal loop in addIndexEntry? List-Archive: http://lists.mysql.com/falcon/792 Message-Id: <4A5C8416.6010207@sun.com> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT > Lars-Erik Bjørk wrote: > Does this sound like a plausible theory? Yes it does. Go for it. Good catch and happy hunting! > Lars-Erik Bjørk wrote: > Hi all, > > I have looked at bug#46083 (falcon_pagesize2K fails in > IndexRootPage::addIndexEntry() during recovery) today, and I have > debugged the relevant parts using the log and tablespace files provided > by Olav. > > This is what seems to happen: > > We try to add a node in IndexRootPage::addIndexEntry. Here is a piece of > the code: > > > > /* If the node fits on page, we're done */ > > AddNodeResult result; > > for (;;) > { > result = page->addNode(dbb, key, recordNumber); > > > The node does not fit on the page, and we are told to split the page at > the middle. > > In IndexPage::splitIndexPageMiddle, we search for a node to split on. We > want to chose a node > where the prefix compression is poor. > > > for (; node.node < pageEnd; node.getNext(bucketEnd)) > { > //int l = > node.expandKey(key); > > if (node.offset || node.length) > { > if (node.nextNode > midpoint) > break; > > chain = node.node; > } > > } > > > > Then we check to make sure that this is not the last node on the page, > and if it is, we split on the previous node (prevNode) > (I sent a mail about the calculation of the previous node some weeks > ago, unrelated to this): > > // The split node should never be the last node on the page > if ((UCHAR*) node.nextNode > (UCHAR*) this + length) > node = prevNode; > > What I see in the debugger is that (UCHAR*) node.nextNode == (UCHAR*) > this + length (note *equal*). To my understanding, this *is* the last > node on the page, and the condition should have been '>=' instead of > just '>'. This means that we in this case split on the last node instead > of the second to last node. On the next iteration we will navigate to > this page once again, and try to insert into it. This page is still full > (same length as the last time, we split on the END_BUCKET node I > assume), and we will split it again. This goes on ... and on (I have > looped through 14 of the iterations). > > If I change this condition from '>' to '>=' and also actually update > prevNode in the loop showed earlier, so that it does not refer to the > first node on the page, the node will be inserted on the second > iteration, after the first split. The recovery still segfaults, but on a > different place that I have no knowledge of, further down the recovery > process :) > > Does this sound like a plausible theory? > > /Lars-Erik > > > >