List:Falcon Storage Engine« Previous MessageNext Message »
From:Jim Starkey Date:February 19 2009 8:12pm
Subject:Re: Index Recovery
View as plain text  
Vladislav Vaintroub wrote:
>> -----Original Message-----
>> From: Jim Starkey [mailto:jstarkey@stripped]
>> Sent: Thursday, February 19, 2009 7:55 PM
>> To: Vladislav Vaintroub
>> Cc: 'Falcon'
>> Subject: Re: Index Recovery
>> I think we need to find out what is happening.  Pages shouldn't get
>> lost.  If there's a bug, it should be fixed, not papered over.
> I think the bug is missing atomic logging of a split. 
> There are bugs in index logging. I think a least of of them is obvious to me
> 1. IndexPage::splitPage(). If  crash happens and next page is flushed, its
> prior pointer points to nowhere.  It is not simply fixable with adding
> logging to the next page. Newly created page must be logged as well. But if
> it is logged, the btree remains inconsistent, because we get an empty page
> in the middle of a btree and that breaks the consistency of the btree. Which
> leads to rather complicated restructuring of the current code so that next
> page is released much later, when the split page is released. Which is no
> different from what I propose.
OK, but is this what you're seeing?  This is theoretically possible, but 
with a probability so low as to be negligible.  For this to happen, you 
would have to not only have a crash during an index split, but also to 
have a serial log window overflow that forced one or two of the serial 
log records to disk leaving the other(s) unwritten.

It is trivial to check whether this is the case that you are seeing.  
Unless the current serial log block is the last in the log and also the 
last in the window, this isn't the bug that you are seeing.

I don't object to your proposed change, but I don't want a different 
problem to go uninvestigated.
> 2.Another code in question that looks buggy ( I'm not absolutely sure it is
> one, and please correct me if I'm wrong) is at the end of
> IndexRootPage::splitIndexPage
> 			splitBdb->mark (transId);
> 			splitPage = (IndexPage*) splitBdb->buffer;
> 			splitPage->parentPage = bdb->pageNumber;
> 			bdb->release(REL_HISTORY);
> 			splitBdb->release(REL_HISTORY);
> nice idea , changing the parent page and forgetting to log it. 
> There could be a bunch of similar errors that are less obvious and the
> reason for the bugs is the lack of robust logging strategy. It should not be
> possible to modify a page during the split and forget to log it. This is
> what I propose.
I don't think that is a bug.  An attempted insertion forced a split.  
This is a retry have the split was complete.  The node insertion would 
have already need logged.

Jim Starkey
President, NimbusDB, Inc.
978 526-1376

[Fwd: Index Discussion]Kevin Lewis19 Feb
  • Re: [Fwd: Index Discussion]Ann W. Harrison19 Feb
    • Index RecoveryVladislav Vaintroub19 Feb
      • RE: Index RecoveryVladislav Vaintroub19 Feb
      • Re: Index RecoveryJim Starkey19 Feb
        • RE: Index RecoveryVladislav Vaintroub19 Feb
          • Re: Index RecoveryJim Starkey19 Feb
            • RE: Index RecoveryVladislav Vaintroub19 Feb
              • Re: Index RecoveryJim Starkey19 Feb