List:Falcon Storage Engine« Previous MessageNext Message »
From:Jim Starkey Date:February 19 2009 6:54pm
Subject:Re: Index Recovery
View as plain text  
Vladislav Vaintroub wrote:
> Hello,
> I'm keeping to get errors in the index recovery (lost parent page of an
> index page typically, not on disk and I cannot find any info about this page
> in the whole log).
> I have an idea on how to log splits to make recovery work better. Please
> give me your feedback.
>
> The basic idea is that we do not do log single pages anymore but bunch of
> pages (every page that was changed during the split). And we do it
> atomically, in a single serial log record. And we do not release them until
> they are logged.
>
> Our index page got some links to other pages (next on the same level, prior
> on the same level  and parent) and when splitting some or all of them are
> modified. Also, a new page is always created (and I believe even 2
> sometimes).
> That means, new record type that includes several pages would somewhat more
> heavy than individual pages we used to log . On the other hand, split should
> be considered a relatively rare operation, most page updates do not
> overflow.
>
> But benefits are obvious (for me at least): 
> next/prior chain is consistent, parent does not lose the child, child does
> not lose the parent and we do not need to think about the order when
> logging. We log an atomic operation (split) and there is no way that
> recovery get an inconsistent index because server stopped while doing a
> split and while
>
> Please share your thoughts.
>
>   
I think we need to find out what is happening.  Pages shouldn't get 
lost.  If there's a bug, it should be fixed, not papered over.

The parent and both children are all in use while all three are logged.  
I don't understand how they could get out of sync.

Is there any change that an attempted optimization has short circuited 
the interlock between  log pruning and the flush of the page cache?  We 
do depend on a page having been written when computing the recovery 
point.  If the page cache, in fact, has not been completely written, all 
hell will break loose during recovery.

You seen to have an anomaly that shouldn't happen if the other 
mechanisms are working properly.  I strongly suggest you find what the 
actual failure is  rather than working around the failure.
 

-- 
Jim Starkey
President, NimbusDB, Inc.
978 526-1376

Thread
[Fwd: Index Discussion]Kevin Lewis19 Feb
  • Re: [Fwd: Index Discussion]Ann W. Harrison19 Feb
    • Index RecoveryVladislav Vaintroub19 Feb
      • RE: Index RecoveryVladislav Vaintroub19 Feb
      • Re: Index RecoveryJim Starkey19 Feb
        • RE: Index RecoveryVladislav Vaintroub19 Feb
          • Re: Index RecoveryJim Starkey19 Feb
            • RE: Index RecoveryVladislav Vaintroub19 Feb
              • Re: Index RecoveryJim Starkey19 Feb