From: Jim Starkey Date: July 6 2009 6:18pm Subject: Double Recoveries List-Archive: http://lists.mysql.com/falcon/785 Message-Id: <4A523FDA.4000704@nimbusdb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Recovery from a failure during recovery of an otherwise recoverable database is a difficult problem -- and one that should be given careful thought. Perhaps we should take a que from the Hippocratic oath: First, do no harm. In the context of a recovery, this means that the database shouldn't be overwritten until both a) we know the recovery will success, and b) the overwrite can be restarted if it fails for any reason. Our normal recoveries are close. Usually, the page cache is big enough to hold all changes until the recovery is logically complete, and is then written simply and quickly so failure is highly unlikely. I think we could do better. Here's a suggestion: 1. Write any and all recovered pages to a separate file during recovery. 2. After the third serial log pass, flush the cache to separate file followed by a write of recovery meta-data 3. After everything required to restart of the flush has been written, mark the database header page as recovering 4. Fsync the disk 5. Start the copy operation. If the system crashes, the copy operation can be restarted This isn't as much hassle as one might think the separate file is a sequence list of page and the metadata a mapping from (tablespace, page number) to (page number) in the separate file. This will also makes things a great deal easier to debug.