List:Falcon Storage Engine« Previous MessageNext Message »
From:Olav Sandstaa Date:October 21 2008 7:11pm
Subject:Review request: Handling of exceptions after serial log is in state
writeError
View as plain text  
Hi,

I have committed the second patch for fixing Bug #39912 "Falcon can
crash after hitting problems with the serial log". This patch adds
code for handling uncaught exceptions that occurs after the state of
the serial log has been set to "writeError". The patch is available here:

   http://lists.mysql.com/commits/56745

The following six situationa where MySQL previously would crash due to
uncaught exceptions are now handled:

1. Rollback to savepoint: handled in StorageConnection::rollbackVerb()
   and in StorageInterface::rollback()

2. Rollback of transaction: handled in StorageConnection::rollback()
   and in StorageInterface::rollback()

3. Commit: handled in StorageConnection::commit() and
StorageInterface::commit()

4. Commit of implicite transactions: handled in
   StorageConnection::commit(),
   StorageConnection::endImpliciteTransaction() and
   StorageInterface::external_lock()

All of these will now return HA_ERR_LOGGING_IMPOSSIBLE to the server
after the serial log becomes "un-writable".

5. Scavenger: uncaught exception when committing updates to the
   cardinalities. This is now handled in
   Database::updateCardinalities(). After this situation has occured,
   cardinalities will no longer be updated.

6. IO-thread: uncaught exception when writing the check point log
   record. This is now handled in Cache::ioThread(). Checkpoint will
   continue to run but the checkpoint log record will not be written.

NOTE: Pay particular note of the last of these given that the solution
to get out of this situation is to do a successful recovery. This fix
might result in checkpoints that writes pages to the database without
having a complete checkpoint log record between them (or at least I
think this might be a possible scenario). Can this give problems for
the recovery process? (I do not think so but thought it was good to
mention it.... :-) ).

Call stacks for all uncaught exceptions scenarios are available in the
bug report.

Concern 1: Wlad thinks that returning HA_ERR_LOGGING_IMPOSSIBLE is the
wrong error code to return. His main objection was that this is an
error code that is only used by InnoDB in relation to replication. If
we come up with a better error code I can either update the patch
before it get pushed or as a separate patch later. (anyway,
HA_ERR_LOGGING_IMPOSSIBLE is better than DuplicateKeyError which we
returned earlier or to crash the process).

Concern 2: There are like more cases where it is possible to get
uncaught exceptions after the serial log is in writeError state but I
do think I have covered the most frequently occuring.

Testing: I have tested that the solution fixes all of the cases above
except case 4 which I only saw once and did not reproduce easily. The
testing has been done by instrumenting the Serial Log to set state to
writeError after some time.

Please feel free to review and comment on the patch.

Olav

Thread
Review request: Handling of exceptions after serial log is in statewriteErrorOlav Sandstaa21 Oct
  • Re: Review request: Handling of exceptions after serial log is instate writeErrorAnn W. Harrison21 Oct
    • Re: Review request: Handling of exceptions after serial log is instate writeErrorOlav Sandstaa21 Oct