List:Commits« Previous MessageNext Message »
From:Olav Sandstaa Date:March 28 2009 8:48am
Subject:bzr push into mysql-6.0-falcon-team branch (olav:3083 to 3084) Bug#41665
View as plain text  
 3084 Olav Sandstaa	2009-03-28
      Fix for Bug #41665 Falcon crash in Transaction::waitForTransaction 
      
      This crash occurred in the deadlock detector code. The current thread
      A is traversing the waitingFor list while another thread C has just
      aborted (or possibly committed) and is releasing its reference count
      on the transaction state object. A third thread, B, that is just in
      front of C in the waitingFor list is waken up, and also releases its
      reference count on C. If A is inspecting object C as this happens,
      this object is deleted - and can result in thread A crashing.
      
      The fix to Transaction::waitForTransaction(): Before releasing the
      reference count on the transaction state object we aquire an exclusive
      lock on the active transaction list.  This will avoid that the
      transaction state object is released and deleted while another thread
      is transversing the waitingFor list.
      
      Also added a missing call to transState->release() in the catch block.
     @ storage/falcon/Transaction.cpp
        Fix to Transaction::waitForTransaction(): Before releasing the
        reference count on the transaction state object we aquire an exclusive
        lock on the active transaction list.  The cause for this is that
        during the deadlock detektion code another thread might have used this
        transaction state object's waitingFor pointer to navigate to the next
        transaction state object - and this thread might be the only one that
        has a reference count on that transaction state object.  This will
        avoid that the transaction state object is released and deleted while
        another thread is transversing the waitingFor list.

    modified:
      storage/falcon/Transaction.cpp
 3083 Kevin Lewis	2009-03-28
      By adding a temporary assert in RecordVersion::getPriorVersion() that checks if the thread has a CycleLock, I was able to find 6 more call stacks that needed a CycleLock.  This assert is not in this patch, but the 6 new CycleLocks are, along with a change in Database::start()  that creates the CycleManager sooner and a change in StorageDatabase that consistently collects the database from the connection instead of the table.

    modified:
      storage/falcon/Connection.cpp
      storage/falcon/Database.cpp
      storage/falcon/StorageDatabase.cpp
      storage/falcon/Table.cpp
=== modified file 'storage/falcon/Transaction.cpp'
--- a/storage/falcon/Transaction.cpp	2009-03-27 13:43:10 +0000
+++ b/storage/falcon/Transaction.cpp	2009-03-28 08:44:03 +0000
@@ -1083,11 +1083,15 @@ State Transaction::waitForTransaction(Tr
 			break;
 			}
 
+	// Release the lock on the active transaction list because we will
+	// possibly be blocked and will re-take the lock exclusively
+
+	syncActiveTransactions.unlock();
+
 	if (!(*deadlock))
 		{
 		try
 			{
-			syncActiveTransactions.unlock();
 			CycleLock *cycleLock = CycleLock::unlock();
 			transState->waitForTransaction();
 			
@@ -1098,6 +1102,11 @@ State Transaction::waitForTransaction(Tr
 			{
 			if (!COMPARE_EXCHANGE_POINTER(&transactionState->waitingFor, transState, NULL))
 				FATAL("waitingFor was not %p", transState);
+
+			// See comments about this locking further down
+
+			syncActiveTransactions.lock(Exclusive);
+			transState->release();
 				
 			throw;
 			}
@@ -1106,6 +1115,18 @@ State Transaction::waitForTransaction(Tr
 	if (!COMPARE_EXCHANGE_POINTER(&transactionState->waitingFor, transState, NULL))
 		FATAL("waitingFor was not %p", transState);
 
+	// Before releasing the reference count on the transaction state object we
+	// need to aquire a exclusive lock on the active transaction list. The 
+	// cause for this is that another thread might have used this 
+	// transaction state object's waitingFor pointer to navigate to the next 
+	// transaction state object - and this thread might be the only one that 
+	// has a reference count on that transaction state object. So to avoid that 
+	// the transaction state object is released and deleted while another 
+	// thread is transversing the waitingFor list (code above) with a shared
+	// lock on it we lock it exclusively.
+
+	syncActiveTransactions.lock(Exclusive);
+
 	state = (State) transState->state;
 	transState->release();
 


Attachment: [text/bzr-bundle] bzr/olav@sun.com-20090328084403-ojb0712tyxlemko4.bundle
Thread
bzr push into mysql-6.0-falcon-team branch (olav:3083 to 3084) Bug#41665Olav Sandstaa28 Mar