Hi Vlad,
Thanks for fixing this high priority bug. The patch looks very good and
I have only minor comments:
Table.cpp:
* The new method waitForWriteComplete() is declared to not return any
value but in the implementation you have included a return statement.
Transaction.cpp:
* Question: Before your change to Transaction::commit() all records
were updated first and then all triggers were fired. After your change
you do both in a single loop. This is an area where I neither know the
code or what is correct but I have some vague memories about that an
AFTER trigger should run "after all operations have been done to the
involved records". Does your change here change semantic for triggers?
As I said this is something I do not know very much about so I am just
asking...
* Question: You have moved updates of cardinalities outside of the
region where you had a lock on
transactionManager->committedTransactions.syncObject. Does this increase
the likelihood that the cardinalities can be updated concurrently by
multiple threads? (and if yes, does it really matter?)
* Transaction;;hadRecords: the have text passed to the Sync object is
"Transaction::releaseDependency". Should this be
"Transaction::hasRecords" instead?
TransactionManager.cpp:
* In the method waitForWriteComplete() I the string added to the Sync
object should be "TransactionManager::waitForWriteComplete", not just
"waitForWriteComplete".
If you think the answer to the question I had about order of updates and
execution of triggers is "no problem" then this patch is "OK to push".
Regards,
Olav
Vladislav Vaintroub wrote:
> #At file:///C:/bzr/mysql-6.0-falcon-team/
>
> 2772 Vladislav Vaintroub 2008-08-11
> Bug#22165 - crash while doing ALTER in parallel with another transactions
> Eliminated races.
>
> Race condition 1.
> ALTER/DROP/TRUNCATE have to ensure that table has no committed records
> that are not yet "write complete". Otherwise crash is possible if table is
> deleted, but then accessed during Transaction::writeComplete() in
> Record::poke()
>
> It is now solved, Storage::external_lock will wait for all transaction on
> the table to become write complete, if there is no uncommited records.
>
> Race condition 2.
> In Transaction::commit(), Transaction state is set to Committed to early
> while the transaction is still in the active list. This can produce incorrect
> results in Transaction::hasUncommitedRecords, DROP is not blocked.
> When later Transaction::commit() traverses the record list, it can crash
> accessing freed memory in Table::updateRecord().
>
> This is fixed and transaction state is changed during the transition from
> active list to committed list.
> modified:
> mysql-test/suite/falcon/t/disabled.def
> storage/falcon/Database.cpp
> storage/falcon/Database.h
> storage/falcon/StorageTable.cpp
> storage/falcon/StorageTable.h
> storage/falcon/Table.cpp
> storage/falcon/Table.h
> storage/falcon/Transaction.cpp
> storage/falcon/Transaction.h
> storage/falcon/TransactionManager.cpp
> storage/falcon/TransactionManager.h
> storage/falcon/ha_falcon.cpp
>
> per-file messages:
> storage/falcon/Database.cpp
> new function waitForWriteComplete
> storage/falcon/Database.h
> new function waitForWriteComplete
> storage/falcon/StorageTable.cpp
> new function waitForWriteComplete
> storage/falcon/StorageTable.h
> new function waitForWriteComplete
> storage/falcon/Table.cpp
> new function waitForWriteComplete
> storage/falcon/Table.h
> new function waitForWriteComplete
> storage/falcon/Transaction.cpp
> Transaction state was set to Committed to early, allowing for DROP to process in
> paralell
> moved state change to later point, when transaction is removed from active list
> and added
> to committed list.
>
> Also, removed unnecessary loops : record list is now traversed
> only once in Transaction::commit() ,not 2 or 3 times
>
> Additionally, Transaction::hasRecords is now lock-protected to
> avoid races with writeComplete() and dropTable()
>
> Renamed hasUncommittedRecords() to hasRecords() - we're sometimes interested
> in committed no-yet-write-complete transactions
> storage/falcon/Transaction.h
> hasUncommittedRecords->hasRecords
> storage/falcon/TransactionManager.cpp
> new function TransactionManager::waitForWriteComplete() to
> wait for all transactions referencing records from given
> table to become "write complete"
> storage/falcon/TransactionManager.h
> new function waitForWriteComplete()
> storage/falcon/ha_falcon.cpp
> in StorageInterface::external_lock, if there is no uncommitted records
> for the table, also wait while committed records will become write
> complete. This is done to avoid race with Transaction::writeComplete()
> === modified file 'mysql-test/suite/falcon/t/disabled.def'
> --- a/mysql-test/suite/falcon/t/disabled.def 2008-08-07 21:11:55 +0000
> +++ b/mysql-test/suite/falcon/t/disabled.def 2008-08-11 13:22:53 +0000
> @@ -12,4 +12,3 @@
>
> falcon_bug_28095_I : Bug#xxxxx 2008-04-22 hakank Disabled until soft restart is in
> main tree
> falcon_bug_28095_II : Bug#xxxxx 2008-03-22 hakank Disabled until soft restart is in
> main tree
> -falcon_bug_22165 : Bug#22165 2008-08-07 wlad Disabled temporarily, because it
> crashes. Work in progress
>
> === modified file 'storage/falcon/Database.cpp'
> --- a/storage/falcon/Database.cpp 2008-07-17 13:52:17 +0000
> +++ b/storage/falcon/Database.cpp 2008-08-11 13:22:53 +0000
> @@ -2282,6 +2282,11 @@ bool Database::hasUncommittedRecords(Tab
> return transactionManager->hasUncommittedRecords(table, transaction);
> }
>
> +void Database::waitForWriteComplete(Table *table)
> +{
> + transactionManager->waitForWriteComplete(table);
> +}
> +
> void Database::commitByXid(int xidLength, const UCHAR* xid)
> {
> serialLog->commitByXid(xidLength, xid);
>
> === modified file 'storage/falcon/Database.h'
> --- a/storage/falcon/Database.h 2008-05-09 19:58:50 +0000
> +++ b/storage/falcon/Database.h 2008-08-11 13:22:53 +0000
> @@ -200,6 +200,7 @@ public:
> virtual void createDatabase (const char *filename);
> void renameTable(Table* table, const char* newSchema, const char* newName);
> bool hasUncommittedRecords(Table* table, Transaction *transaction);
> + void waitForWriteComplete(Table *table);
> void validateCache(void);
> int recoverGetNextLimbo(int xidSize, unsigned char *xid);
> void commitByXid(int xidLength, const UCHAR* xid);
>
> === modified file 'storage/falcon/StorageTable.cpp'
> --- a/storage/falcon/StorageTable.cpp 2008-08-04 15:53:52 +0000
> +++ b/storage/falcon/StorageTable.cpp 2008-08-11 13:22:53 +0000
> @@ -565,6 +565,11 @@ int StorageTable::alterCheck(void)
> return 0;
> }
>
> +void StorageTable::waitForWriteComplete(void)
> +{
> + share->table->waitForWriteComplete();
> +}
> +
> void StorageTable::unlockRow(void)
> {
> if (recordLocked)
>
> === modified file 'storage/falcon/StorageTable.h'
> --- a/storage/falcon/StorageTable.h 2008-08-04 15:53:52 +0000
> +++ b/storage/falcon/StorageTable.h 2008-08-11 13:22:53 +0000
> @@ -61,6 +61,7 @@ public:
> void transactionEnded(void);
> void setRecord(Record* record, bool locked);
> int alterCheck(void);
> + void waitForWriteComplete();
> void clearAlter(void);
> bool setAlter(void);
>
>
> === modified file 'storage/falcon/Table.cpp'
> --- a/storage/falcon/Table.cpp 2008-08-07 21:11:55 +0000
> +++ b/storage/falcon/Table.cpp 2008-08-11 13:22:53 +0000
> @@ -3331,6 +3331,11 @@ bool Table::hasUncommittedRecords(Transa
> return database->hasUncommittedRecords(this, transaction);
> }
>
> +void Table::waitForWriteComplete()
> +{
> + return database->waitForWriteComplete(this);
> +}
> +
> RecordVersion* Table::lockRecord(Record* record, Transaction* transaction)
> {
> Record *current = fetch(record->recordNumber);
>
> === modified file 'storage/falcon/Table.h'
> --- a/storage/falcon/Table.h 2008-04-22 21:10:23 +0000
> +++ b/storage/falcon/Table.h 2008-08-11 13:22:53 +0000
> @@ -185,6 +185,7 @@ public:
> int getFormatVersion();
> void validateAndInsert(Transaction *transaction, RecordVersion *record);
> bool hasUncommittedRecords(Transaction* transaction);
> + void waitForWriteComplete();
> void checkAncestor(Record* current, Record* oldRecord);
> int64 estimateCardinality(void);
> void optimize(Connection *connection);
>
> === modified file 'storage/falcon/Transaction.cpp'
> --- a/storage/falcon/Transaction.cpp 2008-07-31 10:04:30 +0000
> +++ b/storage/falcon/Transaction.cpp 2008-08-11 13:22:53 +0000
> @@ -275,38 +275,43 @@ void Transaction::commit()
> releaseRecordLocks();
>
> database->serialLog->preCommit(this);
> - state = Committed;
> syncActive.unlock();
>
> +
> +
> for (RecordVersion *record = firstRecord; record; record = record->nextInTrans)
> - if (!record->isSuperceded() && record->state != recLock)
> - record->format->table->updateRecord (record);
> + {
> + Table * table = record->format->table;
>
> - if (commitTriggers)
> - for (RecordVersion *record = firstRecord; record; record =
> record->nextInTrans)
> - if (!record->isSuperceded() && record->state != recLock)
> - record->format->table->postCommit (this, record);
> + if (!record->isSuperceded() && record->state != recLock)
> + {
> + table->updateRecord (record);
> + if (commitTriggers)
> + table->postCommit (this, record);
> + }
>
> + if (!record->getPriorVersion())
> + ++table->cardinality;
> + if (record->state == recDeleted && table->cardinality > 0)
> + --table->cardinality;
> + }
> +
> releaseDependencies();
> database->flushInversion(this);
>
> + // Transfer transaction from active list to committed list, set committed state
> Sync syncCommitted(&transactionManager->committedTransactions.syncObject,
> "Transaction::commit(2)");
> - syncCommitted.lock(Exclusive);
> -
> Sync
> syncActiveTransactions(&transactionManager->activeTransactions.syncObject,
> "Transaction::commit(3)");
> + syncCommitted.lock(Exclusive);
> syncActiveTransactions.lock(Exclusive);
> +
> transactionManager->activeTransactions.remove(this);
> - syncActiveTransactions.unlock();
> -
> - for (RecordVersion *record = firstRecord; record; record = record->nextInTrans)
> - {
> - if (!record->getPriorVersion())
> - ++record->format->table->cardinality;
> - if (record->state == recDeleted &&
> record->format->table->cardinality > 0)
> - --record->format->table->cardinality;
> - }
> transactionManager->committedTransactions.append(this);
> + state = Committed;
> +
> + syncActiveTransactions.unlock();
> syncCommitted.unlock();
> +
> database->commit(this);
>
> delete [] xid;
> @@ -925,8 +930,11 @@ void Transaction::truncateTable(Table* t
> ptr = &rec->nextInTrans;
> }
>
> -bool Transaction::hasUncommittedRecords(Table* table)
> +bool Transaction::hasRecords(Table* table)
> {
> + // This lock is to avoid race with writeComplete
> + Sync sync(&syncIndexes, "Transaction::releaseDependency");
> + sync.lock(Exclusive);
> for (RecordVersion *rec = firstRecord; rec; rec = rec->nextInTrans)
> if (rec->format->table == table)
> return true;
>
> === modified file 'storage/falcon/Transaction.h'
> --- a/storage/falcon/Transaction.h 2008-07-24 08:45:03 +0000
> +++ b/storage/falcon/Transaction.h 2008-08-11 13:22:53 +0000
> @@ -105,7 +105,7 @@ public:
> State waitForTransaction (Transaction *transaction, TransId transId, bool
> *deadlock);
> void dropTable(Table* table);
> void truncateTable(Table* table);
> - bool hasUncommittedRecords(Table* table);
> + bool hasRecords(Table* table);
> void writeComplete(void);
> void releaseDependency(void);
> int createSavepoint();
>
> === modified file 'storage/falcon/TransactionManager.cpp'
> --- a/storage/falcon/TransactionManager.cpp 2008-07-24 08:45:03 +0000
> +++ b/storage/falcon/TransactionManager.cpp 2008-08-11 13:22:53 +0000
> @@ -27,6 +27,7 @@
> #include "Log.h"
> #include "LogLock.h"
> #include "Synchronize.h"
> +#include "Thread.h"
>
> static const int EXTRA_TRANSACTIONS = 10;
>
> @@ -168,12 +169,40 @@ bool TransactionManager::hasUncommittedR
> syncTrans.lock (Shared);
>
> for (Transaction *trans = activeTransactions.first; trans; trans = trans->next)
> - if (trans != transaction && trans->isActive() &&
> trans->hasUncommittedRecords(table))
> + if (trans != transaction && trans->isActive() &&
> trans->hasRecords(table))
> return true;
>
> return false;
> }
>
> +// Wait until all committed records for a table are purged by gophers
> +// (their transaction become write complete)
> +void TransactionManager::waitForWriteComplete(Table* table)
> +{
> + for(;;)
> + {
> + bool again = false;
> + Sync committedTrans (&committedTransactions.syncObject,
> + "waitForWriteComplete");
> + committedTrans.lock (Shared);
> +
> + for (Transaction *trans = committedTransactions.first; trans;
> + trans = trans->next)
> + {
> + if (trans->hasRecords(table)&& trans->writePending)
> + {
> + again = true;
> + break;
> + }
> + }
> +
> + if(!again)
> + return;
> +
> + committedTrans.unlock();
> + Thread::getThread("waitForWriteComplete")->sleep(500);
> + }
> +}
> void TransactionManager::commitByXid(int xidLength, const UCHAR* xid)
> {
> Sync sync (&activeTransactions.syncObject, "TransactionManager::commitByXid");
>
> === modified file 'storage/falcon/TransactionManager.h'
> --- a/storage/falcon/TransactionManager.h 2008-04-24 17:26:34 +0000
> +++ b/storage/falcon/TransactionManager.h 2008-08-11 13:22:53 +0000
> @@ -35,6 +35,7 @@ public:
> void dropTable(Table* table, Transaction* transaction);
> void truncateTable(Table* table, Transaction* transaction);
> bool hasUncommittedRecords(Table* table, Transaction* transaction);
> + void waitForWriteComplete(Table *table);
> void commitByXid(int xidLength, const UCHAR* xid);
> void rollbackByXid(int xidLength, const UCHAR* xid);
> void print(void);
>
> === modified file 'storage/falcon/ha_falcon.cpp'
> --- a/storage/falcon/ha_falcon.cpp 2008-08-04 15:53:52 +0000
> +++ b/storage/falcon/ha_falcon.cpp 2008-08-11 13:22:53 +0000
> @@ -1893,6 +1893,7 @@ int StorageInterface::external_lock(THD
> DBUG_RETURN(error(ret));
> }
> }
> + storageTable->waitForWriteComplete();
> break;
> default:
> break;
>
>
>