From: Olav Sandstaa Date: December 10 2008 12:45am Subject: Crash with new dependency manager List-Archive: http://lists.mysql.com/falcon/291 Message-Id: <493F1137.8010705@sun.com> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset=UTF-8 Content-Transfer-Encoding: 8BIT Kevin, All, In the previous email I sent about the new dependency manager I mentioned that I had seen a crash with the new dependency manager. The crash has occurred only once in the tests that I have run, on a Windows machine but by running the failing test (falcon.falcon_bug_34351_C) I have been able to reproduce it once. I sending this email to hear if anyone has seen a similar crash or have ideas of what has gone wrong. I have not spent to much time on it yet so I might understand it better when I have worked more on it. The crash is an assert in Table.cpp after calling Transaction::getRelativeState(): [Falcon] Error: assertion (IS_CONSISTENT_READ(transaction->isolationLevel)) failed at line 3532 in file Table.cpp The call stack is: [13] Error::assertionFailed(text = 0x157aa24 "IS_CONSISTENT_READ(transaction->isolationLevel)", fileName = 0x157aa54 "Table.cpp", line = 3532), line 78 in "Error.cpp" [14] Table::fetchForUpdate(this = 0x1cc0ed0, transaction = 0x1cb4dc0, source = 0x43030f8, usingIndex = true), line 3532 in "Table.cpp" [15] StorageDatabase::nextIndexed(this = 0x1d108d8, storageTable = 0x1cc2f08, recordBitmap = 0x1d13098, recordNumber = 14808, lockForUpdate = true), line 428 in "StorageDatabase.cpp" [16] StorageTable::nextIndexed(this = 0x1cc2f08, recordNumber = 0, lockForUpdate = true), line 171 in "StorageTable.cpp" [17] StorageInterface::index_next(this = 0x40f2f68, buf = 0x40f3228 "�"), line 1687 in "ha_falcon.cpp" [18] StorageInterface::read_range_first(this = 0x40f2f68, start_key = 0x40f3038, end_key = 0x40f3058, eq_range_arg = true, sorted = false), line 1655 in "ha_falcon.cpp" [19] handler::multi_range_read_next(this = 0x40f2f68, range_info = 0xfffffd7ffd6e16e0), line 4258 in "handler.cc" [20] QUICK_RANGE_SELECT::get_next(this = 0x1d1d4b8), line 8596 in "opt_range.cc" [21] rr_quick(info = 0xfffffd7ffd6e1d48), line 322 in "records.cc" [22] mysql_delete(thd = 0x3cbc4f0, table_list = 0x40f66e8, conds = 0x40f6e88, order = 0x40f5d80, limit = 18446744073709551615U, options = 0, reset_auto_increment = false), line 284 in "sql_delete.cc" [23] mysql_execute_command(thd = 0x3cbc4f0), line 3237 in "sql_parse.cc" [24] sp_instr_stmt::exec_core(this = 0x40f6ff8, thd = 0x3cbc4f0, nextp = 0xfffffd7ffd6e4668), line 2910 in "sp_head.cc" [25] sp_lex_keeper::reset_lex_and_exec_core(this = 0x40f7038, thd = 0x3cbc4f0, nextp = 0xfffffd7ffd6e4668, open_tables = false, instr = 0x40f6ff8), line 2734 in "sp_head.cc" [26] sp_instr_stmt::execute(this = 0x40f6ff8, thd = 0x3cbc4f0, nextp = 0xfffffd7ffd6e4668), line 2847 in "sp_head.cc" [27] sp_head::execute(this = 0x40e86e8, thd = 0x3cbc4f0), line 1241 in "sp_head.cc" [28] sp_head::execute_procedure(this = 0x40e86e8, thd = 0x3cbc4f0, args = 0x3cbe950), line 1972 in "sp_head.cc" [29] mysql_execute_command(thd = 0x3cbc4f0), line 4234 in "sql_parse.cc" [30] mysql_parse(thd = 0x3cbc4f0, inBuf = 0x3cc6f98 "CALL p1()", length = 9U, found_semicolon = 0xfffffd7ffd6e7be8), line 5737 in "sql_parse.cc" [31] dispatch_command(command = COM_QUERY, thd = 0x3cbc4f0, packet = 0x3cbef41 "CALL p1()", packet_length = 9U), line 1006 in "sql_parse.cc" [32] do_command(thd = 0x3cbc4f0), line 689 in "sql_parse.cc" [33] handle_one_connection(arg = 0x3cbc4f0), line 1156 in "sql_connect.cc" [34] _thr_setup(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff10fd7b [35] _lwp_start(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff10ffb0 The code looks like (Table.cpp line ~3532): // We may need to lock the record State state = transaction->getRelativeState(record, WAIT_IF_ACTIVE); switch (state) { case CommittedInvisible: // CommittedInvisible only happens for consistent read. ASSERT(IS_CONSISTENT_READ(transaction->isolationLevel)); // <------- CRASH HERE record->release(); Log::debug("Table::fetchForUpdate: Update Conflict: TransId=%d, RecordNumber=%d, Table %s.%s\n", transaction->transactionId, record->recordNumber, schemaName, name); throw SQLError(UPDATE_CONFLICT, "update conflict in table %s.%s", schemaName, name); Before you run to see what getRelativeState() does, please be aware that this has been updated due to the new dependency manager (and a bug fixed by Kevin) and that the version in the falcon tree is not the current version. So I have attached a version of the code for it, both as text futher down and as an attachment (which might be removed by the list softwae). When the crash happens, the isolationLevel is: (dbx) print transaction->isolationLevel transaction->isolationLevel = 4 or TRANSACTION_WRITE_COMMITTED = 4 The record (after casting it to a RecordVersion object) that is used as argument to getRelativeState(record...) looks like: (dbx) print -r *((RecordVersion*)record) *(class RecordVersion *) record = { RecordVersion::Record::data = { RecordVersion::Record::record = 0x4303198 "^F" } RecordVersion::Record::useCount = 2 RecordVersion::Record::format = 0x1d4ce90 RecordVersion::Record::recordNumber = 14807 RecordVersion::Record::size = 158 RecordVersion::Record::generation = 1ULL RecordVersion::Record::highWater = 1 RecordVersion::Record::encoding = '\004' RecordVersion::Record::state = '\0' RecordVersion::Record::active = '\001' RecordVersion::priorVersion = (nil) RecordVersion::virtualOffset = 15874157ULL RecordVersion::transaction = (nil) RecordVersion::nextInTrans = (nil) RecordVersion::prevInTrans = (nil) RecordVersion::transactionId = 59357U RecordVersion::savePointId = 0 RecordVersion::superceded = false } In this RecordVersion object the transaction pointer it NULL so the second call to getRelativeState should look like this: getRelativeState(NULL; 59357, ...) And the following code in getRelativeState should be executed: // A record may still have the transId even after the trans itself has been deleted. if (!transaction) { // All calls to getRelativeState are for the purpose of writing. // So only ConsistentRead can get CommittedInvisible. if (IS_CONSISTENT_READ(isolationLevel)) { // If the transaction is no longer around, and the record is, // then it must be committed. // Check if the transaction started after us. // With the old dependency manager this test might have been // hit but with the new dependency manager this will never // happen. Still leave it in until further evaluation. if (transactionId < transId) return CommittedInvisible; // Be sure it was not active when we started. // The dependency manager ensures that transactions that were // active at the time this transaction started will not be // deleted at least not until also we are committed. // Since the transaction pointer is NULL here, // the transaction is not deleted and hence was not active at // the time we started. } return CommittedVisible; } Since the isolationLevel is TRANSACTION_WRITE_COMMITTED this code should return CommittedVisible. The transactionId in the code above is: (dbx) print transaction->transactionId transaction->transactionId = 59361U and transId is (hopefully) 59357 so even if we got passed the second if test (on isolation level) the if-test on transaction Id would not be true. If we got even further down the code to: if (transaction->isActive()) { if (flags & DO_NOT_WAIT) return Active; bool isDeadlock; waitForTransaction(transaction, 0 , &isDeadlock); if (isDeadlock) return Deadlock; return WasActive; // caller will need to re-fetch } The transaction's state is: (dbx) print transaction->state transaction->state = 0 so we should return either DeadLock or WasActive.... Any ideas how this happened? It might be obvious, I might be too tired or something is really wrong. Any suggestions on what has happened? NOTE: This crash happens on a version that does not have any of the two last performance patches includes, so there is no re-using of transaction objects or CAS allocations involved. Olav ==================================================================================== Copy of the two Transaction::getRelativeState() methods: /*** @brief Get the relative state between this transaction and the transaction associated with a record version. ***/ State Transaction::getRelativeState(Record* record, uint32 flags) { // If this is a Record object it has no assosiated transaction // and is always visible if (!record->isVersion()) { return CommittedVisible; } blockingRecord = record; State state = getRelativeState(record->getTransaction(), record->getTransactionId(), flags); blockingRecord = NULL; return state; } /*** @brief Get the relative state between this transaction and another. ***/ State Transaction::getRelativeState(Transaction *transaction, TransId transId, uint32 flags) { if (transactionId == transId) return Us; // A record may still have the transId even after the trans itself has been deleted. if (!transaction) { // All calls to getRelativeState are for the purpose of writing. // So only ConsistentRead can get CommittedInvisible. if (IS_CONSISTENT_READ(isolationLevel)) { // If the transaction is no longer around, and the record is, // then it must be committed. // Check if the transaction started after us. // With the old dependency manager this test might have been // hit but with the new dependency manager this will never // happen. Still leave it in until further evaluation. if (transactionId < transId) return CommittedInvisible; // Be sure it was not active when we started. // The dependency manager ensures that transactions that were // active at the time this transaction started will not be // deleted at least not until also we are committed. // Since the transaction pointer is NULL here, // the transaction is not deleted and hence was not active at // the time we started. } return CommittedVisible; } if (transaction->isActive()) { if (flags & DO_NOT_WAIT) return Active; bool isDeadlock; waitForTransaction(transaction, 0 , &isDeadlock); if (isDeadlock) return Deadlock; return WasActive; // caller will need to re-fetch } if (transaction->state == Committed) { // Return CommittedVisible if the other trans has a lower TransId and // it was committed when we started. if (visible (transaction, transId, FOR_WRITING)) return CommittedVisible; return CommittedInvisible; } return (State) transaction->state; }