List:Falcon Storage Engine« Previous MessageNext Message »
From:Olav Sandstaa Date:December 10 2008 2:29pm
Subject:Re: Crash with new dependency manager
View as plain text  
Hi,

It seems like this might not be a regression introduced by changes done 
with the new dependency manager. By running the test 4059 times with the 
code from the falcon-team tree that contains the old dependency manger I 
was able to reproduce the crash. I have filed this problem as Bug #41357 
"falcon.falcon_bug_34351_C fails with assertion IS_CONSISTENT_READ"

Olav


Olav Sandstaa wrote:
> Kevin, All,
>
> In the previous email I sent about the new dependency manager I
> mentioned that I had seen a crash with the new dependency manager. The
> crash has occurred only once in the tests that I have run, on a Windows
> machine but by running the failing test (falcon.falcon_bug_34351_C) I
> have been able to reproduce it once. I sending this email to hear if
> anyone has seen a similar crash or have ideas of what has gone wrong. I
> have not spent to much time on it yet so I might understand it better
> when I have worked more on it.
>
> The crash is an assert in Table.cpp after calling
> Transaction::getRelativeState():
>
> [Falcon] Error: assertion
> (IS_CONSISTENT_READ(transaction->isolationLevel)) failed at line 3532 in
> file Table.cpp
>
> The call stack is:
>
>
> [13] Error::assertionFailed(text = 0x157aa24
> "IS_CONSISTENT_READ(transaction->isolationLevel)", fileName = 0x157aa54
> "Table.cpp", line = 3532), line 78 in "Error.cpp"
> [14] Table::fetchForUpdate(this = 0x1cc0ed0, transaction = 0x1cb4dc0,
> source = 0x43030f8, usingIndex = true), line 3532 in "Table.cpp"
> [15] StorageDatabase::nextIndexed(this = 0x1d108d8, storageTable =
> 0x1cc2f08, recordBitmap = 0x1d13098, recordNumber = 14808, lockForUpdate
> = true), line 428 in "StorageDatabase.cpp"
> [16] StorageTable::nextIndexed(this = 0x1cc2f08, recordNumber = 0,
> lockForUpdate = true), line 171 in "StorageTable.cpp"
> [17] StorageInterface::index_next(this = 0x40f2f68, buf = 0x40f3228
> "�"), line 1687 in "ha_falcon.cpp"
> [18] StorageInterface::read_range_first(this = 0x40f2f68, start_key =
> 0x40f3038, end_key = 0x40f3058, eq_range_arg = true, sorted = false),
> line 1655 in "ha_falcon.cpp"
> [19] handler::multi_range_read_next(this = 0x40f2f68, range_info =
> 0xfffffd7ffd6e16e0), line 4258 in "handler.cc"
> [20] QUICK_RANGE_SELECT::get_next(this = 0x1d1d4b8), line 8596 in
> "opt_range.cc"
> [21] rr_quick(info = 0xfffffd7ffd6e1d48), line 322 in "records.cc"
> [22] mysql_delete(thd = 0x3cbc4f0, table_list = 0x40f66e8, conds =
> 0x40f6e88, order = 0x40f5d80, limit = 18446744073709551615U, options =
> 0, reset_auto_increment = false), line 284 in "sql_delete.cc"
> [23] mysql_execute_command(thd = 0x3cbc4f0), line 3237 in "sql_parse.cc"
> [24] sp_instr_stmt::exec_core(this = 0x40f6ff8, thd = 0x3cbc4f0, nextp =
> 0xfffffd7ffd6e4668), line 2910 in "sp_head.cc"
> [25] sp_lex_keeper::reset_lex_and_exec_core(this = 0x40f7038, thd =
> 0x3cbc4f0, nextp = 0xfffffd7ffd6e4668, open_tables = false, instr =
> 0x40f6ff8), line 2734 in "sp_head.cc"
> [26] sp_instr_stmt::execute(this = 0x40f6ff8, thd = 0x3cbc4f0, nextp =
> 0xfffffd7ffd6e4668), line 2847 in "sp_head.cc"
> [27] sp_head::execute(this = 0x40e86e8, thd = 0x3cbc4f0), line 1241 in
> "sp_head.cc"
> [28] sp_head::execute_procedure(this = 0x40e86e8, thd = 0x3cbc4f0, args
> = 0x3cbe950), line 1972 in "sp_head.cc"
> [29] mysql_execute_command(thd = 0x3cbc4f0), line 4234 in "sql_parse.cc"
> [30] mysql_parse(thd = 0x3cbc4f0, inBuf = 0x3cc6f98 "CALL p1()", length
> = 9U, found_semicolon = 0xfffffd7ffd6e7be8), line 5737 in "sql_parse.cc"
> [31] dispatch_command(command = COM_QUERY, thd = 0x3cbc4f0, packet =
> 0x3cbef41 "CALL p1()", packet_length = 9U), line 1006 in "sql_parse.cc"
> [32] do_command(thd = 0x3cbc4f0), line 689 in "sql_parse.cc"
> [33] handle_one_connection(arg = 0x3cbc4f0), line 1156 in 
> "sql_connect.cc"
> [34] _thr_setup(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff10fd7b
> [35] _lwp_start(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff10ffb0
>
>
> The code looks like (Table.cpp line ~3532):
>
> // We may need to lock the record
>
> State state = transaction->getRelativeState(record, WAIT_IF_ACTIVE);
>
> switch (state)
> {
> case CommittedInvisible:
> // CommittedInvisible only happens for consistent read.
>
> ASSERT(IS_CONSISTENT_READ(transaction->isolationLevel)); // <-------
> CRASH HERE
> record->release();
> Log::debug("Table::fetchForUpdate: Update Conflict: TransId=%d,
> RecordNumber=%d, Table %s.%s\n",
> transaction->transactionId, record->recordNumber, schemaName, name);
> throw SQLError(UPDATE_CONFLICT, "update conflict in table %s.%s",
> schemaName, name);
>
>
> Before you run to see what getRelativeState() does, please be aware that
> this has been updated due to the new dependency manager (and a bug fixed
> by Kevin) and that the version in the falcon tree is not the current
> version. So I have attached a version of the code for it, both as text
> futher down and as an attachment (which might be removed by the list
> softwae).
>
> When the crash happens, the isolationLevel is:
>
> (dbx) print transaction->isolationLevel
> transaction->isolationLevel = 4
>
> or TRANSACTION_WRITE_COMMITTED = 4
>
> The record (after casting it to a RecordVersion object) that is used as
> argument to getRelativeState(record...) looks like:
>
> (dbx) print -r *((RecordVersion*)record)
> *(class RecordVersion *) record = {
> RecordVersion::Record::data = {
> RecordVersion::Record::record = 0x4303198 "^F"
> }
> RecordVersion::Record::useCount = 2
> RecordVersion::Record::format = 0x1d4ce90
> RecordVersion::Record::recordNumber = 14807
> RecordVersion::Record::size = 158
> RecordVersion::Record::generation = 1ULL
> RecordVersion::Record::highWater = 1
> RecordVersion::Record::encoding = '\004'
> RecordVersion::Record::state = '\0'
> RecordVersion::Record::active = '\001'
> RecordVersion::priorVersion = (nil)
> RecordVersion::virtualOffset = 15874157ULL
> RecordVersion::transaction = (nil)
> RecordVersion::nextInTrans = (nil)
> RecordVersion::prevInTrans = (nil)
> RecordVersion::transactionId = 59357U
> RecordVersion::savePointId = 0
> RecordVersion::superceded = false
> }
>
> In this RecordVersion object the transaction pointer it NULL so the
> second call to getRelativeState should look like this:
>
> getRelativeState(NULL; 59357, ...)
>
> And the following code in getRelativeState should be executed:
>
> // A record may still have the transId even after the trans itself has
> been deleted.
>
> if (!transaction)
> {
> // All calls to getRelativeState are for the purpose of writing.
> // So only ConsistentRead can get CommittedInvisible.
>
> if (IS_CONSISTENT_READ(isolationLevel))
> {
> // If the transaction is no longer around, and the record is,
> // then it must be committed.
>
> // Check if the transaction started after us.
> // With the old dependency manager this test might have been
> // hit but with the new dependency manager this will never
> // happen. Still leave it in until further evaluation.
>
> if (transactionId < transId)
> return CommittedInvisible;
>
> // Be sure it was not active when we started.
>
> // The dependency manager ensures that transactions that were
> // active at the time this transaction started will not be
> // deleted at least not until also we are committed.
> // Since the transaction pointer is NULL here,
> // the transaction is not deleted and hence was not active at
> // the time we started.
> }
>
> return CommittedVisible;
> }
>
> Since the isolationLevel is TRANSACTION_WRITE_COMMITTED this code should
> return CommittedVisible.
>
> The transactionId in the code above is:
>
> (dbx) print transaction->transactionId
> transaction->transactionId = 59361U
>
> and transId is (hopefully) 59357
>
> so even if we got passed the second if test (on isolation level) the
> if-test on transaction Id would not be true.
>
> If we got even further down the code to:
>
> if (transaction->isActive())
> {
> if (flags & DO_NOT_WAIT)
> return Active;
>
> bool isDeadlock;
> waitForTransaction(transaction, 0 , &isDeadlock);
>
> if (isDeadlock)
> return Deadlock;
>
> return WasActive; // caller will need to re-fetch
> }
>
> The transaction's state is:
>
> (dbx) print transaction->state
> transaction->state = 0
>
> so we should return either DeadLock or WasActive....
>
> Any ideas how this happened? It might be obvious, I might be too tired 
> or something is really wrong.
>
> Any suggestions on what has happened?
>
> NOTE: This crash happens on a version that does not have any of the two
> last performance patches includes, so there is no re-using of
> transaction objects or CAS allocations involved.
>
> Olav
>
> ====================================================================================
> 
>
>
> Copy of the two Transaction::getRelativeState() methods:
>
> /***
> @brief Get the relative state between this transaction and
> the transaction associated with a record version.
> ***/
>
> State Transaction::getRelativeState(Record* record, uint32 flags)
> {
> // If this is a Record object it has no assosiated transaction
> // and is always visible
>
> if (!record->isVersion())
> {
> return CommittedVisible;
> }
>
> blockingRecord = record;
> State state = getRelativeState(record->getTransaction(),
> record->getTransactionId(), flags);
> blockingRecord = NULL;
>
> return state;
> }
>
> /***
> @brief Get the relative state between this transaction and another.
> ***/
>
> State Transaction::getRelativeState(Transaction *transaction, TransId
> transId, uint32 flags)
> {
> if (transactionId == transId)
> return Us;
>
> // A record may still have the transId even after the trans itself has
> been deleted.
>
> if (!transaction)
> {
> // All calls to getRelativeState are for the purpose of writing.
> // So only ConsistentRead can get CommittedInvisible.
>
> if (IS_CONSISTENT_READ(isolationLevel))
> {
> // If the transaction is no longer around, and the record is,
> // then it must be committed.
>
> // Check if the transaction started after us.
> // With the old dependency manager this test might have been
> // hit but with the new dependency manager this will never
> // happen. Still leave it in until further evaluation.
>
> if (transactionId < transId)
> return CommittedInvisible;
>
> // Be sure it was not active when we started.
>
> // The dependency manager ensures that transactions that were
> // active at the time this transaction started will not be
> // deleted at least not until also we are committed.
> // Since the transaction pointer is NULL here,
> // the transaction is not deleted and hence was not active at
> // the time we started.
> }
>
> return CommittedVisible;
> }
>
> if (transaction->isActive())
> {
> if (flags & DO_NOT_WAIT)
> return Active;
>
> bool isDeadlock;
> waitForTransaction(transaction, 0 , &isDeadlock);
>
> if (isDeadlock)
> return Deadlock;
>
> return WasActive; // caller will need to re-fetch
> }
>
> if (transaction->state == Committed)
> {
> // Return CommittedVisible if the other trans has a lower TransId and
> // it was committed when we started.
>
> if (visible (transaction, transId, FOR_WRITING))
> return CommittedVisible;
>
> return CommittedInvisible;
> }
>
> return (State) transaction->state;
> }
>
>
>
>
>
>
>

Thread
Crash with new dependency managerOlav Sandstaa10 Dec
  • Re: Crash with new dependency managerOlav Sandstaa10 Dec