From: John Embretsen Date: June 17 2009 12:55pm Subject: Weekly Falcon bazaar merge List-Archive: http://lists.mysql.com/falcon/766 Message-Id: <4A38E7BF.7050307@Sun.COM> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="Boundary_(ID_VdPCTTmNyoas+oi9WweZEA)" --Boundary_(ID_VdPCTTmNyoas+oi9WweZEA) Content-type: text/plain; CHARSET=US-ASCII Content-transfer-encoding: 7BIT Hi falconers, I just finished a merge of our bazaar branches mysql-6.0-falcon-team (private) --> mysql-6.0-falcon (public) Commit comments from the 22 merged revisions are attached to this E-mail. Due to changes introduced by the new MySQL development- and release model [1, 2] we did not do any other big merges the past couple of weeks. As always, please test your code tree for unusual problems such as compile errors or the like. The latest merge will appear in the public falcon branch on Launchpad within a couple of hours. You can create your own local branch from launchpad: bzr branch lp:~mysql/mysql-server/mysql-6.0-falcon (see https://code.launchpad.net/~mysql/mysql-server/mysql-6.0-falcon) [1]: http://forge.mysql.com/wiki/Development_Cycle [2]: http://forge.mysql.com/wiki/The_New_MySQL_Release_Model Cheers and happy compiling, -- John H. Embretsen QA Engineer, Sun Microsystems --Boundary_(ID_VdPCTTmNyoas+oi9WweZEA) Content-type: text/plain; name=merged-from-falcon-team-to-falcon.txt Content-transfer-encoding: 7BIT Content-disposition: inline; filename=merged-from-falcon-team-to-falcon.txt Merged 22 revisions to mysql-6.0-falcon branch: ------------------------------------------------------------ revno: 2729 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Fri 2009-06-12 13:17:19 -0500 message: Bug#43554 - This is an unorthodox, special case fix for a bug that originates from different assumptions about how a storage enginge keeps track of locks. Falcon will reuse a record lock if it is asked for again within the same transaction. MySQL server assumes that the storage engine can lock a record twice and unlock it once, leaving it locked for the future of that transaction. Falcon will not unlock a record if it was locked at an earlier savepoint. But the server, in the test case for this bug, locks the record twice at the same savepoint. To make up for this bad assumption in the server, this patch will lower the savepoint ID by 1 if a lock request comes in a second time at the same savepoint. This will hold a lock slightly longer within the transaction, which if there is a condition where it is not intended, is not as bad as giving up an intended lock. ------------------------------------------------------------ revno: 2728 committer: lars-erik.bjork branch nick: clean timestamp: Fri 2009-06-12 16:47:06 +0200 message: Correction of erroneous assignment in IndexKey::setKey(...). Value was assigned to Input parameter instead of member variable. The different IndexKey::setKey(..) methods are now also grouped together for readability Modified file 'storage/falcon/IndexKey.cpp' Modified file 'storage/falcon/IndexKey.h' ------------------------------------------------------------ revno: 2727 committer: Olav Sandstaa branch nick: falcon-bug45297 timestamp: Fri 2009-06-12 15:34:24 +0200 message: In Bug#44744 and Bug#45297 a crash occures during recovery. In both these cases the crash occurs when a record in a data page is attempted updated and the record is not already present in the page. This patch does not solve this problem but will make the recovery fail when the invalid update to the data page occurs instead of failing later when the invalid data is used. Common for both recovery crashes is that DataPage::updateRecord() is called with a lineNumber that is equal to the page's maxLine. This is not a correct use of this function as DataPage::updateRecord should only be called for already existing records (ie. lineNumber should be less than maxLine). Through repeatedly running recovery I see several cases where this happens but does not lead to a crash but instead likely leads to an inconsistency. To avoid this inconsistency and to make the recovery fail immediately instead of when the invalid data is used this patch adds an ASSERT that verify that the lineNumber is valid. ------------------------------------------------------------ revno: 2726 committer: Bjorn Munch branch nick: falc-60 timestamp: Thu 2009-06-11 14:04:02 +0200 message: Bug #42642 MTRv2 restricts valid values for --default-storage-engine option Convert provided option to lower case ------------------------------------------------------------ revno: 2725 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Wed 2009-06-10 18:54:03 -0500 message: Bug#36631 - Make sure all serial log windows are almost 1 Mb even when the falcon-record-chill-threshold is set below 1 Mb. SRLUpdateRecords::append() steps through the records in a transaction with two embedded loops. The inner loop has two places where execution can break out. When it does, and there are still more records to process, writeWindow will be flushed and a new writeWindow will be created. One of these break conditions is appropriate for starting a new window and the other is not. The first condition has this comnment. // If this is a different savepoint, break out of the inner loop // and start another serial log record There should be a different serial log record for each savepoint, but it does not need to start a new writeWindow. When large transactions must chill records changed at different savepoints, this code path was causing the serial log windows to use less that the full 1 Mb available in the buffer. This waisted space and increased the chances that the previous patch for this bug occurred. The second break in the inner loop has this comment; // Ensure record fits within current window The code is designed to predict if a record will not fit in the current window. If not, it would explicitly flush the old writeWindow, start a new writeWindow, and put the current record there. The ability to break to a new window at any time is built into putdata, but for the purposes af thawing, we need to be able to thaw an entire record from the same serialLogRecord. To distinguish between thew two purposes of breaking out of the inner loop, the code now uses a boolean called forceNewWindow. ------------------------------------------------------------ revno: 2724 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Wed 2009-06-10 10:05:08 -0500 message: Fix compiler warnings on Linux ------------------------------------------------------------ revno: 2723 [merge] committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Wed 2009-06-10 09:57:20 -0500 message: Merge ------------------------------------------------------------ revno: 2722 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Wed 2009-06-10 02:36:02 -0500 message: Bug#36631 - Make sure that SerialLogControl::currentLength is always set correctly. If it is too short and a buffer needs to be re-read, not enough of the serial log will be read back inot the buffer. When it gets to the end of the current length and calls SerialLogControl::nextRecord() it will hit an assert when it reads the previous contents of that buffer, which is now junk. The problem is that the block length indicates a larger window than the currentLength. SerialLogControl::currentLength is now updated in each location that the current writePtr is extended. This is three more places than before. In addition, some comments and some just-in-case code is added to SRLUpdateRecords::thaw(). Since this thaw is not calling setWindow(), but only doing a subset of that function, the task of clearing anddeactivating the old inputWindow that is done in setWindow() is repeated here. ------------------------------------------------------------ revno: 2721 committer: John H. Embretsen branch nick: mysql-6.0-falcon-team-pbfix timestamp: Thu 2009-06-04 16:00:27 +0200 message: Fix accidental change to falcon_team test suite's "experimental" status in PB2 committed with revision alik-20090428081350-qonhsqrsamqeznj0. See bug 43982 for further justification. ------------------------------------------------------------ revno: 2720 [merge] committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Thu 2009-06-04 08:02:19 -0500 message: Merge ------------------------------------------------------------ revno: 2719 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Thu 2009-06-04 07:55:34 -0500 message: Bug#43344 - One more cleanup. RecordVersion::hasData() should act like RecordVersion::getRecordData() by testing the recordData returned from thaw() instead of calling isChilled() which tests data.record again. The record can immediately get re-chilled. And that actually happened in a pushbuld test. ------------------------------------------------------------ revno: 2718 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Wed 2009-06-03 16:01:11 -0500 message: Bug#44911 - More follow-on changes. This assures that there will not be a infinite loop in the case where a lock record is returned from Table::fetch and the transaction becomes CommittedVisible soon after. This should only occur once. The next time, it has to be a different lock record. If the same record is read again, something is fataly wrong. It happened once while I was making some changes for 43344 because of a mistake. So instead of looping infinitely reading the same record, this patch will issue a fatal error. ------------------------------------------------------------ revno: 2717 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Wed 2009-06-03 15:49:27 -0500 message: Bug#43961 & Bug#44223 Both of these bugs are caused by a record that gets queued for delete while still chained up to the transaction. Then during Transaction::rolback (44223) a record to rollback is queued for delete a second time. Or during Transaction::rollbackSavepoint (43961), the code assumes that this record on the transaction and savepoint is still the base record since it has Record::superceded=false, and tries to call insertIntoTree to rerplace it with its prior. But this record is not the base record either. This can happen in Table::update(Transaction * transaction, Record *orgRecord, Stream *stream); If this update replaces a lock record with an updated record and an exception occurred, it may or may not take the record off the transaction and the tree, but it would always queue the lock record for delete. This code has gone through some iterations lately. But in the most recent code, I was able to reporduce 44223. This patch cleans it up even more by distinguishing whether a record is actually inserted into the tree and added to the transaction. It undoes those actions only if they were done. Then if the update started with a lock record, that recordversion will remain on the tree with superceded=false and it will not be queued for delete. It correctly uses the 'wasLock' variable instead of the 'updated' or 'wasUpdated' flag. ------------------------------------------------------------ revno: 2716 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Wed 2009-06-03 13:15:32 -0500 message: Bug#43344 Code review fix. The previous patch lost the negative on isALock(). This bug would limit the effectiveness of pruning old versions. In addition, this patch limits the use of Record::state even further by asserting that it is always recNormal before setting it to something, and setting it back to recNormal when that state is completed. It assures that these temporary states do not overlap. And since recChilled overlaps with recInserting, recChilled is not used anymore. Instead recInserting and a new state, recUpdating, are checked to prevent the current record from being chilled. ------------------------------------------------------------ revno: 2715 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Tue 2009-06-02 20:36:57 -0500 message: Bug#43344 - Reduce Falcon's dependency on Record::state since some of them are temporary states that overwrite other states. Record.cpp, Record.h, RecordVersion.cpp & RecordVersion.h - - - - - - - - - - - - - - - Replace states recDeleted and recLock with values for Record::data.record of recordIsALock (-2) and recordIsDeleted(-3). This makes it much more deterministic to tell what kind of record this is. Other states were found to be not needed since they were not being checked. They include recOnDisk, recRollback, recUnlocked, recDeleting,and recPruning. The initial state of zero previously called recData is now called recNormal. A new value is used for data.record when the data buffer has been deleted prior to the object being scavenged. It is called recordDataIsDeleted (-4). New functions are added to determine intrinsic record types and status including isAllocated, isALock(), findRecordData(), hasNoData(), setLock() & setDeleted(). ExchangeData is separated from deleteData() so that it can be used more flexibly. validateData is deleted since it is not used. In several places, deleteData() is used before putting something else into the data buffer. Now, it is rearranged so that the ne data buffer is created first and the two are exhanged atomically. This continues the effort to change data.record atomically, even where it contains the non-pointer values. In the process of analyzing this code, several new ASSERTS are added. These were tested as much as possible, but pushbuild and other DBT2 runs will need to completely verify them. Index.cpp, RecordScavenge.cpp, SRLUpdateRecords.cpp, StorageDatabase.cpp, Table.cpp, Transaction.cpp - - - - - - - - - - - - - - - The functions isALock() and isDeleted() are used everywhere the state was checked for recLocked and recDeleted. Table.cpp - - - - - - - - - - - - - - - One of the two flavors of Table::update() had two sections of code that tried to set 'oldRecord' These were combined into more compact code and comments were added. ------------------------------------------------------------ revno: 2714 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Tue 2009-06-02 10:25:59 -0500 message: Code cleanup Delete extra tabs. Add comments. Rename boolean updated to wasUpdated. Rename one of the 3 Table::deleteRecord functions to Table::logDeletedRecord() since it is a serial log action. Added some logging information to Transaction::thawAll() ------------------------------------------------------------ revno: 2713 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Tue 2009-06-02 09:40:14 -0500 message: Bug#44911 - Suppose a call to fetchForUpdate is called for a base lockRecord returned from Table::fetch, but before it is evaluated, the lockRecord is superceded by an update record and the transaction is committed. FetchForUpdate will need to re-fetch the new committed record as if getRelativeState() had returned WasActive. But getRelativeState() returned CommittedVisible. That would have caused this FATAL condition that a "Committed non-deleted record has no data buffer". So a check for (source->state == recLock) needs to be done in order to re-fetch the base record. A lockRecord can still be linked to a transaction after the commit if it was updated at a higher savepoint. But they are always superceded with newer records. Regular lockRecords are unlinked from transactions during the commit in the call to Transaction::releaseRecordLocks(). So if a lock record is the base record and the transaction turns out to be committed, then the transaction must have been committed after the fetch. And it would only be CommittedVisible if the isolation was ReadCommitted or WriteCommitted. It would not happen with ConsistentRead. ------------------------------------------------------------ revno: 2712 committer: lars-erik.bjork branch nick: clean timestamp: Mon 2009-05-25 13:10:37 +0200 message: This is the second of two patches for bug#43630 A SELECT using ORDER BY and LIMIT sometimes returns no rows When we look at the first index entry of a page, the following piece of code is executed: for (;; first = true) { if (first) { first = false; recordNumber = node.getNumber(); if (recordNumber >= 0) { return recordNumber; } We return the recordNumber without setting up indexKey to reflect the first index entry on the page. This will only work in two scenarioes: 1: If the key value of the END_BUCKET node on the previous page is the same as the key value of the first index node on the current page 2: If this is the first page we look at during the search With a lot of UPDATE/DELETE statements, the prerequisites for scenario 1 to be correct, may easily be broken, and we will get wrong results. The fix is to call node.expandKey(&indexKey) if this is the first node on the page. This is however not necessary if it is the first page we look at during the search, as this has already been done in IndexRootPage::positionIndex ------------------------------------------------------------ revno: 2711 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Thu 2009-05-21 10:21:20 -0500 message: Clean up extra tabs ------------------------------------------------------------ revno: 2710 [merge] committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Wed 2009-05-20 13:21:09 -0500 message: Merge ------------------------------------------------------------ revno: 2709 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Wed 2009-05-20 11:58:14 -0500 message: Bug#44946 Code review changes; It is not necessary to us CAS to update the Record::highWater. Explained in comments. Added other comments and compiler warning fixes. ------------------------------------------------------------ revno: 2708 committer: Kevin Lewis branch nick: mysql-6.0-falcon-team timestamp: Tue 2009-05-19 23:46:31 -0500 message: Bug#44946 - ASSERT(vector[index] < size); is hit. -------------------------------- Record.cpp & Record.h Protect Record::highWater from being incremented in lock-step with another thread also running Record::getEncodedValue(). Do this by using a local copy of highWater and updating it by CAS only if it is higher than what is there. We assume that multiple threads can update the vector at the same time since they are all writing the same values to the vector. If multiple threads thawing the same record rarely ever happens, there will not be too much performance cost due to cache line bounces. And this patch allows it safely. In order to use CAS, Record::highWater must change from a short to an INTERLOCK_TYPE. Change the assert to a FATAL message if (vector[index] > size - getSize()). The size includes the bytes used by the Record or RecordVersion object. In Record::setEncodedRecord(), 'size' only needs to be adjusted for non-interlocked calls. The interlocked calls are done during a thaw. When a record is chilled, the size variable is left the same. -------------------------------- RecordVersion.cpp & SRLUpdateRecords.cpp RecordVersion::thaw() should also check the virtualOffset before calling Transaction::thaw(). And if the Transaction cannot thaw, the virtualOffset should be immediately set to zero. Add ASSERTs to RecordVersion::hasRecord() and RecordVersion::getRecordData() to make sure that there are no more 'unthawable' records. -------------------------------- Transaction.cpp & Transaction.h In order to prevent 'unthawable' records, make sure all records attached to a transaction are thawed by the gopher thread when the transaction is fully complete. Do this by adding a ne function; Transaction::thawAll() --Boundary_(ID_VdPCTTmNyoas+oi9WweZEA)--