From: Kevin Lewis Date: November 19 2008 10:35pm Subject: Re: New dependency manager and Transaction::getRelativeState List-Archive: http://lists.mysql.com/falcon/199 Message-Id: <492494A2.8040001@sun.com> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset=ISO-8859-1 Content-Transfer-Encoding: 7BIT Olav, comments below; ======================================== > Olav wrote; > Do you have any opinion about how I should proceed with this? I think we > both agree that this probably is an existing bug, and I think that > solving this might make this part of using the dependency manager > simpler. One alternative is that I continue the work on the dependency > manager making it work identical to the old one in this case and then > later after we have solved this bug do any simplifications that can be > done? An alternative is that that I started to focus on this bug but I > am a bit worried that that might take some time both to understand and > fix (and take away focus from the dependency manager). > > Another option for me is to spend some time writing some tests for this > area to check and verify that the both the old and new dependency > manager works for this parts of the code. Philip has asked a few times > about how we plan to test this area and by spending some time on testing > I could get a better overview of what kind of testing that is useful. > > I also try to get some progress on some of my other more trivial bugs > when I feel for doing something that needs less concentration and > effort. And it feels good to get progress on the bug count too. > > Let me know if you have any specific opinion of what I should do if not > I probably will continue working as planned on the dependency manager, > some testing and some other bugs. > > Is this new bug something that can be discuss in Friday's meeting or > will it be to detailed for you to bring up? ======================================== > Olav wrote; > The new dependency manager uses the "startEvent" (start time) and the > "endEvent" (commit time) for each transaction to determine relationships > (and thus dependency) between two transaction. The main test for this is: > > if (transaction A's endEvent < transaction B's startEvent) { // > Transaction A committed before B even started > // A and B was never active on the same time > // B would not have any dependencies on A (using the old > dependency manage) > // A can be clean/removed/deleted > } Note that you cannot remove transaction A until it's endEvent is < the startEvent of the 'oldestActiveTransaction'. In fact, without the dependency pointers between transactions to prevent this, Transaction::commitRecords() will probably not actually be able to break that linkage between records and transaction at Transaction::writeComplete(). It will have to be done later if there are 'dependencies'. The current code uses that dependency counter to release these older transactions when the dependencies pointers finally go away. Fortunately, the current code also has a call to TransactionManager::purgeTransactions() in the scavenger. You will need to rely on this to clean up these old committed transactions that still have records attached to them. That means that the work I am doing is now critical for the success of your new dependency management code. My new scavenger starts up on a load based event and will call purgeTransactions for you much more often. The current code will only call it every 30 seconds by default. At least it does not wait until the scavenge threshold is reached in the record cache like it does before releasing old records! > This test is basically replacing most of the tests for "dependencies == > 0" in the old dependency manager. > > The problem with the code in Transaction::getRelativeState() is that we > do not always have the transaction object for transaction A (only the > transaction id of it). This must change. A record needs to keep the transaction pointer until the transaction is visible to all. The alternative is to add the endEvent to each and every RecordVersion, which I agree is generally not what we want. You have found an exception to this with Table::unlockRecord(). I am working on a solution to this exception and will provide it to you tomorrow. Even if you add the endEvent to each Recordversion, we need to fix this problem that a lock record can get state=recUnlock. This state is not expected by many places in the code. > I have a solution for this where B commits (in > Transaction::commit()) it copies it "endEvent" number into all > RecordVersion objects. This makes B's endEvent available in > Transaction::getRelativeState() also when we do not have the pointer to > the transaction object. This makes it possible to replace the for-loop > that goes through the states array with a similar test using the > endEvent for the "other" and the startEvent of "this" transaction for > determining if the "other" transaction was active or not when "this" > transaction started. > > It was actully by this approach that I found out that we had cases where > the pointer to transaction was NULL while the transaction in fact was > not committed. The way I saw this was that Transaction::getReletiveState > got called and the endEvent was zero (e.g. the transaction had not > committed). So I can actually by looking at the endEvent number I have > introduced in the RecordVersion object both determine if the "other" > transaction was active when "this" transaction started (test above) and > if it has committed or not (endEvent equal to zero). > > But... I do not like having to extend the use of the endEvent transition > event into the RecordVersion objects... So if we could fix the existing > bug that we found yesterday it would have been great if we just could > have declared the for-loop that tests the states array in > Transaction::getRelativeState() as "dead code". There is another problem with putting the endEvent into each and every record. It takes a while to do. It would have to be done at Transaction::commit. Transaction::getRelativeState is supposed to wait on another active transaction until it commits. But if it sees a number in endEvent, it will assume that the transaction is committed already. This may cause some race conditions. It is much better to go directly to the transaction object to see its state. I talked to Jim today about how to fix this. He confirmed my solution which is to get rid of the recUnlock state and leave detached lock records attached to the transaction when they are 'unlocked'. It will still replace this record on the recordLeaf so that future Table::fetch calls will find the lock record's priorVersion. In addition, I need to make sure that any code caught looking at one of these lock records has the wherewithall to look at the prior. And if it waits, it MUST do another Table::fetch(). I'll get this code to you as soon as I can. Kevin