List:Falcon Storage Engine« Previous MessageNext Message »
From:Olav Sandstaa Date:November 19 2008 2:19pm
Subject:Re: New dependency manager and Transaction::getRelativeState
View as plain text  
Kevin,

Thanks a lot for looking into this and for providing a very detailed 
explanation of what happens. I agree that this is likely an existing bug 
in the code.

Comments below:

Kevin Lewis wrote:
> Olav,
>
> Thanks for the tip on running falcon_bug_22165 repeatably.  It stopped 
> at this line on the third run;
>
>     for (int n = 0; n < numberStates; ++n)
>         if (states [n].transactionId == transId)
>             return CommittedInvisible;           <-- break here
>
> Transaction 8988 is calling Table::fetchForUpdate which has read a 
> base record and is calling getRelativeState to determine if it is 
> visible. The base record was owned by transaction 8979 which is 
> currently the oldest active transaction.  But this records state=7
>
> static const int recUnlocked = 7;  // record is being unlocked
>
> You will notice that Table::unlockRecord() calls removeRecord(record) 
> which sets the transaction pointer to NULL.  That is important to do.
>
> This base record for 8979 is not visible to 8988 since it is a lock 
> record that is being backed out.  The dependency is tracked by the 
> states array, but we have to figure out another way to do it.
>
> I have repeated this breakpoint many times and it is always a record 
> that has state = recUnlocked.
>
> I think this case is actually an existing bug!  It currently returns 
> CommittedInvisible or CommittedVisible depending on whether it is 
> consistent read.  But this record is NOT committed.  The code assumes 
> that if Recordversion::transaction is NULL, the transaction must be 
> committed.  This transaction could be rolling back and undoing all 
> locks, or it could be committing and releasing lock records that were 
> not changed.
>
> The code should go into the (transaction->isActive() section and wait 
> for that transaction.  I'm not sure what to do about this, but let me 
> think about it a little.
>
> However, my original proposal still is valid.  We need to prevent 
> these lock records from being seen while the transaction is NULL, or 
> handle it correctly.

I have code for the new dependency manager that handles this similarly 
to what the existing code (and thus have the same bug as the existing 
code) but I am not quite satisfied with the implementation that I 
currently have (as pointed out in the "issue 1" in the first email I 
sent about this yesterday).

The new dependency manager uses the "startEvent" (start time) and the 
"endEvent" (commit time) for each transaction to determine relationships 
(and thus dependency) between two transaction. The main test for this is:

    if (transaction A's endEvent < transaction B's startEvent) { // 
Transaction A committed before B even started
       // A and B was never active on the same time
       // B would not have any dependencies on A (using the old 
dependency manage)
       // A can be clean/removed/deleted
  }

This test is basically replacing most of the tests for "dependencies == 
0" in the old dependency manager.

The problem with the code in Transaction::getRelativeState() is that we 
do not always have the transaction object for transaction A (only the 
transaction id of it). I have a solution for this where B commits (in 
Transaction::commit()) it copies it "endEvent" number into all 
RecordVersion objects. This makes B's endEvent available in 
Transaction::getRelativeState() also when we do not have the pointer to 
the transaction object. This makes it possible to replace the for-loop 
that goes through the states array with a similar test using the 
endEvent for the "other" and the startEvent of "this" transaction for 
determining if the "other" transaction was active or not when "this" 
transaction started.

It was actully by this approach that I found out that we had cases where 
the pointer to transaction was NULL while the transaction in fact was 
not committed. The way I saw this was that Transaction::getReletiveState 
got called and the endEvent was zero (e.g. the transaction had not 
committed). So I can actually by looking at the endEvent number I have 
introduced in the RecordVersion object both determine if the "other" 
transaction was active when "this" transaction started (test above) and 
if it has committed or not (endEvent equal to zero).

But... I do not like having to extend the use of the endEvent transition 
event into the RecordVersion objects... So if we could fix the existing 
bug that we found yesterday it would have been great if we just could 
have declared the for-loop that tests the states array in 
Transaction::getRelativeState() as "dead code".

Olav


Thread
New dependency manager and Transaction::getRelativeStateOlav Sandstaa18 Nov
  • Re: New dependency manager and Transaction::getRelativeStateKevin Lewis18 Nov
    • Re: New dependency manager and Transaction::getRelativeStateKevin Lewis18 Nov
      • Re: New dependency manager and Transaction::getRelativeStateOlav Sandstaa18 Nov
      • Re: New dependency manager and Transaction::getRelativeStateOlav Sandstaa18 Nov
        • Re: New dependency manager and Transaction::getRelativeStateKevin Lewis19 Nov
          • Re: New dependency manager and Transaction::getRelativeStateOlav Sandstaa19 Nov
            • Re: New dependency manager and Transaction::getRelativeStateKevin Lewis19 Nov
              • Re: New dependency manager and Transaction::getRelativeStateOlav Sandstaa20 Nov