List:Falcon Storage Engine« Previous MessageNext Message »
From:Kevin Lewis Date:November 19 2008 10:35pm
Subject:Re: New dependency manager and Transaction::getRelativeState
View as plain text  
Olav, comments below;

========================================
 > Olav wrote;
 > Do you have any opinion about how I should proceed with this? I think we
 > both agree that this probably is an existing bug, and I think that
 > solving this might make this part of using the dependency manager
 > simpler. One alternative is that I continue the work on the dependency
 > manager making it work identical to the old one in this case and then
 > later after we have solved this bug do any simplifications that can be
 > done? An alternative is that that I started to focus on this bug but I
 > am a bit worried that that might take some time both to understand and
 > fix (and take away focus from the dependency manager).
 >
 > Another option for me is to spend some time writing some tests for this
 > area to check and verify that the both the old and new dependency
 > manager works for this parts of the code. Philip has asked a few times
 > about how we plan to test this area and by spending some time on testing
 > I could get a better overview of what kind of testing that is useful.
 >
 > I also try to get some progress on some of my other more trivial bugs
 > when I feel for doing something that needs less concentration and
 > effort. And it feels good to get progress on the bug count too.
 >
 > Let me know if you have any specific opinion of what I should do if not
 > I probably will continue working as planned on the dependency manager,
 > some testing and some other bugs.
 >
 > Is this new bug something that can be discuss in Friday's meeting or
 > will it be to detailed for you to bring up?
========================================
 > Olav wrote;
> The new dependency manager uses the "startEvent" (start time) and the 
> "endEvent" (commit time) for each transaction to determine relationships 
> (and thus dependency) between two transaction. The main test for this is:
> 
>    if (transaction A's endEvent < transaction B's startEvent) { // 
> Transaction A committed before B even started
>       // A and B was never active on the same time
>       // B would not have any dependencies on A (using the old 
> dependency manage)
>       // A can be clean/removed/deleted
>  }

Note that you cannot remove transaction A until it's endEvent is < the 
startEvent of the 'oldestActiveTransaction'.  In fact, without the 
dependency pointers between transactions to prevent this, 
Transaction::commitRecords() will probably not actually be able to break 
that linkage between records and transaction at 
Transaction::writeComplete().  It will have to be done later if there 
are 'dependencies'.  The current code uses that dependency counter to 
release these older transactions when the dependencies pointers finally 
go away.  Fortunately, the current code also has a call to 
TransactionManager::purgeTransactions() in the scavenger.  You will need 
to rely on this to clean up these old committed transactions that still 
have records attached to them.

That means that the work I am doing is now critical for the success of 
your new dependency management code.  My new scavenger starts up on a 
load based event and will call purgeTransactions for you much more 
often.  The current code will only call it every 30 seconds by default. 
  At least it does not wait until the scavenge threshold is reached in 
the record cache like it does before releasing old records!

> This test is basically replacing most of the tests for "dependencies == 
> 0" in the old dependency manager.
> 
> The problem with the code in Transaction::getRelativeState() is that we 
> do not always have the transaction object for transaction A (only the 
> transaction id of it). 

This must change.  A record needs to keep the transaction pointer until 
the transaction is visible to all.  The alternative is to add the 
endEvent to each and every RecordVersion, which I agree is generally not 
what we want.

You have found an exception to this with Table::unlockRecord().  I am 
working on a solution to this exception and will provide it to you 
tomorrow.  Even if you add the endEvent to each Recordversion, we need 
to fix this problem that a lock record can get state=recUnlock.  This 
state is not expected by many places in the code.

> I have a solution for this where B commits (in 
> Transaction::commit()) it copies it "endEvent" number into all 
> RecordVersion objects. This makes B's endEvent available in 
> Transaction::getRelativeState() also when we do not have the pointer to 
> the transaction object. This makes it possible to replace the for-loop 
> that goes through the states array with a similar test using the 
> endEvent for the "other" and the startEvent of "this" transaction for 
> determining if the "other" transaction was active or not when "this" 
> transaction started.
 >
> It was actully by this approach that I found out that we had cases where 
> the pointer to transaction was NULL while the transaction in fact was 
> not committed. The way I saw this was that Transaction::getReletiveState 
> got called and the endEvent was zero (e.g. the transaction had not 
> committed). So I can actually by looking at the endEvent number I have 
> introduced in the RecordVersion object both determine if the "other" 
> transaction was active when "this" transaction started (test above) and 
> if it has committed or not (endEvent equal to zero).
> 
> But... I do not like having to extend the use of the endEvent transition 
> event into the RecordVersion objects... So if we could fix the existing 
> bug that we found yesterday it would have been great if we just could 
> have declared the for-loop that tests the states array in 
> Transaction::getRelativeState() as "dead code".

There is another problem with putting the endEvent into each and every 
record.  It takes a while to do. It would have to be done at 
Transaction::commit.  Transaction::getRelativeState is supposed to wait 
on another active transaction until it commits.  But if it sees a number 
in endEvent, it will assume that the transaction is committed already. 
  This may cause some race conditions.  It is much better to go directly 
to the transaction object to see its state.

I talked to Jim today about how to fix this.  He confirmed my solution 
which is to get rid of the recUnlock state and leave detached lock 
records attached to the transaction when they are 'unlocked'.  It will 
still replace this record on the recordLeaf so that future Table::fetch 
calls will find the lock record's priorVersion.

In addition, I need to make sure that any code caught looking at one of 
these lock records has the wherewithall to look at the prior.  And if it 
waits, it MUST do another Table::fetch().

I'll get this code to you as soon as I can.

Kevin
Thread
New dependency manager and Transaction::getRelativeStateOlav Sandstaa18 Nov
  • Re: New dependency manager and Transaction::getRelativeStateKevin Lewis18 Nov
    • Re: New dependency manager and Transaction::getRelativeStateKevin Lewis18 Nov
      • Re: New dependency manager and Transaction::getRelativeStateOlav Sandstaa18 Nov
      • Re: New dependency manager and Transaction::getRelativeStateOlav Sandstaa18 Nov
        • Re: New dependency manager and Transaction::getRelativeStateKevin Lewis19 Nov
          • Re: New dependency manager and Transaction::getRelativeStateOlav Sandstaa19 Nov
            • Re: New dependency manager and Transaction::getRelativeStateKevin Lewis19 Nov
              • Re: New dependency manager and Transaction::getRelativeStateOlav Sandstaa20 Nov