List:Falcon Storage Engine« Previous MessageNext Message »
From:Kevin Lewis Date:December 3 2008 4:48pm
Subject:Re: Bug 41194
View as plain text  
Ann W. Harrison wrote:
> Kevin,
> 
> I'm concerned about your second fix to bug 41194.
> 
>   "The previous patch [which delayed switching the transaction
>    state to committed] does not work.  What happens is that the
>    gopher thread immediately starts processing the serial log
>    commit record and gets all the way to Transaction::writeComplete
>    before the Transaction::commit even gets to the state = committed;
>    line.  So then it asserts in writeComplete that (state == committed).
> 
>   "The call to database->commit(this); does not have to occur before
>    switching the transaction from activeTransaction list to
>    comittedTransactions.  But it does need to be called before other
>    waiting transactions are signaled that this transaction is
>    committed."
> 
> Obviously, the assertion you saw in the gopher thread is a problem,
> but I worry that your solution introduces another possible failure
> that will be harder to detect.  Consider a transaction A that's
> starting to commit.  It moves itself to the committed list.
> Transaction B starts.  B has never waited for A, but reads a
> record that A created and takes some action based on that record,
> and commits itself. Then A's call to database->commit(this) fails.
> Unless there's a lock that I haven't seen that keeps B from
> starting or committing while A is between moving itself to the
> committed list and actually writing its commit to the serial
> log, there's a problem.

The lock that prevents action being taken on a record being committed by 
transaction A is Transaction::syncIsActive. 
Transaction::waitForTransaction() gets a shared lock on it if 
Transaction::state == active.  So now we have this order

1) synchronously move trans from Active to committed lists
2) Transaction::state = committed
3) Durability - write committed record to serial log.
4) Transaction::syncIsActive.unlock()

There is admittedly a gap between 2 and 4 where another transaction can 
take action on a record that is "committed" but not yet durable.

 > An alternate solution might be to have the gopher check that
 > the transaction either no longer exists or has a state of
 > committed before it starts to move changes out of the serial
 > log.

Good idea.  The gopher thread can get a quick shared lock on 
Transaction::syncIsActive before processing it.  Since the gopher is in 
the background the performance cost is acceptable.  And the transaction 
in the serial log with writePending == true has a predictable path 
bewteen Transactin::state == active to committed.  So that wait is 
deterministic.

I'll try it.

It is still a good idea to do #3-durability before #4-signal, right?
It was the other way around.
Thread
Bug 41194Ann W. Harrison3 Dec
  • Re: Bug 41194Kevin Lewis3 Dec
    • Re: Bug 41194Ann W. Harrison3 Dec
      • RE: Bug 41194Xuekun Hu4 Dec
  • Re: Bug 41194Kevin Lewis4 Dec
    • RE: Bug 41194Xuekun Hu4 Dec
  • Re: Bug 41194Ann W. Harrison4 Dec
    • RE: Bug 41194Xuekun Hu5 Dec
  • Re: Bug 41194Kevin Lewis4 Dec
  • Re: Bug 41194Kevin Lewis5 Dec
    • RE: Bug 41194Xuekun Hu5 Dec