List:Falcon Storage Engine« Previous MessageNext Message »
From:Olav Sandstaa Date:February 25 2009 11:13pm
Subject:Re: Problems with record visibility and how it is computed
View as plain text  
Hi,

I need to get back to work on this again and try to reach an agreement 
for a strategy for how we should solve the problem with the 
RecordVersion's transaction pointer (I have been distracted by numerous 
low-level synchronization issues showing up in pushbuild tests and now 
really feels for doing something more constructive :-) ).

Thanks to Jim, Ann and Kevin for a lot of feedback and good ideas. I 
have read this email thread again and think only two of the suggested 
alternatives still have not be "voted down":

6. Cycle locking (proposed by Jim): Jim, thanks for explaining it and 
providing code showing how it can be implemented. I *think* that I 
understand it and how this can be used for solving this problem. Still, 
I share Kevin's concern that we might run into issues that makes this 
complex in some situations.

8. A transaction state vector (implemented as a sparse array?) (proposed 
by Ann): I understand the basic idea - and think it is good. Still, as 
pointed out by Ann, we need a way for releasing parts of the array when 
"the entry is no longer needed". Which again leads us back to: when will 
this be safe to do?

In addition to these I also have two new possible solutions:

9. (This is partly based on something Ann wrote: Make the transasction 
pointer indirect combined with my original proposal number 5: let the 
transaction objects live as long as there are record version pointing to 
them): Introduce a new "Transaction state object" that contains as a 
minimum the "commitId" but most likely also the "transactionId" and "state":

     1. Each time when creating a transaction object, we create a 
transaction state object and a pointer to the transaction state object 
from the transaction object.
     2. When we commit: the commitId is written to the transaction state 
object (not the transaction object). So the commit is the update to the 
transaction state object.
     3. Each time we create a record version object it has a pointer to 
the transaction state object (and if needed also to the transaction object)
     4. Each time we add a record to a transaction (ie. update a record) 
we increment a useCount in the transaction state object: This should not 
need any extra locking (I think)
     5. When we need to compute visibility of records we only need to 
use the transaction state object. This will have at least the same life 
span as the record version objects pointing to it (thus the pointer is 
guaranteed to be valid)
     6. When the scavenger removes a recordversion it decrement the 
useCount in the transaction state object (I think this can be done 
safely without locking given that we have one scavenger)
     7. When the useCount reaches 0 we delete the transaction state object.

   Issue/question: is there any other places in the code that will 
access this transaction state object than by accessing it from the 
record version object?

    I assume when normal transactions accesses record version objects 
they are "locked" and this will implicitely also lock the transaction 
state object when normal transactions access it?

10. This is an extention to alternative 4 (duplicate the commitId in 
every recordversion object - which was correctly voted down due to 
issues with how to update all these atomicly in the commit):

      New proposal:

         0. Add a commitId field to every recordversion object.
         1. Update this field lazily some time *after* the commit, ie. 
this is not part of the "atomic commit". For instance it can be done in 
Transaction::commit after the commit is finished or in 
Transaction::commitRecords() or by a new "transaction scavenger". The 
purpose is that the commitId is propagated to all recordversion objects 
shortly after the commit has been executed.
         2. Introduce a "transaction scavenger" that is deleting old 
transaction objects no less than (e.g) 30 seconds after commit (or 
rather: no less than 30 seconds after the operation in step 30 has been 
done).

      This gives us a record version object with the following property:

         -in a short period after commit: the commitId is only available 
in the Transaction object.
         -after this and until 30 seconds after commit: commitId is 
available in both the recordVersion object and in the Transaction object.
         -after 30 seconds: commitId is ONLY available in the record 
version object.

      This gives us the following code to get the commitId for a 
recordVersion object:

          uint32 RecordVersion::getCommitId()
              {
               if (commitId != 0)
                    return commitId;
               else
                    return transactionptr->commitId;
               }

        I think this will work in practice if we can assume:

            -processors do not cache data in their cache for 30 seconds 
without updating their content from memory (ie. updates to the commitId 
in a recordversion becomes visible to all processors within 30 seconds)
            -a thread does not get "stalled" in the above code for 30 
seconds.
            -nobody put a breakpoing on line 4 above and goes for a cup 
of coffee and then files this as a bug

       The nice thing about this last proposal is that it (a) does not 
introduce any new data structure (or synchronization strategy) and (b) 
that it does not include any new locking (I think) and (c) we can delete 
transaction objects after 30 seconds because nobody will use the pointer 
to them after about 100 ms after the commit took place (the bad thing is 
that if any of the assumptions above is not valid then this has a tiny 
theoretical probability for failing).

Feel free to make proposals on which alternative I should start to work 
on - or if you have better/alternative suggestions for how to solve this.

Best regards,
Olav


Ann W. Harrison wrote:
> Jim,
>
> Talking about keeping the size of a vector of transaction
> commitId's indexed by the transactionId reasonably dense...
>
>> Don't assign the id until the transaction actually does an update.
>
> OK, so the Transaction object has four identifiers .... first, its
> address, second its startId, third its updateId, and fourth its
> commitId.  The address is in general use, the startId determines
> what it sees, the updateId is written into its record versions,
> and its commitId determines who can see its changes...  did I
> get that right?
>
>>
>> We could probably make a sparse array that was updated exclusively 
>> with interlocked CAS -- something to think about when the dog and a 
>> fox get into a middle of the night bark-fest.
>
> Yes, I was thinking along those lines ... and that "vector" changes
> on transaction commit - thus once per updating transaction, not once
> per record or once per record view...
>
>
> Cheers,
>
> Ann

Thread
Problems with record visibility and how it is computedOlav Sandstaa21 Jan
  • search for null values in indexed columnsVladislav Vaintroub21 Jan
    • Re: search for null values in indexed columnsLars-Erik Bjørk21 Jan
    • Re: search for null values in indexed columnsKevin Lewis21 Jan
      • RE: search for null values in indexed columnsVladislav Vaintroub21 Jan
        • Re: search for null values in indexed columnsKevin Lewis21 Jan
      • Re: search for null values in indexed columnsJames Day23 Jan
  • Re: Problems with record visibility and how it is computedOlav Sandstaa23 Jan
    • Re: Problems with record visibility and how it is computedJim Starkey23 Jan
      • Re: Problems with record visibility and how it is computedKevin Lewis28 Jan
        • Re: Problems with record visibility and how it is computedJim Starkey28 Jan
          • Re: Problems with record visibility and how it is computedKevin Lewis28 Jan
            • Cycle Locking (was Problems with record visibility and how it iscomputed)Jim Starkey28 Jan
            • Re: Problems with record visibility and how it is computedAnn W. Harrison28 Jan
              • Re: Problems with record visibility and how it is computedJim Starkey28 Jan
                • Re: Problems with record visibility and how it is computedAnn W. Harrison28 Jan
                  • Re: Problems with record visibility and how it is computedOlav Sandstaa26 Feb
                    • New Transaction State object (Was: Problems with record visibility andhow it is computed)Olav Sandstaa16 Mar
                      • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Kevin Lewis16 Mar
                      • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Jim Starkey16 Mar
                        • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Olav Sandstaa19 Mar
                          • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Jim Starkey19 Mar
                            • RE: New Transaction State object (Was: Problems with record visibilityand how it is computed)Vladislav Vaintroub20 Mar
                            • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Olav Sandstaa20 Mar
                              • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Jim Starkey20 Mar
          • Re: Problems with record visibility and how it is computedOlav Sandstaa28 Jan
            • Re: Problems with record visibility and how it is computedJim Starkey28 Jan
              • Another Idea for Transaction Lifetime ControlJim Starkey28 Jan
                • Re: Another Idea for Transaction Lifetime ControlJim Starkey29 Jan
              • RE: Problems with record visibility and how it is computedXuekun Hu4 Feb
          • Re: Problems with record visibility and how it is computedAnn W. Harrison28 Jan
  • Quick question on row countsKeith Murphy24 Jan
    • Re: Quick question on row countsJim Starkey25 Jan
      • Re: Quick question on row countsKeith Murphy25 Jan