List:Falcon Storage Engine« Previous MessageNext Message »
From:Olav Sandstaa Date:January 28 2009 7:17pm
Subject:Re: Problems with record visibility and how it is computed
View as plain text  
Hi Jim and Kevin,

Thanks for commenting and providing alternatives. See some comments inline.

Jim Starkey wrote:
> Kevin Lewis wrote:
>> Jim Starkey wrote:
>>> Olav Sandstaa wrote:
>>>> Hi,
>>>>
>>>> Here is a quick summary of some possible solutions for how to solve 
>>>> this problem that have popped up the last days.
>>>>
>>>> First a summary of what the problem is: when determining if a 
>>>> RecordVersion should be visible for a transaction we check the 
>>>> RecordVersion's pointer to the transaction object. If this is 
>>>> non-null we access several member variables of the transaction 
>>>> object. At least one of the inconsistencies that I have seen occurs 
>>>> when the transaction object is deleted (and overwritten) just as we 
>>>> are using its member variables for determining the visibility of 
>>>> the transaction.
>>>>
>>>> The list of more or less random ideas for how this can be solved:
>>>>
>>>> 1. We can stop using the pointer to the transaction and instead 
>>>> locate the transaction by searching for it using the transaction 
>>>> id. Disadvantage of this: a) searching for the correct transaction  
>>>> could become costly. b) we would likely have to  introduce an extra 
>>>> data structure for locating transaction (e.g. a hash table) and c) 
>>>> we would need to  acquire the shared lock on the 
>>>> Transaction::committedList.
>>> We already have somebody who does that, 
>>> TransactionManager::findTransaction.  It doesn't use a hash table 
>>> (it could), but whether it does or doesn't, it requires a shared 
>>> lock and release.  That's way too expensive for checking record 
>>> visibility.
>> I agree that this is too expensive.
>>
>>>> 2. We can avoid that the transaction object get deleted "behind our 
>>>> back" by acquiring the shared lock on the committed transaction 
>>>> list each time we call Transaction::getRelativeState and 
>>>> Transaction::visible(). Disadvantage: a) a lot of access to this 
>>>> shared lock, potential contention
>>> I agree.  Squared.
>> Too much contention on the committed transaction list.
>>
>>>> 3. An alternative to 2: we introduce a new shared lock for just 
>>>> this purpose to avoid the contention on the committed transaction 
>>>> list lock. When purging transactions we must acquire an exclusive 
>>>> lock on this SyncObject (and we have to reduce the frequency of 
>>>> checking for purging of transactions). Disadvantage: a new shared 
>>>> lock that needs to be used by both the code checking for visibility 
>>>> and for purging of transactions - could still become a hot-spot (I 
>>>> do not think this is very different from cycle locking, but I have 
>>>> not thought too much about it yet).
>>> I think we need to avoid both locks and interlocked instructions 
>>> (increments and decrements) if we're going to keep performance up.
>> This suggestion is about a SyncObject in each transaction that is 
>> held with a shared lock whenever another thread has a reference to 
>> it. Normally, this is done in Falcon with a useCount and interlocked 
>> instructions.  The Transaction class does have a useCount.  Why not 
>> use it?  This would be idea 3A.  Transaction::addRef() could be 
>> called for every RecordVersion added to the transaction in 
>> Transactino::addRecord().  Then also for every stack based use of 
>> that pointer, an addref and release should be called.  I think this 
>> might work without too much cost.

Idea 3 was not actually about having a new SyncObject in each 
transaction but an additional sync object covering the entire 
transaction list. The problem about 3A and using the exiting 
Transaction::addRef() is that I am not sure how we should ensure that it 
would be safe to call it?

>
> There isn't much difference in cost between a lock(Shared)/unlock and 
> addRef/release.  Each requires a pair of interlocked instructions.

Thanks for the clearification about the cost. I expected the interlock 
instruction to be relatively cheaper than a shared lock.

>>
>>>> 4. Avoid having to access the transaction object: "Duplicate" the 
>>>> commit information (one integer) in every RecordVersion object. So 
>>>> when committing a transaction, we update every record version with 
>>>> the commitId of the transaction and all RecordVersion objects: 
>>>> Disadvantages: a) duplication of information b) issues with doing 
>>>> the commit as an "atomic operation".
>>> This is doable and would be my second choice.  Making it visible 
>>> across processors might be tricky.
>> I think this is too expensive.  It makes the commit take longer 
>> because the list of records attached to the committing transaction 
>> need to be marked with this commitId.  AND, this needs to happen 
>> while both transaction lists are locked because that is where the 
>> commitId is assigned to the transaction.

I agree with Kevin that this would likely be too expensive.
>
> I'm going to back off on my assessment of doable.  Let's move this to 
> the back burner for now.
>
>>
>>
>>>> 5. "Make it safe to access the transaction pointer and transaction 
>>>> object": By never deleting the transaction object until all record 
>>>> objects pointing to it have been deleted. This would also simplify 
>>>> some of the code since we would always have the transaction 
>>>> available. Disadvantages: a) foot print would increase since we 
>>>> would potentially have a lot of old transaction objects laying 
>>>> around (and we do not want to introduce chilling or backlogging of 
>>>> transaction objects :-) ) b) The current scavenger (or a new 
>>>> transaction scavenger?) would need to be able to determine when a 
>>>> transaction object was ready for purging. I do not know this part 
>>>> of the code, but it might be solvable by having a "record version 
>>>> counter" in each transaction object. c) I do not know what extra 
>>>> this would cost but I think it should be possible to do without any 
>>>> locking.
>>> That's a lot of dead transaction objects.  And it isn't just records 
>>> that point to the transaction but also threads that picked up the 
>>> transaction pointer and are waiting for someone to purge the 
>>> transaction so they can crash Falcon.  Evil little suckers, threads.
>> Well, this sounds doable to me also.  Records would be detached from 
>> these transactions by the scavenger, mostly.  The exception is 
>> releaseSavepoint() and rollbackSavepoint().  So records, while they 
>> can still be used, will always have that transaction pointer.  It 
>> could be set to null in code that assures that no other thread is 
>> using it. Pruning is like that.  And records that are retired use an 
>> exclusive lock on the recordLeaf and require the Record::useCount to 
>> be 1. So I think we can exclude those other pesky threads with this 
>> solution.
> Does that actually work?  How does the scavenger know that another 
> thread isn't using a record transaction pointer so it can delete the 
> transaction object?

I think it should be safe. Afterall the scavenger knows when to delete 
the record objects so there should be no transactions that is currently 
using the record version object or the pointer from it? (but I might be 
wrong).

>>
>> Now, will the committed transaction list get too long?  Maybe.  But 
>> this might also allow us to retire records sooner.  The current 
>> scavenger cannot retire Records until after they have been separated 
>> from their transaction (purgeTransactions).  But this would allow the 
>> scavenger to do the separating, and retire records that are just not 
>> needed by any other active transaction.  So this solution has the 
>> benefit of allowing records to be retired sooner.  See the email 
>> chain titled "Reluctant Scavenger" for a possible example of this 
>> problem.

Thanks for chiming in, Kevin. This is still my favorite - if it does not 
take too much extra memory....

>>
>>>> 6. Kevin/Jim's proposal to use a cycle locking - I have not had the 
>>>> time to consider this in detail but it might perform similar to 
>>>> alternative 3 above.
>>> OK, I've got cycle locking on the brain.  It doesn't cure everything 
>>> -- it does nothing about the Palestine problem, for example.  But 
>>> it's a cheap, non-blocking solution to short duration pointer 
>>> validity.  It 1) makes it possible to compress intermediate 
>>> versions, and 2) papers over problems of transaction pointer 
>>> validation.  Since everyone understands that the typical software 
>>> sequence is 0 ... 1 ... 2 ... infinite, there are clearly other 
>>> places that a are just screaming for cycle locking.
>> But this change is a big deal.  I plan to write up a separate email 
>> with  lots of questions about this.
>
> I don't think it is a big deal at all, so I look forward to your email.

Me too :-) Whether we do this or not it will be useful to explore the 
idea and know better about both how it works and get an idea about how 
to implement it and what it will take to implement it. From the basic 
idea presented by Jim I do not see that it should take that much effort?

> One thing that does need to be added, however, is a mechanism to 
> release a cycle lock before going to sleep for a protracted interval, 
> e.g. waiting on another transaction.  I've got a version of CycleLock 
> that uses thread specific data to do this.
>>
>> 7.  Yes, I have another alternative.  Probably the easiest of all. 
>> Comment out this line;
>>
>> storage\falcon\Transaction.cpp(292):
>>     transactionManager->purgeTransactionsWithLocks();
>>
>> This is new with Olav's dependency manager change and it causes 
>> transactions to be purged much sooner than a scavenger cycle.  It 
>> extends the CPU cycles during the critical/serialized part of every 
>> commit and may actually be to blame for a recent performance slowdown 
>> in DBT2. (that is a big maybe! Not sure about that at all).  But if 
>> we leave the transactions around until the scavenge cycle, this error 
>> might not happen any more...
>
> I don't understand how this fixes the problem.  Doesn't this leave the 
> race between picking up a transaction pointer from a record and 
> deleting the transaction object persist?  I'll grant you that it is 
> less likely, maybe even highly improbable.  But you need to 
> demonstrate it works in all circumstances.

I do not think this fixes the problem. The scavenger can run at "any 
time" so in theory it can delete transaction object just as early as the 
thread doing a commit (it just happens to do this very seldom). So I do 
not think it eliminates the problem, just reduces the likelyhood.

The same test as I was exploring also fails with the old dependency 
manager with the same assert/symptoms.

>>
>>>>
>>>> I welcome comments and alternative suggestions.
>>>>
>>>> My current "favorites" this far is 5 (simple, safe and almost 
>>>> lockfree - but uses way more memory) and 6 (because it is a great 
>>>> and general idea).
>>>>
>>>>
>>> Hmmm.   Isn't that just throwing memory at the problem????  (where 
>>> have I heard that before?).
>>>
>>> I like 6, of course.
>>
>> I would like to try 7 first, then 5, then 3A, and then maybe 6.
>>
>
> The theoretical problem is that there generally accessible pointers to 
> transaction objects, so deleting the transaction object is 
> problematic.  I think the only solutions involve:
>
>    * Eliminating the pointer to the transaction

Or let the transaction object live as long as the pointer..... 
(alternative 6 above)

>    * A reference count of the transaction object held by the record 
> object
>    * A lock on the transaction object held by the record object
>    * A lock on transaction existence for the duration of record
>      visibility cycle
>
> I think the most fruitful place to look for a solution is the last 
> alternative.  The real question is the granularity of the lock (or 
> implied lock).  An ad hoc lock (nice rhyme) is too expensive.  
> "Forever" is cheap but hard on memory.  The scavenge cycle is 
> tempting, but doesn't actually solve the problem.
>
> Maybe the first alternative has an answer.  Suppose we had a reusable 
> transaction id as an index into a transaction state vector.  That 
> would tie down only a byte per transaction rather than a large 
> object.  We would still need a mechanism to recycle the transaction 
> ids, however.  Something might take a day and think about this...

I agree this could be an idea but need to think about it.

Thanks for providing comments and suggestions.

Olav

Thread
Problems with record visibility and how it is computedOlav Sandstaa21 Jan
  • search for null values in indexed columnsVladislav Vaintroub21 Jan
    • Re: search for null values in indexed columnsLars-Erik Bjørk21 Jan
    • Re: search for null values in indexed columnsKevin Lewis21 Jan
      • RE: search for null values in indexed columnsVladislav Vaintroub21 Jan
        • Re: search for null values in indexed columnsKevin Lewis21 Jan
      • Re: search for null values in indexed columnsJames Day23 Jan
  • Re: Problems with record visibility and how it is computedOlav Sandstaa23 Jan
    • Re: Problems with record visibility and how it is computedJim Starkey23 Jan
      • Re: Problems with record visibility and how it is computedKevin Lewis28 Jan
        • Re: Problems with record visibility and how it is computedJim Starkey28 Jan
          • Re: Problems with record visibility and how it is computedKevin Lewis28 Jan
            • Cycle Locking (was Problems with record visibility and how it iscomputed)Jim Starkey28 Jan
            • Re: Problems with record visibility and how it is computedAnn W. Harrison28 Jan
              • Re: Problems with record visibility and how it is computedJim Starkey28 Jan
                • Re: Problems with record visibility and how it is computedAnn W. Harrison28 Jan
                  • Re: Problems with record visibility and how it is computedOlav Sandstaa26 Feb
                    • New Transaction State object (Was: Problems with record visibility andhow it is computed)Olav Sandstaa16 Mar
                      • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Kevin Lewis16 Mar
                      • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Jim Starkey16 Mar
                        • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Olav Sandstaa19 Mar
                          • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Jim Starkey19 Mar
                            • RE: New Transaction State object (Was: Problems with record visibilityand how it is computed)Vladislav Vaintroub20 Mar
                            • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Olav Sandstaa20 Mar
                              • Re: New Transaction State object (Was: Problems with record visibilityand how it is computed)Jim Starkey20 Mar
          • Re: Problems with record visibility and how it is computedOlav Sandstaa28 Jan
            • Re: Problems with record visibility and how it is computedJim Starkey28 Jan
              • Another Idea for Transaction Lifetime ControlJim Starkey28 Jan
                • Re: Another Idea for Transaction Lifetime ControlJim Starkey29 Jan
              • RE: Problems with record visibility and how it is computedXuekun Hu4 Feb
          • Re: Problems with record visibility and how it is computedAnn W. Harrison28 Jan
  • Quick question on row countsKeith Murphy24 Jan
    • Re: Quick question on row countsJim Starkey25 Jan
      • Re: Quick question on row countsKeith Murphy25 Jan