Olav Sandstaa wrote:
> I still have one question about this method. My question is not about
> your change but since I anyway looked at this code. The code with your
> change now looks like:
>
> // Set the priorVersion to NULL and return its pointer.
> // The caller is responsible for releasing the associated useCount.
>
> Record* RecordVersion::clearPriorVersion(void)
> {
> Record * prior = priorVersion;
> if (prior && prior->useCount == 1) {
> if (COMPARE_EXCHANGE_POINTER(&priorVersion, prior, NULL))
> return prior;
> }
>
> return NULL;
> }
>
> If I understand your comment above correctly, this code might be run
> concurrently with other threads that update the priorVersion pointer.
> Correct? So your change ensures that the setting of priorVersion to NULL
> only succeeds (one time) when the code works with the last updated value
> for the priorVersion (the CAS takes ensures we work on the "correct"
> value of priorVersion). This is an improvement over the previous version
> where we "successfully" could have multiple threads doing the "set to
> NULL and return priorVersion" successfully in paralell (an potentially
> overwritting any updated versions of priorVersion?).
>
> My question is for this scenario (assuming multiple threads can do
> updates of priorVersion concurrently): what happens if a thread A calls
> clearPriorVersion() and do the "Record * prior = priorVersion;" and then
> another thread does change the priorVersion pointer (using CAS) - and
> after this thread A does the "prior && prior->useCount == 1" test. Are
> we at this time sure that prior points to a valid Record/RecordVersion
> object? Is this a scenario that will not happen? Or do we have other
> measures in place that ensures that the Record/RecordVersion object can
> not be deleted?
Olav,
The theory about clearPriorVersion is that it is only done by a pruning
scavenger thread. And it is only done when the useCount is gone down to
1. The 1 useCount is for the superceding record pointing back to it
through its priorRecord chain. Once a record is older than the oldest
visible record version, there is no know way for the useCount to
increase. This is my dilema. I don't know how two threads could be
messing with the priorRecord pointer of the oldest visible record
version. These deterministic exchanges will help us find them if that
is the case.
It is posible for a visible base record to get a lockRecord superceding
it, then an updated record in that transaction, then another updated
record in that same transaction. This causesa detached record version
to be put into purgatory with a priorRecord pointer and associated
useCount on that visible record.
Then suppose the whole transaction gets rolled back. Now the updated
record version and possible even the lockRecord may get detached and put
into purgatory with their own pointers.
Then suppose newer record versions are committed several times and that
old visible record becomes older than the oldest visible record. It may
still have a useCount > 1 because it has these detached record versions
still in purgatory pointing back to it. But the only way for that
useCount to go is down. It cannot go back up. No fetchVersion call
will look at this old record because a newer record is visible. This is
why the pruneRecords function is checking that useCount. When all the
records at the end of a chain have useCount 1, they can be pruned. And
I do not know how those useCount can go back up. Just down.
So the code is now more deterministic in changing priorRecord pointers
because there is no other protection except a known algorithm. This
will help us find what is unknown.
Kevin