Olav,
We will have time after Beta to consider implementing cycle managers for
record verions, transactions, and whatever else. But that is purely a
performance improvement. I have added the descriptions that I have in
email form from Jim to the new Future section of the Falcon home page.
I think it is a great idea which will helpt our high concurrency
response time and even throughput. But let's do it after Beta.
Kevin
Olav Sandstaa wrote:
> Hi Jim,
>
> Thanks for your comments and suggestions. See inline for my respons:
>
> Jim Starkey wrote:
>> Olav Sandstaa wrote:
>>>
>>> 2. Introduce CAS allocation and batch-allocation of transaction objects:
>>>
>>> -This patch reduces the number of times an exclusive lock is
>>> needed on the active transaction list when allocating a transaction
>>> object. The allocation is now done by searching the list for an
>>> "available" (free) transaction object. This is done when only having
>>> a shared lock on the list and the allocation is done by using a CAS
>>> (compare-and-swap) operation.
>>>
>>> -If not available transaction is found in the active list, an
>>> exclusive lock is set and a new transaction object is allocated. To
>>> increase the likelihood for the next thread searching for an
>>> available transaction, we allocates ten extra free transaction
>>> objects since we anyway have the exclusive lock on the list.
>>>
>>> -The "drawback" about introducing this is that the active
>>> transaction list no longer is guaranteed to be sorted by
>>> transactionId. This complicates the test for determining when a
>>> transaction object can be purged from the committed transaction list.
>>> Until this introduction it was enough to compare the transaction's
>>> commitId (endEvent) with the transactionId (startEvent) of the first
>>> transaction in the active list. After this change we have to search
>>> the active transaction list in order to find the oldest active
>>> transaction (and do this while other threads are doing CAS on it is a
>>> bit, well.. (Yes, Jim, I agree with you about non-blocking,
>>> concurrent data structures....) (and I am actually considering to put
>>> an exclusive lock on it while searching for the oldest transaction...
>>> .- and I am considering to do purging less frequently - but that
>>> might be a topic of another discussion...)
>>>
>>> -The saving of this patch is that we in most cases is able to
>>> allocate a transaction object without an exclusive lock on the active
>>> transaction list.
>>>
>>> - The patch is available here:
>>>
>>> http://lists.mysql.com/commits/61009
>> Olav, have you consider pre-assigning transaction ids to the extra
>> transactions added to the list? That would keep the transaction ids
>> in order and eliminate the hassle about the oldest active transaction.
>
> I am not sure if that would work correctly, at least not after we
> "joined" the transactionId and the startEvent as the same number. We are
> using the transactionId in "visibility" checks like this example:
>
> // If the other transaction committed after we started then it should
> not
> // be visible to us
>
> if (transaction->commitId > transactionId)
> return false;
>
> By preallocating transactionId we could/would possibly change the
> visibility of transactions: two following transactions on the same
> connection where the second transaction would not see the changes done
> by the previous transacation....
> (this is after midnight so I might not think clearly).
>
>>
>> If you were to add cycle locking, you could let the cycle manager
>> remove completed, no-commit transactions from the middle of the active
>> transaction list and move them to the end of the list as available
>> transactions. (If you added two cycle managers, you could have
>> bicycles or maybe outboard motors)
>
> :-)
>
> Could be a good alternative but I am not quite sure how this would work.
> Do you have an example or explanation of cycle locking ? (if it was not
> this late I could probably have found an example myself).
>
>>>
>>>
>>> 3. Re-use read-only transaction objects.
>>>
>>> -This patch changes Transaction::commitNoUpdates() to not delete
>>> the transaction objects but let them stay in the active transaction
>>> list marked as "available" (so they can be picked up and re-used by
>>> the code used in patch 2). Basically, commitNoUpdates() are back as
>>> it where when I started on this rewrite.
>>>
>>> -The advantage of this is that we now are able to run read-only
>>> transaction without having to set exclusive locks on the active
>>> transaction list neither at start of the transaction nor at commit.
>>> As an example: for each of the patches I have made for the new
>>> dependency manager I have run a ten hour stress test. Before this
>>> latest patch this test allocates about 225 million transaction
>>> objects. With the latest patch it allocates in total about 140
>>> transaction objects (yes, it is a read-only test).
>>>
>>> -The patch is ready, but not yet committed.
>>>
>>> The effect of these patches are shown in the attached graph for a
>>> "singel-record select test" (yes, I know it is not representative but
>>> I selected this test because it is simple and a test where the
>>> dependency manager's allocation of transaction objects is really
>>> stressed and it will have huge impact on performance):
>>>
>>> -Red line: performance of old transaction manager
>>> -Green and blue: new implementation without any optimizations for
>>> read-only transactions
>>> -Purple: patch 1 above: shorter code path, no exclusive lock on
>>> committed list
>>> -Cyan: patch 2 above: batch allocation of transaction objects and
>>> CAS allocation
>>> -Brown: patch 3 above: reuse of transaction objects.
>>>
>>> So far so good, and hadn't it been for this morning a single test
>>> failed on a Windows machine that is most likely related to changes in
>>> the dependency manager, well time for debugging..... (more info about
>>> this is a separate email)...
>>>
>>> Any questions and comments are as usual welcome (as long as they do
>>> not involve sorting of lists using CAS....)
>>>
>> Leaving committed no-update transactions in the active transaction
>> list for a cycle manager to move to the end is simple, fast, elegant,
>> and subversive.
>>
>
>
> I will consider it. But I do not think the current implementation is too
> bad either. We only need to find "the oldest active" when we want to
> purge old committed transactions. This we do in two cases in the new
> implementation:
>
> 1. For each commit of an update transasaction
>
> 2. Every 30 second (or so) by the scavenger.
>
> Only one of these would stricly be necessary (I think). The impact of
> having to read the active list is largest for the purge done by
> commit(). To reduce the impact of it one simple way would be to not do
> this on every commit but on only every N commit.
>
> But I would like to learn how a cycle manager would work and consider
> implement it.
>
> Olav
>