From: Olav Sandstaa Date: February 25 2009 11:13pm Subject: Re: Problems with record visibility and how it is computed List-Archive: http://lists.mysql.com/falcon/591 Message-Id: <49A5D09E.7040605@sun.com> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset=ISO-8859-1 Content-Transfer-Encoding: 7BIT Hi, I need to get back to work on this again and try to reach an agreement for a strategy for how we should solve the problem with the RecordVersion's transaction pointer (I have been distracted by numerous low-level synchronization issues showing up in pushbuild tests and now really feels for doing something more constructive :-) ). Thanks to Jim, Ann and Kevin for a lot of feedback and good ideas. I have read this email thread again and think only two of the suggested alternatives still have not be "voted down": 6. Cycle locking (proposed by Jim): Jim, thanks for explaining it and providing code showing how it can be implemented. I *think* that I understand it and how this can be used for solving this problem. Still, I share Kevin's concern that we might run into issues that makes this complex in some situations. 8. A transaction state vector (implemented as a sparse array?) (proposed by Ann): I understand the basic idea - and think it is good. Still, as pointed out by Ann, we need a way for releasing parts of the array when "the entry is no longer needed". Which again leads us back to: when will this be safe to do? In addition to these I also have two new possible solutions: 9. (This is partly based on something Ann wrote: Make the transasction pointer indirect combined with my original proposal number 5: let the transaction objects live as long as there are record version pointing to them): Introduce a new "Transaction state object" that contains as a minimum the "commitId" but most likely also the "transactionId" and "state": 1. Each time when creating a transaction object, we create a transaction state object and a pointer to the transaction state object from the transaction object. 2. When we commit: the commitId is written to the transaction state object (not the transaction object). So the commit is the update to the transaction state object. 3. Each time we create a record version object it has a pointer to the transaction state object (and if needed also to the transaction object) 4. Each time we add a record to a transaction (ie. update a record) we increment a useCount in the transaction state object: This should not need any extra locking (I think) 5. When we need to compute visibility of records we only need to use the transaction state object. This will have at least the same life span as the record version objects pointing to it (thus the pointer is guaranteed to be valid) 6. When the scavenger removes a recordversion it decrement the useCount in the transaction state object (I think this can be done safely without locking given that we have one scavenger) 7. When the useCount reaches 0 we delete the transaction state object. Issue/question: is there any other places in the code that will access this transaction state object than by accessing it from the record version object? I assume when normal transactions accesses record version objects they are "locked" and this will implicitely also lock the transaction state object when normal transactions access it? 10. This is an extention to alternative 4 (duplicate the commitId in every recordversion object - which was correctly voted down due to issues with how to update all these atomicly in the commit): New proposal: 0. Add a commitId field to every recordversion object. 1. Update this field lazily some time *after* the commit, ie. this is not part of the "atomic commit". For instance it can be done in Transaction::commit after the commit is finished or in Transaction::commitRecords() or by a new "transaction scavenger". The purpose is that the commitId is propagated to all recordversion objects shortly after the commit has been executed. 2. Introduce a "transaction scavenger" that is deleting old transaction objects no less than (e.g) 30 seconds after commit (or rather: no less than 30 seconds after the operation in step 30 has been done). This gives us a record version object with the following property: -in a short period after commit: the commitId is only available in the Transaction object. -after this and until 30 seconds after commit: commitId is available in both the recordVersion object and in the Transaction object. -after 30 seconds: commitId is ONLY available in the record version object. This gives us the following code to get the commitId for a recordVersion object: uint32 RecordVersion::getCommitId() { if (commitId != 0) return commitId; else return transactionptr->commitId; } I think this will work in practice if we can assume: -processors do not cache data in their cache for 30 seconds without updating their content from memory (ie. updates to the commitId in a recordversion becomes visible to all processors within 30 seconds) -a thread does not get "stalled" in the above code for 30 seconds. -nobody put a breakpoing on line 4 above and goes for a cup of coffee and then files this as a bug The nice thing about this last proposal is that it (a) does not introduce any new data structure (or synchronization strategy) and (b) that it does not include any new locking (I think) and (c) we can delete transaction objects after 30 seconds because nobody will use the pointer to them after about 100 ms after the commit took place (the bad thing is that if any of the assumptions above is not valid then this has a tiny theoretical probability for failing). Feel free to make proposals on which alternative I should start to work on - or if you have better/alternative suggestions for how to solve this. Best regards, Olav Ann W. Harrison wrote: > Jim, > > Talking about keeping the size of a vector of transaction > commitId's indexed by the transactionId reasonably dense... > >> Don't assign the id until the transaction actually does an update. > > OK, so the Transaction object has four identifiers .... first, its > address, second its startId, third its updateId, and fourth its > commitId. The address is in general use, the startId determines > what it sees, the updateId is written into its record versions, > and its commitId determines who can see its changes... did I > get that right? > >> >> We could probably make a sparse array that was updated exclusively >> with interlocked CAS -- something to think about when the dog and a >> fox get into a middle of the night bark-fest. > > Yes, I was thinking along those lines ... and that "vector" changes > on transaction commit - thus once per updating transaction, not once > per record or once per record view... > > > Cheers, > > Ann