Kevin Lewis wrote:
> Jim,
>
> Cycle locking addresses the continued existence of an object while there
> may be pointers to it. The problem we have here is that there is a
> pointer within the record to a data buffer. That pointer can go away at
> any time, even an instant after a call to hasData(). It takes two steps
> to use the data buffer; 1) Ask whether the record is chilled or
> hasData(), 2) Use the data buffer.
>
> We need to surround these with some kind of SyncObject. Otherwise,
> there is no guarantee that a data buffer will be available, even if you
> check for its existance first. Cycle locking protects objects tat
> threads have pointers to. We need to protect the pointer.
>
> Kevin
Kevin,
I find cycle locking to be very intriguing, and have been meaning to
explore it further, but, as you pointed out, it does not appear to be a
solution here (btw, I like your comment, "The act of chilling a record
is very intrusive and dangerous...").
I briefly considered the value of a separate chill/thaw task that could
be signaled like a scavenge from a low-level record alloc, but that
won't resolve this issue. The thaw task still needs syncWrite and the
chill still needs the syncPrior.
I think this is more about task priority than locking order. A record
thaw is critical because the current task cannot continue and very well
may have other dependencies. A chill operation is a much lower priorty
task with no direct dependencies.
The timeout approach effectively assigns a lower priority to the chill
task by deferring to the thaw task (or whatever is operating on the
record chain.) It is not a pretty solution because it is contrary to the
Falcon locking paradigm. Does the paradigm need to be updated to
accommodate task priority?
> Jim Starkey wrote:
>> I suggest you consider cycle locking rather than a syncPrior lock.
>> The basic idea is this:
>>
>> 1. A thread takes a shared lock on the cycle lock for the duration of
>> an engine operation.
>> 2. Intermediate and trailing records are removed from record chains
>> with an interlocked CAS. The removed records are insert, again
>> with an interocked CAS, into a "record purgatory".
>> 3. After a suitable period of waiting, the cycle manager flips cycle
>> locks, waits for an exclusive lock on the previous cycle lock,
>> then deletes records in record purgatory.
>>
>> In effect, this extends the lifetime of otherwise deleted records to
>> the duration of the cycle. When the cycle is complete (cycle manager
>> gets an exclusive lock on the cycle's lock), there will be no pointers
>> to the deleted records, and they can be safely deleted. In the
>> interim, however, "deleted" record's prior pointer and state remain
>> valid.
>>
>> This gets rid of syncPrior and a lot addRef/releases.
>>
>> A skeleton for a cycle manager is something like this:
>>
>> void XXX::cycleManager(void)
>> {
>> Thread *thread = Thread::getThread("XXX::cycleManager");
>> while (!shutdownServer)
>> {
>> thread->sleep(CYCLE_SLEEP * 1000);
>> SyncObject *prior = currentCycle;
>> currentCycle = (prior == &cycle1) ? &cycle2 : &cycle1;
>> ++cycleNumber;
>> Sync sync(prior, "XXX::cycleManager");
>> sync.lock(Exclusive);
>> sync.unlock();
>> agingCycle(cycleType); // this is where
>> post cycle work
>> gets done
>> }
>> cycleThread = NULL;
>> }
>>
>> Kevin Lewis wrote:
>>> Chris,
>>>
>>> I thought a lot about this solution the last couple of days. Using a
>>> timeout as a standard expected event in handling a concurrency issue
>>> is not the best approach, I'm sure Jim would agree.
>>>
>>> But there are extenuating circumstances. You have shown that we need
>>> to protect other threads that call hasData() from calling that
>>> concurrently with an active chill thread because it could be there in
>>> one instant and not the next. The act of chilling a record is very
>>> intrusive and dangerous without a record mutex. But using a record
>>> mutex is against Jim's MVCC philosophy for many reasons which need
>>> not be listed here. We just can't do it. So testing and using the
>>> syncPrior as an alternative to the record mutex is OK, I think.
>>> Especially since there is no hard requirement to chill any particular
>>> record.
>>>
>>> Since the action on the record is not required, and chilling should
>>> not happen very often, I think a timeout and move-on approach is
>>> acceptable here.
>>>
>>> Especially since I cannot think of a better approach! :)
>>>
>>> Jim, this is your invitation to spark a better idea, or think of a
>>> better approach. Note that this use of syncPrior during a chill
>>> actually works, even though it is not pretty.
>>>
>>> Kevin
>>>
>>> Christopher Powers wrote:
>>>> ...call me today 612-729-1519 or my cell, 612-616-1069. I'd like to
>>>> discuss the changes I did for garbage collect.
>>>>
>>>> I pushed the fixes to mysql-6.0-falcon-chris on PB2, and *both*
>>>> falcon_online_alter and falcon_chill_thaw passed. I find this to be
>>>> very encouraging.
>>>>
>>>> Using syncPrior in chill was not straightforward solution because of
>>>> a deadlock with a low-level thaw, but what I have works.
>>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> Subject:
>>> bzr commit into mysql-6.0-falcon-team branch
>>> (christopher.powers:2964) Bug#42131
>>> From:
>>> Christopher Powers <Christopher.Powers@stripped>
>>> Date:
>>> Sat, 17 Jan 2009 02:23:04 -0600 (CST)
>>> To:
>>> commits@stripped
>>>
>>> To:
>>> commits@stripped
>>>
>>>
>>> #At file:///home/cpowers/work/dev/dev-03/mysql/
>>>
>>> 2964 Christopher Powers 2009-01-17
>>> Bug #42131 "falcon_backlog test crashes in
>>> Record::getEncodedValue"
>>> Transaction:chill() must get an exclusive syncPrior lock
>>> before deleting record data.
>>> modified:
>>> storage/falcon/Record.h
>>> storage/falcon/SRLUpdateRecords.cpp
>>> storage/falcon/SRLUpdateRecords.h
>>> storage/falcon/Table.h
>>>
>>> per-file messages:
>>> storage/falcon/Record.h
>>> Disable CHECK_RECORD_ACTIVITY. This flag has not been maintained,
>>> and causes
>>> invalid fatal assertions.
>>> storage/falcon/SRLUpdateRecords.cpp
>>> SRLUpdateRecords::chill() now gets an exclusive syncPrior lock
>>> before
>>> deleting the record data of a record being chilled.
>>> To prevent a deadlock with concurrent record thaws, abandon
>>> the chill for
>>> any record for which the syncPrior cannot be acquired within 50ms.
>>> The presumption is that if syncPrior is held for a record
>>> chain, then
>>> all records should remain fully intact.
>>> storage/falcon/SRLUpdateRecords.h
>>> Change return type of SRLUpdateRecords::chill() to boolean
>>> storage/falcon/Table.h
>>> Increase size of syncPrior and syncThaw hash table
>>> === modified file 'storage/falcon/Record.h'
>>> --- a/storage/falcon/Record.h 2009-01-09 22:00:52 +0000
>>> +++ b/storage/falcon/Record.h 2009-01-17 08:22:44 +0000
>>> @@ -27,7 +27,7 @@
>>> #define ALLOCATE_RECORD(n) (char*) MemMgrRecordAllocate (n,
>>> __FILE__, __LINE__)
>>> #define DELETE_RECORD(record) MemMgrRecordDelete (record);
>>>
>>> -#define CHECK_RECORD_ACTIVITY
>>> +//#define CHECK_RECORD_ACTIVITY
>>>
>>> #include "SynchronizationObject.h"
>>>
>>>
>>> === modified file 'storage/falcon/SRLUpdateRecords.cpp'
>>> --- a/storage/falcon/SRLUpdateRecords.cpp 2008-11-19 17:00:02 +0000
>>> +++ b/storage/falcon/SRLUpdateRecords.cpp 2009-01-17 08:22:44 +0000
>>> @@ -28,6 +28,7 @@
>>> #include "Sync.h"
>>> #include "SerialLogWindow.h"
>>> #include "Format.h"
>>> +#include "SQLError.h"
>>>
>>> SRLUpdateRecords::SRLUpdateRecords(void)
>>> {
>>> @@ -37,8 +38,19 @@ SRLUpdateRecords::~SRLUpdateRecords(void
>>> {
>>> }
>>>
>>> -void SRLUpdateRecords::chill(Transaction *transaction, RecordVersion
>>> *record, uint dataLength)
>>> +bool SRLUpdateRecords::chill(Transaction *transaction, RecordVersion
>>> *record, uint dataLength)
>>> {
>>> + Sync syncPrior(record->getSyncPrior(), "SRLUpdateRecords::chill");
>>> + + try
>>> + {
>>> + syncPrior.lock(Exclusive, 50);
>>> + }
>>> + catch (...)
>>> + {
>>> + return false;
>>> + }
>>> + // Record data has been written to the serial log, so
>>> release the data
>>> // buffer and set the state accordingly
>>> @@ -55,6 +67,8 @@ void SRLUpdateRecords::chill(Transaction
>>> transaction->totalRecordData -= dataLength;
>>> else
>>> transaction->totalRecordData = 0;
>>> + + return true;
>>> }
>>>
>>> int SRLUpdateRecords::thaw(RecordVersion *record, bool *thawed)
>>> @@ -200,15 +214,16 @@ void SRLUpdateRecords::append(Transactio
>>> {
>>> int chillBytes = record->getEncodedSize();
>>>
>>> - chill(transaction, record, chillBytes);
>>> - - log->chilledRecords++;
>>> - log->chilledBytes += chillBytes;
>>> + if (chill(transaction, record, chillBytes))
>>> + {
>>> + log->chilledRecords++;
>>> + log->chilledBytes += chillBytes;
>>> -
>>> ASSERT(transaction->thawedRecords > 0);
>>> + //ASSERT(transaction->thawedRecords > 0);
>>>
>>> - if (transaction->thawedRecords)
>>> - transaction->thawedRecords--;
>>> + if (transaction->thawedRecords)
>>> + transaction->thawedRecords--;
>>> + }
>>> }
>>> else
>>> {
>>> @@ -253,9 +268,11 @@ void SRLUpdateRecords::append(Transactio
>>> if (chillRecords && record->state !=
>>> recDeleted)
>>> {
>>> - chill(transaction, record, stream.totalLength);
>>> - chilledRecordsWindow++;
>>> - chilledBytesWindow += stream.totalLength;
>>> + if (chill(transaction, record, stream.totalLength))
>>> + {
>>> + chilledRecordsWindow++;
>>> + chilledBytesWindow += stream.totalLength;
>>> + }
>>> }
>>> } // next record version
>>>
>>> === modified file 'storage/falcon/SRLUpdateRecords.h'
>>> --- a/storage/falcon/SRLUpdateRecords.h 2008-11-14 02:30:11 +0000
>>> +++ b/storage/falcon/SRLUpdateRecords.h 2009-01-17 08:22:44 +0000
>>> @@ -33,7 +33,7 @@ public:
>>> virtual void read(void);
>>> virtual void pass2(void);
>>> void append(Transaction *transaction, RecordVersion
>>> *records, bool chillRecords = false);
>>> - void chill(Transaction *transaction, RecordVersion
>>> *record, uint dataLength);
>>> + bool chill(Transaction *transaction, RecordVersion
>>> *record, uint dataLength);
>>> int thaw(RecordVersion *record, bool *thawed);
>>> const UCHAR *data;
>>>
>>> === modified file 'storage/falcon/Table.h'
>>> --- a/storage/falcon/Table.h 2009-01-15 20:29:54 +0000
>>> +++ b/storage/falcon/Table.h 2009-01-17 08:22:44 +0000
>>> @@ -42,8 +42,8 @@ static const int PostCommit = 128;
>>>
>>> static const int BL_SIZE = 128;
>>> static const int FORMAT_HASH_SIZE = 20;
>>> -static const int SYNC_VERSIONS_SIZE = 16;
>>> -static const int SYNC_THAW_SIZE = 16;
>>> +static const int SYNC_VERSIONS_SIZE = 32; // 16;
>>> +static const int SYNC_THAW_SIZE = 32; // 16;
>>>
>>> #define FOR_FIELDS(field,table) {for (Field *field=table->fields;
>>> field; field = field->next){
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> Subject:
>>> Re: Bug#36631, "Assertion in SerialLogControl::nextRecord"
>>> From:
>>> Christopher Powers <Christopher.Powers@stripped>
>>> Date:
>>> Fri, 16 Jan 2009 00:47:06 -0600
>>> To:
>>> Kevin Lewis <klewis@stripped>
>>>
>>> To:
>>> Kevin Lewis <klewis@stripped>
>>> CC:
>>> FalconDev <falcon@stripped>
>>>
>>>
>>> Christopher Powers wrote:
>>>> Kevin,
>>>>
>>>> This is the reopened chill/thaw stress test bug.
>>>>
>>>> I can easily reproduce an assert in Record::getEncodedValue() that
>>>> always occurs during a scavenge/record pruning (stack attached.)
>>>>
>>>> At the time of the assert, the record state is 'recChilled', so that
>>>> explains why getEncodedValue() fails. It appears that the record
>>>> 'oldie' in Index::duplicateKey() gets chilled while it's being
>>>> accessed.
>>>>
>>>> Note that hasRecord() should have thawed the record. I added a
>>>> second thaw to verify this, but the breakpoint never hit, so I think
>>>> the record is being chilled during makeKey().
>>>>
>>>>
>>>> bool Index::duplicateKey(IndexKey *key, Record * record)
>>>> {
>>>> for (Record *oldie = record; oldie; oldie =
> oldie->getPriorVersion())
>>>> if (oldie->hasRecord()) <<<< will thaw record
>>>> {
>>>> IndexKey oldKey(this);
>>>> if (oldie->state == recChilled)
>>>> oldie->thaw(); <<<< breakpoint never hit
>>>> makeKey (oldie, &oldKey); <<<<
> assertion
>>>> here--chill< during makeKey??
>>>>
>>>> if (oldKey.isEqual(key))
>>>> return true;
>>>> }
>>>> return false;
>>>> }
>>>>
>>> ------------------------------------------------------------------------
>>>
>>>
>>
>>