List:Falcon Storage Engine« Previous MessageNext Message »
From:Kevin Lewis Date:January 20 2009 1:35am
Subject:Re: When you have a chance...
View as plain text  
Jim,

Cycle locking addresses the continued existence of an object while there 
may be pointers to it.  The problem we have here is that there is a 
pointer within the record to a data buffer.  That pointer can go away at 
any time, even an instant after a call to hasData().  It takes two steps 
to use the data buffer;  1) Ask whether the record is chilled or 
hasData(),  2) Use the data buffer.

We need to surround these with some kind of SyncObject.  Otherwise, 
there is no guarantee that a data buffer will be available, even if you 
check for its existance first.  Cycle locking protects objects tat 
threads have pointers to.  We need to protect the pointer.

Kevin

Jim Starkey wrote:
> I suggest you consider cycle locking rather than a syncPrior lock.  The 
> basic idea is this:
> 
>   1. A thread takes a shared lock on the cycle lock for the duration of
>      an engine operation.
>   2. Intermediate and trailing records are removed from record chains
>      with an interlocked CAS.  The removed records are insert, again
>      with an interocked CAS, into a "record purgatory".
>   3. After a suitable period of waiting, the cycle manager flips cycle
>      locks, waits for an exclusive lock on the previous cycle lock,
>      then deletes records in record purgatory.
> 
> In effect, this extends the lifetime of otherwise deleted records to the 
> duration of the cycle.  When the cycle is complete (cycle manager gets 
> an exclusive lock on the cycle's lock), there will be no pointers to the 
> deleted records, and they can be safely deleted.  In the interim, 
> however, "deleted" record's prior pointer and state remain valid.
> 
> This gets rid of syncPrior and a lot addRef/releases.
> 
> A skeleton for a cycle manager is something like this:
> 
>    void XXX::cycleManager(void)
>    {
>        Thread *thread = Thread::getThread("XXX::cycleManager");
>              while (!shutdownServer)
>            {
>            thread->sleep(CYCLE_SLEEP * 1000);
>            SyncObject *prior = currentCycle;
>            currentCycle = (prior == &cycle1) ? &cycle2 : &cycle1;
>            ++cycleNumber;
>            Sync sync(prior, "XXX::cycleManager");
>            sync.lock(Exclusive);
>            sync.unlock();
>                          agingCycle(cycleType);      // this is where 
> post cycle work
>    gets done
>                      }
>                  cycleThread = NULL;
>    }
> 
> Kevin Lewis wrote:
>> Chris,
>>
>> I thought a lot about this solution the last couple of days.  Using a 
>> timeout as a standard expected event in handling a concurrency issue 
>> is not the best approach, I'm sure Jim would agree.
>>
>> But there are extenuating circumstances.  You have shown that we need 
>> to protect other threads that call hasData() from calling that 
>> concurrently with an active chill thread because it could be there in 
>> one instant and not the next. The act of chilling a record is very 
>> intrusive and dangerous without a record mutex.  But using a record 
>> mutex is against Jim's MVCC philosophy for many reasons which need not 
>> be listed here. We just can't do it.  So testing and using the 
>> syncPrior as an alternative to the record mutex is OK, I think.  
>> Especially since there is no hard requirement to chill any particular 
>> record.
>>
>> Since the action on the record is not  required, and chilling should 
>> not happen very often, I think a timeout and move-on approach is 
>> acceptable here.
>>
>> Especially since I cannot think of a better approach! :)
>>
>> Jim, this is your invitation to spark a better idea, or think of a 
>> better approach.  Note that this use of syncPrior during a chill 
>> actually works, even though it is not pretty.
>>
>> Kevin
>>
>> Christopher Powers wrote:
>>> ...call me today 612-729-1519 or my cell, 612-616-1069. I'd like to 
>>> discuss the changes I did for garbage collect.
>>>
>>> I pushed the fixes to mysql-6.0-falcon-chris on PB2, and *both* 
>>> falcon_online_alter and falcon_chill_thaw passed. I find this to be 
>>> very encouraging.
>>>
>>> Using syncPrior in chill was not straightforward solution because of 
>>> a deadlock with a low-level thaw, but what I have works.
>>>
>>
>> ------------------------------------------------------------------------
>>
>> Subject:
>> bzr commit into mysql-6.0-falcon-team branch (christopher.powers:2964) 
>> Bug#42131
>> From:
>> Christopher Powers <Christopher.Powers@stripped>
>> Date:
>> Sat, 17 Jan 2009 02:23:04 -0600 (CST)
>> To:
>> commits@stripped
>>
>> To:
>> commits@stripped
>>
>>
>> #At file:///home/cpowers/work/dev/dev-03/mysql/
>>
>>  2964 Christopher Powers    2009-01-17
>>       Bug #42131 "falcon_backlog test crashes in Record::getEncodedValue"
>>             Transaction:chill() must get an exclusive syncPrior lock 
>> before deleting record data.
>> modified:
>>   storage/falcon/Record.h
>>   storage/falcon/SRLUpdateRecords.cpp
>>   storage/falcon/SRLUpdateRecords.h
>>   storage/falcon/Table.h
>>
>> per-file messages:
>>   storage/falcon/Record.h
>>     Disable CHECK_RECORD_ACTIVITY. This flag has not been maintained, 
>> and causes
>>     invalid fatal assertions.
>>   storage/falcon/SRLUpdateRecords.cpp
>>     SRLUpdateRecords::chill() now gets an exclusive syncPrior lock before
>>     deleting the record data of a record being chilled.
>>         To prevent a deadlock with concurrent record thaws, abandon 
>> the chill for
>>     any record for which the syncPrior cannot be acquired within 50ms.
>>         The presumption is that if syncPrior is held for a record 
>> chain, then
>>     all records should remain fully intact.
>>   storage/falcon/SRLUpdateRecords.h
>>     Change return type of SRLUpdateRecords::chill() to boolean
>>   storage/falcon/Table.h
>>     Increase size of syncPrior and syncThaw hash table
>> === modified file 'storage/falcon/Record.h'
>> --- a/storage/falcon/Record.h    2009-01-09 22:00:52 +0000
>> +++ b/storage/falcon/Record.h    2009-01-17 08:22:44 +0000
>> @@ -27,7 +27,7 @@
>>  #define ALLOCATE_RECORD(n)        (char*) MemMgrRecordAllocate (n, 
>> __FILE__, __LINE__)
>>  #define DELETE_RECORD(record)    MemMgrRecordDelete (record);
>>  
>> -#define CHECK_RECORD_ACTIVITY
>> +//#define CHECK_RECORD_ACTIVITY
>>  
>>  #include "SynchronizationObject.h"
>>  
>>
>> === modified file 'storage/falcon/SRLUpdateRecords.cpp'
>> --- a/storage/falcon/SRLUpdateRecords.cpp    2008-11-19 17:00:02 +0000
>> +++ b/storage/falcon/SRLUpdateRecords.cpp    2009-01-17 08:22:44 +0000
>> @@ -28,6 +28,7 @@
>>  #include "Sync.h"
>>  #include "SerialLogWindow.h"
>>  #include "Format.h"
>> +#include "SQLError.h"
>>  
>>  SRLUpdateRecords::SRLUpdateRecords(void)
>>  {
>> @@ -37,8 +38,19 @@ SRLUpdateRecords::~SRLUpdateRecords(void
>>  {
>>  }
>>  
>> -void SRLUpdateRecords::chill(Transaction *transaction, RecordVersion 
>> *record, uint dataLength)
>> +bool SRLUpdateRecords::chill(Transaction *transaction, RecordVersion 
>> *record, uint dataLength)
>>  {
>> +    Sync syncPrior(record->getSyncPrior(), "SRLUpdateRecords::chill");
>> +   
>> +    try
>> +        {
>> +        syncPrior.lock(Exclusive, 50);
>> +        }
>> +    catch (...)
>> +        {
>> +        return false;
>> +        }
>> +   
>>      // Record data has been written to the serial log, so release the 
>> data
>>      // buffer and set the state accordingly
>>     
>> @@ -55,6 +67,8 @@ void SRLUpdateRecords::chill(Transaction
>>          transaction->totalRecordData -= dataLength;
>>      else
>>          transaction->totalRecordData = 0;
>> +       
>> +    return true;
>>  }
>>  
>>  int SRLUpdateRecords::thaw(RecordVersion *record, bool *thawed)
>> @@ -200,15 +214,16 @@ void SRLUpdateRecords::append(Transactio
>>                      {
>>                      int chillBytes = record->getEncodedSize();
>>  
>> -                    chill(transaction, record, chillBytes);
>> -                   
>> -                    log->chilledRecords++;
>> -                    log->chilledBytes += chillBytes;
>> +                    if (chill(transaction, record, chillBytes))
>> +                        {
>> +                        log->chilledRecords++;
>> +                        log->chilledBytes += chillBytes;
>>                     
>> -                    ASSERT(transaction->thawedRecords > 0);
>> +                        //ASSERT(transaction->thawedRecords > 0);
>>  
>> -                    if (transaction->thawedRecords)
>> -                        transaction->thawedRecords--;
>> +                        if (transaction->thawedRecords)
>> +                            transaction->thawedRecords--;
>> +                        }
>>                      }
>>                  else
>>                      {
>> @@ -253,9 +268,11 @@ void SRLUpdateRecords::append(Transactio
>>             
>>              if (chillRecords && record->state != recDeleted)
>>                  {
>> -                chill(transaction, record, stream.totalLength);
>> -                chilledRecordsWindow++;
>> -                chilledBytesWindow += stream.totalLength;
>> +                if (chill(transaction, record, stream.totalLength))
>> +                    {
>> +                    chilledRecordsWindow++;
>> +                    chilledBytesWindow += stream.totalLength;
>> +                    }
>>                  }
>>              } // next record version
>>         
>>
>> === modified file 'storage/falcon/SRLUpdateRecords.h'
>> --- a/storage/falcon/SRLUpdateRecords.h    2008-11-14 02:30:11 +0000
>> +++ b/storage/falcon/SRLUpdateRecords.h    2009-01-17 08:22:44 +0000
>> @@ -33,7 +33,7 @@ public:
>>      virtual void    read(void);
>>      virtual void    pass2(void);
>>      void            append(Transaction *transaction, RecordVersion 
>> *records, bool chillRecords = false);
>> -    void            chill(Transaction *transaction, RecordVersion 
>> *record, uint dataLength);
>> +    bool            chill(Transaction *transaction, RecordVersion 
>> *record, uint dataLength);
>>      int                thaw(RecordVersion *record, bool *thawed);
>>     
>>      const UCHAR        *data;
>>
>> === modified file 'storage/falcon/Table.h'
>> --- a/storage/falcon/Table.h    2009-01-15 20:29:54 +0000
>> +++ b/storage/falcon/Table.h    2009-01-17 08:22:44 +0000
>> @@ -42,8 +42,8 @@ static const int PostCommit    = 128;
>>  
>>  static const int BL_SIZE            = 128;
>>  static const int FORMAT_HASH_SIZE    = 20;
>> -static const int SYNC_VERSIONS_SIZE    = 16;
>> -static const int SYNC_THAW_SIZE        = 16;
>> +static const int SYNC_VERSIONS_SIZE    = 32; // 16;
>> +static const int SYNC_THAW_SIZE        = 32; // 16;
>>  
>>  #define FOR_FIELDS(field,table)    {for (Field *field=table->fields; 
>> field; field = field->next){
>>  
>>
>>
>>  
>> ------------------------------------------------------------------------
>>
>> Subject:
>> Re: Bug#36631, "Assertion in SerialLogControl::nextRecord"
>> From:
>> Christopher Powers <Christopher.Powers@stripped>
>> Date:
>> Fri, 16 Jan 2009 00:47:06 -0600
>> To:
>> Kevin Lewis <klewis@stripped>
>>
>> To:
>> Kevin Lewis <klewis@stripped>
>> CC:
>> FalconDev <falcon@stripped>
>>
>>
>> Christopher Powers wrote:
>>> Kevin,
>>>
>>> This is the reopened chill/thaw stress test bug.
>>>
>>> I can easily reproduce an assert in Record::getEncodedValue() that 
>>> always occurs during a scavenge/record pruning (stack attached.)
>>>
>>> At the time of the assert, the record state is 'recChilled', so that 
>>> explains why getEncodedValue() fails. It appears that the record 
>>> 'oldie' in Index::duplicateKey() gets chilled while it's being accessed.
>>>
>>> Note that hasRecord() should have thawed the record. I added a second 
>>> thaw to verify this, but the breakpoint never hit, so I think the 
>>> record is being chilled during makeKey().
>>>
>>>
>>> bool Index::duplicateKey(IndexKey *key, Record * record)
>>> {
>>>   for (Record *oldie = record; oldie; oldie = oldie->getPriorVersion())
>>>     if (oldie->hasRecord())  <<<< will thaw record
>>>       {
>>>       IndexKey oldKey(this);
>>>                  if (oldie->state == recChilled)
>>>         oldie->thaw();     <<<< breakpoint never hit
>>>                      makeKey (oldie, &oldKey); <<<< assertion
> 
>>> here--chill< during makeKey??
>>>
>>>       if (oldKey.isEqual(key))
>>>         return true;
>>>       }
>>>   return false;
>>> }
>>>
>> ------------------------------------------------------------------------
>>
>>
> 
> 
Thread
Re: When you have a chance...Kevin Lewis19 Jan
  • Re: When you have a chance...Jim Starkey19 Jan
    • Re: When you have a chance...Kevin Lewis20 Jan
      • Re: When you have a chance...Christopher Powers20 Jan
      • Re: When you have a chance...Jim Starkey20 Jan
        • Re: When you have a chance...Kevin Lewis21 Jan
      • Re: When you have a chance...Jim Starkey21 Jan
        • Re: When you have a chance...Kevin Lewis21 Jan
          • RE: When you have a chance...Vladislav Vaintroub21 Jan
            • Re: When you have a chance...Jim Starkey21 Jan
            • Re: When you have a chance...Kevin Lewis21 Jan