List:Falcon Storage Engine« Previous MessageNext Message »
From:Jim Starkey Date:January 19 2009 5:34pm
Subject:Re: When you have a chance...
View as plain text  
I suggest you consider cycle locking rather than a syncPrior lock.  The 
basic idea is this:

   1. A thread takes a shared lock on the cycle lock for the duration of
      an engine operation.
   2. Intermediate and trailing records are removed from record chains
      with an interlocked CAS.  The removed records are insert, again
      with an interocked CAS, into a "record purgatory".
   3. After a suitable period of waiting, the cycle manager flips cycle
      locks, waits for an exclusive lock on the previous cycle lock,
      then deletes records in record purgatory.

In effect, this extends the lifetime of otherwise deleted records to the 
duration of the cycle.  When the cycle is complete (cycle manager gets 
an exclusive lock on the cycle's lock), there will be no pointers to the 
deleted records, and they can be safely deleted.  In the interim, 
however, "deleted" record's prior pointer and state remain valid.

This gets rid of syncPrior and a lot addRef/releases.

A skeleton for a cycle manager is something like this:

    void XXX::cycleManager(void)
    {
        Thread *thread = Thread::getThread("XXX::cycleManager");
       
        while (!shutdownServer)
            {
            thread->sleep(CYCLE_SLEEP * 1000);
            SyncObject *prior = currentCycle;
            currentCycle = (prior == &cycle1) ? &cycle2 : &cycle1;
            ++cycleNumber;
            Sync sync(prior, "XXX::cycleManager");
            sync.lock(Exclusive);
            sync.unlock();
               
            agingCycle(cycleType);      // this is where post cycle work
    gets done
           
            }
           
        cycleThread = NULL;
    }

Kevin Lewis wrote:
> Chris,
>
> I thought a lot about this solution the last couple of days.  Using a 
> timeout as a standard expected event in handling a concurrency issue 
> is not the best approach, I'm sure Jim would agree.
>
> But there are extenuating circumstances.  You have shown that we need 
> to protect other threads that call hasData() from calling that 
> concurrently with an active chill thread because it could be there in 
> one instant and not the next. The act of chilling a record is very 
> intrusive and dangerous without a record mutex.  But using a record 
> mutex is against Jim's MVCC philosophy for many reasons which need not 
> be listed here. We just can't do it.  So testing and using the 
> syncPrior as an alternative to the record mutex is OK, I think.  
> Especially since there is no hard requirement to chill any particular 
> record.
>
> Since the action on the record is not  required, and chilling should 
> not happen very often, I think a timeout and move-on approach is 
> acceptable here.
>
> Especially since I cannot think of a better approach! :)
>
> Jim, this is your invitation to spark a better idea, or think of a 
> better approach.  Note that this use of syncPrior during a chill 
> actually works, even though it is not pretty.
>
> Kevin
>
> Christopher Powers wrote:
>> ...call me today 612-729-1519 or my cell, 612-616-1069. I'd like to 
>> discuss the changes I did for garbage collect.
>>
>> I pushed the fixes to mysql-6.0-falcon-chris on PB2, and *both* 
>> falcon_online_alter and falcon_chill_thaw passed. I find this to be 
>> very encouraging.
>>
>> Using syncPrior in chill was not straightforward solution because of 
>> a deadlock with a low-level thaw, but what I have works.
>>
>
> ------------------------------------------------------------------------
>
> Subject:
> bzr commit into mysql-6.0-falcon-team branch (christopher.powers:2964) 
> Bug#42131
> From:
> Christopher Powers <Christopher.Powers@stripped>
> Date:
> Sat, 17 Jan 2009 02:23:04 -0600 (CST)
> To:
> commits@stripped
>
> To:
> commits@stripped
>
>
> #At file:///home/cpowers/work/dev/dev-03/mysql/
>
>  2964 Christopher Powers	2009-01-17
>       Bug #42131 "falcon_backlog test crashes in Record::getEncodedValue"
>       
>       Transaction:chill() must get an exclusive syncPrior lock before deleting record
> data.
> modified:
>   storage/falcon/Record.h
>   storage/falcon/SRLUpdateRecords.cpp
>   storage/falcon/SRLUpdateRecords.h
>   storage/falcon/Table.h
>
> per-file messages:
>   storage/falcon/Record.h
>     Disable CHECK_RECORD_ACTIVITY. This flag has not been maintained, and causes
>     invalid fatal assertions.
>   storage/falcon/SRLUpdateRecords.cpp
>     SRLUpdateRecords::chill() now gets an exclusive syncPrior lock before
>     deleting the record data of a record being chilled.
>     
>     To prevent a deadlock with concurrent record thaws, abandon the chill for
>     any record for which the syncPrior cannot be acquired within 50ms.
>     
>     The presumption is that if syncPrior is held for a record chain, then
>     all records should remain fully intact.
>   storage/falcon/SRLUpdateRecords.h
>     Change return type of SRLUpdateRecords::chill() to boolean
>   storage/falcon/Table.h
>     Increase size of syncPrior and syncThaw hash table
> === modified file 'storage/falcon/Record.h'
> --- a/storage/falcon/Record.h	2009-01-09 22:00:52 +0000
> +++ b/storage/falcon/Record.h	2009-01-17 08:22:44 +0000
> @@ -27,7 +27,7 @@
>  #define ALLOCATE_RECORD(n)		(char*) MemMgrRecordAllocate (n, __FILE__, __LINE__)
>  #define DELETE_RECORD(record)	MemMgrRecordDelete (record);
>  
> -#define CHECK_RECORD_ACTIVITY
> +//#define CHECK_RECORD_ACTIVITY
>  
>  #include "SynchronizationObject.h"
>  
>
> === modified file 'storage/falcon/SRLUpdateRecords.cpp'
> --- a/storage/falcon/SRLUpdateRecords.cpp	2008-11-19 17:00:02 +0000
> +++ b/storage/falcon/SRLUpdateRecords.cpp	2009-01-17 08:22:44 +0000
> @@ -28,6 +28,7 @@
>  #include "Sync.h"
>  #include "SerialLogWindow.h"
>  #include "Format.h"
> +#include "SQLError.h"
>  
>  SRLUpdateRecords::SRLUpdateRecords(void)
>  {
> @@ -37,8 +38,19 @@ SRLUpdateRecords::~SRLUpdateRecords(void
>  {
>  }
>  
> -void SRLUpdateRecords::chill(Transaction *transaction, RecordVersion *record, uint
> dataLength)
> +bool SRLUpdateRecords::chill(Transaction *transaction, RecordVersion *record, uint
> dataLength)
>  {
> +	Sync syncPrior(record->getSyncPrior(), "SRLUpdateRecords::chill");
> +	
> +	try
> +		{
> +		syncPrior.lock(Exclusive, 50);
> +		}
> +	catch (...)
> +		{
> +		return false;
> +		}
> +	
>  	// Record data has been written to the serial log, so release the data
>  	// buffer and set the state accordingly
>  	
> @@ -55,6 +67,8 @@ void SRLUpdateRecords::chill(Transaction
>  		transaction->totalRecordData -= dataLength;
>  	else
>  		transaction->totalRecordData = 0;
> +		
> +	return true;
>  }
>  
>  int SRLUpdateRecords::thaw(RecordVersion *record, bool *thawed)
> @@ -200,15 +214,16 @@ void SRLUpdateRecords::append(Transactio
>  					{
>  					int chillBytes = record->getEncodedSize();
>  
> -					chill(transaction, record, chillBytes);
> -					
> -					log->chilledRecords++;
> -					log->chilledBytes += chillBytes;
> +					if (chill(transaction, record, chillBytes))
> +						{
> +						log->chilledRecords++;
> +						log->chilledBytes += chillBytes;
>  					
> -					ASSERT(transaction->thawedRecords > 0);
> +						//ASSERT(transaction->thawedRecords > 0);
>  
> -					if (transaction->thawedRecords)
> -						transaction->thawedRecords--;
> +						if (transaction->thawedRecords)
> +							transaction->thawedRecords--;
> +						}
>  					}
>  				else
>  					{
> @@ -253,9 +268,11 @@ void SRLUpdateRecords::append(Transactio
>  			
>  			if (chillRecords && record->state != recDeleted)
>  				{
> -				chill(transaction, record, stream.totalLength);
> -				chilledRecordsWindow++;
> -				chilledBytesWindow += stream.totalLength;
> +				if (chill(transaction, record, stream.totalLength))
> +					{
> +					chilledRecordsWindow++;
> +					chilledBytesWindow += stream.totalLength;
> +					}
>  				}
>  			} // next record version
>  		
>
> === modified file 'storage/falcon/SRLUpdateRecords.h'
> --- a/storage/falcon/SRLUpdateRecords.h	2008-11-14 02:30:11 +0000
> +++ b/storage/falcon/SRLUpdateRecords.h	2009-01-17 08:22:44 +0000
> @@ -33,7 +33,7 @@ public:
>  	virtual void	read(void);
>  	virtual void	pass2(void);
>  	void			append(Transaction *transaction, RecordVersion *records, bool chillRecords =
> false);
> -	void			chill(Transaction *transaction, RecordVersion *record, uint dataLength);
> +	bool			chill(Transaction *transaction, RecordVersion *record, uint dataLength);
>  	int				thaw(RecordVersion *record, bool *thawed);
>  	
>  	const UCHAR		*data;
>
> === modified file 'storage/falcon/Table.h'
> --- a/storage/falcon/Table.h	2009-01-15 20:29:54 +0000
> +++ b/storage/falcon/Table.h	2009-01-17 08:22:44 +0000
> @@ -42,8 +42,8 @@ static const int PostCommit	= 128;
>  
>  static const int BL_SIZE			= 128;
>  static const int FORMAT_HASH_SIZE	= 20;
> -static const int SYNC_VERSIONS_SIZE	= 16;
> -static const int SYNC_THAW_SIZE		= 16;
> +static const int SYNC_VERSIONS_SIZE	= 32; // 16;
> +static const int SYNC_THAW_SIZE		= 32; // 16;
>  
>  #define FOR_FIELDS(field,table)	{for (Field *field=table->fields; field; field =
> field->next){
>  
>
>
>   
>
> ------------------------------------------------------------------------
>
> Subject:
> Re: Bug#36631, "Assertion in SerialLogControl::nextRecord"
> From:
> Christopher Powers <Christopher.Powers@stripped>
> Date:
> Fri, 16 Jan 2009 00:47:06 -0600
> To:
> Kevin Lewis <klewis@stripped>
>
> To:
> Kevin Lewis <klewis@stripped>
> CC:
> FalconDev <falcon@stripped>
>
>
> Christopher Powers wrote:
>> Kevin,
>>
>> This is the reopened chill/thaw stress test bug.
>>
>> I can easily reproduce an assert in Record::getEncodedValue() that 
>> always occurs during a scavenge/record pruning (stack attached.)
>>
>> At the time of the assert, the record state is 'recChilled', so that 
>> explains why getEncodedValue() fails. It appears that the record 
>> 'oldie' in Index::duplicateKey() gets chilled while it's being accessed.
>>
>> Note that hasRecord() should have thawed the record. I added a second 
>> thaw to verify this, but the breakpoint never hit, so I think the 
>> record is being chilled during makeKey().
>>
>>
>> bool Index::duplicateKey(IndexKey *key, Record * record)
>> {
>>   for (Record *oldie = record; oldie; oldie = oldie->getPriorVersion())
>>     if (oldie->hasRecord())  <<<< will thaw record
>>       {
>>       IndexKey oldKey(this);
>>                  if (oldie->state == recChilled)
>>         oldie->thaw();     <<<< breakpoint never hit
>>                      makeKey (oldie, &oldKey); <<<< assertion 
>> here--chill< during makeKey??
>>
>>       if (oldKey.isEqual(key))
>>         return true;
>>       }
>>   return false;
>> }
>>
> ------------------------------------------------------------------------
>
>

Thread
Re: When you have a chance...Kevin Lewis19 Jan
  • Re: When you have a chance...Jim Starkey19 Jan
    • Re: When you have a chance...Kevin Lewis20 Jan
      • Re: When you have a chance...Christopher Powers20 Jan
      • Re: When you have a chance...Jim Starkey20 Jan
        • Re: When you have a chance...Kevin Lewis21 Jan
      • Re: When you have a chance...Jim Starkey21 Jan
        • Re: When you have a chance...Kevin Lewis21 Jan
          • RE: When you have a chance...Vladislav Vaintroub21 Jan
            • Re: When you have a chance...Jim Starkey21 Jan
            • Re: When you have a chance...Kevin Lewis21 Jan