List:Falcon Storage Engine« Previous MessageNext Message »
From:Jim Starkey Date:January 20 2009 7:25pm
Subject:Re: Are multiple serial logs feasible or profitable?
View as plain text  
Hu, Xuekun wrote:
> Hi, Jim
> I'm thinking to implement the SRLStream subclass. I'm asking for the coding
> suggestion, before I really start to write code. :-)
> 1. Since recordTableSpaceId, recordNumber, sectionId and record must be as a whole
> record body to put into the window, if the free space is not enough, the serial log window
> must be flushed first. Currently, since only one complet SRL record is built, how to
> separate the different set of whole record body? My thinking is to put them into different
> segment, and one set whole record body must be in one single segment. Right?
> 2. When writing to serial log, there is still a loop to write the set of record boy
> (one by one) to serial log, since need to judge if the set of record size exceed the free
> space in the windows and also need to update each record virtualOffset. Right?
Take a look at SerialLog::putData.  If the record being built overflows 
the serial log window, the window is flushed without the record, and the 
partial record is copied to a new window.  This works as long as the 
total record doesn't exceed the window size.

You could take advantage of this by keeping the separate stream in which 
the SRL record is being built to safely below the window size.  But keep 
in mind that once you execute the START_RECORD macro, you've got the 
exclusive lock on the serial log.

If you write the entire record to a separate stream, you can probably 
eliminate the relatively expensive estimate of the size of the next 
record version in favor of a cruder test of each field at max length.  
Writing a serial log block a few bytes short of optimal is no consequence...

You will probably want to write a SerialLog::putStream analogous to 
SerialLog::putData but doesn't require a contiguous block of data with 
the record.

I realize this is a little high level, so if anything isn't clear, feel 
free to ask.  Now that we've got Obama safely inaugurated, I can turn my 
attention back to software.
> Thx, Xuekun
> -----Original Message-----
> From: Jim Starkey [mailto:jstarkey@stripped]
> Sent: Friday, January 16, 2009 4:30 AM
> To: Hu, Xuekun
> Cc: Kevin Lewis; FalconDev
> Subject: Re: Are multiple serial logs feasible or profitable?
> The problem with your prototype is that you're allocating a new
> StreamList object for every record.  The Falcon memory manager has to do
> at least two interlocked instructions to protect the memory manager,
> which is more than blowing any performance gain you can get.
> It would make more sense to build the complete SRL record rather than
> building little ones.  There is an issue that the record might exceed
> the amount of space remaining the the window, but this can be handled,
> when necessary, by forcing a premature flush of the serial log window.
> It would be easier to prototype if you created a SRLStream subclass of
> Stream that knew about the SRL encoding.
> But unless you get the per-record memory allocation and deallocates out
> of the inner loop, it just isn't going to perform.  On the other hand,
> I'm not certain that the "optimization" is going to pan out in the best
> of times since it requires more cycles and only reduces contention by a
> modest degree.
> As for multiple serial logs, I'm highly dubious whether it could be made
> to work, since physical changes must be presented to recovery in the
> order presented to the log.  Unless you have some magic bullet to keep
> things straight, it isn't going to work.  But I'm not saying it can't
> work, just that I don't know how to make it work.
> Hu, Xuekun wrote:
>> Hi, Guys
>> I implemented the Jim's idea on SRLUpdateRecords::append and
> SRLUpdateIndex::append. The patch are attached. (Sorry for wrong coding style, this is
> only for concept proving. :-) However the performance even drops ~5% on dbt2 with 10
> warehouses and 32 connections.
>> In SRLUpdateRecords patch, I implemented a struct StreamList, singly linked list,
> to hold the streams. First put all chilled records to the streamlist, then get the serial
> log lock, copy ("logStream") to serial log, then out.
>> struct StreamList {
>>         Stream recordHeaderStream;      // hold the record's recordTableSpaceId,
> record->recordNumber, priorVersion etc
>>         Stream recordStream;            // the chilled serial log record
>>         int32 sectionId, recordTableSpaceId;  // used to update record info
> during writing to serial log
>>         RecordVersion *record;
>>         struct StreamList *next;
>> };
>> However it drops ~5% dbt2 performance. :-(
>> In previous version, I also used malloc to alloc buffer, instead of Stream. This
> version patch didn't get performance drop, however also no improvement.
>> In SRLUpdateIndex patch, one stream is ok, to put all the variable-length index
> body to segment, then copy to serial log. This patch has no big performance impact.
>> Any suggestions/comment are appreciated?
>> Thx, Xuekun
>> -----Original Message-----
>> From: Jim Starkey [mailto:jstarkey@stripped]
>> Sent: Tuesday, December 09, 2008 6:37 AM
>> To: Kevin Lewis
>> Cc: FalconDev
>> Subject: Re: Are multiple serial logs feasible or profitable?
>> Kevin Lewis wrote:
>>> One of our community testers has noticed that changing
>>> innodb_log_files_in_group from 2 to 6 gives dbt2 ~8%
>>> gain with 10 warehouses and 32 connections.
>>> It gives a hint that there is an optimization opportunity
>>> on a single serial log for Falcon which has lots of
>>> syncWrite contention, from SerialLog::flush,
>>> SRLUpdateIndex::append, and SRLUpdateRecords::append.
>>>> Ann Harrison noted;
>>>> One important aspect of the Serial Log is that it is
>>>> *serial* - events are logged in the order they occur.
>>>> It may be possible to maintain that property while
>>>> writing to two different files in parallel, but it
>>>> will certainly complicate managing the log.
>>>> We need to find another way to reduce contention.
>> Here's a starter.
>> Rather than building serial log messages directly, build them in a
>> separate stream, then get the serial log lock, copy the crud in, and get
>> out.  It's less efficient, but should reduce the time holding an
>> exclusive lock on the serial log, increasing concurrency.
>> Prototyping with SRLUpdateRecords and SRLUpdateRecords should be easy
>> and give a good whether the idea is worth pursuing.
>> --
>> Jim Starkey
>> President, NimbusDB, Inc.
>> 978 526-1376
>> --
>> Falcon Storage Engine Mailing List
>> For list archives:
>> To unsubscribe:
> --
> Jim Starkey
> President, NimbusDB, Inc.
> 978 526-1376

Are multiple serial logs feasible or profitable?Kevin Lewis8 Dec
  • Re: Are multiple serial logs feasible or profitable?Jim Starkey8 Dec
    • RE: Are multiple serial logs feasible or profitable?Xuekun Hu15 Jan
      • Re: Are multiple serial logs feasible or profitable?Jim Starkey15 Jan
        • RE: Are multiple serial logs feasible or profitable?Xuekun Hu16 Jan
        • RE: Are multiple serial logs feasible or profitable?Xuekun Hu20 Jan
          • Re: Are multiple serial logs feasible or profitable?Jim Starkey20 Jan
            • RE: Are multiple serial logs feasible or profitable?Xuekun Hu16 Feb