List:Falcon Storage Engine« Previous MessageNext Message »
From:Xuekun Hu Date:January 16 2009 3:52am
Subject:RE: Are multiple serial logs feasible or profitable?
View as plain text  
Hi, Jim

Got it, now I understood why performance drops. And yes, you are right, one SRLStream
record should be better, and created a SRLStream subclass is better.

So now I am rethinking the real benefit it would be. Based on the result of that malloc
version patch which I wrote first, the performance wasn't changed. That means the
increased cycles and reduced contention counteract the impact. So even with a better
implementation, e.g. SRLStream subclass, its benefit still should be low.

I re-analyzed the gdb data on dbt2 with 32 connections. Below are two snapshots. Can you
get some info, like what's the biggest blocking?

One example: only 8 threads are working on 16 cores system
#Thread, blocking function that the thread is running
33      Cache::fetchPage wait
32      SRLCommit::append wait
31      SRLUpdateRecords::append wait
30      Transaction::waitForTransaction wait
29      SRLUpdateRecords::append wait
28      StorageTable::compareKey
27      SRLUpdateRecords::append wait
26      SRLUpdateRecords::append wait
25      SRLUpdateRecords::append wait
24      SRLUpdateIndex::append wait
23      Transaction::waitForTransaction wait
22      SRLUpdateRecords::append wait
21      Section::insertStub wait
20      Table::fetch locking
19      SRLUpdateRecords::append wait
18      SerialLog::flush wait
17      SRLUpdateIndex::append wait
16      SRLUpdateRecords::append wait
15      SRLUpdateIndex::append wait
14      Record::getEncodedRecord
13      SRLUpdateIndex::append wait
12      SRLUpdateIndex::append wait
11      Field_longstr::report_if_important_data
10      Transaction::waitForTransaction wait
9       SRLUpdateRecords::append wait
8       RecordLeaf::fetch locking
7       mysql_lock_tables
6       Table::fetch locking
5       SRLUpdateRecords::append wait
4       Transaction::waitForTransaction wait
3       StorageInterface::index_next
2       Transaction::waitForTransaction wait

The second example: only 11 threads are working on 16 cores system
33      EncodedDataStream::setData
32      Transaction::waitForTransaction wait
31      SRLUpdateIndex::append wait
30      MYSQLparse
29      SRLUpdateIndex::append wait
28      SerialLog::flush wait
27      Transaction::waitForTransaction wait
26      Bitmap::set
25      SRLUpdateIndex::append wait
24      SRLUpdateIndex::append wait
23      SRLUpdateIndex::append wait
22      Transaction::waitForTransaction wait
21      SRLUpdateIndex::append wait
20      Transaction::waitForTransaction wait
19      JOIN::join_free
18      Transaction::waitForTransaction wait
17      SerialLog::flush wait
16      StorageInterface::decodeRecord
15      StorageTable::compareKey
14      EncodeDataStream::decode
13      Index::makeKey
12      write()
11      free in filesort
10      SRLUpdateRecords::append wait
9       Transaction::waitForTransaction wait
8       SRLUpdateIndex::append wait
7       Transaction::waitForTransaction wait
6       SerialLog::flush wait
5       SerialLog::flush wait
4       Transaction::waitForTransaction wait
3       SRLUpdateIndex::append wait
2       Record::getEncodedValue

Thx, Xuekun

-----Original Message-----
From: Jim Starkey [mailto:jstarkey@stripped]
Sent: Friday, January 16, 2009 4:30 AM
To: Hu, Xuekun
Cc: Kevin Lewis; FalconDev
Subject: Re: Are multiple serial logs feasible or profitable?

The problem with your prototype is that you're allocating a new
StreamList object for every record.  The Falcon memory manager has to do
at least two interlocked instructions to protect the memory manager,
which is more than blowing any performance gain you can get.

It would make more sense to build the complete SRL record rather than
building little ones.  There is an issue that the record might exceed
the amount of space remaining the the window, but this can be handled,
when necessary, by forcing a premature flush of the serial log window.

It would be easier to prototype if you created a SRLStream subclass of
Stream that knew about the SRL encoding.

But unless you get the per-record memory allocation and deallocates out
of the inner loop, it just isn't going to perform.  On the other hand,
I'm not certain that the "optimization" is going to pan out in the best
of times since it requires more cycles and only reduces contention by a
modest degree.

As for multiple serial logs, I'm highly dubious whether it could be made
to work, since physical changes must be presented to recovery in the
order presented to the log.  Unless you have some magic bullet to keep
things straight, it isn't going to work.  But I'm not saying it can't
work, just that I don't know how to make it work.

Hu, Xuekun wrote:
> Hi, Guys
> I implemented the Jim's idea on SRLUpdateRecords::append and SRLUpdateIndex::append.
> The patch are attached. (Sorry for wrong coding style, this is only for concept proving.
> :-) However the performance even drops ~5% on dbt2 with 10 warehouses and 32 connections.
> In SRLUpdateRecords patch, I implemented a struct StreamList, singly linked list, to
> hold the streams. First put all chilled records to the streamlist, then get the serial log
> lock, copy ("logStream") to serial log, then out.
> struct StreamList {
>         Stream recordHeaderStream;      // hold the record's recordTableSpaceId,
> record->recordNumber, priorVersion etc
>         Stream recordStream;            // the chilled serial log record
>         int32 sectionId, recordTableSpaceId;  // used to update record info during
> writing to serial log
>         RecordVersion *record;
>         struct StreamList *next;
> };
> However it drops ~5% dbt2 performance. :-(
> In previous version, I also used malloc to alloc buffer, instead of Stream. This
> version patch didn't get performance drop, however also no improvement.
> In SRLUpdateIndex patch, one stream is ok, to put all the variable-length index body
> to segment, then copy to serial log. This patch has no big performance impact.
> Any suggestions/comment are appreciated?
> Thx, Xuekun
> -----Original Message-----
> From: Jim Starkey [mailto:jstarkey@stripped]
> Sent: Tuesday, December 09, 2008 6:37 AM
> To: Kevin Lewis
> Cc: FalconDev
> Subject: Re: Are multiple serial logs feasible or profitable?
> Kevin Lewis wrote:
>> One of our community testers has noticed that changing
>> innodb_log_files_in_group from 2 to 6 gives dbt2 ~8%
>> gain with 10 warehouses and 32 connections.
>> It gives a hint that there is an optimization opportunity
>> on a single serial log for Falcon which has lots of
>> syncWrite contention, from SerialLog::flush,
>> SRLUpdateIndex::append, and SRLUpdateRecords::append.
>>> Ann Harrison noted;
>>> One important aspect of the Serial Log is that it is
>>> *serial* - events are logged in the order they occur.
>>> It may be possible to maintain that property while
>>> writing to two different files in parallel, but it
>>> will certainly complicate managing the log.
>>> We need to find another way to reduce contention.
> Here's a starter.
> Rather than building serial log messages directly, build them in a
> separate stream, then get the serial log lock, copy the crud in, and get
> out.  It's less efficient, but should reduce the time holding an
> exclusive lock on the serial log, increasing concurrency.
> Prototyping with SRLUpdateRecords and SRLUpdateRecords should be easy
> and give a good whether the idea is worth pursuing.
> --
> Jim Starkey
> President, NimbusDB, Inc.
> 978 526-1376
> --
> Falcon Storage Engine Mailing List
> For list archives:
> To unsubscribe:

Jim Starkey
President, NimbusDB, Inc.
978 526-1376

Are multiple serial logs feasible or profitable?Kevin Lewis8 Dec
  • Re: Are multiple serial logs feasible or profitable?Jim Starkey8 Dec
    • RE: Are multiple serial logs feasible or profitable?Xuekun Hu15 Jan
      • Re: Are multiple serial logs feasible or profitable?Jim Starkey15 Jan
        • RE: Are multiple serial logs feasible or profitable?Xuekun Hu16 Jan
        • RE: Are multiple serial logs feasible or profitable?Xuekun Hu20 Jan
          • Re: Are multiple serial logs feasible or profitable?Jim Starkey20 Jan
            • RE: Are multiple serial logs feasible or profitable?Xuekun Hu16 Feb