List:Replication« Previous MessageNext Message »
From:Kristian Nielsen Date:September 15 2011 7:46am
Subject:Re: A feature patch for semi-sync
View as plain text  
[Thanks, Mark, for pointing me to this thread]

MARK CALLAGHAN writes:

> 2011/9/13 周振兴 <orczhou@stripped>:

>> In MySQL-5.5 the semi-sync code is something like this:
>> original semi-sync
>>     binlog_prepare (do nothing)
>>     innodb_prepare
>>     ...
>>     binlog_commit
>>     innobase_commit
>>     WAITING FOR THE SLAVE
>> Now rpl_semi_sync_master_wait_before_commit=1
>>     binlog_prepare (do nothing)
>>     innodb_prepare
>>     ...
>>     binlog_commit
>>     WAITING FOR THE SLAVE
>>     innobase_commit
>

> performance by much more. The overhead from this will be also much
> larger on versions of MySQL that use group commit for InnoDB+binlog
> (which is in MariaDB, Percona and the Facebook patch).

> you present this at one of the community conferences. But can you make
> the overhead of this low for systems with a fast fsync?  When fsync is
> fast and the workload has few concurrent transactions there isn't much
> that can be done to avoid the overhead. But for workloads with a lot
> of concurrency and on a MySQL variant that supports group commit it
> might be possible to minimize the overhead from this.

For MariaDB 5.3/5.5, which supports binlog group commit, I think it should be
simple to minimize the overhead in this way.

What we have there is basically a linked list of all the transactions that are
waiting to commit in parallel. All of these transactions are written/fsynced
to the binlog together. After this they are made visible (ie. committed inside
innodb) together.

So if rpl_semi_sync_master_wait_before_commit=1, then after syncing the binlog
we just need to wait for the slave to acknowledge reception of the _last_
transaction before we proceed to the next step of making them
visible/committed in innodb:

    foreach (transaction)
      write to binlog
    sync binlog
    WAITING FOR THE SLAVE
    foreach (transaction)
      innobase_commit (first part)
    foreach (transaction)
      innobase_commit (second part)

The overhead of waiting for 1 or 10 transactions to be acknowledged from the
slave is about the same (for small transactions). So if many transactions
commit in parallel, the overall overhead is minimized.

Of course, there are some additional details to understand, but I think
overall it should be simple.

I already merged this MariaDB group commit with MySQL 5.5, however the rest of
the merge is still being polished, so it will be a few weeks before it becomes
ready. If you are interested, let me know and I will then point you to the
relevant trees, files, etc. when we have them ready.

 - Kristian.
Thread
A feature patch for semi-sync周振兴14 Sep
  • Re: A feature patch for semi-syncMARK CALLAGHAN14 Sep
    • Re: A feature patch for semi-sync周振兴14 Sep
      • Re: A feature patch for semi-syncMARK CALLAGHAN15 Sep
        • Re: A feature patch for semi-sync周振兴15 Sep
        • Re: A feature patch for semi-syncKristian Nielsen15 Sep
          • Re: A feature patch for semi-sync周振兴23 Sep
            • Re: A feature patch for semi-syncMARK CALLAGHAN23 Sep
              • Re: A feature patch for semi-sync周振兴23 Sep
  • Re: A feature patch for semi-sync周振兴20 Sep
    • Re: A feature patch for semi-syncMats Kindahl20 Sep
      • Re: A feature patch for semi-sync周振兴21 Sep
        • Re: A feature patch for semi-syncMats Kindahl21 Sep