From: Kristian Nielsen Date: September 15 2011 7:46am Subject: Re: A feature patch for semi-sync List-Archive: http://lists.mysql.com/replication/2201 Message-Id: <87pqj26yf2.fsf@frigg.knielsen-hq.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable [Thanks, Mark, for pointing me to this thread] MARK CALLAGHAN writes: > 2011/9/13 =E5=91=A8=E6=8C=AF=E5=85=B4 : >> In MySQL-5.5 the semi-sync code is something like this: >> original semi-sync >> binlog_prepare (do nothing) >> innodb_prepare >> ... >> binlog_commit >> innobase_commit >> WAITING FOR THE SLAVE >> Now rpl_semi_sync_master_wait_before_commit=3D1 >> binlog_prepare (do nothing) >> innodb_prepare >> ... >> binlog_commit >> WAITING FOR THE SLAVE >> innobase_commit > > performance by much more. The overhead from this will be also much > larger on versions of MySQL that use group commit for InnoDB+binlog > (which is in MariaDB, Percona and the Facebook patch). > you present this at one of the community conferences. But can you make > the overhead of this low for systems with a fast fsync? When fsync is > fast and the workload has few concurrent transactions there isn't much > that can be done to avoid the overhead. But for workloads with a lot > of concurrency and on a MySQL variant that supports group commit it > might be possible to minimize the overhead from this. For MariaDB 5.3/5.5, which supports binlog group commit, I think it should = be simple to minimize the overhead in this way. What we have there is basically a linked list of all the transactions that = are waiting to commit in parallel. All of these transactions are written/fsynced to the binlog together. After this they are made visible (ie. committed ins= ide innodb) together. So if rpl_semi_sync_master_wait_before_commit=3D1, then after syncing the b= inlog we just need to wait for the slave to acknowledge reception of the _last_ transaction before we proceed to the next step of making them visible/committed in innodb: foreach (transaction) write to binlog sync binlog WAITING FOR THE SLAVE foreach (transaction) innobase_commit (first part) foreach (transaction) innobase_commit (second part) The overhead of waiting for 1 or 10 transactions to be acknowledged from the slave is about the same (for small transactions). So if many transactions commit in parallel, the overall overhead is minimized. Of course, there are some additional details to understand, but I think overall it should be simple. I already merged this MariaDB group commit with MySQL 5.5, however the rest= of the merge is still being polished, so it will be a few weeks before it beco= mes ready. If you are interested, let me know and I will then point you to the relevant trees, files, etc. when we have them ready. - Kristian.