List:Replication« Previous MessageNext Message »
From:MARK CALLAGHAN Date:September 14 2011 3:10am
Subject:Re: A feature patch for semi-sync
View as plain text  
I think you should describe this in more detail for others to be
interested. What you have done is very interesting, but also not
trivial for us to understand.

In official MySQL commit sequence is:
1) Innodb writes changes to transaction log and something to indicate
transaction is prepared
2) InnoDB syncs transaction log (this fsync can be shared by several
transactions)
3) InnoDB locks prepare_commit_mutex
4) writes/sync the binlog (this fsync cannot be shared as
prepare_commit_mutex is locked)
5) write commit record to innodb transaction log
6) unlock prepare_commit_mutex
7) sync InnoDB transaction log (this fsync can be shared)

I am not familiar with replication in MySQL 5.5 and the official
version of semi-sync. I know the Google version. I will guess that
your change occurs between steps 4 and 5 above. But if a wait is done
there then prepare_commit_mutex is locked and nothing else can commit
until a slave acks. So this might have a significant impact on commit
throughput.

Eventually official MySQL should have group commit. For now it is in
MariaDB, Percona and the Facebook patch. The group commit
implementations remove prepare_commit_mutex and with it remove the
performance impact from your change might be less but I still think
something should be done to pipeline or group acks from the slave.

2011/9/13 周振兴 <orczhou@stripped>:
> Hi all
>
> The patch for semi-sync has been commit to Launchpad.net:
>
> lp:~orczhou/mysql-server/ESR<https://code.launchpad.net/~orczhou/mysql-server/ESR>
> .
>
> A feature request on bug system has been report :
> http://bugs.mysql.com/62174
>
> Semi-sync is cool. But in the semi-sync solution, there are "phantom
> read": When InnoDB commit a transaction and no slave have accept the
> binary log, at the time another "new thread" still can read the
> data generated by this transaction. If database crash and gone(can't
> startup anymore) at this right time, although we have ever read these
> data, it's not exist in any slave and we think this transaction is
> never happened.
>
> In our application architecture, if our "new thread" read
> "phantom data", we will try to do some work about user's critical
> information. So we must not read the "phantom data".
>
> So,we add a new variables for semi-sync
> "rpl_semi_sync_master_wait_before_commit".Once you set this variables
>  ***WILL NOT COMMIT***  ON, MySQL(InnoDB)  the transaction until at least
> one
> slave get the binary log. By default, it is OFF and semi-sync act as the
> original way.
>
> Set this variable ON, make semi-sync act like this :
>  "Once you ***can read*** the data from the master, at least one slave has
> got the binary log"
> as the original semi-sync act like this :
>  "Once you ***get reponse*** from the master, at least on slave has got the
> binary log".
>
> The patch for semi-sync has been commit to Launchpad.net:
>
> lp:~orczhou/mysql-server/ESR<https://code.launchpad.net/~orczhou/mysql-server/ESR>
> .
>
> A feature request on bug system has been report :
> http://bugs.mysql.com/62174
>
> Anders.song has been reviewed the code.
>
> *Some question:*
>
> How do i make the branch from "Development" to "Experimental" or "Mature" ?
>
> When should i Propose for merging?
>
> Are there a "gatekeeper" for mysql-server? How it works ?
>
> *Test case*
>
> I have run all test-case of suite=rpl with
> --mysqld=--loose-rpl_semi_sync_master_wait_before_commit=1.
> All test-cases act like exactly as the original way, except the
> rpl_semi_sync.test. What make rpl_semi_sync.test
> different is :
>    In the patch,any DDL statement(trans_commit_implicit) is an exception,
> which will NOT wait the reply of slave.
>    The reason is : the patch add a new HOOK in binlog_commit. When
> binlog_commit return, we wait the slave reply.
> So every before transaction commit, it will wait the slave reply.But, the
> DDL statement will NEVER invoke binlog_commit.
> And we think it's OK that only DDL is the special case.
>
> A new test-case is writing to verify this patch is always act as the expect
> way.
>
> You can get all the details from the code
>
> :lp:~orczhou/mysql-server/ESR<https://code.launchpad.net/~orczhou/mysql-server/ESR>
> .
>
>
>
> --
> 此致
>    敬礼
> -----------------------------------------------------
> 周振兴  159-9004-0105 Taobao.com
> http://orczhou.com
>



-- 
Mark Callaghan
mdcallag@stripped
Thread
A feature patch for semi-sync周振兴14 Sep
  • Re: A feature patch for semi-syncMARK CALLAGHAN14 Sep
    • Re: A feature patch for semi-sync周振兴14 Sep
      • Re: A feature patch for semi-syncMARK CALLAGHAN15 Sep
        • Re: A feature patch for semi-sync周振兴15 Sep
        • Re: A feature patch for semi-syncKristian Nielsen15 Sep
          • Re: A feature patch for semi-sync周振兴23 Sep
            • Re: A feature patch for semi-syncMARK CALLAGHAN23 Sep
              • Re: A feature patch for semi-sync周振兴23 Sep
  • Re: A feature patch for semi-sync周振兴20 Sep
    • Re: A feature patch for semi-syncMats Kindahl20 Sep
      • Re: A feature patch for semi-sync周振兴21 Sep
        • Re: A feature patch for semi-syncMats Kindahl21 Sep