From: Rick James Date: October 3 2012 4:32pm Subject: RE: making a storage engine crash safe on a slave in MySQL 5.6 List-Archive: http://lists.mysql.com/internals/38594 Message-Id: <2E7DD7ADE53B044C8C8BCD9C5829E1EB148CF92191@SP2-EX07VS01.ds.corp.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable There are edge cases where MySQL can lose data, but they are rare. You can tune things to minimize the chance of loss, but performance suffers= . There is no automatic failover when a Master fails. Attempts to automate i= t have led to disaster -- see the recent lengthy outage of GIT. The design of MySQL's replication is to send a copy of all writes from the = Master to the Slave, then _asynchronously_ apply the updates on the Slave. Some of the "improvements" move that toward _sychronous_. Old Replication: COMMIT returns to client before Slave necessarily has the= write. 5.6 / Percona Cluster / Galera / etc: COMMIT returns after the Slave says = "yes I have the write (in the "relay log"), but I have not applied it yet." Full synchronous (not implemented) would wait for the write to be committed= on all slave(s) before returning from the COMMIT. Why not do sync? How far apart are Master and Slave? -- this is a lag. Is the Slave busy and cannot get around to doing the write? -- another lag The replication stream is (was) "single-threaded" (for good reason); so it = is the query stuck in this 'queue'? -- another lag All these lags would mean High Availability --> Low Performance. The Parallelism in 5.6 _partially_ fixes the single-threadedness. I don't mean to be knocking Replication as implemented (and improved recent= ly); I am trying to explain why it is a tough problem. Another feature you did not mention is the GTID (Global Transaction ID). T= his has been needed for many years, was implemented at Google, and is now (= 5.6) mainstream. In complex and subtle ways it makes promoting an arbitrar= y Slave to Master easily scriptable. The problem before had to do with the= _other_ slaves -- there was no easy way to have them point to the new Mast= er and know where they "left off". Workbench has such scripts. A guy invented MHA to do the work without the GTID; he provides equivalent = scripts. What does InnoDB do upon recovery? There are many aspects to that. On a single machine, transactions that were started but not committed need = to be rolled back; InnoDB's "log" assists in that. A transaction is not sent to the slaves until you COMMIT. etc. The "doublewrite" buffer protects you (at a cost) from writing part of a 16= KB block then crashing. (InnoDB normally works in units of 16KB, but most = file systems talk in units of 4KB.) This is only a quick, cursory, summary of all the stuff that goes on in Inn= oDB an Replication. MySQL+InnoDB is rather "crash safe", but not perfect. XtraDB is essentially the same as InnoDB in this area of discussion. MyISAM is much less crash safe, in many ways. Aria (in MariaDB) touts being crash safe. > -----Original Message----- > From: Zardosht Kasheff [mailto:zardosht@stripped] > Sent: Wednesday, October 03, 2012 9:01 AM > To: Rick James > Cc: internals@stripped > Subject: Re: making a storage engine crash safe on a slave in MySQL 5.6 >=20 > Thanks Rick. So my understanding is muddled. >=20 > So how are slaves crash safe? What does MySQL and InnoDB do upon > recovery? >=20 > On Wed, Oct 3, 2012 at 11:42 AM, Rick James > wrote: > > As I understand it, it is not quite that... > > > > The transaction must get into the Slave's relay log before returning > > from the COMMIT. (That's not the same as "committing".) > > > > > > > >> -----Original Message----- > >> From: Zardosht Kasheff [mailto:zardosht@stripped] > >> Sent: Wednesday, October 03, 2012 8:16 AM > >> To: internals@stripped > >> Subject: making a storage engine crash safe on a slave in MySQL 5.6 > >> > >> Hello all, > >> > >> I read that InnoDB is now crash safe on slaves in MySQL 5.6. I > >> understand that a way they do this is on committing a transaction on > >> a slave, they store the binary log position in an InnoDB table > (which > >> makes that information transactionally maintained). I also > understand > >> that this position is stored for each database, as databases may > >> apply the replication log in parallel. > >> > >> Upon recovery of a slave, how does InnoDB report to MySQL where the > >> replication log should resume? > >> > >> Can another storage engine similarly make itself crash safe on a > slave? > >> Will there be any issues with multiple storage engines doing so? > >> > >> Thanks > >> -Zardosht > >> > >> -- > >> MySQL Internals Mailing List > >> For list archives: http://lists.mysql.com/internals > >> To unsubscribe: http://lists.mysql.com/internals > >