List:Replication« Previous MessageNext Message »
From:Rick James Date:February 6 2008 7:40pm
Subject:RE: Replication breaks without apparent reason
View as plain text  
USE dba;
UPDATE dbb.tbl ...

won't be replicated if dba is replicate-ignored, regardless of dbb
settings.
(And similar things). 

> -----Original Message-----
> From: hq4ever@stripped [mailto:hq4ever@stripped] On Behalf 
> Of Maxim Veksler
> Sent: Wednesday, February 06, 2008 11:27 AM
> To: Mark Callaghan
> Cc: replication@stripped
> Subject: Re: Replication breaks without apparent reason
> 
> Hello Mark,
> 
> On Jan 31, 2008 5:25 PM, Mark Callaghan <mcallaghan@stripped> wrote:
> > The things to rule out are:
> > * does your app set 'sql_log_bin' to avoid logging some 
> statements to the binlog
> 
> Negative, but nice to know such feature exists.
> 
> > * was bin_log set on the master for the entire test
> 
> Yes, absolutely.
> 
> > * do you have any replication filters in the master or slave my.cnf
> > that would prevent the master from writing or the slave 
> from executing
> > the statement. From what you have listed below, that does not appear
> > to be the case.
> 
> Should not be the case, unless I've miss understood something with
> mysql filtering rules.
> 
> > * did the slave crash during the test
> 
> This I don't know, but assuming it did (A crash a.k.a reboot, as we're
> using linux-ha to monitor the machine) shouldn't the node, after
> coming back up catch with changes from master ?
> 
> > * is the problem table a temp table, as in CREATE TEMPORARY 
> TABLE ...
> >
> 
> No, normal INNODB table as the rest of our tables.
> 
> > The replication state update sequence on a slave is:
> > 1) run all statements for a tranaction
> > 2) update relay-log.info
> > When a slave crashes between 1) and 2), then it will replay the
> > transaction on restart. If the transaction were for DROP TABLE, then
> > on the replay, the table will not exist and the slave will stop
> >
> 
> That's interesting, I will examine this. Thouhg I doubt this is the
> case, because our replication failure has reproduced twice already and
> the machine aren't rebooting for weeks. Surly not during the load of
> data described in my previous email.
> 
> > If you are a stickler for consistency between your master and slave,
> > then make sure that CREATE TEMP TABLE statements are not int the
> > binlog. If you care even more, then only use InnoDB tables 
> (or Falcon
> > and Maria when they are ready) on the slave and master so that
> > rollback is done consistently between master and slave.
> >
> 
> As I've mentioned all our tables are by default InnoDB so there's not
> chance of missing some data as "half written commit". We are OTOH
> seeing this error in the logs, I don't know if it's related to the
> case:
> 
>  "080115 15:31:46 InnoDB: ERROR: the age of the last 
> checkpoint is 9433957,
> InnoDB: which exceeds the log group capacity 9433498.
> InnoDB: If you are using big BLOB or TEXT rows, you must set the
> InnoDB: combined size of log files at least 10 times bigger than the
> InnoDB: largest such row."
> 
> Thanks for your help, Google :)
> 
> Maxim.
> 
> >
> > On Jan 30, 2008 6:08 AM, Maxim Veksler 
> <maxim.veksler@stripped> wrote:
> > > Hello List,
> > >
> > > We're experiencing strange replication failures on some 
> SQL actions,
> > > so far we haven't been able to reproduce this failures in a clear
> > > scenario.
> > >
> > > Our setup is as following:
> > >
> > > We have 2 Nodes (rnd-dev1, qa-dev1), both are running Red 
> Hat 5 el,
> > > kernel version 2.6.18-53.1.4.el5PAE.
> > > MySQL version is 5.0.45. The machines are in the same 
> netmask and no
> > > firewall is configured between them.
> > >
> > > Replication is configured with master-master replication.
> > > my.cnf of both nodes:
> > >
> > > rnd-dev1 :
> > > """
> > > [mysqld]
> > > # binary logs are what makes replication tic with 
> master-slave(s) setup.
> > > log-bin
> > > expire_logs_days=14
> > >
> > > # relay logs are used by slave to store statements before 
> execution.
> > > relay-log=ml-relay-log
> > > relay-log-index=ml-relay-log-index
> > >
> > > # Unique on each node !
> > > server-id=115
> > > auto_increment_increment=2
> > > auto_increment_offset=2
> > >
> > >
> > > # General system logs
> > > log=/var/log/COMPANY/MySQL.log
> > > log-error=/var/log/COMPANY/MySQL_error.log
> > > log-slow-queries=/var/log/COMPANY/MySQL_slow-queries.log
> > > log-warnings=2
> > >
> > >
> > > # mysql
> > > replicate-wild-ignore-table=mysql.%
> > >
> > > # scheme1
> > > replicate-wild-do-table=scheme1.%
> > >
> > > # scheme2
> > > replicate-wild-do-table=scheme2.tbl1
> > > replicate-wild-do-table=scheme2.tbl2
> > > replicate-wild-ignore-table=scheme2.tbl3
> > > replicate-wild-ignore-table=scheme2.tbl4
> > > replicate-wild-ignore-table=scheme2.tbl5
> > > replicate-wild-ignore-table=scheme2.tbl6
> > > replicate-wild-ignore-table=scheme2.tbl7
> > > """
> > >
> > > qa-dev1:
> > > """
> > > [mysqld]
> > > # binary logs are what makes replication tic with 
> master-slave(s) setup.
> > > log-bin
> > > expire_logs_days=14
> > >
> > > # relay logs are used by slave to store statements before 
> execution.
> > > relay-log=ml-relay-log
> > > relay-log-index=ml-relay-log-index
> > >
> > > # Unique on each node !
> > > server-id=114
> > > auto_increment_increment=2
> > > auto_increment_offset=1
> > >
> > >
> > > # General system logs
> > > log=/var/log/COMPANY/MySQL.log
> > > log-error=/var/log/COMPANY/MySQL_error.log
> > > log-slow-queries=/var/log/COMPANY/MySQL_slow-queries.log
> > > log-warnings=2
> > >
> > >
> > > # mysql
> > > replicate-wild-ignore-table=mysql.%
> > >
> > > # scheme1
> > > replicate-wild-do-table=scheme1.%
> > >
> > > # scheme2
> > > replicate-wild-do-table=scheme2.tbl1
> > > replicate-wild-do-table=scheme2.tbl2
> > > replicate-wild-ignore-table=scheme2.tbl3
> > > replicate-wild-ignore-table=scheme2.tbl4
> > > replicate-wild-ignore-table=scheme2.tbl5
> > > replicate-wild-ignore-table=scheme2.tbl6
> > > replicate-wild-ignore-table=scheme2.tbl7
> > > """
> > >
> > >
> > > Under normal conditions we have replication working fine, 
> tested under
> > > load of INSERT, UPDATE and co.
> > >
> > > We suspect that it breaks when we do a procedure that involves
> > > dropping of temporary tables. This process of ours does
> > > 1. DROP tmpTblA
> > > 2. CREATE tmpTblA
> > > 3. INSERT ~380 records per second into tmpTblA during ~60 seconds.
> > > 4. SELECT INTO tmpTblB small fragments from tmpTblA
> > > 5. SELECT FROM tmpTblB INSERT INTO tblA
> > >
> > > The differences between this procedure and normal 
> operation being in
> > > the fact here DROP TABLE and CREATE TABLE commands are issued.
> > > We notice the failure because a DROP TABLE statement 
> replicated from
> > > rnd-dev1 (Where it executed OK) to qa-dev1 where this 
> table was not
> > > created so the DROP TABLE failed and replication stopped.
> > >
> > > Running this procedure again does not reproduce the 
> replication failure.
> > >
> > >
> > > The difference in the bin-log before and after the 
> procedure is ~20mb:
> > > Before:
> > > -rw-rw----  1 mysql mysql  270139448 Jan 30 14:04 
> rnd-dev1-bin.000047
> > > After:
> > > -rw-rw----  1 mysql mysql  291102580 Jan 30 15:26 
> rnd-dev1-bin.000047
> > >
> > >
> > > Trying to debug this further I'm suspecting that for some 
> reason some
> > > commands or "command blocks" are not being replicated, 
> which leads me
> > > to think that we notice this now only because here a DROP 
> is issued
> > > which causes a replication error while in other times command were
> > > just skipped.
> > >
> > >
> > >
> > > --
> > > Cheers,
> > > Maxim Veksler
> > >
> > > "Free as in Freedom" - Do u GNU ?
> > >
> > > --
> > > MySQL Replication Mailing List
> > > For list archives: http://lists.mysql.com/replication
> > > To unsubscribe:    
> http://lists.mysql.com/replication?unsub=1
> > >
> > >
> >
> >
> >
> > --
> > Mark Callaghan
> > mcallaghan@stripped
> >
> 
> 
> 
> -- 
> Cheers,
> Maxim Veksler
> 
> "Free as in Freedom" - Do u GNU ?
> 
> -- 
> MySQL Replication Mailing List
> For list archives: http://lists.mysql.com/replication
> To unsubscribe:    
> http://lists.mysql.com/replication?unsub=1
> 
> 
Thread
Replication breaks without apparent reasonMaxim Veksler30 Jan
  • Re: Replication breaks without apparent reasonEber Duarte31 Jan
    • Re: Replication breaks without apparent reasonMaxim Veksler6 Feb
  • Re: Replication breaks without apparent reasonMark Callaghan31 Jan
    • Re: Replication breaks without apparent reasonMaxim Veksler6 Feb
      • RE: Replication breaks without apparent reasonRick James6 Feb