List:Replication« Previous MessageNext Message »
From:Mark Callaghan Date:January 31 2008 3:25pm
Subject:Re: Replication breaks without apparent reason
View as plain text  
The things to rule out are:
* does your app set 'sql_log_bin' to avoid logging some statements to the binlog
* was bin_log set on the master for the entire test
* do you have any replication filters in the master or slave my.cnf
that would prevent the master from writing or the slave from executing
the statement. From what you have listed below, that does not appear
to be the case.
* did the slave crash during the test
* is the problem table a temp table, as in CREATE TEMPORARY TABLE ...

The replication state update sequence on a slave is:
1) run all statements for a tranaction
2) update relay-log.info
When a slave crashes between 1) and 2), then it will replay the
transaction on restart. If the transaction were for DROP TABLE, then
on the replay, the table will not exist and the slave will stop

If you are a stickler for consistency between your master and slave,
then make sure that CREATE TEMP TABLE statements are not int the
binlog. If you care even more, then only use InnoDB tables (or Falcon
and Maria when they are ready) on the slave and master so that
rollback is done consistently between master and slave.

On Jan 30, 2008 6:08 AM, Maxim Veksler <maxim.veksler@stripped> wrote:
> Hello List,
>
> We're experiencing strange replication failures on some SQL actions,
> so far we haven't been able to reproduce this failures in a clear
> scenario.
>
> Our setup is as following:
>
> We have 2 Nodes (rnd-dev1, qa-dev1), both are running Red Hat 5 el,
> kernel version 2.6.18-53.1.4.el5PAE.
> MySQL version is 5.0.45. The machines are in the same netmask and no
> firewall is configured between them.
>
> Replication is configured with master-master replication.
> my.cnf of both nodes:
>
> rnd-dev1 :
> """
> [mysqld]
> # binary logs are what makes replication tic with master-slave(s) setup.
> log-bin
> expire_logs_days=14
>
> # relay logs are used by slave to store statements before execution.
> relay-log=ml-relay-log
> relay-log-index=ml-relay-log-index
>
> # Unique on each node !
> server-id=115
> auto_increment_increment=2
> auto_increment_offset=2
>
>
> # General system logs
> log=/var/log/COMPANY/MySQL.log
> log-error=/var/log/COMPANY/MySQL_error.log
> log-slow-queries=/var/log/COMPANY/MySQL_slow-queries.log
> log-warnings=2
>
>
> # mysql
> replicate-wild-ignore-table=mysql.%
>
> # scheme1
> replicate-wild-do-table=scheme1.%
>
> # scheme2
> replicate-wild-do-table=scheme2.tbl1
> replicate-wild-do-table=scheme2.tbl2
> replicate-wild-ignore-table=scheme2.tbl3
> replicate-wild-ignore-table=scheme2.tbl4
> replicate-wild-ignore-table=scheme2.tbl5
> replicate-wild-ignore-table=scheme2.tbl6
> replicate-wild-ignore-table=scheme2.tbl7
> """
>
> qa-dev1:
> """
> [mysqld]
> # binary logs are what makes replication tic with master-slave(s) setup.
> log-bin
> expire_logs_days=14
>
> # relay logs are used by slave to store statements before execution.
> relay-log=ml-relay-log
> relay-log-index=ml-relay-log-index
>
> # Unique on each node !
> server-id=114
> auto_increment_increment=2
> auto_increment_offset=1
>
>
> # General system logs
> log=/var/log/COMPANY/MySQL.log
> log-error=/var/log/COMPANY/MySQL_error.log
> log-slow-queries=/var/log/COMPANY/MySQL_slow-queries.log
> log-warnings=2
>
>
> # mysql
> replicate-wild-ignore-table=mysql.%
>
> # scheme1
> replicate-wild-do-table=scheme1.%
>
> # scheme2
> replicate-wild-do-table=scheme2.tbl1
> replicate-wild-do-table=scheme2.tbl2
> replicate-wild-ignore-table=scheme2.tbl3
> replicate-wild-ignore-table=scheme2.tbl4
> replicate-wild-ignore-table=scheme2.tbl5
> replicate-wild-ignore-table=scheme2.tbl6
> replicate-wild-ignore-table=scheme2.tbl7
> """
>
>
> Under normal conditions we have replication working fine, tested under
> load of INSERT, UPDATE and co.
>
> We suspect that it breaks when we do a procedure that involves
> dropping of temporary tables. This process of ours does
> 1. DROP tmpTblA
> 2. CREATE tmpTblA
> 3. INSERT ~380 records per second into tmpTblA during ~60 seconds.
> 4. SELECT INTO tmpTblB small fragments from tmpTblA
> 5. SELECT FROM tmpTblB INSERT INTO tblA
>
> The differences between this procedure and normal operation being in
> the fact here DROP TABLE and CREATE TABLE commands are issued.
> We notice the failure because a DROP TABLE statement replicated from
> rnd-dev1 (Where it executed OK) to qa-dev1 where this table was not
> created so the DROP TABLE failed and replication stopped.
>
> Running this procedure again does not reproduce the replication failure.
>
>
> The difference in the bin-log before and after the procedure is ~20mb:
> Before:
> -rw-rw----  1 mysql mysql  270139448 Jan 30 14:04 rnd-dev1-bin.000047
> After:
> -rw-rw----  1 mysql mysql  291102580 Jan 30 15:26 rnd-dev1-bin.000047
>
>
> Trying to debug this further I'm suspecting that for some reason some
> commands or "command blocks" are not being replicated, which leads me
> to think that we notice this now only because here a DROP is issued
> which causes a replication error while in other times command were
> just skipped.
>
>
>
> --
> Cheers,
> Maxim Veksler
>
> "Free as in Freedom" - Do u GNU ?
>
> --
> MySQL Replication Mailing List
> For list archives: http://lists.mysql.com/replication
> To unsubscribe:    http://lists.mysql.com/replication?unsub=1
>
>



-- 
Mark Callaghan
mcallaghan@stripped
Thread
Replication breaks without apparent reasonMaxim Veksler30 Jan
  • Re: Replication breaks without apparent reasonEber Duarte31 Jan
    • Re: Replication breaks without apparent reasonMaxim Veksler6 Feb
  • Re: Replication breaks without apparent reasonMark Callaghan31 Jan
    • Re: Replication breaks without apparent reasonMaxim Veksler6 Feb
      • RE: Replication breaks without apparent reasonRick James6 Feb