On 01/27/2011 06:33 PM, Wechsler, Steven wrote:
> I've been reading up on circular replication and I think I have a decent
> grasp of the concepts. However, what I haven't been able to figure out
> is how to recover from a node failure without reloading all the other
> nodes from an arbitrary master, since there seems to be no way to keep
> binary logs in sync.
> Example 4 node circle: A->B->C->D->A. Let's assume B fails.
> Since all transactions in the binary logs are timestamped, is MySQL
> smart enough not to re-run any transactions; that is, if, when I alter
> the slave on C to read from the A (now that B is no longer available), I
> set the binlog number to some earlier position (say 1 hour earlier),
> will it skip everything up to the point where it lost the connection?
No, sorry. It will re-apply the events regardless of the timestamp.
> The alternative would be to write a Perl script to read the binlogs on
> A, and the relay logs on C, then determine the correct position in both
> and alter the slave IO thread on C accordingly, but it seems like there
> should already be something in place that can do that automatically. Is
Yes, this is the approach that you would need to use. There is a
procedure in the book we wrote (MySQL High-Availability) which
outlines how it can be done, but it basically boils down to finding
out what position in A you can use based on the contents of the last
event you have in the relay log on C.
Just note one thing: you have to run the relay-logs to completion
before switching over to the new master. It is not easy to "append"
events to the relay log since there is no support for that yet.
To make it easier to find out where to re-connect to, it is necessary
to have at least the global transaction IDs that Google have in their
patch plus some extensions to be able to keep track of transactions
from multiple servers.
Just my few cents,