List:Cluster« Previous MessageNext Message »
From:umapathi b Date:February 23 2012 12:57pm
Subject:Cluster Failure Handling ..
View as plain text  
Hi All,

I have a production cluster running with 2 data nodes , 2 sql nodes and 1
mgmt node .
And I have a slave to one of the above servers with innodb plugin for data
backup purpose
which is running fine .

One day , while trying to do some parameter changes wrt disk based tables ,
I got some error
and the cluster was not able to re-start/recover . In this case , I had to
start the cluster with --initial
option again and reload/restore the data from the slave . But this took
considerable time(around 2 hours) ..
and I was safe as it was off-peak time ..and did not impact the customers.

How can I handle this kind of complete failure of cluster , in order to
have no downtime at all
or to quickly recover ?!

I am sure somebody might have faced this kind of issue earlier ...
Advice/Guidance in this regard
is highly appreciable ..

Thank you all in advance ..

- Umapathi.

Thread
Cluster Failure Handling ..umapathi b23 Feb
  • Re: Cluster Failure Handling ..umapathi b27 Feb
    • Re: Cluster Failure Handling ..Johan Andersson27 Feb