I am trying to visualise a geographically distributed MySQL cluster and hope
the list can confirm/correct/comment on my reasoning:
Take two data centres (DC1 and DC2), connected together by dark fibre with a
MySQL cluster distributed across them such that DC1 has one storage node
(SN1) and DC2 has another (SN2) with the cluster set to 2 replicas. The
reasoning I hope the list can verify is: in this configuration the
management node (MGM1) cannot be located in either DC1 or DC2.
If I have understood the mechanisms involved then this is because should
MGM1 be located in DC1 and DC1 suffer a power outage, then SN2 which still
has power in DC2 would recognise it is not in the majority and also being
unable to contact an arbitrator and would consequently shut itself down.
Similar logic applies if MGM1 is in DC2 and DC2 suffers a power outage.
This is correct behaviour because of another scenario which would look the
same from SN2's point of view: if the dark fibre between the data centres
had been cut, but both DC1 and DC2 had power then SN2 must not continue to
operate or else it will get out of sync.
Therefore, if MGM1 is in DC1 then a power cut in DC1 will cause the whole
cluster to fail, despite the fact DC2 still has power.
So, MGM1 must be in a third data centre to which SN1 and SN2 have
connectivity separately from the dark fibre. This way, in the event of any
single failure (DC1 <-> DC2 fibre cut, or power outage at either data
centre) the cluster continues to operate correctly as MGM1 casts the
deciding vote as to who is in the majority.
Is that a correct interpretation of the mechanisms involved in MySQL
clusters? If MGM1 was at a third data centre can anyone visualise a single
event which would cause cluster failure? Can that reliability be achieved
with only two data centres?
One last question, if the above is accurate: the connection between SN{1,2}
and MGM1 will not be constantly used and doesn't need to be terribly fast by
comparison to the SN1 <-> SN2 link which should be as fast as possible - is
that correct?
Thanks,
Louis Aslett
| Thread |
|---|
| • Geographically Distributed Cluster | Louis Aslett | 30 Nov |