Ken Linder wrote:
> Hi all,
> I have been researching the possibility of clustering four
> geographically separate locations with MySQL and using clustering to
> keep them all in synch. The master will be in Las Vegas or Chicago,
> with the other nodes in Los Angeles, Dallas and New York. There is
> also the possibility of having a fifth node in the UK within two
> One node would be the 'master' that our back-end processes would talk
> to. That data, through clustering, would be available to the other
> three nodes. These other three nodes can also modify data that must
> be available on the other nodes. Now, the questions.
There are actually 3 distinct types of nodes in a MySQL Cluster:
1. Data nodes actually store the data.
2. API nodes provide access to the data stored on the data nodes. (A
MySQL server in a Cluster acts as an API node, and is often referred to
as an SQL node.) It is also possible to write your own API node programs
using the NDB API.
3. Management nodes provide management services - they provide
configuration info to the data nodes, and allow for monitoring of the
A minimal MySQL Cluster that's viable in production consists of 4
computers: 2 of them each running a data node, one running a MySQL
Server (SQL node), and one running a Cluster management server. The
reason for having multiple data nodes is to provide high availability -
if one data node goes down, the other(s) continue to store the data.
(You can have many more than 2 data nodes - currently we support up to
48 data nodes in a single Cluster.)
> 1) As I understand it, clustering is transaction based. So,
> say a program attaches to node A, begins a transaction, updates 100
> records and closes the transaction. Is control returned to the
> requesting program when the transaction is complete on A, or is
> control returned when all nodes are updated?
I'm a little hazy on this at the moment, so I'll invite one of our
developers or consultants to chime in - this would be an excellent
question to add to the FAQs, BTW - thanks!
> 2) If node B starts experiencing high communication latency
> with the other nodes for a period of time, will that effect the
> performance of the other nodes?
You need dedicated, high-speed, low-latency bandwidth between MySQL
Cluster data nodes. Interruptions or slowdowns will likely cause data
nodes to be dropped from the cluster. So geographical separation between
data nodes is pretty much a no-go. (These demands aren't quite so
critical with regard to SQL/API nodes, IIRC.)
However, you could use separate MySQL Clusters in different locations
and replicate between them. We refer to this as MySQL Cluster
Replication or sometimes 'Geo-Replication' - see
> 3) If I choose to have one MySQL server initially, how
> difficult would it be to roll that into a cluster later as the entire
> system matures?
It's not difficult to add more MySQL servers (SQL nodes) to a running
MySQL Cluster. Currently, we support up to 255 nodes total in a MySQL
Cluster, and those which aren't data or management nodes can be SQL or
other API nodes.
Currently, you can't add new data nodes to a MySQL Cluster without
shutting it down completely and restarting it, but we're working on
enabling online adding of data nodes in the 6.4 series, which we hope to
have available in the first half of 2009.
> I am new to MySQL clustering and distributed SQL systems, so a lot of
> the terminology in the White Papers and FAQs is quite new to me.
Have you looked at any of the material in the MySQL Manual? I'd
recommend starting with these pages:
Hopefully you'll find some of this helpful, and we look forward to
seeing more questions from you if you have any.
> TIA, all. -Ken Linder ANINetworks.com
Jon Stephens - jon@stripped
Technical Writer - MySQL Documentation Team
Sun Microsystems AB - Database Technology Group
Liljeholmen, Stockholm, Sweden
(Summer: UTC +02.00 / Winter: UTC +01.00)
Mobile: +46 (0) 736 773 993