Chad Martin wrote:
> This is how I understand it given my reading and attendance at the
> last MySQL User's Conference. If I'm wrong on any point, the experts
> can correct me.
> Phil wrote:
>> 1) When i create a table or type NDBCLUSTER where is that information
>> stored? On all nodes? On one node? Split up between the nodes?
>> Say i had 4 nodes and two api's. Do the APIs talk to every single node?
>> Just the management server? All nodes and the MGM?
> "...that information..." is pretty vague. Let me give you the long
> answer. All nodes talk to the management server, but there's not a
> lot of communication. The management server is mostly for exceptions
> to the norm. Start up, shutdown, node failure resolution, etc. All
> the usual dirty work is done between the API and NDB nodes. The
> tables reside on the NDB node collectively. Any given replica is
> distributed row-wise across all the nodes dealing with that replica.
> When you connect to an API and send an SQL statement, it needs to talk
> to all the NDB nodes to construct a response. Each API node works
> independently from each other API node.
some complimentary info:
Mysql server will hold an .frm file as usual.
The ndbd nodes will also store that info (so that if another mysql
server connects it will get the "same" frm file)
Mysql server never talks to ndb_mgmd during a query.
mysqld's (API's) never talk to other mysqld's (except indirectly via the
ndbd nodes, sharing data, frm files)
During a transaction one ndbd node will be chosen as the transaction
cordinator (TC). All communication from the mysqld goes via that TC for
that transaction. Next transaction may or may not end up in the same
TC. Delivery of the actual data involved in the query to the mysqld may
come from any ndbd node (depending on query and where the data actually
>> When i do an insert query on an NDBCluster how does the insert work?
>> Does it put it on one node or split it up and distribute it across all
> The rows inserted are distribued evenly within a replica, and copied
> across replicas.
>> 2) What happens if a node crashes in the middle of a query?
> The management server notices the problem, and moves the work to a
> different replica. It's coded intelligently enough such that the
> query will not fail as long as you have a consistent network and
> there's at least one copy of all the data somewhere in the cluster.
Almost correct. Management server is actually not needed during node
failure except to be the aritrator for network partitioning, which means
that in a 2-node setup it is needed during node failure (cf.
ArbitrationRank config parameter). It is needed though to be able to
start the node again.
If a ndbd node which is not the TC query/transaction will continue
If the TC failes during a query/transaction it depends on how far the
transaction has gotten. Normally transaction will be aborted and the
mysqld/user is notified and will have to do the transaction over again.
If however the crash occurs in such a late state so that commit has
started but not completed, the transaction cordination will be taken
over by another ndbd node (TC takeover), and the transaction is completed.
>> 3) Using ndb_mgm if i do a <id> stop how can i bring that node backup
>> without explicitly logging on to that machine and running the process
> You can't, AFAIK.
The may use <id> restart -n
which will "restart" the node but not acually start it. Then you can do
<id> start once you want it up and running again
>> 4) Does the management server do all the work? Does it do the
>> queries/handle nodes distribution or is that all done at the node level?
> I think I covered this above. The number of management servers is not
> a scalability issue. Adding another management server will not
> increase performance, but it will allow the cluster to continue
> running when one of them crashes. All the real work is done between
> the API and NDB nodes.
> Chad Martin
> Arete Studios, Inc.