From: jon Date: June 28 2007 6:11am Subject: svn commit - mysqldoc@docsrva: r6939 - trunk/ndbapi List-Archive: http://lists.mysql.com/commits/29802 Message-Id: <200706280611.l5S6BvEQ020567@docsrva.mysql.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Author: jstephens Date: 2007-06-28 08:11:55 +0200 (Thu, 28 Jun 2007) New Revision: 6939 Log: More work on Start Phases doc. Modified: trunk/ndbapi/ndb-internals-start-phases.xml trunk/ndbapi/start-phases-tmp.txt Modified: trunk/ndbapi/ndb-internals-start-phases.xml =================================================================== --- trunk/ndbapi/ndb-internals-start-phases.xml 2007-06-27 23:24:02 UTC (rev 6938) +++ trunk/ndbapi/ndb-internals-start-phases.xml 2007-06-28 06:11:55 UTC (rev 6939) Changed blocks: 1, Lines Added: 114, Lines Deleted: 1; 6193 bytes @@ -408,8 +408,121 @@ <literal>STTOR</literal> Phase 1 - + + This is one of the phases in which most kernel blocks participate + (see the table in + ). Otherwise, + most blocks are involved primarily in the initialization of data + — for example, this is all that DBTC + does. + + + Many blocks initialize references to other blocks in Phase 1. + DBLQH initialises block references to + DBTUP, and DBACC initialises + block references to DBTUP and + DBLQH. DBTUP initialises + references to the blocks DBLQH, + TSMAN, and LGMAN. + + + + NDBCNTR initialises some variables and sets up + block references to DBTUP, + DBLQH, DBACC, + DBTC, DBDIH, and + DBDICT; these are needed in the special + startphase handling of these blocks using + NDB_STTOR signals, where the bulk of the node + startup process actually takes place. + + + + If the cluster is configured to lock pages (that is, if the + LockPagesInMainMemory configuration parameter + has been set), CMVMI handles this locking. + + + + The QMGR block calls the + initData() method (defined in + storage/ndb/src/kernel/blocks/qmgr/QmgrMain.cpp) + whose output is handled by all other blocks in the + READ_CONFIG_REQ phase (see + ). + Following these initializations, QMGR sends the + DIH_RESTARTREQ signal to + DBDIH, which determines whether a proper system + file exists; if it does, an initial start is not being performed. + After the reception of this signal comes the process of + integrating the node among the other data nodes in the cluster, + where data nodes enter the cluster one at a time. The first one to + enter becomes the master; whenever the master dies the new master + is always the node that has been running for the longest time from + those remaining. + + + + QMGR sets up timers to ensure that inclusion in + the cluster does not take longer than what the cluster's + configuration is set to allow (see + Controlling + Timeouts, Intervals, and Disk Paging for the relevant + configuration parameters), after which communication to all other + data nodes is established. At this point, a + CM_REGREQ signal is sent to all data nodes. + Only the president of the cluster responds to this signal; the + president allows one node at a time to enter the cluster. If no + node responds within 3 seconds then the president becomes the + master. If several nodes start up simultaneously, then the node + with the lowest node ID becomes president. The president sends + CM_REGCONF in response to this signal, but also + sends a CM_ADD singal to all nodes that are + currently alive. + + + + Next, the starting node sends a CM_NODEINFOREQ + signal to all current live data nodes. When these + nodes receive that signal they send a + NODE_VERSION_REP signal to all API nodes that + have connected to them. Each data node also sends a + CM_ACKADD to the president to inform the + president that it has heard the CM_NODEINFOREQ + signal from the new node. Finally, each of the current data nodes + sends the CM_NODEINFOCONF signal in response to + the starting node. When the starting node has received all these + signals, it also sends the CM_ACKADD signal to + the president. + + + + When the president has received all of the expected + CM_ACKADD signals, it knows that all data nodes + (including the newest one to start) have replied to the + CM_NODEINFOREQ signal. When the president + receives the final CM_ACKADD, it sends a + CM_ADD signal to all current data nodes (that + is, except for the node that just started). Upon receiving this + signal, the existing data nodes enable communication with the new + node; they begin sending heartbeats to it and including in the + list of neighbors used by the heartbeat protocol. + + + + The start struct is reset, so that it can + handle new starting nodes, and then each data node sends a + CM_ACKADD to the president, which then sends a + CM_ADD to the starting node after all such + CM_ACKADD signals have been received. The new + node then opens all of its communication channels to the data + nodes that were already connected to the cluster; it also sets up + its own heartbeat structures and starts sending heartbeats. It + alsos send a CM_ACKADD message in response to + the president. + +
Modified: trunk/ndbapi/start-phases-tmp.txt =================================================================== --- trunk/ndbapi/start-phases-tmp.txt 2007-06-27 23:24:02 UTC (rev 6938) +++ trunk/ndbapi/start-phases-tmp.txt 2007-06-28 06:11:55 UTC (rev 6939) Changed blocks: 2, Lines Added: 2, Lines Deleted: 2; 652 bytes @@ -129,8 +129,6 @@ PGMAN: Phase 1,3,7 (Phase 7 currently empty) RESTORE: Phase 1,3 (Only Phase 1 does any real work) -[***STOP POINT***] - STTOR PHASE 1 ----------------------- @@ -194,6 +192,8 @@ sending heartbeats. It will also send a response message CM_ACKADD to the president. +[***STOP POINT***] + PROTOCOL TO INCLUDE STARTING NODE IN CLUSTER [CHART GOES HERE when we have one that's legible]