From: Brancaleoni Matteo Date: June 20 2004 9:25pm Subject: Re: DB node hang on start List-Archive: http://lists.mysql.com/cluster/15 Message-Id: <1087766707.15782.11.camel@athlon> MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Hi, thanks for the fast answer :) see my comments inline. Il lun, 2004-06-21 alle 00:43, Tomas Ulin ha scritto: > first of all, if you download the latest source you don't have to > specify the "[TCP]" connections at all Ok, done. > 1) please look where you started ndb_mgmd, you should find a cluster.log > (look at the end "tail -n100 cluster.log") ok, got it. unfortunately no trace about the db node #3, that's the one onto the remote machine > 2) please make sure that you don't have any trailing "ndbd" processes on > the failing machine. (we're working on better detection on clashes), if > so kill and restart (if a "ndb" process hangs this is often due to that > there are "multiple" processes trying to connect as the same "id") ok. no trailing processes. > 3) make sure you have your [COMPUTER] sections correct in the config file ok, done > 4) make sure that your Ndb.cfg/NDB_CONNECTSTRING points to the actual > host:port that run the ndb_mgmd sure done. If I write something wrong (done just 4 testing) the node doesn't go at all into starting phase (should be phase 1, I think). But when starts, is stick in that state. > and try again until you get the config right mmh... I tried to start 2 db nodes on the same machine (of course with different fs), the 2nd db node starts, but after phase #4 crashes. I have a rather long trace file for that. the error into ndbd error.log is : Date/Time: x 20 June 2004 - 23:15:49 Type of error: error Message: Internal program error (failed ndbrequire) Fault ID: 2341 Problem data: DbdihMain.cpp Object of reference: DBDIH (Line: 1080) 0x00000002 ProgramName: NDB Kernel ProcessID: 10904 TraceFile: NDB_TraceFile_1.trace ***EOM*** The mgm config is (for 2 db nodes on same machine) [COMPUTER] Id: 1 ByteOrder: Little HostName: bestia [COMPUTER] Id: 2 ByteOrder: Little HostName: bestia [MGM] Id: 1 ExecuteOnComputer: 1 ArbitrationRank: 1 [DB DEFAULT] NoOfReplicas: 2 [DB] Id: 2 ExecuteOnComputer: 1 FileSystemPath: /root/ndb/ndb_data1 [DB] Id: 3 ExecuteOnComputer: 2 FileSystemPath: /root/ndb/ndb_data2 [API] Id: 4 ExecuteOnComputer: 1 ArbitrationRank: 1 Regarding 2 db nodes on different machines, I'm stick to node #3 not starting (stops at phase 1, without exiting...) The only difference in mgm config.ini is the hostname of COMPUTER with id #2 any clue? -- Brancaleoni Matteo Espia Srl