List:Cluster« Previous MessageNext Message »
From:Matteo Brancaleoni Date:June 21 2004 2:31pm
Subject:Re: DB node hang on start
View as plain text  
Hi.

I was able to run 2 db nodes on the same machine.
The problem was into the [COMPUTER]
definition. Following the demos, I thought that
I need 2 [COMPUTER] definitions, even pointing
to the same machine, and let DB node 1 to run
on computer 1 and db node 2 to run on computer 2
(that's the same entry).

Simply removing the 2nd computer entry and
letting db node #2 to run on computer 1 (as the first
db node) works ok.

so far so good.

but now I have the problem about having the 2nd db node
on another machine... still no joy.

Matteo

Il lun, 2004-06-21 alle 17:34, Tomas Ulin ha scritto:
> but, I saw the below.  It shows that you did not start cluster empty (-i).
> 
> T
> 
> 2004-06-20 16:49:34 [NDB] INFO     -- Angel pid: 5558 ndb pid: 5560
> 2004-06-20 16:49:34 [NDB] INFO     -- NDB Cluster -- DB node 2
> 2004-06-20 16:49:34 [NDB] INFO     -- Version 3.5.0 (beta) --
> 2004-06-20 16:49:34 [NDB] INFO     -- Start initiated (version 3.5.0)
> Dbdict: name=sys/def/SYSTAB_0,id=0
> Dbdict: name=sys/def/NDB$EVENTS_0,id=2
> Dbdict: name=test/def/matteotabella2,id=4
> Dbdict: name=test/def/4/PRIMARY,id=6
> Dbdict: name=test/def/matteo,id=8
> Dbdict: name=test/def/8/PRIMARY,id=10
> Dbdict: name=test/def/mytabella,id=12
> Dbdict: name=test/def/12/PRIMARY,id=14
> 2004-06-20 16:50:12 [NDB] INFO     -- Started (version 3.5.0)
> 
> 
> 
> Tomas Ulin wrote:
> 
> >
> > when going from 1-node to 2-nodes, did you restart both nodes with -i 
> > flag?
> >
> > T
> >
> > Matteo Brancaleoni wrote:
> >
> >> Hi
> >>
> >> Il lun, 2004-06-21 alle 13:45, Tomas Ulin ha scritto:
> >>  
> >>
> >>> Did you try to start the second node with "ndbd -i"?
> >>>   
> >>
> >>
> >> yes, without success.
> >>
> >>  
> >>
> >>> Brancaleoni Matteo wrote:
> >>>
> >>>   
> >>>
> >>>> Hi, thanks for the fast answer :)
> >>>> see my comments inline.
> >>>>
> >>>> Il lun, 2004-06-21 alle 00:43, Tomas Ulin ha scritto:
> >>>>
> >>>>
> >>>>     
> >>>>
> >>>>> first of all, if you download the latest source you don't have
> to 
> >>>>> specify the "[TCP]" connections at all
> >>>>>  
> >>>>>       
> >>>>
> >>>> Ok, done.
> >>>>
> >>>>
> >>>>
> >>>>     
> >>>>
> >>>>> 1) please look where you started ndb_mgmd, you should find a 
> >>>>> cluster.log (look at the end "tail -n100 cluster.log")
> >>>>>  
> >>>>>       
> >>>>
> >>>> ok, got it. unfortunately no trace about the db node #3, that's
> >>>> the one onto the remote machine
> >>>>
> >>>>
> >>>>
> >>>>     
> >>>>
> >>>>> 2) please make sure that you don't have any trailing "ndbd" 
> >>>>> processes on the failing machine. (we're working on better 
> >>>>> detection on clashes), if so kill and restart  (if a "ndb"
> process 
> >>>>> hangs this is often due to that there are "multiple" processes 
> >>>>> trying to connect as the same "id")
> >>>>>  
> >>>>>       
> >>>>
> >>>> ok. no trailing processes.
> >>>>
> >>>>
> >>>>
> >>>>     
> >>>>
> >>>>> 3) make sure you have your [COMPUTER] sections correct in the 
> >>>>> config file
> >>>>>  
> >>>>>       
> >>>>
> >>>> ok, done
> >>>>
> >>>>
> >>>>
> >>>>     
> >>>>
> >>>>> 4) make sure that your Ndb.cfg/NDB_CONNECTSTRING points to the 
> >>>>> actual host:port that run the ndb_mgmd
> >>>>>  
> >>>>>       
> >>>>
> >>>> sure done.
> >>>> If I write something wrong (done just 4 testing) the node
> >>>> doesn't go at all into starting phase (should be phase 1, I think).
> >>>> But when starts, is stick in that state.
> >>>>
> >>>>
> >>>>
> >>>>     
> >>>>
> >>>>> and try again until you get the config right
> >>>>>  
> >>>>>       
> >>>>
> >>>> mmh... I tried to start 2 db nodes on the same machine
> >>>> (of course with different fs), the 2nd db node starts,
> >>>> but after phase #4 crashes.
> >>>>
> >>>> I have a rather long trace file for that.
> >>>> the error into ndbd error.log is :
> >>>>
> >>>> Date/Time: x 20 June 2004 - 23:15:49
> >>>> Type of error: error
> >>>> Message: Internal program error (failed ndbrequire)
> >>>> Fault ID: 2341
> >>>> Problem data: DbdihMain.cpp
> >>>> Object of reference: DBDIH (Line: 1080) 0x00000002
> >>>> ProgramName: NDB Kernel
> >>>> ProcessID: 10904
> >>>> TraceFile: NDB_TraceFile_1.trace
> >>>> ***EOM***
> >>>>
> >>>>
> >>>> The mgm config is (for 2 db nodes on same machine)
> >>>> [COMPUTER]
> >>>> Id: 1
> >>>> ByteOrder: Little
> >>>> HostName: bestia
> >>>> [COMPUTER]
> >>>> Id: 2
> >>>> ByteOrder: Little
> >>>> HostName: bestia
> >>>> [MGM]
> >>>> Id: 1
> >>>> ExecuteOnComputer: 1
> >>>> ArbitrationRank: 1
> >>>> [DB DEFAULT]
> >>>> NoOfReplicas: 2
> >>>> [DB]
> >>>> Id: 2
> >>>> ExecuteOnComputer: 1
> >>>> FileSystemPath: /root/ndb/ndb_data1
> >>>> [DB]
> >>>> Id: 3
> >>>> ExecuteOnComputer: 2
> >>>> FileSystemPath: /root/ndb/ndb_data2
> >>>> [API]
> >>>> Id: 4
> >>>> ExecuteOnComputer: 1
> >>>> ArbitrationRank: 1
> >>>>
> >>>> Regarding 2 db nodes on different machines, I'm stick
> >>>> to node #3 not starting (stops at phase 1, without
> >>>> exiting...)
> >>>> The only difference in mgm config.ini is the hostname
> >>>> of COMPUTER with id #2
> >>>>
> >>>> any clue?
> >>>>
> >>>>
> >>>>
> >>>>     
> >>>
> >
> >
-- 
Matteo Brancaleoni <mbrancaleoni@stripped>
Espia - Emmegi Srl

Thread
DB node hang on startBrancaleoni Matteo20 Jun
  • Re: DB node hang on startTomas Ulin20 Jun
    • Re: DB node hang on startBrancaleoni Matteo20 Jun
      • Re: DB node hang on startTomas Ulin21 Jun
      • Re: DB node hang on startTomas Ulin21 Jun
        • Re: DB node hang on startMatteo Brancaleoni21 Jun
          • Re: DB node hang on startTomas Ulin21 Jun
            • Re: DB node hang on startTomas Ulin21 Jun
              • Re: DB node hang on startMatteo Brancaleoni21 Jun
                • Re: DB node hang on startTomas Ulin21 Jun
                  • Re: DB node hang on startBrancaleoni Matteo21 Jun
                • Re: DB node hang on starttulin23 Jun
                  • Re: DB node hang on startMatteo Brancaleoni23 Jun
                    • Re: DB node hang on startTomas Ulin23 Jun
                      • Re: DB node hang on startMatteo Brancaleoni23 Jun
                  • Re: DB node hang on startMatteo Brancaleoni23 Jun
Re: DB node hang on startTomas Ulin22 Jun