List:Cluster« Previous MessageNext Message »
From:Tomas Ulin Date:June 21 2004 3:34pm
Subject:Re: DB node hang on start
View as plain text  
but, I saw the below.  It shows that you did not start cluster empty (-i).

T

2004-06-20 16:49:34 [NDB] INFO     -- Angel pid: 5558 ndb pid: 5560
2004-06-20 16:49:34 [NDB] INFO     -- NDB Cluster -- DB node 2
2004-06-20 16:49:34 [NDB] INFO     -- Version 3.5.0 (beta) --
2004-06-20 16:49:34 [NDB] INFO     -- Start initiated (version 3.5.0)
Dbdict: name=sys/def/SYSTAB_0,id=0
Dbdict: name=sys/def/NDB$EVENTS_0,id=2
Dbdict: name=test/def/matteotabella2,id=4
Dbdict: name=test/def/4/PRIMARY,id=6
Dbdict: name=test/def/matteo,id=8
Dbdict: name=test/def/8/PRIMARY,id=10
Dbdict: name=test/def/mytabella,id=12
Dbdict: name=test/def/12/PRIMARY,id=14
2004-06-20 16:50:12 [NDB] INFO     -- Started (version 3.5.0)



Tomas Ulin wrote:

>
> when going from 1-node to 2-nodes, did you restart both nodes with -i 
> flag?
>
> T
>
> Matteo Brancaleoni wrote:
>
>> Hi
>>
>> Il lun, 2004-06-21 alle 13:45, Tomas Ulin ha scritto:
>>  
>>
>>> Did you try to start the second node with "ndbd -i"?
>>>   
>>
>>
>> yes, without success.
>>
>>  
>>
>>> Brancaleoni Matteo wrote:
>>>
>>>   
>>>
>>>> Hi, thanks for the fast answer :)
>>>> see my comments inline.
>>>>
>>>> Il lun, 2004-06-21 alle 00:43, Tomas Ulin ha scritto:
>>>>
>>>>
>>>>     
>>>>
>>>>> first of all, if you download the latest source you don't have to 
>>>>> specify the "[TCP]" connections at all
>>>>>  
>>>>>       
>>>>
>>>> Ok, done.
>>>>
>>>>
>>>>
>>>>     
>>>>
>>>>> 1) please look where you started ndb_mgmd, you should find a 
>>>>> cluster.log (look at the end "tail -n100 cluster.log")
>>>>>  
>>>>>       
>>>>
>>>> ok, got it. unfortunately no trace about the db node #3, that's
>>>> the one onto the remote machine
>>>>
>>>>
>>>>
>>>>     
>>>>
>>>>> 2) please make sure that you don't have any trailing "ndbd" 
>>>>> processes on the failing machine. (we're working on better 
>>>>> detection on clashes), if so kill and restart  (if a "ndb" process 
>>>>> hangs this is often due to that there are "multiple" processes 
>>>>> trying to connect as the same "id")
>>>>>  
>>>>>       
>>>>
>>>> ok. no trailing processes.
>>>>
>>>>
>>>>
>>>>     
>>>>
>>>>> 3) make sure you have your [COMPUTER] sections correct in the 
>>>>> config file
>>>>>  
>>>>>       
>>>>
>>>> ok, done
>>>>
>>>>
>>>>
>>>>     
>>>>
>>>>> 4) make sure that your Ndb.cfg/NDB_CONNECTSTRING points to the 
>>>>> actual host:port that run the ndb_mgmd
>>>>>  
>>>>>       
>>>>
>>>> sure done.
>>>> If I write something wrong (done just 4 testing) the node
>>>> doesn't go at all into starting phase (should be phase 1, I think).
>>>> But when starts, is stick in that state.
>>>>
>>>>
>>>>
>>>>     
>>>>
>>>>> and try again until you get the config right
>>>>>  
>>>>>       
>>>>
>>>> mmh... I tried to start 2 db nodes on the same machine
>>>> (of course with different fs), the 2nd db node starts,
>>>> but after phase #4 crashes.
>>>>
>>>> I have a rather long trace file for that.
>>>> the error into ndbd error.log is :
>>>>
>>>> Date/Time: x 20 June 2004 - 23:15:49
>>>> Type of error: error
>>>> Message: Internal program error (failed ndbrequire)
>>>> Fault ID: 2341
>>>> Problem data: DbdihMain.cpp
>>>> Object of reference: DBDIH (Line: 1080) 0x00000002
>>>> ProgramName: NDB Kernel
>>>> ProcessID: 10904
>>>> TraceFile: NDB_TraceFile_1.trace
>>>> ***EOM***
>>>>
>>>>
>>>> The mgm config is (for 2 db nodes on same machine)
>>>> [COMPUTER]
>>>> Id: 1
>>>> ByteOrder: Little
>>>> HostName: bestia
>>>> [COMPUTER]
>>>> Id: 2
>>>> ByteOrder: Little
>>>> HostName: bestia
>>>> [MGM]
>>>> Id: 1
>>>> ExecuteOnComputer: 1
>>>> ArbitrationRank: 1
>>>> [DB DEFAULT]
>>>> NoOfReplicas: 2
>>>> [DB]
>>>> Id: 2
>>>> ExecuteOnComputer: 1
>>>> FileSystemPath: /root/ndb/ndb_data1
>>>> [DB]
>>>> Id: 3
>>>> ExecuteOnComputer: 2
>>>> FileSystemPath: /root/ndb/ndb_data2
>>>> [API]
>>>> Id: 4
>>>> ExecuteOnComputer: 1
>>>> ArbitrationRank: 1
>>>>
>>>> Regarding 2 db nodes on different machines, I'm stick
>>>> to node #3 not starting (stops at phase 1, without
>>>> exiting...)
>>>> The only difference in mgm config.ini is the hostname
>>>> of COMPUTER with id #2
>>>>
>>>> any clue?
>>>>
>>>>
>>>>
>>>>     
>>>
>
>

Thread
DB node hang on startBrancaleoni Matteo20 Jun
  • Re: DB node hang on startTomas Ulin20 Jun
    • Re: DB node hang on startBrancaleoni Matteo20 Jun
      • Re: DB node hang on startTomas Ulin21 Jun
      • Re: DB node hang on startTomas Ulin21 Jun
        • Re: DB node hang on startMatteo Brancaleoni21 Jun
          • Re: DB node hang on startTomas Ulin21 Jun
            • Re: DB node hang on startTomas Ulin21 Jun
              • Re: DB node hang on startMatteo Brancaleoni21 Jun
                • Re: DB node hang on startTomas Ulin21 Jun
                  • Re: DB node hang on startBrancaleoni Matteo21 Jun
                • Re: DB node hang on starttulin23 Jun
                  • Re: DB node hang on startMatteo Brancaleoni23 Jun
                    • Re: DB node hang on startTomas Ulin23 Jun
                      • Re: DB node hang on startMatteo Brancaleoni23 Jun
                  • Re: DB node hang on startMatteo Brancaleoni23 Jun
Re: DB node hang on startTomas Ulin22 Jun