List:Cluster« Previous MessageNext Message »
From:Mike van Lammeren Date:October 25 2010 7:39pm
Subject:Re: Error starting new cluster: CM_REGREF Election without selecting
new candidate
View as plain text  
Hi Johan!

Awesome advice and awesome blog entry!

There were quite a few entries in my /etc/hosts file. I commented them out
except the line for 'localhost', as suggested in your blog, and now I can
start the ndb nodes.

Thank you very much for your advice, and thanks again for your amazing
Configurator!

On Mon, Oct 25, 2010 at 12:57 PM, Johan Andersson <johan@stripped>wrote:

> Hi,
>
> What do you have in your /etc/hosts?
> Here is a summary of things to check:
>
>
> http://johanandersson.blogspot.com/2009/05/cluster-fails-to-start-self-diagnosis.html
>
> BR
> johan
>
>
> 25 okt 2010 kl. 17.45 skrev Mike van Lammeren:
>
> > Hello!
> >
> > Thanks in advance for any help anyone can give!
> >
> > There are two ndb servers, called ndb1 and ndb2, and two mgm servers,
> called lb1 and lb2. After attempting to start the cluster, the mgm servers
> start OK, but the ndb nodes never do.
> >
> > The lb1 and lb2 logs are identical, so I'll just include the logs for
> lb1. Here is /data/mysqlcluster/ndb_1_out.log for lb1:
> >
> > MySQL Cluster Management Server mysql-5.1.47 ndb-7.1.8
> > ==INITIAL==
> > ==CONFIRMED==
> >
> >
> > Here is /data/mysqlcluster/ndb_1_cluster.log for lb1:
> >
> > 2010-10-25 11:28:26 [MgmtSrvr] INFO     -- Got initial configuration from
> '/etc/mysql/config.ini', will try to set it when all ndb_mgmd(s) started
> > 2010-10-25 11:28:26 [MgmtSrvr] INFO     -- Mgmt server state: nodeid 1
> reserved for ip 192.168.0.225, m_reserved_nodes 1.
> > 2010-10-25 11:28:26 [MgmtSrvr] INFO     -- Id: 1, Command port: *:1186
> > 2010-10-25 11:28:26 [MgmtSrvr] INFO     -- MySQL Cluster Management
> Server mysql-5.1.47 ndb-7.1.8 started
> > 2010-10-25 11:28:30 [MgmtSrvr] INFO     -- Node 1: Node 2 Connected
> > 2010-10-25 11:28:30 [MgmtSrvr] INFO     -- Node 2 connected
> > 2010-10-25 11:28:30 [MgmtSrvr] INFO     -- Starting initial configuration
> change
> > 2010-10-25 11:28:30 [MgmtSrvr] INFO     -- Configuration 1 commited
> > 2010-10-25 11:28:30 [MgmtSrvr] INFO     -- Config change completed! New
> generation: 1
> > 2010-10-25 11:28:42 [MgmtSrvr] INFO     -- Mgmt server state: nodeid 3
> reserved for ip 192.168.0.192, m_reserved_nodes 1 and 3.
> > 2010-10-25 11:28:48 [MgmtSrvr] INFO     -- Mgmt server state: nodeid 4
> reserved for ip 192.168.0.193, m_reserved_nodes 1, 3 and 4.
> > 2010-10-25 11:28:48 [MgmtSrvr] INFO     -- Node 1: Node 3 Connected
> > 2010-10-25 11:28:52 [MgmtSrvr] INFO     -- Node 3: Initial start, waiting
> for 4 to connect,  nodes [ all: 3 and 4 connected: 3 no-wait:  ]
> > 2010-10-25 11:28:54 [MgmtSrvr] INFO     -- Node 1: Node 4 Connected
> > 2010-10-25 11:28:55 [MgmtSrvr] INFO     -- Node 3: Initial start, waiting
> for 4 to connect,  nodes [ all: 3 and 4 connected: 3 no-wait:  ]
> > 2010-10-25 11:28:57 [MgmtSrvr] INFO     -- Node 4: Initial start, waiting
> for 3 to connect,  nodes [ all: 3 and 4 connected: 4 no-wait:  ]
> >
> >
> > Here is /data/mysqlcluster/ndb_3_out.log for ndb1:
> >
> > 2010-10-25 11:28:43 [ndbd] INFO     -- Angel pid: 4460 started child:
> 4461
> > 2010-10-25 11:28:43 [ndbd] INFO     -- Configuration fetched from '
> 192.168.0.225:1186', generation: 1
> > NDBMT: non-mt
> > 2010-10-25 11:28:43 [ndbd] INFO     -- NDB Cluster -- DB node 3
> > 2010-10-25 11:28:43 [ndbd] INFO     -- mysql-5.1.47 ndb-7.1.8 --
> > 2010-10-25 11:28:43 [ndbd] INFO     -- WatchDog timer is set to 6000 ms
> > 2010-10-25 11:28:43 [ndbd] INFO     -- Ndbd_mem_manager::init(1) min:
> 708Mb initial: 964Mb
> > Adding 965Mb to ZONE_LO (1,30855)
> > 2010-10-25 11:28:48 [ndbd] INFO     -- Start initiated (mysql-5.1.47
> ndb-7.1.8)
> > 2010-10-25 11:28:48 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in:
> Job Handling elapsed=100
> > 2010-10-25 11:28:48 [ndbd] INFO     -- Watchdog: User time: 16  System
> time: 407
> > 2010-10-25 11:28:48 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in:
> Job Handling elapsed=200
> > 2010-10-25 11:28:48 [ndbd] INFO     -- Watchdog: User time: 18  System
> time: 416
> > NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
> > NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
> > NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
> > NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
> > NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
> > 2010-10-25 11:28:48 [ndbd] INFO     -- timerHandlingLab now: 325335749
> sent: 325335408 diff: 341
> > WOPool::init(61, 9)
> > RWPool::init(22, 14)
> > 2010-10-25 11:28:49 [ndbd] INFO     -- timerHandlingLab now: 325336222
> sent: 325335817 diff: 405
> > RWPool::init(42, 18)
> > RWPool::init(62, 13)
> > 2010-10-25 11:28:49 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in:
> Job Handling elapsed=100
> > 2010-10-25 11:28:49 [ndbd] INFO     -- Watchdog: User time: 36  System
> time: 478
> > 2010-10-25 11:28:49 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in:
> Job Handling elapsed=200
> > 2010-10-25 11:28:49 [ndbd] INFO     -- Watchdog: User time: 36  System
> time: 488
> > 2010-10-25 11:28:49 [ndbd] WARNING  -- Ndb kernel thread 0 is stuck in:
> Job Handling elapsed=300
> > 2010-10-25 11:28:49 [ndbd] INFO     -- Watchdog: User time: 36  System
> time: 498
> > 2010-10-25 11:28:50 [ndbd] INFO     -- timerHandlingLab now: 325336779
> sent: 325336392 diff: 387
> > Using 1 fragments per node
> > 2010-10-25 11:28:50 [ndbd] INFO     -- timerHandlingLab now: 325336979
> sent: 325336820 diff: 159
> > RWPool::init(c2, 18)
> > RWPool::init(e2, 16)
> > WOPool::init(41, 8)
> > RWPool::init(82, 12)
> > RWPool::init(a2, 54)
> > WOPool::init(21, 10)
> > 2010-10-25 11:28:50 [ndbd] INFO     -- Start phase 0 completed
> > 2010-10-25 11:28:50 [ndbd] INFO     -- Start phase 0 completed
> > 2010-10-25 11:28:50 [ndbd] INFO     -- CM_REGREF from Node 3 to our Node
> 3. Cause = Election without selecting new candidate
> > 2010-10-25 11:28:50 [ndbd] INFO     -- Initial start, waiting for 4 to
> connect,  nodes [ all: 3 and 4 connected: 3 no-wait:  ]
> >
> >
> >
> >
> > On Fri, Oct 22, 2010 at 2:18 PM, Johan Andersson <johan@stripped>
> wrote:
> > Hi,
> > Hard to say. Send the complete logs (ndb_error logs, tracefiles,
> config.ini and output from 'free -m ' and I can have a look.
> >
> > Br
> > Johan
> >
> > On 22 okt 2010, at 18:38, Mike van Lammeren <mike@stripped>
> wrote:
> >
> > > Hello!
> > >
> > > I've been following this excellent mailing list for over a year now,
> and I
> > > just know someone can help me with my problem!
> > >
> > > I have a new cluster of VMs running Ubuntu 10.04, with MySQL Cluster
> 7.1
> > > installed. I used the configurator from severalnines.com to build and
> > > install the cluster. These VMs are strictly for development purposes,
> but we
> > > will be going to real hardware in a few weeks. I have been successfully
> > > running a very similar set-up with MySQL Cluster 7.0 on Ubuntu 8.04 VMs
> for
> > > about a year.
> > >
> > > When I try to start my new cluster, the mgmd nodes seem to start OK.
> But
> > > when I try to start the ndbd nodes, I see this error in the log:
> > >
> > > 2010-10-22 12:25:40 [ndbd] INFO     -- Start phase 0 completed
> > > 2010-10-22 12:25:40 [ndbd] INFO     -- Start phase 0 completed
> > > 2010-10-22 12:25:40 [ndbd] INFO     -- CM_REGREF from Node 3 to our
> Node 3.
> > > Cause = Election without selecting new candidate
> > >
> > >
> > > Here are the pertinent lines from my config.ini
> > >
> > > [NDB_MGMD DEFAULT]
> > > PortNumber=1186
> > > Datadir=/data/mysqlcluster/
> > >
> > > [NDB_MGMD]
> > > NodeId=1
> > > Hostname=192.168.0.225
> > >
> LogDestination=FILE:filename=ndb_1_cluster.log,maxsize=10000000,maxfiles=6
> > > ArbitrationRank=1
> > >
> > > [NDB_MGMD]
> > > NodeId=2
> > > Hostname=192.168.0.226
> > >
> LogDestination=FILE:filename=ndb_2_cluster.log,maxsize=10000000,maxfiles=6
> > > ArbitrationRank=1
> > >
> > > [NDBD DEFAULT]
> > > NoOfReplicas=2
> > > Datadir=/data/mysqlcluster/
> > >
> > >
> > > I've been stuck on this problem for a day now. Any help or advice would
> be
> > > appreciated.
> > >
> > > Thanks!
> > >
> > > Mike van Lammeren
> >
>
>

Thread
Error starting new cluster: CM_REGREF Election without selecting new candidateMike van Lammeren22 Oct
  • Re: Error starting new cluster: CM_REGREF Election without selectingnew candidateMike van Lammeren22 Oct
  • Re: Error starting new cluster: CM_REGREF Election without selectingnew candidateMagnus BlĂ„udd25 Oct
Re: Error starting new cluster: CM_REGREF Election without selectingnew candidateMike van Lammeren25 Oct