List:Cluster« Previous MessageNext Message »
From:Janos Lehnhardt Date:October 19 2012 2:31pm
Subject:Re: Installation of SQL-Node - connection established & heartbeat
not working
View as plain text  
Hey,

thanks for the fast reply.
I've invested another few hours trying to get the thing running, but I'm 
unfortunately still stuck on the same mistake.

The servers are installed on one vmware-esx-server as vm's using the 
same network device in the same subnet,
they are definately not routed over any firewall or whatsoever.

Its seems like the nodes cannot "heartbeat" with my sql-node, altho they 
reach each other directly on the first hop.
Ive disabled any sort of software-firewall on the machines afaik.

Id appreciate any ideas, since im running out of em..:(

These are my logs.


error-log of sql-node:
121019 16:27:12 mysqld_safe Number of processes running now: 0
121019 16:27:12 mysqld_safe mysqld restarted
121019 16:27:12 [Note] Plugin 'FEDERATED' is disabled.
121019 16:27:12 [Note] NDB: NodeID is 4, management server '10.X.X.57:1186'
121019 16:27:30 mysqld_safe Starting mysqld daemon with databases from 
/var/lib/mysql
121019 16:27:30 [Note] Plugin 'FEDERATED' is disabled.
121019 16:27:30 [Note] NDB: NodeID is 4, management server '10.X.X.57:1186'
121019 16:28:00 [Note] NDB[0]: NodeID: 4, no storage nodes connected 
(timed out)
121019 16:28:00 [Warning] NDB: server id set to zero - changes logged to 
bin log with server id zero will be logged with another server id by 
slave mysqlds
121019 16:28:00 [Note] Starting Cluster Binlog Thread
121019 16:28:00 InnoDB: The InnoDB memory heap is disabled
121019 16:28:00 InnoDB: Mutexes and rw_locks use GCC atomic builtins
121019 16:28:00 InnoDB: Compressed tables use zlib 1.2.3
121019 16:28:00 InnoDB: Using Linux native AIO
121019 16:28:00 InnoDB: Initializing buffer pool, size = 128.0M
121019 16:28:00 InnoDB: Completed initialization of buffer pool
121019 16:28:00 InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
121019 16:28:00  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
121019 16:28:00  InnoDB: Waiting for the background threads to start
121019 16:28:01 InnoDB: 1.1.8 started; log sequence number 1595685
121019 16:28:01 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3306
121019 16:28:01 [Note]   - '0.0.0.0' resolves to '0.0.0.0';
121019 16:28:01 [Note] Server socket created on IP: '0.0.0.0'.
121019 16:28:01 [Note] Event Scheduler: Loaded 0 events
121019 16:28:01 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.27-ndb-7.2.8-cluster-gpl'  socket: 
'/var/lib/mysql/mysql.sock'  port: 3306  MySQL Cluster Community Server 
(GPL)

(the crash comes from the killing the mysqld process via console)

error-log of management-node:
2012-10-19 16:23:46 [MgmSrvr] INFO     -- Node 2: Node 4: API version 7.2.8
2012-10-19 16:23:46 [MgmSrvr] INFO     -- Node 3: Node 4 Connected
2012-10-19 16:23:46 [MgmSrvr] INFO     -- Node 3: Node 4: API version 7.2.8
2012-10-19 16:23:50 [MgmSrvr] WARNING  -- Node 2: Node 4 missed heartbeat 2
2012-10-19 16:23:51 [MgmSrvr] WARNING  -- Node 3: Node 4 missed heartbeat 2
2012-10-19 16:23:51 [MgmSrvr] WARNING  -- Node 2: Node 4 missed heartbeat 3
2012-10-19 16:23:52 [MgmSrvr] WARNING  -- Node 3: Node 4 missed heartbeat 3
2012-10-19 16:23:53 [MgmSrvr] WARNING  -- Node 2: Node 4 missed heartbeat 4
2012-10-19 16:23:53 [MgmSrvr] ALERT    -- Node 2: Node 4 declared dead 
due to missed heartbeat
2012-10-19 16:23:53 [MgmSrvr] INFO     -- Node 2: Communication to Node 
4 closed
2012-10-19 16:23:53 [MgmSrvr] INFO     -- Node 3: Communication to Node 
4 closed
2012-10-19 16:23:53 [MgmSrvr] ALERT    -- Node 2: Node 4 Disconnected
2012-10-19 16:23:53 [MgmSrvr] ALERT    -- Node 3: Node 4 Disconnected


Regards,
Janos

On 18.10.2012 17:31, Johan De Meersman wrote:
> ----- Original Message -----
>> From: "Janos Lehnhardt" <janos.lehnhardt@stripped>
>>
>> 2012-10-18 15:32:46 [MgmSrvr] WARNING  -- Node 2: Node 4 missed heartbeat 4
>> 2012-10-18 15:32:46 [MgmSrvr] ALERT    -- Node 2: Node 4 declared dead due to
> missed heartbeat
>> 2012-10-18 15:32:46 [MgmSrvr] INFO     -- Node 2: Communication to Node 4 closed
> Well, this is obviously why it keeps being disconnected. Now the question is, why is
> it missing heartbeats? You say there's no firewalls. I assume all nodes are connected to
> the same switch, then? Plug node 4 into another port, change the cable, et cetera to rule
> out network issues. Use a different network card, if you can. Also check the general
> system logs for unusual messages.
>
> Is your system not overloaded, cpu-wise? Is it swapping badly? Run ping to one of the
> other nodes (or better still, MTR) to see how the network behaves right before the
> disconnect. Grab a tcpdump and see wether there's something interesting to be seen there.
>
> Hmm. Also, check wether it's not as simple as another device claiming the same IP.
> Seen that one happen before, too :-)
>
>
>
>


Thread
Installation of SQL-Node - connection established & heartbeat notworkingJanos Lehnhardt18 Oct
  • Re: Installation of SQL-Node - connection established & heartbeatnot workingJohan De Meersman18 Oct
    • Re: Installation of SQL-Node - connection established & heartbeatnot workingJanos Lehnhardt19 Oct
  • Re: Installation of SQL-Node - connection established & heartbeatnot workingMagnus BlĂ„udd26 Oct