Quoting Yannis Tsakiris <gtsakiris@stripped>:
> > your MySQL Server has for some reason lost the connection to the NDB
> Cluster
> > storage engine. That is why you get the error "Can't lock file(errno:
> 4009)".
> > Why does this happen? In order to diagnose the problem you should look in
> the
> > cluster log file. It can be found in the directory of you MGM server and
> will
> > be called ndb_<nodeid>_cluster.log. In that file look for messages
> indicating
> > like "Node <nodeid> disconnected" and "Node <nodeid> missed
> heartbeat 2"
> etc.
> > You can also send that and your config.ini file to the list. If the
> computers
> > are heavily loaded the most likely problem is that the mysqld is excluded
> from
> > the cluster because it misses too many heartbeats, you should increases
> the
> > below two parameters in config.ini.
>
> The computers are not that much loaded, they are a pair of pIII at 1 GHz, 2
> cpu's on each computer with 1 GB each one. The average load is about 0.25.
> Anyway, I'll try to increase the interval between the heartbeats and see
> what will happen.
>
> > Also, instead of restarting the NDB nodes have you tried to restart the
> MySQL
> > Server that get this message? Will that resolve the problem?
>
> I'll try this alternative as soon as it happens again...
>
> I also send you my current config.ini:
> (2 Computers, 2 MGMs, 2 DBs, 2 APIs, 1 node group, 2 replicas):
>
> [COMPUTER]
> Id: 1
> Hostname: 10.3.90.1
>
> [COMPUTER]
> Id: 2
> HostName: 10.3.90.2
>
>
> [MGM DEFAULT]
> PortNumber: 2200
>
> [MGM]
> Id: 1
> ExecuteOnComputer: 1
> ArbitrationRank: 1
>
> [MGM]
> Id: 2
> ExecuteOnComputer: 2
> ArbitrationRank: 2
>
>
> [DB DEFAULT]
> NoOfReplicas: 2
> StopOnError: N
>
> [DB]
> Id: 3
> ExecuteOnComputer: 1
> FileSystemPath: /usr/local/mysql/cluster/db/
>
> [DB]
> Id: 4
> ExecuteOnComputer: 2
> FileSystemPath: /usr/local/mysql/cluster/db/
>
>
> [API]
> Id: 5
> ExecuteOnComputer: 1
>
> [API]
> Id: 6
> ExecuteOnComputer: 2
>
>
> [TCP DEFAULT]
> PortNumber: 28002
>
> [TCP]
> NodeId1: 1
> NodeId2: 3
>
> [TCP]
> NodeId1: 1
> NodeId2: 4
>
> [TCP]
> NodeId1: 2
> NodeId2: 3
>
> [TCP]
> NodeId1: 2
> NodeId2: 4
>
> [TCP]
> NodeId1: 3
> NodeId2: 5
>
> [TCP]
> NodeId1: 3
> NodeId2: 6
>
> [TCP]
> NodeId1: 4
> NodeId2: 5
>
> [TCP]
> NodeId1: 4
> NodeId2: 6
>
> [TCP]
> NodeId1: 3
> NodeId2: 4
>
> Do you see something wrong?
> Regards,
> Yannis Tsakiris
>
You can remove all [TCP] sections in your config.ini files since this will be
calculated automatically by the mgm server when it reads the config file(the
rule is simple, all nodes must have connections between them). Your problem
could be because there is a connection missing! Easisest way is to remove all.
:)
Have you looked in the ndb_1_cluster.log?
If the error occurs again you could type the SHOW command in the ndb_mgm and
send the output to the list.
Best regards
Magnus
--
Magnus Svensson, Software Engineer
MySQL AB, www.mysql.com
Office: +46 (0)709 164 491
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.