Here's what we'd like to do. We have a cluster of /x/ machines (right
now, x = 2), and we'd like to have these machines answer mysql requests
on their primary network interfaces (100mbs full-duplex), but to
relegate all ndb traffic to a private network on their secondary
network interfaces (gigabit ethernet).
We've got a dedicated vlan on our switch for the ndb traffic, and the
secondary interfaces are up and able to see / communicated with each
other, but we're having problems when we try to bring the cluster up.
We're using 4.1.3 beta, and we're able to run the cluster on the
primary NICs.
The documentation suggests that you can use IP addressess in the
config.ini file, instead of hostnames.
Our config.ini file looks like...
[COMPUTER]
Id: 1
HostName: 10.0.0.1
[COMPUTER]
Id: 2
HostName: 10.0.0.2
[MGM]
Id: 1
ExecuteOnComputer: 1
LogDestination: SYSLOG:facility=local0
PortNumber: 28000
[DB DEFAULT]
NoOfReplicas: 2
[DB]
Id: 2
ExecuteOnComputer: 1
FileSystemPath: /var/mysql/cluster/data
[DB]
Id: 3
ExecuteOnComputer: 2
FileSystemPath: /var/mysql/cluster/data
[API DEFAULT]
ArbitrationRank: 1
[API]
Id: 4
ExecuteOnComputer: 1
[API]
Id: 5
ExecuteOnComputer: 2
[TCP DEFAULT]
PortNumber: 28002
And the data Ndg.cfg look like...
nodeid=2
host=10.0.0.1:28000
and
nodeid=3
host=10.0.0.1:28000
When we try to start ndbd, we get..
sanskrit-root# /usr/libexec/ndbd
Error handler shutting down system
Error handler shutdown completed - exiting
and the error log says...
Current byte-offset of file-pointer is: 468
Date/Time: Tuesday 20 July 2004 - 16:16:33
Type of error: error
Message: Invalid Configuration fetched from Management Server
Fault ID: 2350
Problem data: Could not fetch configuration/invalid configuration
Object of reference: Local hostname(sanskrit.test.umich.edu) and
config hostname(10.0.0.1) dont match
ProgramName: NDB Kernel
ProcessID: 3758
TraceFile: NDB_TraceFile_1.trace
***EOM***
So... we try adding entries in /etc/hosts....
10.0.0.1 linus
10.0.0.2 lucy
and changing the entries in config.ini, and we get the same thing...
Problem data: Could not fetch configuration/invalid configuration
Object of reference: Local hostname(croatian.test.umich.edu) and
config hostname(lucy) dont match
I *think* I was able to get it to work (it came up, seemed to work, but
I've no network traffic analysis to prove it's going through the
private network), but I had to alter /etc/hosts so the machines' own
entries pointed to the IP addresses on the private network, and also
change /etc/nsswitch.conf to read from files before dns.
This seems like a lot of work to do something that I would have thought
would be pretty easy acc'd to the docs. Why must their be parity
between the machine's hostname and the IP address connect through? Are
we stupid for even wanting / trying this?
Liam Hoekenga
University of Michigan Webmaster Team