List:Cluster« Previous MessageNext Message »
From:Liam Hoekenga Date:July 20 2004 8:54pm
Subject:using a private network for clustering
View as plain text  
Here's what we'd like to do.  We have a cluster of /x/ machines (right 
now, x = 2), and we'd like to have these machines answer mysql requests 
on their primary network interfaces (100mbs full-duplex), but to 
relegate all ndb traffic to a private network on their secondary 
network interfaces (gigabit ethernet).

We've got a dedicated vlan on our switch for the ndb traffic, and the 
secondary interfaces are up and able to see / communicated with each 
other, but we're having problems when we try to bring the cluster up.  
We're using 4.1.3 beta, and we're able to run the cluster on the 
primary NICs.

The documentation  suggests that you can use IP addressess in the 
config.ini file, instead of hostnames.

Our config.ini file looks like...

     [COMPUTER]
     Id: 1
     HostName: 10.0.0.1

      [COMPUTER]
      Id: 2
      HostName: 10.0.0.2

     [MGM]
     Id: 1
     ExecuteOnComputer: 1
     LogDestination: SYSLOG:facility=local0
     PortNumber: 28000

     [DB DEFAULT]
     NoOfReplicas: 2

     [DB]
     Id: 2
     ExecuteOnComputer: 1
     FileSystemPath: /var/mysql/cluster/data

     [DB]
     Id: 3
     ExecuteOnComputer: 2
     FileSystemPath: /var/mysql/cluster/data

     [API DEFAULT]
     ArbitrationRank: 1

     [API]
     Id: 4
     ExecuteOnComputer: 1

     [API]
     Id: 5
     ExecuteOnComputer: 2

     [TCP DEFAULT]
     PortNumber: 28002


And the data Ndg.cfg look like...

     nodeid=2
     host=10.0.0.1:28000

and

     nodeid=3
     host=10.0.0.1:28000


When we try to start ndbd, we get..

     sanskrit-root# /usr/libexec/ndbd
     Error handler shutting down system
     Error handler shutdown completed - exiting

and the error log says...

     Current byte-offset of file-pointer is: 468

     Date/Time: Tuesday 20 July 2004 - 16:16:33
     Type of error: error
     Message: Invalid Configuration fetched from Management Server
     Fault ID: 2350
     Problem data: Could not fetch configuration/invalid configuration
     Object of reference: Local hostname(sanskrit.test.umich.edu) and 
config hostname(10.0.0.1) dont match
     ProgramName: NDB Kernel
     ProcessID: 3758
     TraceFile: NDB_TraceFile_1.trace
     ***EOM***

So... we try adding entries in /etc/hosts....

10.0.0.1                linus
10.0.0.2                lucy

and changing the entries in config.ini, and we get the same thing...

     Problem data: Could not fetch configuration/invalid configuration
     Object of reference: Local hostname(croatian.test.umich.edu) and 
config hostname(lucy) dont match

I *think* I was able to get it to work (it came up, seemed to work, but 
I've no network traffic analysis to prove it's going through the 
private network), but I had to alter /etc/hosts so the machines' own 
entries pointed to the IP addresses on the private network, and also 
change /etc/nsswitch.conf to read from files before dns.

This seems like a lot of work to do something that I would have thought 
would be pretty easy acc'd to the docs.  Why must their be parity 
between the machine's hostname and the IP address connect through?  Are 
we stupid for even wanting / trying this?

Liam Hoekenga
University of Michigan Webmaster Team

Thread
using a private network for clusteringLiam Hoekenga20 Jul
  • Re: using a private network for clusteringLiam Hoekenga21 Jul