From: Ph.D. Joseph E. Sacco Date: July 29 2004 3:34pm Subject: Building a cluster with a slow and a fast machine... List-Archive: http://lists.mysql.com/cluster/213 Message-Id: <1091115248.1786.77.camel@plantain.jesacco.com> MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Systems: * PowerMac with dual G4 533MHz CPU's, 1GB RAM, SCSI drives * Powermac with single G3 240MHz CPU, 264MB RAM, SCSI drives * Yellow Dog Linux-3.0.1 [Redhat clone] * mysql-4.1 from BK tree 28July04 ==================================================================== Question: What can be done to improve cluster startup when clustering two machines with drastically different resource/performance characteristics? ==================================================================== I am running a two replica, four node cluster on two PowerMacs. One machine is considerably faster than the other: [Slow G3 machine] # cat /proc/cpuinfo processor : 0 cpu : 740/750 revision : 2.2 (pvr 0008 0202) bogomips : 478.41 machine : PowerMac,NuBus motherboard : PDM MacRISC detected as : 0 () pmac flags : 00000000 memory : 264MB pmac-generation : NuBus [Fast G4 machine] # cat /proc/cpuinfo processor : 0 cpu : 7410, altivec supported temperature : 58-60 C (uncalibrated) clock : 533MHz revision : 17.3 (pvr 800c 1103) bogomips : 1064.96 processor : 1 cpu : 7410, altivec supported temperature : 34-36 C (uncalibrated) clock : 533MHz revision : 17.3 (pvr 800c 1103) bogomips : 1064.96 total bogomips : 2129.92 machine : PowerMac3,4 motherboard : PowerMac3,4 MacRISC2 MacRISC Power Macintosh board revision : 00000000 detected as : 69 (PowerMac G4 Silver) pmac flags : 00000000 L2 cache : 1024K unified memory : 1024MB pmac-generation : NewWorld It is often difficult to get the DB nodes on the slower machine to start. It typically takes several attempts to get the two nodes on the slow machine up and running. Once the cluster is established it is stable. One of the things I have noticed is the DB node startup process on the slow machine temporarily spawns a *large* number of ndbd processes. During the startup it is not uncommon to see 20 -> 30 ndbd processes started, which effectively overload the slow machine. Once all the nodes in the cluster are up and running the number of ndbd processes drops to two per DB node. What's going on??? Are things timing out so the "Angel" processes continually fire up other ndbd processes? If so, what can be done? -Joseph -- Joseph E. Sacco, Ph.D.