From: Jim Hoadley Date: June 30 2004 5:30pm Subject: Re: NDB Cluster on multiple computers List-Archive: http://lists.mysql.com/cluster/50 Message-Id: <20040630173032.92309.qmail@web41908.mail.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Here's an update. Getting close to a 2-node cluster that can survive either node crashing. > - although the cluster can survive shutting down the node on the "second > computer," the MySQL daemon dies when shutting down the node on computer 1 > (on which the manager and API are running). I turned off iptables and now MySQL does not crash when shutting down the node on computer 1. Does anyone know which ports/protocols should be allowed in iptables other than 2200, 2201, 2202, ... and 3306. In production I'd like to enable iptables. > - leaving that running, both nodes would eventually crash after 6+ hours, > and bringing it all back up with the data intact was not possible. This no longer happens. Previously I was unable to compile a version on an AMD-based PC. The code snapshot from 6/28 did compile, though, and now it's in my test cluster. I'm to get the API running on multiple computers too. Those are good things, and I feel I'm making progress. Here's a rundown of what works: MySQL Cluster with MGM, 2 DBs, API all on computer 1 works. MySQL Cluster with MGM, 2 DBs, API all on computer 2 works. MySQL Cluster with MGM, 2 DBs, API all on computer 3 works. MySQL Cluster with MGM, DB and API on computer 1, DB on computer 2: - runs fine uninterrupted - when I stop DB on computer 2, all is OK - when I bring the DB on computer 2 back into cluster, all is OK - when I stop DB on computer 1, all is OK Here's my problem: I can't bring the DB on computer 1 back into the cluster. Could someone else try this test and see if they have a similar problem? Any ideas? Thanks. -- Jim Jim Hoadley Sr Software Eng Dealer Fusion, Inc --- Jim Hoadley wrote: > Tomas -- > > Thanks for your reply. The information in Matteo's post solved my problem > too: > > > Solved! > > > simply before I had the hostname of each machine listed > > in the 127.0.0.1 entry in /etc/hosts. > > Moving that entry on both machine to the eth0 ipaddr > > instead of 127.0.0.1 made averything work.... > > Now I've got another hang up. > > I've got MySQL Cluster running with the MGM, DB1 and API on a host named > edsel, > and DB2 on a host named cooler. > > When I stop DB2 all is good. MySQL clients can still use the database. > However > when I stop DB1, mysqld crashes (tries to restart and can't). At this point I > can restart DB1, then restart mysqld and everything is fine again. > > Tomas, any ideas? Matteo have you tried this sort of test? > > The errors I see in the mysql client (running on a third computer) are: > > mysql> SELECT * FROM woohoo; > ERROR 1015: Can't lock file (errno: 4009) > mysql> SELECT * FROM woohoo; > ERROR 2013: Lost connection to MySQL server during query > > The error log var/edsel.err says: > > 040623 17:20:18 mysqld started > 040623 17:20:18 InnoDB: Started; log sequence number 0 43634 > /usr/local/mysql/libexec/mysqld: ready for connections. > Version: '4.1.3-beta-nightly-20040618' socket: '/tmp/mysql.sock' port: 3306 > mysqld got signal 11; > This could be because you hit a bug. It is also possible that this binary > or one of the libraries it was linked against is corrupt, improperly built, > or misconfigured. This error can also be caused by malfunctioning hardware. > We will try our best to scrape up some info that will hopefully help diagnose > the problem, but since we have already crashed, something is definitely wrong > and this may fail. > > > > key_buffer_size=8388600 > read_buffer_size=131072 > max_used_connections=1 > max_connections=100 > threads_connected=1 > It is possible that mysqld could use up to > key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = > 225791 K > bytes of memory > Hope that's ok; if not, decrease some variables in the equation. > > > > thd=0x87185e0 > Attempting backtrace. You can use the following information to find out > where mysqld died. If you see no messages after this, something went > terribly wrong... > Cannot determine thread, fp=0xb05c1c6c, backtrace may not be correct. > Stack range sanity check OK, backtrace follows: > 0x816ebc0 > 0xb749de48 > 0x81b8224 > 0x81ee1b0 > 0x81b8224 > 0x81b7478 > 0x81b13b7 > 0x81a91d1 > 0x81a9c05 > 0x81a618a > 0x8183b3e > 0x818974c > 0x8182588 > 0x8182136 > 0x81818da > 0xb7497dac > 0xb73c3a8a > New value of fp=(nil) failed sanity check, terminating stack trace! > Please read http://www.mysql.com/doc/en/Using_stack_trace.html and follow > instructions on how to resolve the stack trace. Resolved > stack trace is much more helpful in diagnosing the problem, so please do > resolve it > Trying to get some variables. > Some pointers may be invalid and cause the dump to abort... > thd->query at 0x8731348 = SELECT * FROM woohoo > thd->thread_id=1 > The manual page at http://www.mysql.com/doc/en/Crashing.html contains > information that should help you find out what is causing the crash. > > > > Number of processes running now: 0 > 040623 17:25:05 mysqld restarted > 040623 17:25:05 InnoDB: Started; log sequence number 0 43634 > 040623 17:26:11 Can't init databases > 040623 17:26:11 Aborting > > > > 040623 17:26:11 InnoDB: Starting shutdown... > 040623 17:26:13 InnoDB: Shutdown completed; log sequence number 0 43634 > 040623 17:26:13 /usr/local/mysql/libexec/mysqld: Shutdown complete > > > > 040623 17:26:15 mysqld ended > > Thanks for any help you can provide. > > -- Jim > > > > > > > > > > > --- Tomas Ulin wrote: > > seems there is a similar problem for mbrancaleoni@stripped (other thread) > > > > as I stated there you might have to specify ip-addresses sometimes. If > > this helps, please let me know, so we can try to fix this in the source. > > > > BR, > > > > T > > > > Jim Hoadley wrote: > > > > >I'm trying to set up MySQL Cluster with 2 DB nodes, where each DB node is > on > > a > > >separate computer. > > > > > >Management server is on computer 1. > > >DB node 2 is on computer 2. > > >DB node 3 is on computer 3. > > >Computer 1 and 3 are the same machine. > > > > > >I've edited the config.ini on computer 1 and changed the definition of > > COMPUTER > > >2 to the hostname of computer 2. I've edited the Ndb.cfg on computer 2 to > > >reference the management server on computer 1. > > > > > >Step 2.10 on page 20 of the MySQL Cluster Administrator Guide seems to say > > > >that's all that's needed. > > > > > >What I've seen: > > > > > >Management server and DB node 3 start up and communicate, and apparently > DB > > >node 2 sees the management server (looking at output from ndbd), but the > > >management server doesn't see DB node 2 (looking at NBD>2 status). > > > > > >I am able to get MySQL Cluster running both computer 1 and computer 2 > > >independently. In this case I am using ndb/ndbcluster.sh --small & > > > > > >Some other pertinent info: > > > > > >As a diagnostic, I tried to telnet from computer 1 to computer 2 on port > > 2202. > > >Connection refused. But a telnet to localhost on port 2202 on computer 2 > is > > >accepted. I have ports 2200, 2201, 2202, 2203, 2204 open for TCP in > iptables > > on > > >both computers. > > > > > >Any help would be appreciated. Thanks. > > > > > > > > > > > > > > > > > > > > >__________________________________ > > >Do you Yahoo!? > > >New and Improved Yahoo! Mail - 100MB free storage! > > >http://promotions.yahoo.com/new_mail > > > > > > > > > > > > > > > -- > > MySQL Cluster Mailing List > > For list archives: http://lists.mysql.com/cluster > > To unsubscribe: http://lists.mysql.com/cluster?unsub=j_hoadley@stripped > > > > > === message truncated === __________________________________ Do you Yahoo!? New and Improved Yahoo! Mail - Send 10MB messages! http://promotions.yahoo.com/new_mail