List:Cluster« Previous MessageNext Message »
From:Jim Hoadley Date:June 30 2004 5:30pm
Subject:Re: NDB Cluster on multiple computers
View as plain text  
Here's an update. Getting close to a 2-node cluster that can survive either
node
crashing.
                                                                               
                                             
>  - although the cluster can survive shutting down the node on the "second
> computer," the MySQL daemon dies when shutting down the node on computer 1
> (on which the manager and API are running).
                                                                               
                                             
I turned off iptables and now MySQL does not crash when shutting down the
node on computer 1. Does anyone know which ports/protocols should be allowed in
iptables other than 2200, 2201, 2202, ... and 3306. In production I'd like to
enable iptables.
                                                                               
                                             
>  - leaving that running, both nodes would eventually crash after 6+ hours,
> and bringing it all back up with the data intact was not possible.
                                                                               
                                             
This no longer happens.
                                                                               
                                             
Previously I was unable to compile a version on an AMD-based PC. The
code snapshot from 6/28 did compile, though, and now it's in my test cluster.
                                                                               
                                             
I'm to get the API running on multiple computers too.

Those are good things, and I feel I'm making progress. Here's a rundown of what
works:
                                                                               
                                             
MySQL Cluster with MGM, 2 DBs, API all on computer 1 works.
MySQL Cluster with MGM, 2 DBs, API all on computer 2 works.
MySQL Cluster with MGM, 2 DBs, API all on computer 3 works.
MySQL Cluster with MGM, DB and API on computer 1, DB on computer 2:
  - runs fine uninterrupted
  - when I stop DB on computer 2, all is OK
  - when I bring the DB on computer 2 back into cluster, all is OK
  - when I stop DB on computer 1, all is OK

Here's my problem: I can't bring the DB on computer 1 back into the cluster.
Could someone else try this test and see if they have a similar problem? Any
ideas? Thanks.

-- Jim 

   Jim Hoadley
   Sr Software Eng
   Dealer Fusion, Inc

--- Jim Hoadley <j_hoadley@stripped> wrote:
> Tomas --
> 
> Thanks for your reply. The information in Matteo's post solved my problem
> too:
> 
>   > Solved!
> 
>   > simply before I had the hostname of each machine listed
>   > in the 127.0.0.1 entry in /etc/hosts.
>   > Moving that entry on both machine to the eth0 ipaddr
>   > instead of 127.0.0.1 made averything work....
> 
> Now I've got another hang up.
> 
> I've got MySQL Cluster running with the MGM, DB1 and API on a host named
> edsel,
> and DB2 on a host named cooler.
> 
> When I stop DB2 all is good. MySQL clients can still use the database.
> However
> when I stop DB1, mysqld crashes (tries to restart and can't). At this point I
> can restart DB1, then restart mysqld and everything is fine again.
> 
> Tomas, any ideas? Matteo have you tried this sort of test?
> 
> The errors I see in the mysql client (running on a third computer) are:
> 
> mysql> SELECT * FROM woohoo;
> ERROR 1015: Can't lock file (errno: 4009)
> mysql> SELECT * FROM woohoo;
> ERROR 2013: Lost connection to MySQL server during query
> 
> The error log var/edsel.err says:
> 
> 040623 17:20:18  mysqld started
> 040623 17:20:18  InnoDB: Started; log sequence number 0 43634
> /usr/local/mysql/libexec/mysqld: ready for connections.
> Version: '4.1.3-beta-nightly-20040618'  socket: '/tmp/mysql.sock'  port: 3306
> mysqld got signal 11;
> This could be because you hit a bug. It is also possible that this binary
> or one of the libraries it was linked against is corrupt, improperly built,
> or misconfigured. This error can also be caused by malfunctioning hardware.
> We will try our best to scrape up some info that will hopefully help diagnose
> the problem, but since we have already crashed, something is definitely wrong
> and this may fail.
>                                                                              
>  
>                                              
> key_buffer_size=8388600
> read_buffer_size=131072
> max_used_connections=1
> max_connections=100
> threads_connected=1
> It is possible that mysqld could use up to
> key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections =
> 225791 K
> bytes of memory
> Hope that's ok; if not, decrease some variables in the equation.
>                                                                              
>  
>                                              
> thd=0x87185e0
> Attempting backtrace. You can use the following information to find out
> where mysqld died. If you see no messages after this, something went
> terribly wrong...
> Cannot determine thread, fp=0xb05c1c6c, backtrace may not be correct.
> Stack range sanity check OK, backtrace follows:
> 0x816ebc0
> 0xb749de48
> 0x81b8224
> 0x81ee1b0
> 0x81b8224
> 0x81b7478
> 0x81b13b7
> 0x81a91d1
> 0x81a9c05
> 0x81a618a
> 0x8183b3e
> 0x818974c
> 0x8182588
> 0x8182136
> 0x81818da
> 0xb7497dac
> 0xb73c3a8a
> New value of fp=(nil) failed sanity check, terminating stack trace!
> Please read http://www.mysql.com/doc/en/Using_stack_trace.html and follow
> instructions on how to resolve the stack trace. Resolved
> stack trace is much more helpful in diagnosing the problem, so please do
> resolve it
> Trying to get some variables.
> Some pointers may be invalid and cause the dump to abort...
> thd->query at 0x8731348 = SELECT * FROM woohoo
> thd->thread_id=1
> The manual page at http://www.mysql.com/doc/en/Crashing.html contains
> information that should help you find out what is causing the crash.
>                                                                              
>  
>                                              
> Number of processes running now: 0
> 040623 17:25:05  mysqld restarted
> 040623 17:25:05  InnoDB: Started; log sequence number 0 43634
> 040623 17:26:11  Can't init databases
> 040623 17:26:11  Aborting
>                                                                              
>  
>                                              
> 040623 17:26:11  InnoDB: Starting shutdown...
> 040623 17:26:13  InnoDB: Shutdown completed; log sequence number 0 43634
> 040623 17:26:13  /usr/local/mysql/libexec/mysqld: Shutdown complete
>                                                                              
>  
>                                              
> 040623 17:26:15  mysqld ended
> 
> Thanks for any help you can provide.
> 
> -- Jim
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> --- Tomas Ulin <tomas@stripped> wrote:
> > seems there is a similar problem for mbrancaleoni@stripped (other thread)
> > 
> > as I stated there you might have to specify ip-addresses sometimes.  If 
> > this helps, please let me know, so we can try to fix this in the source.
> > 
> > BR,
> > 
> > T
> > 
> > Jim Hoadley wrote:
> > 
> > >I'm trying to set up MySQL Cluster with 2 DB nodes, where each DB node is
> on
> > a
> > >separate computer.
> > >
> > >Management server is on computer 1. 
> > >DB node 2 is on computer 2. 
> > >DB node 3 is on computer 3.
> > >Computer 1 and 3 are the same machine.
> > >
> > >I've edited the config.ini on computer 1 and changed the definition of
> > COMPUTER
> > >2 to the hostname of computer 2. I've edited the Ndb.cfg on computer 2 to
> > >reference the management server on computer 1.
> > >
> > >Step 2.10 on page 20 of the MySQL Cluster Administrator Guide seems to say
> 
> > >that's all that's needed.
> > >
> > >What I've seen:
> > >
> > >Management server and DB node 3 start up and communicate, and apparently
> DB
> > >node 2 sees the management server (looking at output from ndbd), but the
> > >management server doesn't see DB node 2 (looking at NBD>2 status).
> > >
> > >I am able to get MySQL Cluster running both computer 1 and computer 2
> > >independently. In this case I am using ndb/ndbcluster.sh --small &
> > >
> > >Some other pertinent info:
> > >
> > >As a diagnostic, I tried to telnet from computer 1 to computer 2 on port
> > 2202.
> > >Connection refused. But a telnet to localhost on port 2202 on computer 2
> is
> > >accepted. I have ports 2200, 2201, 2202, 2203, 2204 open for TCP in
> iptables
> > on
> > >both computers.
> > >
> > >Any help would be appreciated. Thanks.
> > >
> > >
> > >
> > >
> > >	
> > >		
> > >__________________________________
> > >Do you Yahoo!?
> > >New and Improved Yahoo! Mail - 100MB free storage!
> > >http://promotions.yahoo.com/new_mail 
> > >
> > >  
> > >
> > 
> > 
> > -- 
> > MySQL Cluster Mailing List
> > For list archives: http://lists.mysql.com/cluster
> > To unsubscribe:    http://lists.mysql.com/cluster?unsub=1
> > 
> > 
> 
=== message truncated ===



		
__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail 
Thread
NDB Cluster on multiple computersJim Hoadley23 Jun
  • Re: NDB Cluster on multiple computersTomas Ulin23 Jun
    • Re: NDB Cluster on multiple computersJim Hoadley24 Jun
Re: NDB Cluster on multiple computersJim Hoadley30 Jun
  • Re: NDB Cluster on multiple computersBrancaleoni Matteo30 Jun
Re: NDB Cluster on multiple computersJim Hoadley1 Jul