List:Cluster« Previous MessageNext Message »
From:Antonio Modesto Date:April 23 2012 6:56pm
Subject:RE: Cluster crash after datanode shutdown
View as plain text  
Hi Andrew,

	I noticed something strange here, after I posted about the problem, I
tried to do the tests again to send you the log entries, but it simply
worked! I tried two times with each node and the cluster kept running
nicely. Anyway, I will change the timeout of the arbitration as you said
and test again, thank you very much.


On Mon, 2012-04-23 at 10:41 -0700, Andrew Morgan wrote:
> Hi Antonio,
> 
>  For now, just retry with the increase timeout for arbitration and then check the
> logs if it still crashes.
> 
> Andrew.
> 
> > -----Original Message-----
> > From: Antonio Modesto [mailto:modesto@stripped]
> > Sent: 23 April 2012 18:12
> > To: Andrew Morgan
> > Cc: MySQL-Cluster Lists
> > Subject: RE: Cluster crash after datanode shutdown
> > 
> > Do you want me enable some extra debug and/or info?
> > 
> > On Mon, 2012-04-23 at 10:05 -0700, Andrew Morgan wrote:
> > > Hi Antonio,
> > >
> > >
> > >
> > > The next thing I’d check is the log/error files on the management
> node
> > > and the data node that you weren’t trying to shut down.
> > >
> > >
> > >
> > > I notice that you have ArbitrationTimeout set to 10 milleseconds (the
> > > default is 7500 – I’d try increasing it).
> > >
> > >
> > >
> > > Regards, Andrew.
> > >
> > >
> > >
> > > From: Antonio Modesto [mailto:modesto@stripped]
> > > Sent: 23 April 2012 17:13
> > > To: Andrew Morgan
> > > Cc: MySQL-Cluster Lists
> > > Subject: RE: Cluster crash after datanode shutdown
> > >
> > >
> > >
> > >
> > > Hi, here is my config.ini:
> > >
> > > [NDBD DEFAULT]
> > > NoOfReplicas: 2
> > > DataDir: /usr/local/mysql/data
> > > DataMemory: 6000M
> > > IndexMemory: 2000M
> > > StringMemory: 5
> > > MaxNoOfConcurrentTransactions: 4096
> > > MaxNoOfConcurrentOperations: 100000
> > > MaxNoOfLocalOperations: 100000
> > > MaxNoOfConcurrentIndexOperations: 8192
> > > MaxNoOfFiredTriggers: 4000
> > > TransactionBufferMemory: 1M
> > > MaxNoOfConcurrentScans: 300
> > > MaxNoOfLocalScans: 32
> > > BatchSizePerLocalScan: 64
> > > LongMessageBuffer: 1M
> > > NoOfFragmentLogFiles: 300
> > > FragmentLogFileSize: 16M
> > > MaxNoOfOpenFiles: 40
> > > InitialNoOfOpenFiles: 27
> > > MaxNoOfSavedMessages: 25
> > > MaxNoOfAttributes: 1500
> > > MaxNoOfTables: 400
> > > MaxNoOfOrderedIndexes: 200
> > > MaxNoOfUniqueHashIndexes: 200
> > > MaxNoOfTriggers: 770
> > > LockPagesInMainMemory: 0
> > > StopOnError: 1
> > > Diskless: 0
> > > ODirect: 0
> > > TimeBetweenWatchDogCheck: 6000
> > > TimeBetweenWatchDogCheckInitial: 6000
> > > StartPartialTimeout: 30000
> > > StartPartitionedTimeout: 60000
> > > StartFailureTimeout: 1000000
> > > HeartbeatIntervalDbDb: 2000
> > > HeartbeatIntervalDbApi: 3000
> > > TimeBetweenLocalCheckpoints: 20
> > > TimeBetweenGlobalCheckpoints: 2000
> > > TransactionInactiveTimeout: 0
> > > TransactionDeadlockDetectionTimeout: 1200
> > > DiskSyncSize: 4M
> > > DiskCheckpointSpeed: 10M
> > > DiskCheckpointSpeedInRestart: 100M
> > > ArbitrationTimeout: 10
> > > UndoIndexBuffer: 2M
> > > UndoDataBuffer: 1M
> > > RedoBuffer: 32M
> > > LogLevelStartup: 15
> > > LogLevelShutdown: 3
> > > LogLevelStatistic: 0
> > > LogLevelCheckpoint: 0
> > > LogLevelNodeRestart: 0
> > > LogLevelConnection: 0
> > > LogLevelError: 15
> > > LogLevelCongestion: 0
> > > LogLevelInfo: 3
> > > MemReportFrequency: 0
> > > BackupDataBufferSize: 2M
> > > BackupLogBufferSize: 2M
> > > BackupMemory: 64M
> > > BackupWriteSize: 32K
> > > BackupMaxWriteSize: 256K
> > > [MGM DEFAULT]
> > > PortNumber: 1186
> > > DataDir: /usr/local/mysql/mysql-cluster [TCP DEFAULT]
> > > SendBufferMemory: 2M
> > > [NDB_MGMD]
> > > NodeId: 1
> > > HostName: 192.168.0.7
> > > [NDBD]
> > > NodeId: 2
> > > HostName: 192.168.0.30
> > > [NDBD]
> > > NodeId: 3
> > > HostName: 192.168.0.31
> > > [API]
> > > NodeId: 4
> > > HostName: 192.168.0.30
> > > [API]
> > > NodeId: 5
> > > HostName: 192.168.0.31
> > >
> > > On Mon, 2012-04-23 at 08:59 -0700, Andrew Morgan wrote:
> > >
> > >
> > > Hi Antonio,
> > >
> > > Could you please share your config.ini file? I'd like to check that your
> > ndb_mgmd process is not running on the same host as one of your data
> > nodes.
> > >
> > > Setting StopOnError to FALSE will tell the data node's agent to restart an
> > ndbd process if it is killed.
> > >
> > > Regards, Andrew.
> > >
> > > -----Original Message-----
> > > From: Antonio Modesto [mailto:modesto@stripped]
> > > Sent: 23 April 2012 15:44
> > > To: MySQL-Cluster Lists
> > > Subject: Cluster crash after datanode shutdown
> > >
> > > Hi,
> > >
> > > I setting up a mysql cluster with 2 data nodes and 1 management node. I
> > was testing its reliability by shutting down a node and making some queries
> > in the alive one, when I was testing with a small database, it worked well,
> > independently of the data node I turned off, it kept running. The problem is
> > when I import my radius database to it (about 1.5GB), if I turn one of the data
> > nodes off, the cluster stops and I receive this message:
> > >
> > > Node 3: Forced node shutdown completed. Caused by error 2305: 'Node
> > lost connection to other nodes and can not form a unpartitioned cluster,
> > please investigate if there are error(s) on other node(s)(Arbitration error).
> > Temporary error, restart node'.
> > >
> > >
> > > I've seen in some lists the people recommending to enable the
> > StopOnError attribute, but I don't know its side effects.
> > >
> > > Thanks.
> > >
> > >
> > >
> > >
> > > --
> > >
> > >
> > > Atenciosamente,
> > >
> > > Antônio Modesto
> > >
> > > Gerente de TI
> > >
> > >
> > >
> > > Praça Getúlio Vargas, 77 – Sala 308 – Centro
> > >
> > > Santo Antônio do Monte – MG – CEP: 35560-000
> > > Tel:(37) 3281-2800
> > >
> > > Contato: isimples@stripped
> > > http://www.isimples.com.br
> > >
> > >
> > > Aviso:Esta mensagem e quaisquer arquivos em anexo podem conter
> > > informações confidenciais e/ou
> > >
> > > privilegiadas. Se você não for o destinatário ou a
> pessoa autorizada a
> > > receber esta mensagem, por favor, não
> > >
> > > leia, copie, repasse, imprima, guarde, nem tome qualquer ação
> baseada
> > > nessas informações. Notifique o
> > >
> > > remetente imediatamente por e-mail e apague a mensagem
> > > permanentemente. Atenção: embora a Isimples
> > >
> > > Telecom, tome seus cuidados para garantir a ausência de vírus
> neste
> > > e-mail, a empresa não se responsabiliza
> > >
> > > por quaisquer perdas ou danos decorrentes do uso da mensagem e seus
> > > anexos. A segurança e ausência de
> > >
> > > erros na transmissão do e-mail não podem ser garantidas,
> já que as
> > > informações podem ser interceptadas,
> > >
> > > corrompidas, perdidas, destruídas, atrasadas, chegarem incompletas,
> > > ou, ainda, conter vírus. Recomendamos
> > >
> > > checar se o e-mail e seus anexos contém vírus, uma vez que
> nem a
> > > Isimples Telecom ou o remetente se
> > >
> > > responsabilizam pela transmissão destes.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > 
> > 
> > 
> > --
> > MySQL Cluster Mailing List
> > For list archives: http://lists.mysql.com/cluster
> > To unsubscribe:    http://lists.mysql.com/cluster
> > 
> 


Thread
Cluster crash after datanode shutdownAntonio Modesto23 Apr
  • RE: Cluster crash after datanode shutdownAndrew Morgan23 Apr
    • RE: Cluster crash after datanode shutdownAntonio Modesto23 Apr
      • RE: Cluster crash after datanode shutdownAndrew Morgan23 Apr
RE: Cluster crash after datanode shutdownAntonio Modesto23 Apr
  • RE: Cluster crash after datanode shutdownAndrew Morgan23 Apr
    • RE: Cluster crash after datanode shutdownAntonio Modesto23 Apr
      • Re: Cluster crash after datanode shutdownfranco riggi25 Apr