List:Cluster« Previous MessageNext Message »
From:franco riggi Date:April 25 2012 8:28am
Subject:Re: Cluster crash after datanode shutdown
View as plain text  
Hi Antonio,

It could also have been some network problem which causes the "lost
connection" error, myabe your network was overloaded when you got the error.

Regards,
Francesco




2012/4/23 Antonio Modesto <modesto@stripped>

> Hi Andrew,
>
>        I noticed something strange here, after I posted about the problem,
> I
> tried to do the tests again to send you the log entries, but it simply
> worked! I tried two times with each node and the cluster kept running
> nicely. Anyway, I will change the timeout of the arbitration as you said
> and test again, thank you very much.
>
>
> On Mon, 2012-04-23 at 10:41 -0700, Andrew Morgan wrote:
> > Hi Antonio,
> >
> >  For now, just retry with the increase timeout for arbitration and then
> check the logs if it still crashes.
> >
> > Andrew.
> >
> > > -----Original Message-----
> > > From: Antonio Modesto [mailto:modesto@stripped]
> > > Sent: 23 April 2012 18:12
> > > To: Andrew Morgan
> > > Cc: MySQL-Cluster Lists
> > > Subject: RE: Cluster crash after datanode shutdown
> > >
> > > Do you want me enable some extra debug and/or info?
> > >
> > > On Mon, 2012-04-23 at 10:05 -0700, Andrew Morgan wrote:
> > > > Hi Antonio,
> > > >
> > > >
> > > >
> > > > The next thing I’d check is the log/error files on the
> management
> node
> > > > and the data node that you weren’t trying to shut down.
> > > >
> > > >
> > > >
> > > > I notice that you have ArbitrationTimeout set to 10 milleseconds (the
> > > > default is 7500 – I’d try increasing it).
> > > >
> > > >
> > > >
> > > > Regards, Andrew.
> > > >
> > > >
> > > >
> > > > From: Antonio Modesto [mailto:modesto@stripped]
> > > > Sent: 23 April 2012 17:13
> > > > To: Andrew Morgan
> > > > Cc: MySQL-Cluster Lists
> > > > Subject: RE: Cluster crash after datanode shutdown
> > > >
> > > >
> > > >
> > > >
> > > > Hi, here is my config.ini:
> > > >
> > > > [NDBD DEFAULT]
> > > > NoOfReplicas: 2
> > > > DataDir: /usr/local/mysql/data
> > > > DataMemory: 6000M
> > > > IndexMemory: 2000M
> > > > StringMemory: 5
> > > > MaxNoOfConcurrentTransactions: 4096
> > > > MaxNoOfConcurrentOperations: 100000
> > > > MaxNoOfLocalOperations: 100000
> > > > MaxNoOfConcurrentIndexOperations: 8192
> > > > MaxNoOfFiredTriggers: 4000
> > > > TransactionBufferMemory: 1M
> > > > MaxNoOfConcurrentScans: 300
> > > > MaxNoOfLocalScans: 32
> > > > BatchSizePerLocalScan: 64
> > > > LongMessageBuffer: 1M
> > > > NoOfFragmentLogFiles: 300
> > > > FragmentLogFileSize: 16M
> > > > MaxNoOfOpenFiles: 40
> > > > InitialNoOfOpenFiles: 27
> > > > MaxNoOfSavedMessages: 25
> > > > MaxNoOfAttributes: 1500
> > > > MaxNoOfTables: 400
> > > > MaxNoOfOrderedIndexes: 200
> > > > MaxNoOfUniqueHashIndexes: 200
> > > > MaxNoOfTriggers: 770
> > > > LockPagesInMainMemory: 0
> > > > StopOnError: 1
> > > > Diskless: 0
> > > > ODirect: 0
> > > > TimeBetweenWatchDogCheck: 6000
> > > > TimeBetweenWatchDogCheckInitial: 6000
> > > > StartPartialTimeout: 30000
> > > > StartPartitionedTimeout: 60000
> > > > StartFailureTimeout: 1000000
> > > > HeartbeatIntervalDbDb: 2000
> > > > HeartbeatIntervalDbApi: 3000
> > > > TimeBetweenLocalCheckpoints: 20
> > > > TimeBetweenGlobalCheckpoints: 2000
> > > > TransactionInactiveTimeout: 0
> > > > TransactionDeadlockDetectionTimeout: 1200
> > > > DiskSyncSize: 4M
> > > > DiskCheckpointSpeed: 10M
> > > > DiskCheckpointSpeedInRestart: 100M
> > > > ArbitrationTimeout: 10
> > > > UndoIndexBuffer: 2M
> > > > UndoDataBuffer: 1M
> > > > RedoBuffer: 32M
> > > > LogLevelStartup: 15
> > > > LogLevelShutdown: 3
> > > > LogLevelStatistic: 0
> > > > LogLevelCheckpoint: 0
> > > > LogLevelNodeRestart: 0
> > > > LogLevelConnection: 0
> > > > LogLevelError: 15
> > > > LogLevelCongestion: 0
> > > > LogLevelInfo: 3
> > > > MemReportFrequency: 0
> > > > BackupDataBufferSize: 2M
> > > > BackupLogBufferSize: 2M
> > > > BackupMemory: 64M
> > > > BackupWriteSize: 32K
> > > > BackupMaxWriteSize: 256K
> > > > [MGM DEFAULT]
> > > > PortNumber: 1186
> > > > DataDir: /usr/local/mysql/mysql-cluster [TCP DEFAULT]
> > > > SendBufferMemory: 2M
> > > > [NDB_MGMD]
> > > > NodeId: 1
> > > > HostName: 192.168.0.7
> > > > [NDBD]
> > > > NodeId: 2
> > > > HostName: 192.168.0.30
> > > > [NDBD]
> > > > NodeId: 3
> > > > HostName: 192.168.0.31
> > > > [API]
> > > > NodeId: 4
> > > > HostName: 192.168.0.30
> > > > [API]
> > > > NodeId: 5
> > > > HostName: 192.168.0.31
> > > >
> > > > On Mon, 2012-04-23 at 08:59 -0700, Andrew Morgan wrote:
> > > >
> > > >
> > > > Hi Antonio,
> > > >
> > > > Could you please share your config.ini file? I'd like to check that
> your
> > > ndb_mgmd process is not running on the same host as one of your data
> > > nodes.
> > > >
> > > > Setting StopOnError to FALSE will tell the data node's agent to
> restart an
> > > ndbd process if it is killed.
> > > >
> > > > Regards, Andrew.
> > > >
> > > > -----Original Message-----
> > > > From: Antonio Modesto [mailto:modesto@stripped]
> > > > Sent: 23 April 2012 15:44
> > > > To: MySQL-Cluster Lists
> > > > Subject: Cluster crash after datanode shutdown
> > > >
> > > > Hi,
> > > >
> > > > I setting up a mysql cluster with 2 data nodes and 1 management
> node. I
> > > was testing its reliability by shutting down a node and making some
> queries
> > > in the alive one, when I was testing with a small database, it worked
> well,
> > > independently of the data node I turned off, it kept running. The
> problem is
> > > when I import my radius database to it (about 1.5GB), if I turn one of
> the data
> > > nodes off, the cluster stops and I receive this message:
> > > >
> > > > Node 3: Forced node shutdown completed. Caused by error 2305: 'Node
> > > lost connection to other nodes and can not form a unpartitioned
> cluster,
> > > please investigate if there are error(s) on other node(s)(Arbitration
> error).
> > > Temporary error, restart node'.
> > > >
> > > >
> > > > I've seen in some lists the people recommending to enable the
> > > StopOnError attribute, but I don't know its side effects.
> > > >
> > > > Thanks.
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > >
> > > > Atenciosamente,
> > > >
> > > > Antônio Modesto
> > > >
> > > > Gerente de TI
> > > >
> > > >
> > > >
> > > > Praça Getúlio Vargas, 77 – Sala 308 –
> Centro
> > > >
> > > > Santo Antônio do Monte – MG – CEP: 35560-000
> > > > Tel:(37) 3281-2800
> > > >
> > > > Contato: isimples@stripped
> > > > http://www.isimples.com.br
> > > >
> > > >
> > > > Aviso:Esta mensagem e quaisquer arquivos em anexo podem conter
> > > > informações confidenciais e/ou
> > > >
> > > > privilegiadas. Se você não for o destinatário ou a
> pessoa autorizada
> a
> > > > receber esta mensagem, por favor, não
> > > >
> > > > leia, copie, repasse, imprima, guarde, nem tome qualquer
> ação baseada
> > > > nessas informações. Notifique o
> > > >
> > > > remetente imediatamente por e-mail e apague a mensagem
> > > > permanentemente. Atenção: embora a Isimples
> > > >
> > > > Telecom, tome seus cuidados para garantir a ausência de
> vírus neste
> > > > e-mail, a empresa não se responsabiliza
> > > >
> > > > por quaisquer perdas ou danos decorrentes do uso da mensagem e seus
> > > > anexos. A segurança e ausência de
> > > >
> > > > erros na transmissão do e-mail não podem ser garantidas,
> já que as
> > > > informações podem ser interceptadas,
> > > >
> > > > corrompidas, perdidas, destruídas, atrasadas, chegarem
> incompletas,
> > > > ou, ainda, conter vírus. Recomendamos
> > > >
> > > > checar se o e-mail e seus anexos contém vírus, uma vez
> que nem a
> > > > Isimples Telecom ou o remetente se
> > > >
> > > > responsabilizam pela transmissão destes.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> > > --
> > > MySQL Cluster Mailing List
> > > For list archives: http://lists.mysql.com/cluster
> > > To unsubscribe:    http://lists.mysql.com/cluster
> > >
> >
>
>
>
> --
> MySQL Cluster Mailing List
> For list archives: http://lists.mysql.com/cluster
> To unsubscribe:    http://lists.mysql.com/cluster
>
>

Thread
Cluster crash after datanode shutdownAntonio Modesto23 Apr
  • RE: Cluster crash after datanode shutdownAndrew Morgan23 Apr
    • RE: Cluster crash after datanode shutdownAntonio Modesto23 Apr
      • RE: Cluster crash after datanode shutdownAndrew Morgan23 Apr
RE: Cluster crash after datanode shutdownAntonio Modesto23 Apr
  • RE: Cluster crash after datanode shutdownAndrew Morgan23 Apr
    • RE: Cluster crash after datanode shutdownAntonio Modesto23 Apr
      • Re: Cluster crash after datanode shutdownfranco riggi25 Apr