On Fri, 2009-10-02 at 10:17 -0700, Tarandeep Singh wrote:
> Hi,
>
> I am using Mysql Cluster version 7.0.7.
>
> My conf is as follows-
>
> 4 data nodes.
> 1 mgmt nodes
> 1 sql node (running on the same machine as 1 of the data nodes)
>
> this morning, I noticed 2 of the 4 data nodes crashed. They were from
> different node groups, so I assumed the cluster would have kept running.
> However my application logs show SQL Exception-
>
> java.sql.SQLException: got temporary error 286 'Node failure caused abort of
> transaction' from NDBCLUSTER
> and
> java.sql.SQLException: got temporary error 4010 'Node failure caused abort
> of transaction' from NDBCLUSTER
>
> On checking the failed nodes log files, I only got this-
>
> ndb_2_error.log:
> ------------------
> Error: 2341
> Error data: pgman.cpp
> Error object: PGMAN (Line: 1473) 0x0000000e
> Program: ./bin/ndbmtd
> Pid: 29261
> So my questions are-
>
> 1) if the 2 data nodes were in different node groups, why the inserts failed
Though the cluster remains operational, and new queries are accepted,
any transaction that was in progress on either of those nodes would be
aborted, rolled back and need to be restarted during a node failure.
Your application should handle these "temporary error" codes as it would
for a deadlock in InnoDB or any other transactional engine.
> 2) How reliable Mysql cluster is? Is there any
> writeup/document/article/paragraph that talks about the cluster reliability?
See:
http://www.mysql.com/products/database/cluster/ and
http://www.mysql.com/products/database/cluster/mysql-cluster-datasheet.pdf
Also, you can configure the nodes to immediately restart, re-sync and
resume cluster operations using the StopOnError=0 setting in their
[ndbd] or [ndbd default] blocks. This will mitigate somewhat against
the potential for total cluster failure.
Likely these crashes are related to Bug #44195 and Bug #46507 which are
still being investigated.
http://bugs.mysql.com/bug.php?id=44195
http://bugs.mysql.com/bug.php?id=46507
--
Matthew Montgomery
MySQL Senior Support Engineer
Sun Microsystems Inc., Database Group
San Antonio, Texas, USA