I'm trying to convert a traditional MySQL database to Cluster. I have
the SQL dump with about 2 million records and changed ENGINE=ndbcluster.
I have a 2 data node MySQL cluster on somewhat resource limited machines
(mainly the hard drives are slower). When I try to import the SQL dump,
sometimes the process aborts due to node failure and I see in the logs
that several heartbeats are missed, eventually leading to a data node
failure. During the import, when an LCP starts, disk I/O causes a heavy
load on the data node machines causing them to sometimes be slow to
respond. I think that is why the heartbeats are missed. This doesn't
ALWAYs happen, but has happened more than half the time.
The only way I can consistently get the SQL dump to import without
1) start up the cluster completely
2) shutdown one of the two data nodes
3) import the SQL dump
4) start up the data node that I shut down in #2
5) wait for the data nodes to sync up
In the above scenario with only one of the two data nodes, even if the
single data node is under heavy load due to disk I/O during LCP, there's
no node failure and hence the SQL import can continue without aborting.
If i leave both data nodes alive, there's a chance that one of the nodes
will fail during the import. And it seems a node failure will cause the
SQL import to abort.
Is my approach to the problem correct? Or, is there some configuration
problem that I should be addressing?
This concerns me because if the application needs to write an additional
1 million records at a time during production, does this mean I might
see node failures, resulting in an abort when the disk I/O load gets too
heavy? I know, the other solution is to get faster disks, but if that
wasn't an option, what can i do?