Hi Guys,
So I finally got the cluster working the way I wanted it to, after a
long day and night thursday into friday. Thanks to all the people who
made suggestions. In the end I wiped out my data and reset the whole
cluster with some new config.ini settings, and started fresh. Why did I
wipe out my data ? well the delete process deletes rows from my table at
a rate of 1,000,000 rows an hour. I wanted to delete about 8,000,000
rows (table contained ~= 14,000,000 rows) and the time to delete, backup
the remaining, reset cluster and restore would have been beyond the
allowable downtime. As the data is not critical, losing it was actually
an option so I took it.
When I came in this morning, I saw that one of my data nodes has crashed
with this error :
2009-11-08 14:01:04 [ndbd] INFO -- dblqh/DblqhMain.cpp
2009-11-08 14:01:04 [ndbd] INFO -- DBLQH (Line: 7661) 0x0000000e
2009-11-08 14:01:04 [ndbd] INFO -- Error handler shutting down
system
2009-11-08 14:01:04 [ndbd] INFO -- Error handler shutdown completed
- exiting
2009-11-08 14:01:04 [ndbd] ALERT -- Node 4: Forced node shutdown
completed. Caused by error 2341: 'Internal program error (failed
ndbrequire)(Internal error, programming error or missing error message,
please report a bug). Temporary error, restart node'.
2009-11-09 10:09:39 [ndbd] INFO -- Angel pid: 28874 ndb pid: 28875
NDBMT: non-mt
I tried to restart the node, but restart failed (after quite a long
time) with the following message :
Node 4: Forced node shutdown completed. Occurred during startphase 5.
Caused by error 2341: 'Internal program error (failed
ndbrequire)(Internal error, programming error or missing error message,
please report a bug). Temporary error, restart node'.
and now I am back to a single datanode, which is not good. Can anyone
tell me what I can do to get this node back up? or if I need to file a
bug report...
Richard
--
Richard McCluskey
Senior Engineer
go2Media, Inc.
rmccluskey@stripped
(617) 671-0057
http://go2.com - The best entertainment guide on mobile.