Jonas Oreland wrote:
> Lewis Bergman wrote:
>
>> Back to the books. It looks like the stale session problem keeps an ndbd
>> node from coming back immediately. How do all you guru's who know how to
>> do this handle this situation without intervention.
>>
>> If I use the --no-nodeid-checks can I then use a connect string that
>> includes the nodeid=<nodeid> parameter and avoid the stale session
>> problem?
>
>
> there is a ndb_mgm command: "PURGE STALE SESSIONS"
>
I have seen that and used it to get the nodes back up and running.
My question is more like this:
What do you do to insure that a cluster node disappearing in the middle of
the night does not necessitate someone's manual intervention?
I may have different goals for this than most of you. I have noone
babysitting a lrage cluster 24/7. I would have to be alerted somehow and
then wake up, log in, start mgm, PURGE STALE SESSIONS, log out. The main
reason for me to have the cluster is to avoid such problems to the greatest
extent.
I want the thing to come back on and get back in the cluster so I can
figure it out tommorrow instead of at two in the morning.
That does bring up a good point though. Is there anyone who has any scripts
or (dare I say it) snmp capability that the mgmd can react with? I guess I
could write one to watch the cluster log or something. That doesn't look
very friendly though. At any rate, if anyone has thoughts on how you deal
with this please let me know.
--
Lewis Bergman
Texas Communications
4309 Maple St.
Abilene, TX 79602-8044
325-691-3301
800-299-6962