> In Bug #45845 "Falcon crashes while running falcon_bug_36294-big test"
> (http://bugs.mysql.com/bug.php?id=45845) Ann brings up the issue of that
> we in Thread::thread() catches an exception and the rethrows it
> immediately without any other code catching it leading to a process
> crash. I agree that this is a problem.
Jim, Kevin, and I talked about it yesterday. The problem is not this
double throw, but the fact that the thread (probably a gopher) is
throwing the "record memory exhausted" error and not catching it.
Jim and Kevin agreed that uncaught exceptions ought to crash the
server (not their words). I disagree. Some MySQL engines corrupt
data when they crash, so crashing the server is not socially
acceptable, even if it does make debugging easier.
> 1. The main issue is that we have an Falcon internal thread that has
> thrown an exception that is likely due to something "very serious" since
> it is not handled anywhere (or it caused by a coding bug) .
I suspect that the gopher never expected to run out of record memory -
and I'm not really sure how it can, since, at least in my simplistic
diagram, it moves records from the serial log into the page cache
and shouldn't be mucking with the record cache at all.
> So at this
> point in the code we are basically handling an exception that the
> thread's own code did not manage to handle. Unfortunately, as the code
> is now it leads to a process crash. What is the alternative? It is
> fairly easy to avoid the process crash by just catching the exception -
> but what should we do with the thread (or rather the lack of this
> thread)? Should we just restart the same thread? Continue without this
> thread? In many cases the cause for the exception is so serious that
> Falcon will not be able to continue.
If it were me, I'd put a try/catch all on every call from the server
to avoid crashes. In debugging mode, the handler could set the machine
on fire to call attention to the problem.
> The best we can do is probably to try to restart the crashed thread -
> and hope the issue was temporarily? The second best is probably to let
> the thread die and just continue as if nothing had happened...
I think we need to return a severe error and ignore further handlers
calls. Falcon is dead, but something else might survive.
> 2. Another issue is that if we continue to let this lead to process
> crashes - the catch/rethrow is mostly annoying (at least for developers)
> since it leads to a call stack that just shows where the rethrow took
> place and not where the initial problem was. This makes it much harder
> to identify what was the real cause for the crash. I did a commit last
> week where I changed the logging code so that we at least write to the
> log what kind of exception this was - earlier this catch/rethrow was
> done "silently" (if there was not debug flags specified).
That's a start...