List:Falcon Storage Engine« Previous MessageNext Message »
From:Jim Starkey Date:July 2 2009 3:38pm
Subject:Re: Catch and rethrow of exceptions in Thread::thread()
View as plain text  
Ann W. Harrison wrote:
> Olav,
>>
>> In Bug #45845 "Falcon crashes while running falcon_bug_36294-big 
>> test" (http://bugs.mysql.com/bug.php?id=45845) Ann brings up the 
>> issue of that we in Thread::thread() catches an exception and the 
>> rethrows it immediately without any other code catching it leading to 
>> a process crash. I agree that this is a problem.
>
>
> Jim, Kevin, and I talked about it yesterday.  The problem is not this
> double throw, but the fact that the thread (probably a gopher) is
> throwing the "record memory exhausted" error and not catching it.
> Jim and Kevin agreed that uncaught exceptions ought to crash the
> server (not their words).  I disagree.  Some MySQL engines corrupt
> data when they crash, so crashing the server is not socially
> acceptable, even if it does make debugging easier.
Ann, neither Kevin or I thought that it was acceptable either, but the 
place to catch it is in worker threads.  The internal threading system 
is not the place to implement server friendly semantics, however.
>>
>>
>> 1. The main issue is that we have an Falcon internal thread that has 
>> thrown an exception that is likely due to something "very serious" 
>> since it is not handled anywhere (or it caused by a coding bug) . 
>
> I suspect that the gopher never expected to run out of record memory -
> and I'm not really sure how it can, since, at least in my simplistic
> diagram, it moves records from the serial log into the page cache
> and shouldn't be mucking with the record cache at all.
All it takes is a single memory allocation to find that another thread 
has run the pool out of memory.

I hate to say it, but a gopher discovering a memory shortage should 
notify a responsible adult (probably SerialLog) that it is exiting, then 
exit.  If the server recovers, SerialLog can restart gophers after the 
crisis has past.

>
>> So at this point in the code we are basically handling an exception 
>> that the thread's own code did not manage to handle. Unfortunately, 
>> as the code is now it leads to a process crash. What is the 
>> alternative? It is fairly easy to avoid the process crash by just 
>> catching the exception - but what should we do with the thread (or 
>> rather the lack of this thread)? Should we just restart the same 
>> thread? Continue without this thread? In many cases the cause for the 
>> exception is so serious that Falcon will not be able to continue.
>
> If it were me, I'd put a try/catch all on every call from the server
> to avoid crashes.  In debugging mode, the handler could set the machine
> on fire to call attention to the problem.
That's appropriate.
>>
>>    The best we can do is probably to try to restart the crashed 
>> thread - and hope the issue was temporarily? The second best is 
>> probably to let the thread die and just continue as if nothing had 
>> happened...
>
> I think we need to return a severe error and ignore further handlers
> calls.  Falcon is dead, but something else might survive.
>>
>> 2. Another issue is that if we continue to let this lead to process 
>> crashes - the catch/rethrow is mostly annoying (at least for 
>> developers) since it leads to a call stack that just shows where the 
>> rethrow took place and not where the initial problem was. This makes 
>> it much harder to identify what was the real cause for the crash. I 
>> did a commit last week where I changed the logging code so that we at 
>> least write to the log what kind of exception this was - earlier this 
>> catch/rethrow was done "silently" (if there was not debug flags 
>> specified).
>>
> That's a start...
>
>
> Cheers,
>
> Ann


-- 
Jim Starkey
President, NimbusDB, Inc.
978 526-1376

Thread
Catch and rethrow of exceptions in Thread::thread()Olav Sandstaa2 Jul
  • Re: Catch and rethrow of exceptions in Thread::thread()Ann W. Harrison2 Jul
    • Re: Catch and rethrow of exceptions in Thread::thread()Jim Starkey2 Jul
    • How can a gopher thread get status 305? (Bug#45746 & Bug#45845)Kevin Lewis2 Jul