Hi, Guilhem!
On Jan 05, Guilhem Bichot wrote:
> Hello Serg,
>
> Sergei Golubchik a écrit, Le 12/22/2008 07:04 PM:
>> #At file:///usr/home/serg/Abk/mysql/6.0-maria/
>
> I guess this will be ported back to 5.1-maria.
Yes.
>> 2707 Sergei Golubchik 2008-12-22
>> Bug#40990 Maria: failure of maria.test & maria_notemebedded in
>> deadlock detection detect a case when a blocker has removed
>> itself and signalled after the condition timed out but before
>> it (cond_wait) acquired the mutex back
>> === modified file 'include/waiting_threads.h'
>> --- a/include/waiting_threads.h 2008-11-04 13:09:32 +0000
>> +++ b/include/waiting_threads.h 2008-12-22 18:04:24 +0000
>> @@ -182,6 +182,7 @@ typedef struct st_wt_thd {
>> } WT_THD;
>> #define WT_TIMEOUT ETIMEDOUT
>> +#define WT_FREE_TO_GO 1
>
> Can we be sure that on all platforms, ETIMEDOUT != 1 (is there some
> POSIX rule somewhere) ?
> Why not pick a negative value for our "custom errors" like is already
> the case for WT_DEADLOCK, WT_DEPTH_EXCEEDED? We would use -3 for
> WT_FREE_TO_GO.
okay.
>> #define WT_OK 0
>> #define WT_DEADLOCK -1
>> #define WT_DEPTH_EXCEEDED -2
>> === modified file 'mysys/waiting_threads.c'
>> --- a/mysys/waiting_threads.c 2008-11-10 19:11:27 +0000
>> +++ b/mysys/waiting_threads.c 2008-12-22 18:04:24 +0000
>> @@ -613,7 +613,16 @@ static int deadlock(WT_THD *thd, WT_THD if (ret ==
>> WT_DEADLOCK && depth)
>> change_victim(blocker, &arg);
>> if (arg.rc)
>> + {
>> + /* special return code if there's nobody to wait for */
>> + if (depth == 0 && ret == WT_OK && arg.rc->owners.elements
> == 0)
>
> I guess the added code has to be in deadlock() and not in
> wt_thd_timedwait() (which could have been more natural), to avoid adding
> locking/unlocking of the rc in wt_thd_timedwait().
>
> So, looking at the new if(), if one of these four is false:
> arg.rc
> depth == 0
> ret == WT_OK
> arg.rc->owners.elements == 0
>
> then it is considered ok to have missed the broadcast and go into the long
> timeout wait... Let's see:
>
> - if "arg.rc->owners.elements == 0" is false, then there is some other
> thread to wait for, we have to do the long timeout
> - if "ret == WT_OK" is false: it could be that ret==WT_DEADLOCK, imagine
> then that we go into "/* notify the victim, if appropriate */" below: the
> if() there sets 'ret' to WT_OK, so it looks like we will do the long
> timeout wait; but the killed victim thread will call wt_thd_release_self()
> and so wake us up, ok
> - if "depth == 0" is false: then it's not wt_thd_cond_timedwait() which is
> calling us, so ok; is there another more fundamental reason for the
> "depth==0"?
If depth==1 (which is the only possible value of depth besides 0),
we don't check all blockers of thd, but only the one of them.
And this one is guaranteed to exist, so WT_FREE_TO_GO can never happen.
Furthermore, arg.rc won't be thd->waiting_for (because we don't start
the search from thd), so checking arg.rc->owners.elements won't make any
sense.
> - if "arg.rc" is false: assuming "depth" is 0, this is possible only if
> deadlock_search(), at first call of the recursion, exceeded search depth
> (which is impossible as it is the first call) or found no resource to wait
> on ("exit: OK (early)"); that last case should be impossible, as
> wt_thd_cond_timedwait() logically assumes thd->waiting_for!=NULL.
>
> Given this, the patch looks ok to me, but is there a more straightforward
> explanation of these added lines than mine?
> I suggest adding a comment before this added if(). To me
okay
> "depth == 0 && ret == WT_OK" is the non-obvious part, I translate "nobody
> to wait for" to only "arg.rc->owners.elements == 0".
Actually, "ret == WT_OK && arg.rc->owners.elements == 0" means "nobody
to wait for". If ret != WT_OK then there's certainly somebody to wait
for. And if ret != WT_OK arg.rc may be even not thd->waiting_for, so
whether it's 0 or not is largely irrelevant for *this* thd.
>> + {
>> + DBUG_ASSERT(thd == blocker);
>> + DBUG_ASSERT(arg.rc == thd->waiting_for);
>> + ret= WT_FREE_TO_GO;
>> + }
>> rc_unlock(arg.rc);
>> + }
>> /* notify the victim, if appropriate */
Regards / Mit vielen Grüßen,
Sergei
--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg@stripped>
/ /|_/ / // /\ \/ /_/ / /__ Principal Software Engineer/Server Architect
/_/ /_/\_, /___/\___\_\___/ Sun Microsystems GmbH, HRB München 161028
<___/ Sonnenallee 1, 85551 Kirchheim-Heimstetten
Geschäftsführer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Häring