>>>>> "Guilhem" == Guilhem Bichot <guilhem@stripped> writes:
Guilhem> Michael Widenius a écrit, Le 11/07/2008 01:58 PM:
>>>>>>> "Guilhem" == Guilhem Bichot <guilhem@stripped> writes:
>>>> For the close, the above *should* be safe as the share should not be
>>>> used by anyone, but somehow we got a hang here.
Guilhem> Thanks for the clear explanation.
Guilhem> It does sound like a bug that maria_close() (which, as it calls
Guilhem> _ma_remove_not_visible_states(), is the last close, so table is not used
Guilhem> by anybody, as you wrote) can happen at the same time as a commit on the
Guilhem> same table, which means that someone else is using the table. I think it
Guilhem> is dangerous to leave this situation unexplored, even though the
Guilhem> deadlock is now gone; this situation is so fishy that it could cause
Guilhem> other bugs.
>> Agree and I will plan to look into this.
>> There are however asserts in maria_close() that should find cases like
Guilhem> But they didn't find it.
There are new asserts since that time.
(I am adding new asserts all the time).
>> Still, removing the mutex should still be a good iea.
>> The one case that could be the reason for the problem is that
>> checkpoint may keep the share in use even when the table should be
Guilhem> But you said that the problem which you have seen is between
Guilhem> maria_close() and trnman_commit(): checkpoint has nothing to do with
Guilhem> this, has it?
The problem was lock order. Checkpoint combined with maria_close() is
the likely reason for the crash.
(I didn't investgate the problem in detail as there was a clear
problem that needed to be solved)
>> If we could release the share lock in maria_close() BEFORE we call
>> _ma_remove_not_visible_states() things would be much better!
>> This is the one thing that I plan to explore further.