Rafal Somla wrote:
> Hi Mats,
>
> I think we agree about the main idea how to deal with this bug. Adding
> description to the function and improving its interface is secondary
> and optional. Please use your judgement here. Below is my proposition
> for the documentation, updated after your comments.
Hi Rafal!
I will add some documentation, since that will help us maintain the
thingie, but I am reluctant to change the interfaces too much. The
change you are proposing is something I have been considering myself, so
I will introduce it at this time.
My reasoning is that that the precondition is to strong for the
function: it requires that term_lock is either NULL or equivalent to
cond_lock. It is better to have the weaker condition that only requires
term_lock to be non-NULL, but places no restrictions on the skip_lock value.
Since I've made the linkage for terminate_slave_thread() internal, we
can change the parameters without breaking any external interfaces.
Just my few cents,
Mats Kindahl
>
> Mats Kindahl wrote:
>> Hi Rafal!
>>
>> Thanks for the review. Comments on your comments below.
>>
>> Just my few cents,
>> Mats Kindahl
>>
>> Rafal Somla wrote:
> <cut>
>>>
>>> I removed two separate mutex pointers from function's signature as
>>> the code
>>> can not work correctly if two different mutexes are used or no mutex
>>> is locked.
>>
>> The precondition for the term_lock and cond_lock are
>>
>> term_lock == NULL || term_lock == cond_lock
>>
>> In other words, if term_lock is passed as NULL, no lock is acquired
>> but the cond_lock is needed to execute pthread_cond_timedwait(). See
>> how skip_lock is used in terminate_slave_threads() and see where
>> terminate_slave_threads() are called with skip_lock == true. In other
>> words, term_lock should not necessarily be locked inside the function.
>>
>
> OK. The intention is, I guess, that a caller can already own the mutex
> and then terminate_slave_thread(s) should not try to acquire it. In
> that case I would introduce boolean parameter to the function instead
> of a confusing extra mutex pointer.
>
> /**
> Wait for slave I/O or SQL thread to terminate.
>
> This function should be called after requesting the thread to
> terminate (by
> setting abort_slave member of RLI or MI structure to 1). The fact
> that thread
> has terminated is detected by checking flag pointed by @c
> slave_running. This
> flag should be guarded by @c term_lock to avoid race conditions and
> its change
> schould be signaled by broadcasting @c term_cond.
>
> Function will acquire @c term_lock before waiting on the condition
> unless @c
> skip_lock is true in which case the mutex should be owned by the
> caller of
> this function and will remain acquired after return from the function.
>
> In case of I/O thread:
> @code
> term_lock = mi->run_lock,
> term_cond = mi->stop_cond,
> slave_running = &mi->slave_runnig;
> @endcode
> in case of SQL thread:
> @code
> term_lock = mi->rli.run_lock,
> term_cond = mi->rli.stop_cond,
> slave_running = &mi->rli.slave_runnig;
> @endcode
> where @c mi is the master info structure.
>
> Function periodically wakes-up slave threads to ensure that they make
> progress. This is done using thd->awake() method (in KICK_SLAVE macro).
>
> @returns ...
> */
>
> int terminate_slave_thread(THD* thd,
> pthread_mutex_t *const term_lock,
> pthread_cond_t *const term_cond,
> volatile bool *const slave_running,
> const bool skip_lock)
>
>
> Rafal
>
--
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com