Hi Andrei!
Patch approved.
/matz
Andrei Elkin wrote:
> Below is the list of changes that have just been committed into a local
> 5.0 repository of aelkin. When aelkin does a push these changes
> will be propagated to the main repository and, within 24 hours after the
> push, to the public repository.
> For information on how to access the public repository
> see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
>
> ChangeSet@stripped, 2008-02-05 17:36:26+02:00, aelkin@stripped +2 -0
> Bug #34305 show slave status handling segfaults when slave io is about
> to leave
>
> The artifact was caused by
> a flaw in concurrent accessing the slave's io thd by
> the io itself and a handling show slave status thread.
> Namely, show_master_info did not acquire mi->run_lock mutex that is
> specified for mi->io_thd member.
>
> Fixed with deploying the mutex locking and unlocking. The mutex is kept
> short time and without interleaving with mi->data_lock mutex.
>
> Todo: to report and fix an issue with
> sys_var_slave_skip_counter::{methods}
> seem to acquire incorrectly
> active_mi->rli.run_lock
> instead of the specified
> active_mi->rli.data_lock
>
> A test case is difficult to compose, so rpl_packet should continue serving
> as the indicator.
>
> sql/slave.cc@stripped, 2008-02-05 17:36:17+02:00, aelkin@stripped +6 -5
> implementing a TODO left at 4.1 time:
> mending access to mi->io_thd with the specified mutex;
>
> sql/slave.h@stripped, 2008-02-05 17:36:17+02:00, aelkin@stripped +2 -2
> adding a member name to the list of that run_lock guards.
>
> diff -Nrup a/sql/slave.cc b/sql/slave.cc
> --- a/sql/slave.cc 2007-12-20 17:07:52 +02:00
> +++ b/sql/slave.cc 2008-02-05 17:36:17 +02:00
> @@ -2447,14 +2447,15 @@ bool show_master_info(THD* thd, MASTER_I
> protocol->prepare_for_resend();
>
> /*
> - TODO: we read slave_running without run_lock, whereas these variables
> - are updated under run_lock and not data_lock. In 5.0 we should lock
> - run_lock on top of data_lock (with good order).
> + slave_running can be accessed without run_lock but not other
> + non-volotile members like mi->io_thd, which is guarded by the mutex.
> */
> + pthread_mutex_lock(&mi->run_lock);
> + protocol->store(mi->io_thd ? mi->io_thd->proc_info : "",
> &my_charset_bin);
> + pthread_mutex_unlock(&mi->run_lock);
> +
> pthread_mutex_lock(&mi->data_lock);
> pthread_mutex_lock(&mi->rli.data_lock);
> -
> - protocol->store(mi->io_thd ? mi->io_thd->proc_info : "",
> &my_charset_bin);
> protocol->store(mi->host, &my_charset_bin);
> protocol->store(mi->user, &my_charset_bin);
> protocol->store((uint32) mi->port);
> diff -Nrup a/sql/slave.h b/sql/slave.h
> --- a/sql/slave.h 2007-02-08 16:53:12 +02:00
> +++ b/sql/slave.h 2008-02-05 17:36:17 +02:00
> @@ -65,8 +65,8 @@
> mi->rli does not either.
>
> In MASTER_INFO: run_lock, data_lock
> - run_lock protects all information about the run state: slave_running, and the
> - existence of the I/O thread (to stop/start it, you need this mutex).
> + run_lock protects all information about the run state: slave_running, thd
> + and the existence of the I/O thread to stop/start it, you need this mutex).
> data_lock protects some moving members of the struct: counters (log name,
> position) and relay log (MYSQL_LOG object).
>
>
>
--
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com