From: Date: February 5 2008 4:36pm Subject: bk commit into 5.0 tree (aelkin:1.2576) BUG#34305 List-Archive: http://lists.mysql.com/commits/41718 X-Bug: 34305 Message-Id: <200802051536.m15FabJ1002563@koti.dsl.inet.fi> Below is the list of changes that have just been committed into a local 5.0 repository of aelkin. When aelkin does a push these changes will be propagated to the main repository and, within 24 hours after the push, to the public repository. For information on how to access the public repository see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html ChangeSet@stripped, 2008-02-05 17:36:26+02:00, aelkin@stripped +2 -0 Bug #34305 show slave status handling segfaults when slave io is about to leave The artifact was caused by a flaw in concurrent accessing the slave's io thd by the io itself and a handling show slave status thread. Namely, show_master_info did not acquire mi->run_lock mutex that is specified for mi->io_thd member. Fixed with deploying the mutex locking and unlocking. The mutex is kept short time and without interleaving with mi->data_lock mutex. Todo: to report and fix an issue with sys_var_slave_skip_counter::{methods} seem to acquire incorrectly active_mi->rli.run_lock instead of the specified active_mi->rli.data_lock A test case is difficult to compose, so rpl_packet should continue serving as the indicator. sql/slave.cc@stripped, 2008-02-05 17:36:17+02:00, aelkin@stripped +6 -5 implementing a TODO left at 4.1 time: mending access to mi->io_thd with the specified mutex; sql/slave.h@stripped, 2008-02-05 17:36:17+02:00, aelkin@stripped +2 -2 adding a member name to the list of that run_lock guards. diff -Nrup a/sql/slave.cc b/sql/slave.cc --- a/sql/slave.cc 2007-12-20 17:07:52 +02:00 +++ b/sql/slave.cc 2008-02-05 17:36:17 +02:00 @@ -2447,14 +2447,15 @@ bool show_master_info(THD* thd, MASTER_I protocol->prepare_for_resend(); /* - TODO: we read slave_running without run_lock, whereas these variables - are updated under run_lock and not data_lock. In 5.0 we should lock - run_lock on top of data_lock (with good order). + slave_running can be accessed without run_lock but not other + non-volotile members like mi->io_thd, which is guarded by the mutex. */ + pthread_mutex_lock(&mi->run_lock); + protocol->store(mi->io_thd ? mi->io_thd->proc_info : "", &my_charset_bin); + pthread_mutex_unlock(&mi->run_lock); + pthread_mutex_lock(&mi->data_lock); pthread_mutex_lock(&mi->rli.data_lock); - - protocol->store(mi->io_thd ? mi->io_thd->proc_info : "", &my_charset_bin); protocol->store(mi->host, &my_charset_bin); protocol->store(mi->user, &my_charset_bin); protocol->store((uint32) mi->port); diff -Nrup a/sql/slave.h b/sql/slave.h --- a/sql/slave.h 2007-02-08 16:53:12 +02:00 +++ b/sql/slave.h 2008-02-05 17:36:17 +02:00 @@ -65,8 +65,8 @@ mi->rli does not either. In MASTER_INFO: run_lock, data_lock - run_lock protects all information about the run state: slave_running, and the - existence of the I/O thread (to stop/start it, you need this mutex). + run_lock protects all information about the run state: slave_running, thd + and the existence of the I/O thread to stop/start it, you need this mutex). data_lock protects some moving members of the struct: counters (log name, position) and relay log (MYSQL_LOG object).