Below is the list of changes that have just been committed into a local
5.0 repository of aelkin. When aelkin does a push these changes
will be propagated to the main repository and, within 24 hours after the
push, to the public repository.
For information on how to access the public repository
see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
ChangeSet@stripped, 2008-02-05 17:36:26+02:00, aelkin@stripped +2 -0
Bug #34305 show slave status handling segfaults when slave io is about
to leave
The artifact was caused by
a flaw in concurrent accessing the slave's io thd by
the io itself and a handling show slave status thread.
Namely, show_master_info did not acquire mi->run_lock mutex that is
specified for mi->io_thd member.
Fixed with deploying the mutex locking and unlocking. The mutex is kept
short time and without interleaving with mi->data_lock mutex.
Todo: to report and fix an issue with
sys_var_slave_skip_counter::{methods}
seem to acquire incorrectly
active_mi->rli.run_lock
instead of the specified
active_mi->rli.data_lock
A test case is difficult to compose, so rpl_packet should continue serving
as the indicator.
sql/slave.cc@stripped, 2008-02-05 17:36:17+02:00, aelkin@stripped +6 -5
implementing a TODO left at 4.1 time:
mending access to mi->io_thd with the specified mutex;
sql/slave.h@stripped, 2008-02-05 17:36:17+02:00, aelkin@stripped +2 -2
adding a member name to the list of that run_lock guards.
diff -Nrup a/sql/slave.cc b/sql/slave.cc
--- a/sql/slave.cc 2007-12-20 17:07:52 +02:00
+++ b/sql/slave.cc 2008-02-05 17:36:17 +02:00
@@ -2447,14 +2447,15 @@ bool show_master_info(THD* thd, MASTER_I
protocol->prepare_for_resend();
/*
- TODO: we read slave_running without run_lock, whereas these variables
- are updated under run_lock and not data_lock. In 5.0 we should lock
- run_lock on top of data_lock (with good order).
+ slave_running can be accessed without run_lock but not other
+ non-volotile members like mi->io_thd, which is guarded by the mutex.
*/
+ pthread_mutex_lock(&mi->run_lock);
+ protocol->store(mi->io_thd ? mi->io_thd->proc_info : "",
&my_charset_bin);
+ pthread_mutex_unlock(&mi->run_lock);
+
pthread_mutex_lock(&mi->data_lock);
pthread_mutex_lock(&mi->rli.data_lock);
-
- protocol->store(mi->io_thd ? mi->io_thd->proc_info : "",
&my_charset_bin);
protocol->store(mi->host, &my_charset_bin);
protocol->store(mi->user, &my_charset_bin);
protocol->store((uint32) mi->port);
diff -Nrup a/sql/slave.h b/sql/slave.h
--- a/sql/slave.h 2007-02-08 16:53:12 +02:00
+++ b/sql/slave.h 2008-02-05 17:36:17 +02:00
@@ -65,8 +65,8 @@
mi->rli does not either.
In MASTER_INFO: run_lock, data_lock
- run_lock protects all information about the run state: slave_running, and the
- existence of the I/O thread (to stop/start it, you need this mutex).
+ run_lock protects all information about the run state: slave_running, thd
+ and the existence of the I/O thread to stop/start it, you need this mutex).
data_lock protects some moving members of the struct: counters (log name,
position) and relay log (MYSQL_LOG object).