From: Andrei Elkin Date: March 28 2012 4:17pm Subject: bzr push into mysql-trunk branch (andrei.elkin:3866 to 3867) Bug#12995174 Bug#13840948 List-Archive: http://lists.mysql.com/commits/143349 X-Bug: 12995174,13840948 Message-Id: <201203281617.q2SGHIju005201@mysql1000.dsl.inet.fi> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit 3867 Andrei Elkin 2012-03-28 Bug#12995174 - MTS: UNEXPECTED RECOVERY ATTEMPT ENDS WITH ER_MASTER_INFO OR ASSERTION FAILURE Bug#13840948 - CHANGE MASTER MODIFIES RELAY_LOG_PURGE OPTION VALUE. The patch coveres a bunch of issues. 0. The reported problem itself turned out to be unnecessary recalculation of rli->group_master_log_* coordinates. In case of relay-log purging for Change-Master is not done the coordinates may be safely reused. Otherwise, as it had been before this patch, the new values may not correspond to the actual execution state because the executed were set optimistically to the lastest that IO thread had been read. Fixed with avoiding recalculation to reuse existing rli->group_master_log_* coordinates for the mentioned no-purging branches of Change-Master. 1. In presence of MTS recovery gaps CHANGE MASTER now errors out with a new error code. 2. Similar intolerance to MTS gaps is implemented for --relay-log-recovery handling as the option is logically equivalent to RESET SLAVE plus CHANGE MASTER. 3. Bug#13840948: In some of its branches change_master() overrides an existing startup time option value. Fixed with reverting temporarily modified --relay-log-purge's value. 4. @@global.relay_log_recovery was not read-only though it should have been. Fixed to be converted into read-only. 5. fixing signature of few mts functions to pass a mutex is locked indication. @ mysql-test/suite/rpl/r/rpl_parallel_change_master.result new results file is added. @ mysql-test/suite/rpl/t/rpl_parallel_change_master.test regression test for p.1-3 is added. @ mysql-test/suite/sys_vars/r/relay_log_recovery_basic.result results updated. @ mysql-test/suite/sys_vars/t/relay_log_recovery_basic.test @@global.relay_log_recovery is converted into read-only. @ sql/log_event.cc rli->reset_notified_checkpoint() @ sql/rpl_rli.cc Relay_log_info::reset_notified_checkpoint() should not lock data_lock is the caller acquired it. @ sql/rpl_rli.h reset_notified_checkpoint() signature is changed. @ sql/rpl_slave.cc change_master() does not tolerate MTS gaps to error out; Workers info is removed in change_master() if there are no gaps; deploying mts gaps similarly to change_master() so in case of its non-zero --relay-log-recovery handling errors out; reverting temporarily modified --relay-log-purge's value in change-master(). @ sql/share/errmsg-utf8.txt New MTS specific error to prevent Change-Master when a condition is met. @ sql/sys_vars.cc relay_log_recovery is officially read-only. added: mysql-test/suite/rpl/r/rpl_parallel_change_master.result mysql-test/suite/rpl/t/rpl_parallel_change_master.test modified: mysql-test/suite/sys_vars/r/relay_log_recovery_basic.result mysql-test/suite/sys_vars/t/relay_log_recovery_basic.test sql/log_event.cc sql/rpl_rli.cc sql/rpl_rli.h sql/rpl_slave.cc sql/share/errmsg-utf8.txt sql/sys_vars.cc 3866 Guilhem Bichot 2012-03-28 Optimizer-trace-only changes: Fix for BUG#13799348 MAKE OPTIMIZER TRACE LESS/MORE VERBOSE REGARDING DATABASE / PLAN SEARCH 1) when greedy search is adding a table to the plan, print the plan prefix so far (i.e. previous tables); this is to make it easier for the reader to figure out what the current plan is (it can get difficult otherwise, as explained in the commit comments of BUG 13685026) 2) for a table, instead of printing two properties in some opt trace objects (like in "best_access_path"): "database": "db", "table": "tbl", print one: either as "table": "db.tbl" or as "db.tbl" when it's clear that it's a table (as in the "plan prefix") this allows to remove some objects, thus decreases nesting level. 3) don't print a table's database if this database is the same as the connection's database (which is, in practice, often the case when we debug); this should provide more compact and readable traces. @ mysql-test/suite/opt_trace/include/general2.inc verify that if table's db is not the default one, it's printed in the trace @ mysql-test/suite/opt_trace/r/bugs_no_prot_all.result Notice the plan_prefix: + "plan_prefix": [ + "`t1` `table1`", + "`t1` `table2`" + ] /* plan_prefix */, The printout of both real name and alias is a happy side-effect of using TABLE_LIST::print(). This way one knows what table the optimizer is talking about. @ sql/item.cc don't print db of an Item if the flag asks so @ sql/opt_trace.cc Ask for no default db, if printing Item. Don't print db of a TABLE if db is the default one; for that, reuse the TABLE_LIST::print() logic by calling this function; as a consequence of this reuse, we have the little "problem" that, if the table is a tmp table materialized from a derived table, printing it now prints this long form: "(SELECT ...) alias_of_derived_table" which is deemed too verbose for printing the table in "best_access_path" for example, thus we add a flag QT_DERIVED_TABLE_ONLY_ALIAS to express that we only want to print the alias. @ sql/opt_trace2server.cc ask for no default db, if printing expanded query @ sql/sql_lex.cc When printing a TABLE_LIST, if flags ask, then: - don't print db if it's the default db - don't print derived table's definition (only print the alias) @ sql/sql_optimizer.cc ask for no default db, when we print items of ORDER BY one by one (this happens when we trace the transformation of each item, not when we print the clause). @ sql/sql_planner.cc deep in greedy search, print the current join prefix ("plan_prefix") modified: mysql-test/suite/opt_trace/include/general2.inc mysql-test/suite/opt_trace/r/bugs_no_prot_all.result mysql-test/suite/opt_trace/r/bugs_no_prot_none.result mysql-test/suite/opt_trace/r/bugs_ps_prot_all.result mysql-test/suite/opt_trace/r/bugs_ps_prot_none.result mysql-test/suite/opt_trace/r/charset.result mysql-test/suite/opt_trace/r/eq_range_statistics.result mysql-test/suite/opt_trace/r/filesort_pq.result mysql-test/suite/opt_trace/r/general2_no_prot.result mysql-test/suite/opt_trace/r/general2_ps_prot.result mysql-test/suite/opt_trace/r/general_no_prot_all.result mysql-test/suite/opt_trace/r/general_no_prot_none.result mysql-test/suite/opt_trace/r/general_ps_prot_all.result mysql-test/suite/opt_trace/r/general_ps_prot_none.result mysql-test/suite/opt_trace/r/range_no_prot.result mysql-test/suite/opt_trace/r/range_ps_prot.result mysql-test/suite/opt_trace/r/security_no_prot.result mysql-test/suite/opt_trace/r/security_ps_prot.result mysql-test/suite/opt_trace/r/subquery_no_prot.result mysql-test/suite/opt_trace/r/subquery_ps_prot.result mysql-test/suite/opt_trace/r/temp_table.result sql/item.cc sql/mysqld.h sql/opt_trace.cc sql/opt_trace2server.cc sql/sql_lex.cc sql/sql_lex.h sql/sql_optimizer.cc sql/sql_planner.cc === added file 'mysql-test/suite/rpl/r/rpl_parallel_change_master.result' --- a/mysql-test/suite/rpl/r/rpl_parallel_change_master.result 1970-01-01 00:00:00 +0000 +++ b/mysql-test/suite/rpl/r/rpl_parallel_change_master.result 2012-03-28 15:24:17 +0000 @@ -0,0 +1,57 @@ +include/master-slave.inc +Warnings: +Note 1756 Sending passwords in plain text without SSL/TLS is extremely insecure. +Note 1757 Storing MySQL user name or password information in the master.info repository is not secure and is therefore not recommended. Please see the MySQL Manual for more about this issue and possible alternatives. +[connection master] +call mtr.add_suppression("Slave SQL: Could not execute Write_rows event on table d1.t1; Duplicate entry '13' for key 'a'"); +call mtr.add_suppression("Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state."); +include/stop_slave.inc +SET @save.slave_parallel_workers=@@global.slave_parallel_workers; +SET @@global.slave_parallel_workers=2; +include/start_slave.inc +CREATE DATABASE d1; +CREATE DATABASE d2; +CREATE TABLE d1.t1 (a int unique) ENGINE=INNODB; +CREATE TABLE d2.t1 (a int unique) ENGINE=INNODB; +INSERT INTO d1.t1 VALUES (1); +FLUSH LOGS; +include/stop_slave.inc +CHANGE MASTER TO MASTER_DELAY=5; +include/start_slave.inc +INSERT INTO d1.t1 VALUES (3); +INSERT INTO d1.t1 VALUES (5); +FLUSH LOGS; +include/stop_slave.inc +CHANGE MASTER TO RELAY_LOG_FILE=FILE, RELAY_LOG_POS= POS; +include/start_slave.inc +include/stop_slave.inc +CHANGE MASTER TO RELAY_LOG_FILE=FILE, RELAY_LOG_POS= POS, MASTER_DELAY=0; +include/start_slave.inc +BEGIN; +INSERT INTO d1.t1 VALUES (13); +INSERT INTO d1.t1 VALUES (6); +INSERT INTO d2.t1 VALUES (7); +INSERT INTO d1.t1 VALUES (13); +INSERT INTO d2.t1 VALUES (8); +INSERT INTO d2.t1 VALUES (9); +COMMIT; +include/wait_for_slave_sql_error.inc [errno=1062] +include/stop_slave_io.inc +CHANGE MASTER TO RELAY_LOG_FILE=FILE, RELAY_LOG_POS= POS; +ERROR HY000: CHANGE MASTER cannot be executed when the slave was stopped with an error or killed in MTS mode. Consider using RESET SLAVE or START SLAVE UNTIL. +SET @@global.slave_parallel_workers= @save.slave_parallel_workers; +include/rpl_restart_server.inc [server_number=2 parameters: --relay-log-recovery --skip-slave-start] +SELECT @@global.relay_log_recovery as 'must be ON'; +must be ON +1 +call mtr.add_suppression("--relay-log-recovery cannot be executed when the slave was stopped with an error or killed in MTS mode; consider using RESET SLAVE or restart the server with --relay-log-recovery = 0"); +call mtr.add_suppression("Failed to initialize the master info structure"); +include/rpl_restart_server.inc [server_number=2 parameters: --skip-slave-start] +SELECT @@global.relay_log_recovery as 'must be OFF'; +must be OFF +0 +DELETE FROM d1.t1 WHERE a = 13; +include/start_slave.inc +DROP DATABASE d1; +DROP DATABASE d2; +include/rpl_end.inc === added file 'mysql-test/suite/rpl/t/rpl_parallel_change_master.test' --- a/mysql-test/suite/rpl/t/rpl_parallel_change_master.test 1970-01-01 00:00:00 +0000 +++ b/mysql-test/suite/rpl/t/rpl_parallel_change_master.test 2012-03-28 15:24:17 +0000 @@ -0,0 +1,119 @@ +# +# Test verifies MTS behaviour with regard to +# Change-Master. +# +# Related bugs: +# Bug 12995174 - MTS: UNEXPECTED RECOVERY ATTEMPT ENDS WITH ER_MASTER_INFO OR ASSERTION + +--source include/master-slave.inc +# The test for bug#12995174 is not format-specific but uses sleep +# so it made to be run in ROW format that is the way the bug is reported. +--source include/have_binlog_format_row.inc +--source include/have_innodb.inc + +--connection slave +call mtr.add_suppression("Slave SQL: Could not execute Write_rows event on table d1.t1; Duplicate entry '13' for key 'a'"); +call mtr.add_suppression("Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state."); +--source include/stop_slave.inc +SET @save.slave_parallel_workers=@@global.slave_parallel_workers; +SET @@global.slave_parallel_workers=2; +--source include/start_slave.inc + +--connection master + +CREATE DATABASE d1; +CREATE DATABASE d2; +CREATE TABLE d1.t1 (a int unique) ENGINE=INNODB; +CREATE TABLE d2.t1 (a int unique) ENGINE=INNODB; + +INSERT INTO d1.t1 VALUES (1); +FLUSH LOGS; + +--sync_slave_with_master + +--source include/stop_slave.inc +CHANGE MASTER TO MASTER_DELAY=5; +--source include/start_slave.inc + +--connection master +INSERT INTO d1.t1 VALUES (3); +--sleep 3 +INSERT INTO d1.t1 VALUES (5); +FLUSH LOGS; + +--connection slave +--source include/stop_slave.inc + +let $relay_file = query_get_value( SHOW SLAVE STATUS, Relay_Log_File, 1 ); +let $relay_pos = query_get_value( SHOW SLAVE STATUS, Relay_Log_Pos, 1 ); + +--replace_regex /RELAY_LOG_FILE=[^,]+/RELAY_LOG_FILE=FILE/ /RELAY_LOG_POS=[0-9]+/ RELAY_LOG_POS= POS/ +eval CHANGE MASTER TO RELAY_LOG_FILE='$relay_file', RELAY_LOG_POS=$relay_pos; + +--source include/start_slave.inc +--sleep 5 +--source include/stop_slave.inc + +let $relay_file = query_get_value( SHOW SLAVE STATUS, Relay_Log_File, 1 ); +let $relay_pos = query_get_value( SHOW SLAVE STATUS, Relay_Log_Pos, 1 ); +--replace_regex /RELAY_LOG_FILE=[^,]+/RELAY_LOG_FILE=FILE/ /RELAY_LOG_POS=[0-9]+/ RELAY_LOG_POS= POS/ +eval CHANGE MASTER TO RELAY_LOG_FILE='$relay_file', RELAY_LOG_POS=$relay_pos, MASTER_DELAY=0; + +--source include/start_slave.inc +BEGIN; +INSERT INTO d1.t1 VALUES (13); # to cause the dup key error +# change-master with gaps +--connection master + +INSERT INTO d1.t1 VALUES (6); +INSERT INTO d2.t1 VALUES (7); +INSERT INTO d1.t1 VALUES (13); +INSERT INTO d2.t1 VALUES (8); +INSERT INTO d2.t1 VALUES (9); + +--connection slave +COMMIT; # worker executing (13) errors out + +--let $slave_sql_errno= 1062 +--source include/wait_for_slave_sql_error.inc + +--source include/stop_slave_io.inc + +let $relay_file = query_get_value( SHOW SLAVE STATUS, Relay_Log_File, 1 ); +let $relay_pos = query_get_value( SHOW SLAVE STATUS, Relay_Log_Pos, 1 ); +--replace_regex /RELAY_LOG_FILE=[^,]+/RELAY_LOG_FILE=FILE/ /RELAY_LOG_POS=[0-9]+/ RELAY_LOG_POS= POS/ +--error ER_MTS_CHANGE_MASTER_CANT_RUN_WITH_GAPS +eval CHANGE MASTER TO RELAY_LOG_FILE='$relay_file', RELAY_LOG_POS=$relay_pos; + +SET @@global.slave_parallel_workers= @save.slave_parallel_workers; # cleanup + +# +# --relay-log-recovery= 1 and MTS gaps is handled similarly to Change-Master +# +--let $rpl_server_number= 2 +--let $rpl_server_parameters= --relay-log-recovery --skip-slave-start +--source include/rpl_restart_server.inc + +--connection slave +SELECT @@global.relay_log_recovery as 'must be ON'; +call mtr.add_suppression("--relay-log-recovery cannot be executed when the slave was stopped with an error or killed in MTS mode; consider using RESET SLAVE or restart the server with --relay-log-recovery = 0"); +call mtr.add_suppression("Failed to initialize the master info structure"); + +--let $rpl_server_number= 2 +--let $rpl_server_parameters= --skip-slave-start +--source include/rpl_restart_server.inc + +SELECT @@global.relay_log_recovery as 'must be OFF'; +--connection slave +DELETE FROM d1.t1 WHERE a = 13; +--source include/start_slave.inc + +# +# cleanup +# +--connection master +DROP DATABASE d1; +DROP DATABASE d2; +--sync_slave_with_master + +--source include/rpl_end.inc === modified file 'mysql-test/suite/sys_vars/r/relay_log_recovery_basic.result' --- a/mysql-test/suite/sys_vars/r/relay_log_recovery_basic.result 2010-01-29 06:33:00 +0000 +++ b/mysql-test/suite/sys_vars/r/relay_log_recovery_basic.result 2012-03-28 15:24:17 +0000 @@ -1,7 +1,3 @@ -SET @start_global_value = @@global.relay_log_recovery; -SELECT @start_global_value; -@start_global_value -0 select @@global.relay_log_recovery; @@global.relay_log_recovery 0 @@ -20,16 +16,7 @@ select * from information_schema.session VARIABLE_NAME VARIABLE_VALUE RELAY_LOG_RECOVERY OFF set global relay_log_recovery=1; -select @@global.relay_log_recovery; -@@global.relay_log_recovery -1 -select * from information_schema.global_variables where variable_name='relay_log_recovery'; -VARIABLE_NAME VARIABLE_VALUE -RELAY_LOG_RECOVERY ON -select * from information_schema.session_variables where variable_name='relay_log_recovery'; -VARIABLE_NAME VARIABLE_VALUE -RELAY_LOG_RECOVERY ON -set global relay_log_recovery=OFF; +ERROR HY000: Variable 'relay_log_recovery' is a read only variable select @@global.relay_log_recovery; @@global.relay_log_recovery 0 @@ -39,15 +26,3 @@ RELAY_LOG_RECOVERY OFF select * from information_schema.session_variables where variable_name='relay_log_recovery'; VARIABLE_NAME VARIABLE_VALUE RELAY_LOG_RECOVERY OFF -set session relay_log_recovery=1; -ERROR HY000: Variable 'relay_log_recovery' is a GLOBAL variable and should be set with SET GLOBAL -set global relay_log_recovery=1.1; -ERROR 42000: Incorrect argument type to variable 'relay_log_recovery' -set global relay_log_recovery=1e1; -ERROR 42000: Incorrect argument type to variable 'relay_log_recovery' -set global relay_log_recovery="foo"; -ERROR 42000: Variable 'relay_log_recovery' can't be set to the value of 'foo' -SET @@global.relay_log_recovery = @start_global_value; -SELECT @@global.relay_log_recovery; -@@global.relay_log_recovery -0 === modified file 'mysql-test/suite/sys_vars/t/relay_log_recovery_basic.test' --- a/mysql-test/suite/sys_vars/t/relay_log_recovery_basic.test 2010-01-29 06:33:00 +0000 +++ b/mysql-test/suite/sys_vars/t/relay_log_recovery_basic.test 2012-03-28 15:24:17 +0000 @@ -6,9 +6,6 @@ --source include/not_embedded.inc -SET @start_global_value = @@global.relay_log_recovery; -SELECT @start_global_value; - # # exists as global only # @@ -21,29 +18,12 @@ select * from information_schema.global_ select * from information_schema.session_variables where variable_name='relay_log_recovery'; # -# show that it's writable +# show that it's read-only # +--error ER_INCORRECT_GLOBAL_LOCAL_VAR set global relay_log_recovery=1; select @@global.relay_log_recovery; select * from information_schema.global_variables where variable_name='relay_log_recovery'; select * from information_schema.session_variables where variable_name='relay_log_recovery'; -set global relay_log_recovery=OFF; -select @@global.relay_log_recovery; -select * from information_schema.global_variables where variable_name='relay_log_recovery'; -select * from information_schema.session_variables where variable_name='relay_log_recovery'; ---error ER_GLOBAL_VARIABLE -set session relay_log_recovery=1; - -# -# incorrect types -# ---error ER_WRONG_TYPE_FOR_VAR -set global relay_log_recovery=1.1; ---error ER_WRONG_TYPE_FOR_VAR -set global relay_log_recovery=1e1; ---error ER_WRONG_VALUE_FOR_VAR -set global relay_log_recovery="foo"; -SET @@global.relay_log_recovery = @start_global_value; -SELECT @@global.relay_log_recovery; === modified file 'sql/log_event.cc' --- a/sql/log_event.cc 2012-03-27 00:14:24 +0000 +++ b/sql/log_event.cc 2012-03-28 15:24:17 +0000 @@ -6529,7 +6529,7 @@ int Rotate_log_event::do_update_pos(Rela (ulong) rli->get_group_master_log_pos())); mysql_mutex_unlock(&rli->data_lock); if (rli->is_parallel_exec()) - rli->reset_notified_checkpoint(0, when.tv_sec + (time_t) exec_time); + rli->reset_notified_checkpoint(0, when.tv_sec + (time_t) exec_time, false); /* Reset thd->variables.option_bits and sql_mode etc, because this could be the signal of === modified file 'sql/rpl_rli.cc' --- a/sql/rpl_rli.cc 2012-03-27 08:43:25 +0000 +++ b/sql/rpl_rli.cc 2012-03-28 15:24:17 +0000 @@ -197,8 +197,10 @@ void Relay_log_info::reset_notified_rela checkpoint change. @param new_ts new seconds_behind_master timestamp value unless zero. Zero could be due to FD event. + @param locked true if caller has locked @c data_lock */ -void Relay_log_info::reset_notified_checkpoint(ulong shift, time_t new_ts= 0) +void Relay_log_info::reset_notified_checkpoint(ulong shift, time_t new_ts, + bool locked) { /* If this is not a parallel execution we return immediately. @@ -238,9 +240,11 @@ void Relay_log_info::reset_notified_chec if (new_ts) { - mysql_mutex_lock(&data_lock); + if (!locked) + mysql_mutex_lock(&data_lock); last_master_timestamp= new_ts; - mysql_mutex_unlock(&data_lock); + if (!locked) + mysql_mutex_unlock(&data_lock); } } @@ -1745,7 +1749,8 @@ a file name for --relay-log-index option err: inited= 0; - sql_print_error("%s.", msg); + if (msg) + sql_print_error("%s.", msg); relay_log.close(LOG_CLOSE_INDEX | LOG_CLOSE_STOP_EVENT); DBUG_RETURN(error); } === modified file 'sql/rpl_rli.h' --- a/sql/rpl_rli.h 2012-03-27 08:43:25 +0000 +++ b/sql/rpl_rli.h 2012-03-28 15:24:17 +0000 @@ -621,7 +621,7 @@ public: Coordinator notifies Workers about this event. Coordinator and Workers maintain a bitmap of executed group that is reset with a new checkpoint. */ - void reset_notified_checkpoint(ulong, time_t); + void reset_notified_checkpoint(ulong, time_t, bool); /* * End of MTS section ******************************************************/ === modified file 'sql/rpl_slave.cc' --- a/sql/rpl_slave.cc 2012-03-27 08:43:25 +0000 +++ b/sql/rpl_slave.cc 2012-03-28 15:24:17 +0000 @@ -399,6 +399,15 @@ int init_recovery(Master_info* mi, const We need to improve this. /Alfranio. */ error= mts_recovery_groups(rli, &rli->recovery_groups); + if (rli->mts_recovery_group_cnt) + { + error= 1; + sql_print_error("--relay-log-recovery cannot be executed when the slave " + "was stopped with an error or killed in MTS mode; " + "consider using RESET SLAVE or restart the server " + "with --relay-log-recovery = 0 followed by " + "START SLAVE"); + } } group_master_log_name= const_cast(rli->get_group_master_log_name()); @@ -4737,7 +4746,7 @@ bool mts_checkpoint_routine(Relay_log_in cnt is zero. This value means that the checkpoint information will be completely reset. */ - rli->reset_notified_checkpoint(cnt, rli->gaq->lwm.ts); + rli->reset_notified_checkpoint(cnt, rli->gaq->lwm.ts, locked); /* end-of "Coordinator::"commit_positions" */ @@ -7673,7 +7682,8 @@ err: } /** - Execute a CHANGE MASTER statement. + Execute a CHANGE MASTER statement. MTS workers info tables data are removed + in the successful branch (i.e. there are no gaps in the execution history). @param thd Pointer to THD object for the client thread executing the statement. @@ -7690,11 +7700,13 @@ bool change_master(THD* thd, Master_info const char* errmsg= 0; bool need_relay_log_purge= 1; char *var_master_log_name= NULL, *var_group_master_log_name= NULL; - bool ret= FALSE; + bool ret= false; char saved_host[HOSTNAME_LENGTH + 1], saved_bind_addr[HOSTNAME_LENGTH + 1]; uint saved_port= 0; char saved_log_name[FN_REFLEN]; my_off_t saved_log_pos= 0; + my_bool save_relay_log_purge= relay_log_purge; + bool mts_remove_workers= false; DBUG_ENTER("change_master"); @@ -7729,7 +7741,28 @@ bool change_master(THD* thd, Master_info ret= true; goto err; } + if (mi->rli->mts_recovery_group_cnt) + { + /* + Change-Master can't be done if there is a mts group gap. + That requires mts-recovery which START SLAVE provides. + */ + DBUG_ASSERT(mi->rli->recovery_parallel_workers); + my_message(ER_MTS_CHANGE_MASTER_CANT_RUN_WITH_GAPS, + ER(ER_MTS_CHANGE_MASTER_CANT_RUN_WITH_GAPS), MYF(0)); + ret= true; + goto err; + } + else + { + /* + Lack of mts group gaps makes Workers info stale + regardless of need_relay_log_purge computation. + */ + if (mi->rli->recovery_parallel_workers) + mts_remove_workers= true; + } /* We cannot specify auto position and set either the coordinates on master or slave. If we try to do so, an error message is @@ -7982,8 +8015,7 @@ bool change_master(THD* thd, Master_info THD_STAGE_INFO(thd, stage_purging_old_relay_logs); if (mi->rli->purge_relay_logs(thd, 0 /* not only reset, but also reinit */, - &errmsg) || - remove_workers(mi->rli)) + &errmsg)) { my_error(ER_RELAY_LOG_FAIL, MYF(0), errmsg); ret= TRUE; @@ -8005,6 +8037,7 @@ bool change_master(THD* thd, Master_info goto err; } } + relay_log_purge= save_relay_log_purge; /* Coordinates in rli were spoilt by the 'if (need_relay_log_purge)' block, @@ -8016,10 +8049,12 @@ bool change_master(THD* thd, Master_info ''/0: we have lost all copies of the original good coordinates. That's why we always save good coords in rli. */ - mi->rli->set_group_master_log_pos(mi->get_master_log_pos()); - DBUG_PRINT("info", ("master_log_pos: %lu", (ulong) mi->get_master_log_pos())); - mi->rli->set_group_master_log_name(mi->get_master_log_name()); - + if (need_relay_log_purge) + { + mi->rli->set_group_master_log_pos(mi->get_master_log_pos()); + DBUG_PRINT("info", ("master_log_pos: %lu", (ulong) mi->get_master_log_pos())); + mi->rli->set_group_master_log_name(mi->get_master_log_name()); + } var_group_master_log_name= const_cast(mi->rli->get_group_master_log_name()); if (!var_group_master_log_name[0]) // uninitialized case mi->rli->set_group_master_log_pos(0); @@ -8057,7 +8092,11 @@ bool change_master(THD* thd, Master_info err: unlock_slave_threads(mi); if (ret == FALSE) + { + if (mts_remove_workers) + remove_workers(mi->rli); my_ok(thd); + } DBUG_RETURN(ret); } /** === modified file 'sql/share/errmsg-utf8.txt' --- a/sql/share/errmsg-utf8.txt 2012-03-21 08:42:00 +0000 +++ b/sql/share/errmsg-utf8.txt 2012-03-28 15:24:17 +0000 @@ -6740,6 +6740,8 @@ ER_UNKNOWN_ALTER_ALGORITHM ER_UNKNOWN_ALTER_LOCK eng "Unknown LOCK type '%s'" +ER_MTS_CHANGE_MASTER_CANT_RUN_WITH_GAPS + eng "CHANGE MASTER cannot be executed when the slave was stopped with an error or killed in MTS mode. Consider using RESET SLAVE or START SLAVE UNTIL." # # End of 5.6 error messages. # === modified file 'sql/sys_vars.cc' --- a/sql/sys_vars.cc 2012-03-21 21:19:11 +0000 +++ b/sql/sys_vars.cc 2012-03-28 15:24:17 +0000 @@ -3650,7 +3650,7 @@ static Sys_var_mybool Sys_relay_log_reco "right after the database startup, which means that the IO Thread " "starts re-fetching from the master right after the last transaction " "processed", - GLOBAL_VAR(relay_log_recovery), CMD_LINE(OPT_ARG), DEFAULT(FALSE)); + READ_ONLY GLOBAL_VAR(relay_log_recovery), CMD_LINE(OPT_ARG), DEFAULT(FALSE)); static Sys_var_mybool Sys_slave_allow_batching( "slave_allow_batching", "Allow slave to batch requests", No bundle (reason: useless for push emails).