List:Commits« Previous MessageNext Message »
From:Andrei Elkin Date:March 28 2012 4:17pm
Subject:bzr push into mysql-trunk branch (andrei.elkin:3866 to 3867) Bug#12995174
Bug#13840948
View as plain text  
 3867 Andrei Elkin	2012-03-28
      Bug#12995174 - MTS: UNEXPECTED RECOVERY ATTEMPT ENDS WITH ER_MASTER_INFO OR ASSERTION FAILURE
      Bug#13840948 - CHANGE MASTER MODIFIES RELAY_LOG_PURGE OPTION VALUE.
      
      The patch coveres a bunch of issues.
      
      0. The reported problem itself turned out to be unnecessary recalculation of
         rli->group_master_log_* coordinates.
         In case of relay-log purging for Change-Master is not done the coordinates 
         may be safely reused.
         Otherwise, as it had been before this patch, the new values may not correspond
         to the actual execution state because the executed were set optimistically
         to the lastest that IO thread had been read.
      
         Fixed with avoiding recalculation to reuse existing rli->group_master_log_* coordinates
         for the mentioned no-purging branches of Change-Master.
      
      1. In presence of MTS recovery gaps CHANGE MASTER now errors out with 
         a new error code.
      
      2. Similar intolerance to MTS gaps is implemented for --relay-log-recovery handling
         as the option is logically equivalent to RESET SLAVE plus CHANGE MASTER.
         
      3. Bug#13840948: In some of its branches change_master() overrides an existing startup time
         option value.
      
         Fixed with reverting temporarily modified --relay-log-purge's value.
      
      4. @@global.relay_log_recovery was not read-only though it should have been.
      
         Fixed to be converted into read-only.
      
      5. fixing signature of few mts functions to pass a mutex is locked indication.
     @ mysql-test/suite/rpl/r/rpl_parallel_change_master.result
        new results file is added.
     @ mysql-test/suite/rpl/t/rpl_parallel_change_master.test
        regression test for p.1-3 is added.
     @ mysql-test/suite/sys_vars/r/relay_log_recovery_basic.result
        results updated.
     @ mysql-test/suite/sys_vars/t/relay_log_recovery_basic.test
        @@global.relay_log_recovery is converted into read-only.
     @ sql/log_event.cc
        rli->reset_notified_checkpoint()
     @ sql/rpl_rli.cc
        Relay_log_info::reset_notified_checkpoint() should not lock data_lock
        is the caller acquired it.
     @ sql/rpl_rli.h
        reset_notified_checkpoint() signature is changed.
     @ sql/rpl_slave.cc
        change_master() does not tolerate MTS gaps to error out;
        Workers info is removed in change_master() if there are no gaps;
        deploying mts gaps similarly to change_master() so in case of its non-zero 
        --relay-log-recovery handling errors out;
        reverting temporarily modified --relay-log-purge's value in change-master().
     @ sql/share/errmsg-utf8.txt
        New MTS specific error to prevent Change-Master when a condition is met.
     @ sql/sys_vars.cc
        relay_log_recovery is officially read-only.

    added:
      mysql-test/suite/rpl/r/rpl_parallel_change_master.result
      mysql-test/suite/rpl/t/rpl_parallel_change_master.test
    modified:
      mysql-test/suite/sys_vars/r/relay_log_recovery_basic.result
      mysql-test/suite/sys_vars/t/relay_log_recovery_basic.test
      sql/log_event.cc
      sql/rpl_rli.cc
      sql/rpl_rli.h
      sql/rpl_slave.cc
      sql/share/errmsg-utf8.txt
      sql/sys_vars.cc
 3866 Guilhem Bichot	2012-03-28
      Optimizer-trace-only changes:
      Fix for BUG#13799348 MAKE OPTIMIZER TRACE LESS/MORE VERBOSE REGARDING DATABASE / PLAN SEARCH
      
      1) when greedy search is adding a table to the plan,
      print the plan prefix so far (i.e. previous tables);
      this is to make it easier for the reader to figure out
      what the current plan is (it can get difficult
      otherwise, as explained in the commit comments of
      BUG 13685026)
      2) for a table, instead of printing two properties in some opt trace objects
      (like in "best_access_path"):
        "database": "db",
        "table": "tbl",
      print one: either as
        "table": "db.tbl"
      or as "db.tbl" when it's clear that it's a table (as in the "plan prefix")
      this allows to remove some objects, thus decreases nesting level.
      3) don't print a table's database if this database is the same as
      the connection's database (which is, in practice, often the case when
      we debug); this should provide more compact and readable traces.
     @ mysql-test/suite/opt_trace/include/general2.inc
        verify that if table's db is not the default one, it's printed in the trace
     @ mysql-test/suite/opt_trace/r/bugs_no_prot_all.result
        Notice the plan_prefix:
        +                        "plan_prefix": [
        +                          "`t1` `table1`",
        +                          "`t1` `table2`"
        +                        ] /* plan_prefix */,
        The printout of both real name and alias is a happy side-effect
        of using TABLE_LIST::print(). This way one knows what table
        the optimizer is talking about.
     @ sql/item.cc
        don't print db of an Item if the flag asks so
     @ sql/opt_trace.cc
        Ask for no default db, if printing Item.
        Don't print db of a TABLE if db is the default one;
        for that, reuse the TABLE_LIST::print() logic
        by calling this function; as a consequence of this reuse,
        we have the little "problem" that, if the table
        is a tmp table materialized from a derived table, printing
        it now prints this long form:
           "(SELECT ...) alias_of_derived_table"
        which is deemed too verbose for printing the table
        in "best_access_path" for example, thus we add a flag
        QT_DERIVED_TABLE_ONLY_ALIAS to express that we only
        want to print the alias.
     @ sql/opt_trace2server.cc
        ask for no default db, if printing expanded query
     @ sql/sql_lex.cc
        When printing a TABLE_LIST, if flags ask, then:
        - don't print db if it's the default db
        - don't print derived table's definition (only print the alias)
     @ sql/sql_optimizer.cc
        ask for no default db, when we print items of ORDER BY one by one
        (this happens when we trace the transformation of each item,
        not when we print the clause).
     @ sql/sql_planner.cc
        deep in greedy search, print the current join prefix ("plan_prefix")

    modified:
      mysql-test/suite/opt_trace/include/general2.inc
      mysql-test/suite/opt_trace/r/bugs_no_prot_all.result
      mysql-test/suite/opt_trace/r/bugs_no_prot_none.result
      mysql-test/suite/opt_trace/r/bugs_ps_prot_all.result
      mysql-test/suite/opt_trace/r/bugs_ps_prot_none.result
      mysql-test/suite/opt_trace/r/charset.result
      mysql-test/suite/opt_trace/r/eq_range_statistics.result
      mysql-test/suite/opt_trace/r/filesort_pq.result
      mysql-test/suite/opt_trace/r/general2_no_prot.result
      mysql-test/suite/opt_trace/r/general2_ps_prot.result
      mysql-test/suite/opt_trace/r/general_no_prot_all.result
      mysql-test/suite/opt_trace/r/general_no_prot_none.result
      mysql-test/suite/opt_trace/r/general_ps_prot_all.result
      mysql-test/suite/opt_trace/r/general_ps_prot_none.result
      mysql-test/suite/opt_trace/r/range_no_prot.result
      mysql-test/suite/opt_trace/r/range_ps_prot.result
      mysql-test/suite/opt_trace/r/security_no_prot.result
      mysql-test/suite/opt_trace/r/security_ps_prot.result
      mysql-test/suite/opt_trace/r/subquery_no_prot.result
      mysql-test/suite/opt_trace/r/subquery_ps_prot.result
      mysql-test/suite/opt_trace/r/temp_table.result
      sql/item.cc
      sql/mysqld.h
      sql/opt_trace.cc
      sql/opt_trace2server.cc
      sql/sql_lex.cc
      sql/sql_lex.h
      sql/sql_optimizer.cc
      sql/sql_planner.cc
=== added file 'mysql-test/suite/rpl/r/rpl_parallel_change_master.result'
--- a/mysql-test/suite/rpl/r/rpl_parallel_change_master.result	1970-01-01 00:00:00 +0000
+++ b/mysql-test/suite/rpl/r/rpl_parallel_change_master.result	2012-03-28 15:24:17 +0000
@@ -0,0 +1,57 @@
+include/master-slave.inc
+Warnings:
+Note	1756	Sending passwords in plain text without SSL/TLS is extremely insecure.
+Note	1757	Storing MySQL user name or password information in the master.info repository is not secure and is therefore not recommended. Please see the MySQL Manual for more about this issue and possible alternatives.
+[connection master]
+call mtr.add_suppression("Slave SQL: Could not execute Write_rows event on table d1.t1; Duplicate entry '13' for key 'a'");
+call mtr.add_suppression("Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state.");
+include/stop_slave.inc
+SET @save.slave_parallel_workers=@@global.slave_parallel_workers;
+SET @@global.slave_parallel_workers=2;
+include/start_slave.inc
+CREATE DATABASE d1;
+CREATE DATABASE d2;
+CREATE TABLE d1.t1 (a int unique) ENGINE=INNODB;
+CREATE TABLE d2.t1 (a int unique) ENGINE=INNODB;
+INSERT INTO d1.t1 VALUES (1);
+FLUSH LOGS;
+include/stop_slave.inc
+CHANGE MASTER TO MASTER_DELAY=5;
+include/start_slave.inc
+INSERT INTO d1.t1 VALUES (3);
+INSERT INTO d1.t1 VALUES (5);
+FLUSH LOGS;
+include/stop_slave.inc
+CHANGE MASTER TO RELAY_LOG_FILE=FILE,  RELAY_LOG_POS= POS;
+include/start_slave.inc
+include/stop_slave.inc
+CHANGE MASTER TO RELAY_LOG_FILE=FILE,  RELAY_LOG_POS= POS, MASTER_DELAY=0;
+include/start_slave.inc
+BEGIN;
+INSERT INTO d1.t1 VALUES (13);
+INSERT INTO d1.t1 VALUES (6);
+INSERT INTO d2.t1 VALUES (7);
+INSERT INTO d1.t1 VALUES (13);
+INSERT INTO d2.t1 VALUES (8);
+INSERT INTO d2.t1 VALUES (9);
+COMMIT;
+include/wait_for_slave_sql_error.inc [errno=1062]
+include/stop_slave_io.inc
+CHANGE MASTER TO RELAY_LOG_FILE=FILE,  RELAY_LOG_POS= POS;
+ERROR HY000: CHANGE MASTER cannot be executed when the slave was stopped with an error or killed in MTS mode. Consider using RESET SLAVE or START SLAVE UNTIL.
+SET @@global.slave_parallel_workers= @save.slave_parallel_workers;
+include/rpl_restart_server.inc [server_number=2 parameters: --relay-log-recovery --skip-slave-start]
+SELECT @@global.relay_log_recovery as 'must be ON';
+must be ON
+1
+call mtr.add_suppression("--relay-log-recovery cannot be executed when the slave was stopped with an error or killed in MTS mode; consider using RESET SLAVE or restart the server with --relay-log-recovery = 0");
+call mtr.add_suppression("Failed to initialize the master info structure");
+include/rpl_restart_server.inc [server_number=2 parameters: --skip-slave-start]
+SELECT @@global.relay_log_recovery as 'must be OFF';
+must be OFF
+0
+DELETE FROM d1.t1 WHERE a = 13;
+include/start_slave.inc
+DROP DATABASE d1;
+DROP DATABASE d2;
+include/rpl_end.inc

=== added file 'mysql-test/suite/rpl/t/rpl_parallel_change_master.test'
--- a/mysql-test/suite/rpl/t/rpl_parallel_change_master.test	1970-01-01 00:00:00 +0000
+++ b/mysql-test/suite/rpl/t/rpl_parallel_change_master.test	2012-03-28 15:24:17 +0000
@@ -0,0 +1,119 @@
+#
+# Test verifies MTS behaviour with regard to 
+# Change-Master.
+# 
+# Related bugs:
+# Bug 12995174 - MTS: UNEXPECTED RECOVERY ATTEMPT ENDS WITH ER_MASTER_INFO OR ASSERTION 
+
+--source include/master-slave.inc
+# The test for bug#12995174 is not format-specific but uses sleep
+# so it made to be run in ROW format that is the way the bug is reported.
+--source include/have_binlog_format_row.inc
+--source include/have_innodb.inc
+
+--connection slave
+call mtr.add_suppression("Slave SQL: Could not execute Write_rows event on table d1.t1; Duplicate entry '13' for key 'a'");
+call mtr.add_suppression("Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state.");
+--source include/stop_slave.inc
+SET @save.slave_parallel_workers=@@global.slave_parallel_workers;
+SET @@global.slave_parallel_workers=2;
+--source include/start_slave.inc
+
+--connection master
+
+CREATE DATABASE d1;
+CREATE DATABASE d2;
+CREATE TABLE d1.t1 (a int unique) ENGINE=INNODB;
+CREATE TABLE d2.t1 (a int unique) ENGINE=INNODB;
+
+INSERT INTO d1.t1 VALUES (1);
+FLUSH LOGS;
+
+--sync_slave_with_master
+
+--source include/stop_slave.inc
+CHANGE MASTER TO MASTER_DELAY=5;
+--source include/start_slave.inc
+
+--connection master
+INSERT INTO d1.t1 VALUES (3);
+--sleep 3
+INSERT INTO d1.t1 VALUES (5);
+FLUSH LOGS;
+
+--connection slave
+--source include/stop_slave.inc
+
+let $relay_file = query_get_value( SHOW SLAVE STATUS, Relay_Log_File, 1 );
+let $relay_pos = query_get_value( SHOW SLAVE STATUS, Relay_Log_Pos, 1 );
+
+--replace_regex /RELAY_LOG_FILE=[^,]+/RELAY_LOG_FILE=FILE/ /RELAY_LOG_POS=[0-9]+/ RELAY_LOG_POS= POS/
+eval CHANGE MASTER TO RELAY_LOG_FILE='$relay_file', RELAY_LOG_POS=$relay_pos;
+
+--source include/start_slave.inc
+--sleep 5
+--source include/stop_slave.inc
+
+let $relay_file = query_get_value( SHOW SLAVE STATUS, Relay_Log_File, 1 );
+let $relay_pos = query_get_value( SHOW SLAVE STATUS, Relay_Log_Pos, 1 );
+--replace_regex /RELAY_LOG_FILE=[^,]+/RELAY_LOG_FILE=FILE/ /RELAY_LOG_POS=[0-9]+/ RELAY_LOG_POS= POS/
+eval CHANGE MASTER TO RELAY_LOG_FILE='$relay_file', RELAY_LOG_POS=$relay_pos, MASTER_DELAY=0;
+
+--source include/start_slave.inc
+BEGIN;
+INSERT INTO d1.t1 VALUES (13); # to cause the dup key error
+# change-master with gaps
+--connection master
+
+INSERT INTO d1.t1 VALUES (6);
+INSERT INTO d2.t1 VALUES (7);
+INSERT INTO d1.t1 VALUES (13);
+INSERT INTO d2.t1 VALUES (8);
+INSERT INTO d2.t1 VALUES (9);
+
+--connection slave
+COMMIT; # worker executing (13) errors out
+
+--let $slave_sql_errno= 1062
+--source include/wait_for_slave_sql_error.inc
+
+--source include/stop_slave_io.inc
+
+let $relay_file = query_get_value( SHOW SLAVE STATUS, Relay_Log_File, 1 );
+let $relay_pos = query_get_value( SHOW SLAVE STATUS, Relay_Log_Pos, 1 );
+--replace_regex /RELAY_LOG_FILE=[^,]+/RELAY_LOG_FILE=FILE/ /RELAY_LOG_POS=[0-9]+/ RELAY_LOG_POS= POS/
+--error ER_MTS_CHANGE_MASTER_CANT_RUN_WITH_GAPS
+eval CHANGE MASTER TO RELAY_LOG_FILE='$relay_file', RELAY_LOG_POS=$relay_pos;
+
+SET @@global.slave_parallel_workers= @save.slave_parallel_workers; # cleanup
+
+#
+# --relay-log-recovery= 1 and MTS gaps is handled similarly to Change-Master
+#
+--let $rpl_server_number= 2
+--let $rpl_server_parameters= --relay-log-recovery --skip-slave-start
+--source include/rpl_restart_server.inc
+
+--connection slave
+SELECT @@global.relay_log_recovery as 'must be ON';
+call mtr.add_suppression("--relay-log-recovery cannot be executed when the slave was stopped with an error or killed in MTS mode; consider using RESET SLAVE or restart the server with --relay-log-recovery = 0");
+call mtr.add_suppression("Failed to initialize the master info structure");
+
+--let $rpl_server_number= 2
+--let $rpl_server_parameters= --skip-slave-start
+--source include/rpl_restart_server.inc
+
+SELECT @@global.relay_log_recovery as 'must be OFF';
+--connection slave
+DELETE FROM d1.t1 WHERE a = 13;
+--source include/start_slave.inc
+
+#
+# cleanup
+#
+--connection master
+DROP DATABASE d1;
+DROP DATABASE d2;
+--sync_slave_with_master
+
+--source include/rpl_end.inc

=== modified file 'mysql-test/suite/sys_vars/r/relay_log_recovery_basic.result'
--- a/mysql-test/suite/sys_vars/r/relay_log_recovery_basic.result	2010-01-29 06:33:00 +0000
+++ b/mysql-test/suite/sys_vars/r/relay_log_recovery_basic.result	2012-03-28 15:24:17 +0000
@@ -1,7 +1,3 @@
-SET @start_global_value = @@global.relay_log_recovery;
-SELECT @start_global_value;
-@start_global_value
-0
 select @@global.relay_log_recovery;
 @@global.relay_log_recovery
 0
@@ -20,16 +16,7 @@ select * from information_schema.session
 VARIABLE_NAME	VARIABLE_VALUE
 RELAY_LOG_RECOVERY	OFF
 set global relay_log_recovery=1;
-select @@global.relay_log_recovery;
-@@global.relay_log_recovery
-1
-select * from information_schema.global_variables where variable_name='relay_log_recovery';
-VARIABLE_NAME	VARIABLE_VALUE
-RELAY_LOG_RECOVERY	ON
-select * from information_schema.session_variables where variable_name='relay_log_recovery';
-VARIABLE_NAME	VARIABLE_VALUE
-RELAY_LOG_RECOVERY	ON
-set global relay_log_recovery=OFF;
+ERROR HY000: Variable 'relay_log_recovery' is a read only variable
 select @@global.relay_log_recovery;
 @@global.relay_log_recovery
 0
@@ -39,15 +26,3 @@ RELAY_LOG_RECOVERY	OFF
 select * from information_schema.session_variables where variable_name='relay_log_recovery';
 VARIABLE_NAME	VARIABLE_VALUE
 RELAY_LOG_RECOVERY	OFF
-set session relay_log_recovery=1;
-ERROR HY000: Variable 'relay_log_recovery' is a GLOBAL variable and should be set with SET GLOBAL
-set global relay_log_recovery=1.1;
-ERROR 42000: Incorrect argument type to variable 'relay_log_recovery'
-set global relay_log_recovery=1e1;
-ERROR 42000: Incorrect argument type to variable 'relay_log_recovery'
-set global relay_log_recovery="foo";
-ERROR 42000: Variable 'relay_log_recovery' can't be set to the value of 'foo'
-SET @@global.relay_log_recovery = @start_global_value;
-SELECT @@global.relay_log_recovery;
-@@global.relay_log_recovery
-0

=== modified file 'mysql-test/suite/sys_vars/t/relay_log_recovery_basic.test'
--- a/mysql-test/suite/sys_vars/t/relay_log_recovery_basic.test	2010-01-29 06:33:00 +0000
+++ b/mysql-test/suite/sys_vars/t/relay_log_recovery_basic.test	2012-03-28 15:24:17 +0000
@@ -6,9 +6,6 @@
 
 --source include/not_embedded.inc
 
-SET @start_global_value = @@global.relay_log_recovery;
-SELECT @start_global_value;
-
 #
 # exists as global only
 #
@@ -21,29 +18,12 @@ select * from information_schema.global_
 select * from information_schema.session_variables where variable_name='relay_log_recovery';
 
 #
-# show that it's writable
+# show that it's read-only
 #
+--error ER_INCORRECT_GLOBAL_LOCAL_VAR
 set global relay_log_recovery=1;
 select @@global.relay_log_recovery;
 select * from information_schema.global_variables where variable_name='relay_log_recovery';
 select * from information_schema.session_variables where variable_name='relay_log_recovery';
-set global relay_log_recovery=OFF;
-select @@global.relay_log_recovery;
-select * from information_schema.global_variables where variable_name='relay_log_recovery';
-select * from information_schema.session_variables where variable_name='relay_log_recovery';
---error ER_GLOBAL_VARIABLE
-set session relay_log_recovery=1;
-
-#
-# incorrect types
-#
---error ER_WRONG_TYPE_FOR_VAR
-set global relay_log_recovery=1.1;
---error ER_WRONG_TYPE_FOR_VAR
-set global relay_log_recovery=1e1;
---error ER_WRONG_VALUE_FOR_VAR
-set global relay_log_recovery="foo";
 
-SET @@global.relay_log_recovery = @start_global_value;
-SELECT @@global.relay_log_recovery;
 

=== modified file 'sql/log_event.cc'
--- a/sql/log_event.cc	2012-03-27 00:14:24 +0000
+++ b/sql/log_event.cc	2012-03-28 15:24:17 +0000
@@ -6529,7 +6529,7 @@ int Rotate_log_event::do_update_pos(Rela
                         (ulong) rli->get_group_master_log_pos()));
     mysql_mutex_unlock(&rli->data_lock);
     if (rli->is_parallel_exec())
-      rli->reset_notified_checkpoint(0, when.tv_sec + (time_t) exec_time);
+      rli->reset_notified_checkpoint(0, when.tv_sec + (time_t) exec_time, false);
 
     /*
       Reset thd->variables.option_bits and sql_mode etc, because this could be the signal of

=== modified file 'sql/rpl_rli.cc'
--- a/sql/rpl_rli.cc	2012-03-27 08:43:25 +0000
+++ b/sql/rpl_rli.cc	2012-03-28 15:24:17 +0000
@@ -197,8 +197,10 @@ void Relay_log_info::reset_notified_rela
                  checkpoint change.
    @param new_ts new seconds_behind_master timestamp value unless zero.
                  Zero could be due to FD event.
+   @param locked true if caller has locked @c data_lock
 */
-void Relay_log_info::reset_notified_checkpoint(ulong shift, time_t new_ts= 0)
+void Relay_log_info::reset_notified_checkpoint(ulong shift, time_t new_ts,
+                                               bool locked)
 {
   /*
     If this is not a parallel execution we return immediately.
@@ -238,9 +240,11 @@ void Relay_log_info::reset_notified_chec
 
   if (new_ts)
   {
-    mysql_mutex_lock(&data_lock);
+    if (!locked)
+      mysql_mutex_lock(&data_lock);
     last_master_timestamp= new_ts;
-    mysql_mutex_unlock(&data_lock);
+    if (!locked)
+      mysql_mutex_unlock(&data_lock);
   }
 }
 
@@ -1745,7 +1749,8 @@ a file name for --relay-log-index option
 
 err:
   inited= 0;
-  sql_print_error("%s.", msg);
+  if (msg)
+    sql_print_error("%s.", msg);
   relay_log.close(LOG_CLOSE_INDEX | LOG_CLOSE_STOP_EVENT);
   DBUG_RETURN(error);
 }

=== modified file 'sql/rpl_rli.h'
--- a/sql/rpl_rli.h	2012-03-27 08:43:25 +0000
+++ b/sql/rpl_rli.h	2012-03-28 15:24:17 +0000
@@ -621,7 +621,7 @@ public:
      Coordinator notifies Workers about this event. Coordinator and Workers
      maintain a bitmap of executed group that is reset with a new checkpoint. 
   */
-  void reset_notified_checkpoint(ulong, time_t);
+  void reset_notified_checkpoint(ulong, time_t, bool);
 
   /*
    * End of MTS section ******************************************************/

=== modified file 'sql/rpl_slave.cc'
--- a/sql/rpl_slave.cc	2012-03-27 08:43:25 +0000
+++ b/sql/rpl_slave.cc	2012-03-28 15:24:17 +0000
@@ -399,6 +399,15 @@ int init_recovery(Master_info* mi, const
       We need to improve this. /Alfranio.
     */
     error= mts_recovery_groups(rli, &rli->recovery_groups);
+    if (rli->mts_recovery_group_cnt)
+    {
+      error= 1;
+      sql_print_error("--relay-log-recovery cannot be executed when the slave "
+                        "was stopped with an error or killed in MTS mode; "
+                        "consider using RESET SLAVE or restart the server "
+                        "with --relay-log-recovery = 0 followed by "
+                        "START SLAVE");
+    }
   }
 
   group_master_log_name= const_cast<char *>(rli->get_group_master_log_name());
@@ -4737,7 +4746,7 @@ bool mts_checkpoint_routine(Relay_log_in
     cnt is zero. This value means that the checkpoint information
     will be completely reset.
   */
-  rli->reset_notified_checkpoint(cnt, rli->gaq->lwm.ts);
+  rli->reset_notified_checkpoint(cnt, rli->gaq->lwm.ts, locked);
 
   /* end-of "Coordinator::"commit_positions" */
 
@@ -7673,7 +7682,8 @@ err:
 }
 
 /**
-  Execute a CHANGE MASTER statement.
+  Execute a CHANGE MASTER statement. MTS workers info tables data are removed
+  in the successful branch (i.e. there are no gaps in the execution history).
 
   @param thd Pointer to THD object for the client thread executing the
   statement.
@@ -7690,11 +7700,13 @@ bool change_master(THD* thd, Master_info
   const char* errmsg= 0;
   bool need_relay_log_purge= 1;
   char *var_master_log_name= NULL, *var_group_master_log_name= NULL;
-  bool ret= FALSE;
+  bool ret= false;
   char saved_host[HOSTNAME_LENGTH + 1], saved_bind_addr[HOSTNAME_LENGTH + 1];
   uint saved_port= 0;
   char saved_log_name[FN_REFLEN];
   my_off_t saved_log_pos= 0;
+  my_bool save_relay_log_purge= relay_log_purge;
+  bool mts_remove_workers= false;
 
   DBUG_ENTER("change_master");
 
@@ -7729,7 +7741,28 @@ bool change_master(THD* thd, Master_info
     ret= true;
     goto err;
   }
+  if (mi->rli->mts_recovery_group_cnt)
+  {
+    /*
+      Change-Master can't be done if there is a mts group gap.
+      That requires mts-recovery which START SLAVE provides.
+    */
+    DBUG_ASSERT(mi->rli->recovery_parallel_workers);
 
+    my_message(ER_MTS_CHANGE_MASTER_CANT_RUN_WITH_GAPS,
+               ER(ER_MTS_CHANGE_MASTER_CANT_RUN_WITH_GAPS), MYF(0));
+    ret= true;
+    goto err;
+  }
+  else
+  {
+    /*
+      Lack of mts group gaps makes Workers info stale
+      regardless of need_relay_log_purge computation.
+    */
+    if (mi->rli->recovery_parallel_workers)
+      mts_remove_workers= true;
+  }
   /*
     We cannot specify auto position and set either the coordinates
     on master or slave. If we try to do so, an error message is
@@ -7982,8 +8015,7 @@ bool change_master(THD* thd, Master_info
     THD_STAGE_INFO(thd, stage_purging_old_relay_logs);
     if (mi->rli->purge_relay_logs(thd,
                                   0 /* not only reset, but also reinit */,
-                                  &errmsg) ||
-        remove_workers(mi->rli))
+                                  &errmsg))
     {
       my_error(ER_RELAY_LOG_FAIL, MYF(0), errmsg);
       ret= TRUE;
@@ -8005,6 +8037,7 @@ bool change_master(THD* thd, Master_info
       goto err;
     }
   }
+  relay_log_purge= save_relay_log_purge;
 
   /*
     Coordinates in rli were spoilt by the 'if (need_relay_log_purge)' block,
@@ -8016,10 +8049,12 @@ bool change_master(THD* thd, Master_info
     ''/0: we have lost all copies of the original good coordinates.
     That's why we always save good coords in rli.
   */
-  mi->rli->set_group_master_log_pos(mi->get_master_log_pos());
-  DBUG_PRINT("info", ("master_log_pos: %lu", (ulong) mi->get_master_log_pos()));
-  mi->rli->set_group_master_log_name(mi->get_master_log_name());
-
+  if (need_relay_log_purge)
+  {
+    mi->rli->set_group_master_log_pos(mi->get_master_log_pos());
+    DBUG_PRINT("info", ("master_log_pos: %lu", (ulong) mi->get_master_log_pos()));
+    mi->rli->set_group_master_log_name(mi->get_master_log_name());
+  }
   var_group_master_log_name=  const_cast<char *>(mi->rli->get_group_master_log_name());
   if (!var_group_master_log_name[0]) // uninitialized case
     mi->rli->set_group_master_log_pos(0);
@@ -8057,7 +8092,11 @@ bool change_master(THD* thd, Master_info
 err:
   unlock_slave_threads(mi);
   if (ret == FALSE)
+  {
+    if (mts_remove_workers)
+      remove_workers(mi->rli);
     my_ok(thd);
+  }
   DBUG_RETURN(ret);
 }
 /**

=== modified file 'sql/share/errmsg-utf8.txt'
--- a/sql/share/errmsg-utf8.txt	2012-03-21 08:42:00 +0000
+++ b/sql/share/errmsg-utf8.txt	2012-03-28 15:24:17 +0000
@@ -6740,6 +6740,8 @@ ER_UNKNOWN_ALTER_ALGORITHM
 ER_UNKNOWN_ALTER_LOCK
   eng "Unknown LOCK type '%s'"
 
+ER_MTS_CHANGE_MASTER_CANT_RUN_WITH_GAPS
+  eng "CHANGE MASTER cannot be executed when the slave was stopped with an error or killed in MTS mode. Consider using RESET SLAVE or START SLAVE UNTIL."
 #
 #  End of 5.6 error messages.
 #

=== modified file 'sql/sys_vars.cc'
--- a/sql/sys_vars.cc	2012-03-21 21:19:11 +0000
+++ b/sql/sys_vars.cc	2012-03-28 15:24:17 +0000
@@ -3650,7 +3650,7 @@ static Sys_var_mybool Sys_relay_log_reco
        "right after the database startup, which means that the IO Thread "
        "starts re-fetching from the master right after the last transaction "
        "processed",
-       GLOBAL_VAR(relay_log_recovery), CMD_LINE(OPT_ARG), DEFAULT(FALSE));
+        READ_ONLY GLOBAL_VAR(relay_log_recovery), CMD_LINE(OPT_ARG), DEFAULT(FALSE));
 
 static Sys_var_mybool Sys_slave_allow_batching(
        "slave_allow_batching", "Allow slave to batch requests",

No bundle (reason: useless for push emails).
Thread
bzr push into mysql-trunk branch (andrei.elkin:3866 to 3867) Bug#12995174Bug#13840948Andrei Elkin29 Mar