List:Commits« Previous MessageNext Message »
From:Andrei Elkin Date:June 20 2011 1:33pm
Subject:bzr push into mysql-next-mr-wl5569 branch (andrei.elkin:3310 to 3312)
WL#5569 WL#5599
View as plain text  
 3312 Andrei Elkin	2011-06-20 [merge]
      wl#5569 MTS
      wl#5599 MTS recovery
      
      fixing valgrind warnings.

    modified:
      sql/rpl_rli_pdb.h
      sql/rpl_slave.cc
 3311 Andrei Elkin	2011-06-20
      wl#5569 MTS
      
      1. mtr.add_suppression for all remained unattended tests that generate
         any error at applying by SQL thread or MTS' worker.
         An error by worker follows by a warning by coordinator. So it's suppressed.
      2. kill of Coordinator is handled immediately without waiting for any ongoing
         group scheduling completion. So it can possibly create
         consistency issue. That is reported with an error.
      3. Two tests are made to expect one of two errors depending on Single- or Multi-Threaded
         mode.
     @ mysql-test/extra/rpl_tests/rpl_conflicts.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/extra/rpl_tests/rpl_loaddata.test
        MTS reports an error that STS does not expect as bug#56287 notes.
     @ mysql-test/extra/rpl_tests/rpl_parallel_load.test
        cleanup.
     @ mysql-test/extra/rpl_tests/rpl_row_empty_imgs.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/extra/rpl_tests/rpl_stm_EE_err2.test
        MTS reports an error that STS does not expect as bug#56287 notes.
     @ mysql-test/suite/rpl/r/rpl_circular_for_4_hosts.result
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/suite/rpl/r/rpl_heartbeat_basic.result
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/suite/rpl/r/rpl_loaddata.result
        results updated.
     @ mysql-test/suite/rpl/t/rpl_circular_for_4_hosts.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/suite/rpl/t/rpl_heartbeat_basic.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/suite/rpl/t/rpl_loaddata_fatal.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/suite/rpl/t/rpl_parallel.test
        cleanup.
     @ mysql-test/suite/rpl/t/rpl_parallel_multi_db-master.opt
        making post-execution check for warning not to fail.
     @ mysql-test/suite/rpl/t/rpl_parallel_start_stop.test
        kill of Coordinator is handled immediately to possibly create
        consistency issue. That is reported with an error.
     @ mysql-test/suite/rpl/t/rpl_row_img_sanity.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/suite/rpl/t/rpl_row_inexist_tbl.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/suite/rpl/t/rpl_show_errors.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/suite/rpl/t/rpl_slave_load_remove_tmpfile.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/suite/rpl/t/rpl_slave_start.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ mysql-test/suite/rpl/t/rpl_stm_000001.test
         An error by worker follows by a warning by coordinator. So it's suppressed.
     @ sql/rpl_reporting.h
        a new method to help in finding out whether an error has been already
        reported.
     @ sql/rpl_slave.cc
        Refining conditions for Coordinator to report a warning or an error
        in cases a. it's killed b. worker errored out and reported it.
        In the case a. the error level message is issue and in the case b. the warning level one.

    modified:
      mysql-test/extra/rpl_tests/rpl_conflicts.test
      mysql-test/extra/rpl_tests/rpl_loaddata.test
      mysql-test/extra/rpl_tests/rpl_parallel_load.test
      mysql-test/extra/rpl_tests/rpl_row_empty_imgs.test
      mysql-test/extra/rpl_tests/rpl_stm_EE_err2.test
      mysql-test/suite/rpl/r/rpl_circular_for_4_hosts.result
      mysql-test/suite/rpl/r/rpl_heartbeat_basic.result
      mysql-test/suite/rpl/r/rpl_loaddata.result
      mysql-test/suite/rpl/r/rpl_loaddata_fatal.result
      mysql-test/suite/rpl/r/rpl_parallel.result
      mysql-test/suite/rpl/r/rpl_parallel_start_stop.result
      mysql-test/suite/rpl/r/rpl_row_conflicts.result
      mysql-test/suite/rpl/r/rpl_row_img_eng_full.result
      mysql-test/suite/rpl/r/rpl_row_img_sanity.result
      mysql-test/suite/rpl/r/rpl_row_inexist_tbl.result
      mysql-test/suite/rpl/r/rpl_sequential.result
      mysql-test/suite/rpl/r/rpl_show_errors.result
      mysql-test/suite/rpl/r/rpl_slave_load_remove_tmpfile.result
      mysql-test/suite/rpl/r/rpl_slave_start.result
      mysql-test/suite/rpl/r/rpl_stm_000001.result
      mysql-test/suite/rpl/r/rpl_stm_EE_err2.result
      mysql-test/suite/rpl/r/rpl_stm_conflicts.result
      mysql-test/suite/rpl/r/rpl_stm_loaddata_concurrent.result
      mysql-test/suite/rpl/t/rpl_circular_for_4_hosts.test
      mysql-test/suite/rpl/t/rpl_heartbeat_basic.test
      mysql-test/suite/rpl/t/rpl_loaddata_fatal.test
      mysql-test/suite/rpl/t/rpl_parallel.test
      mysql-test/suite/rpl/t/rpl_parallel_multi_db-master.opt
      mysql-test/suite/rpl/t/rpl_parallel_multi_db-slave.opt
      mysql-test/suite/rpl/t/rpl_parallel_start_stop.test
      mysql-test/suite/rpl/t/rpl_row_img_sanity.test
      mysql-test/suite/rpl/t/rpl_row_inexist_tbl.test
      mysql-test/suite/rpl/t/rpl_show_errors.test
      mysql-test/suite/rpl/t/rpl_slave_load_remove_tmpfile.test
      mysql-test/suite/rpl/t/rpl_slave_start.test
      mysql-test/suite/rpl/t/rpl_stm_000001.test
      sql/rpl_reporting.h
      sql/rpl_slave.cc
 3310 Andrei Elkin	2011-06-20
      wl#5569 MTS
      
      fixing tests.
      Adding suppressions to few tests that produce sql thread error. Experimenting with rpl_relayrotate to descrease number of rotations and thus to make it clear if timeout is not due to computational load.

    modified:
      mysql-test/extra/rpl_tests/rpl_extra_col_slave.test
      mysql-test/extra/rpl_tests/rpl_relayrotate.test
      mysql-test/extra/rpl_tests/rpl_row_tabledefs.test
      mysql-test/suite/rpl/r/rpl_extra_col_slave_innodb.result
      mysql-test/suite/rpl/r/rpl_extra_col_slave_myisam.result
      mysql-test/suite/rpl/r/rpl_row_colSize.result
      mysql-test/suite/rpl/r/rpl_row_tabledefs_2myisam.result
      mysql-test/suite/rpl/r/rpl_row_tabledefs_3innodb.result
      mysql-test/suite/rpl/t/rpl_row_colSize.test
=== modified file 'mysql-test/extra/rpl_tests/rpl_conflicts.test'
--- a/mysql-test/extra/rpl_tests/rpl_conflicts.test	2011-02-23 20:01:27 +0000
+++ b/mysql-test/extra/rpl_tests/rpl_conflicts.test	2011-06-20 13:26:35 +0000
@@ -98,6 +98,7 @@ if (`SELECT @@global.binlog_format != 'R
   --eval SELECT "$err" as 'Last_SQL_Error (expected "duplicate key" error)'
   --enable_query_log
   call mtr.add_suppression("Slave SQL.*Duplicate entry .1. for key .PRIMARY.* Error_code: 1062");
+  call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 
   SELECT * FROM t1;
 

=== modified file 'mysql-test/extra/rpl_tests/rpl_loaddata.test'
--- a/mysql-test/extra/rpl_tests/rpl_loaddata.test	2011-06-17 18:01:58 +0000
+++ b/mysql-test/extra/rpl_tests/rpl_loaddata.test	2011-06-20 13:26:35 +0000
@@ -158,10 +158,11 @@ if (`SELECT @@global.binlog_format != 'R
 {
   # Query causes error on master but not on slave. This causes the slave to
   # stop with error code 0 (which is wrong: see BUG#57287)
-  --let $slave_sql_errno= 0
+  --let $slave_sql_errno= 0,1740 
   --source include/wait_for_slave_sql_error.inc
   drop table t1, t2;
 }
+
 connection master;
 drop table t1, t2;
 

=== modified file 'mysql-test/extra/rpl_tests/rpl_parallel_load.test'
--- a/mysql-test/extra/rpl_tests/rpl_parallel_load.test	2011-06-15 17:12:11 +0000
+++ b/mysql-test/extra/rpl_tests/rpl_parallel_load.test	2011-06-20 13:26:35 +0000
@@ -252,15 +252,24 @@ let $wait_timeout= 600;
 let $wait_condition= SELECT count(*)+sleep(1) = 5 FROM test0.benchmark;
 source include/wait_condition.inc;
 
+# cleanup for files that could not be removed in the end of previous invocation.
+let $MYSQLD_DATADIR= `select @@datadir`;
+--remove_files_wildcard $MYSQLD_DATADIR *.out
+
 use test;
-select * from test0.benchmark into outfile 'benchmark.out';
+let $benchmark_file= `select replace(concat("benchmark_",uuid(),".out"),"-","_")`;
+--replace_regex /benchmark_.*.out/benchmark.out/
+eval select * from test0.benchmark into outfile '$benchmark_file';
 select ts from test0.benchmark where state like 'master started load' into @m_0;
 select ts from test0.benchmark where state like 'master ends load' into @m_1;
 select ts from test0.benchmark where state like 'slave takes on load' into @s_0;
 select ts from test0.benchmark where state like 'slave ends load' into @s_1;
-select time_to_sec(@m_1) - time_to_sec(@m_0) as 'delta_m', 
-       time_to_sec(@s_1) - time_to_sec(@s_0) as 'delta_s' into outfile 'delta.out';
 
+let $delta_file= `select replace(concat("delta_",uuid(),".out"),"-","_")`;
+--replace_regex /delta_.*.out/delta.out/
+eval select time_to_sec(@m_1) - time_to_sec(@m_0) as 'delta_m', 
+       time_to_sec(@s_1) - time_to_sec(@s_0) as 'delta_s',
+       time_to_sec(@s_m1) - time_to_sec(@s_m0) as 'delta_sm' into outfile '$delta_file';
 
 let $i = $databases + 1;
 while($i)

=== modified file 'mysql-test/extra/rpl_tests/rpl_row_empty_imgs.test'
--- a/mysql-test/extra/rpl_tests/rpl_row_empty_imgs.test	2011-02-23 20:01:27 +0000
+++ b/mysql-test/extra/rpl_tests/rpl_row_empty_imgs.test	2011-06-20 13:26:35 +0000
@@ -206,6 +206,8 @@ if ($lower_engine == ndb)
 SET SQL_LOG_BIN=0;
 call mtr.add_suppression("Slave: Can\'t find record in \'t1\' Error_code: 1032");
 call mtr.add_suppression("Slave SQL: Could not execute Update_rows event on table test.t1; Can.t find record in .t1.* Error_code: 1032");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
+
 SET SQL_LOG_BIN=1;
 
 # NOTE: Because of BUG#52473, when using NDB this will make the test

=== modified file 'mysql-test/extra/rpl_tests/rpl_stm_EE_err2.test'
--- a/mysql-test/extra/rpl_tests/rpl_stm_EE_err2.test	2011-02-23 20:01:27 +0000
+++ b/mysql-test/extra/rpl_tests/rpl_stm_EE_err2.test	2011-06-20 13:26:35 +0000
@@ -25,9 +25,13 @@ drop table t1;
 
 connection slave;
 call mtr.add_suppression("Slave SQL.*Query caused different errors on master and slave.*Error on master:.* error code=1062.*Error on slave:.* Error_code: 0");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
+
 --echo (expect different error codes on master and slave)
---let $slave_sql_errno= 0
---let $show_slave_sql_error= 1
+--let $slave_sql_errno= 0,1740
+# can't print error text. MTS reports a separate error in this case.
+# Todo: to fix single-threaded-slave BUG#57287.
+--let $show_slave_sql_error= 0
 --source include/wait_for_slave_sql_error.inc
 drop table t1;
 --source include/stop_slave.inc

=== modified file 'mysql-test/suite/rpl/r/rpl_circular_for_4_hosts.result'
--- a/mysql-test/suite/rpl/r/rpl_circular_for_4_hosts.result	2011-03-16 16:38:30 +0000
+++ b/mysql-test/suite/rpl/r/rpl_circular_for_4_hosts.result	2011-06-20 13:26:35 +0000
@@ -6,6 +6,7 @@ CREATE TABLE t1 (a INT NOT NULL AUTO_INC
 CREATE TABLE t2 (a INT NOT NULL AUTO_INCREMENT, b VARCHAR(100), c INT NOT NULL, PRIMARY KEY(a)) ENGINE=InnoDB;
 include/rpl_sync.inc
 call mtr.add_suppression("Slave SQL.*Request to stop slave SQL Thread received while applying a group that has non-transactional changes; waiting for completion of the group");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 
 *** Testing schema A->B->C->D->A ***
 

=== modified file 'mysql-test/suite/rpl/r/rpl_heartbeat_basic.result'
--- a/mysql-test/suite/rpl/r/rpl_heartbeat_basic.result	2011-03-17 13:20:36 +0000
+++ b/mysql-test/suite/rpl/r/rpl_heartbeat_basic.result	2011-06-20 13:26:35 +0000
@@ -1,6 +1,7 @@
 include/master-slave.inc
 [connection master]
 call mtr.add_suppression("Slave I/O: The slave I/O thread stops because a fatal error is encountered when it tried to SET @master_binlog_checksum");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 
 *** Preparing ***
 include/stop_slave.inc

=== modified file 'mysql-test/suite/rpl/r/rpl_loaddata.result'
--- a/mysql-test/suite/rpl/r/rpl_loaddata.result	2011-06-17 18:01:58 +0000
+++ b/mysql-test/suite/rpl/r/rpl_loaddata.result	2011-06-20 13:26:35 +0000
@@ -73,7 +73,7 @@ load data infile '../../std_data/rpl_loa
 terminated by ',' optionally enclosed by '%' escaped by '@' lines terminated by
 '\n##\n' starting by '>' ignore 1 lines;
 ERROR 23000: Duplicate entry '2003-03-22' for key 'day'
-include/wait_for_slave_sql_error.inc [errno=0]
+include/wait_for_slave_sql_error.inc [errno=0,1740 ]
 drop table t1, t2;
 drop table t1, t2;
 CREATE TABLE t1 (word CHAR(20) NOT NULL PRIMARY KEY) ENGINE=INNODB;

=== modified file 'mysql-test/suite/rpl/r/rpl_loaddata_fatal.result'
--- a/mysql-test/suite/rpl/r/rpl_loaddata_fatal.result	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/r/rpl_loaddata_fatal.result	2011-06-20 13:26:35 +0000
@@ -4,6 +4,7 @@ CREATE TABLE t1 (a INT, b INT);
 INSERT INTO t1 VALUES (1,10);
 LOAD DATA INFILE '../../std_data/rpl_loaddata.dat' INTO TABLE t1;
 call mtr.add_suppression("Slave SQL.*Fatal error: Not enough memory, Error_code: 1593");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 include/wait_for_slave_sql_error_and_skip.inc [errno=1593]
 Last_SQL_Error = 'Fatal error: Not enough memory'
 DROP TABLE t1;

=== modified file 'mysql-test/suite/rpl/r/rpl_parallel.result'
--- a/mysql-test/suite/rpl/r/rpl_parallel.result	2011-06-15 23:27:20 +0000
+++ b/mysql-test/suite/rpl/r/rpl_parallel.result	2011-06-20 13:26:35 +0000
@@ -10,8 +10,7 @@ select ts from test0.benchmark where sta
 select ts from test0.benchmark where state like 'master ends load' into @m_1;
 select ts from test0.benchmark where state like 'slave takes on load' into @s_0;
 select ts from test0.benchmark where state like 'slave ends load' into @s_1;
-select time_to_sec(@m_1) - time_to_sec(@m_0) as 'delta_m', 
-time_to_sec(@s_1) - time_to_sec(@s_0) as 'delta_s' into outfile 'delta.out';
+select time_to_sec(@m_1) - time_to_sec(@m_0) as 'delta.out';
 include/diff_tables.inc [master:test15.v_tm_nk, slave:test15.v_tm_nk]
 include/diff_tables.inc [master:test15.v_ti_nk, slave:test15.v_ti_nk]
 include/diff_tables.inc [master:test15.v_tm_wk, slave:test15.v_tm_wk]

=== modified file 'mysql-test/suite/rpl/r/rpl_parallel_start_stop.result'
--- a/mysql-test/suite/rpl/r/rpl_parallel_start_stop.result	2011-06-15 17:12:11 +0000
+++ b/mysql-test/suite/rpl/r/rpl_parallel_start_stop.result	2011-06-20 13:26:35 +0000
@@ -2,6 +2,7 @@ include/master-slave.inc
 [connection master]
 call mtr.add_suppression('Slave SQL: Could not execute Write_rows event on table test.t1');
 call mtr.add_suppression('Slave SQL: Could not execute Update_rows event on table test.t1; Deadlock found when trying to get lock');
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 create view worker_proc_list as SELECT id  from Information_Schema.processlist
 where state like 'Waiting for an event from sql thread%';
 create view coord_proc_list  as SELECT id from Information_Schema.processlist where state like 'Slave has read all relay log%';
@@ -15,7 +16,7 @@ include/wait_for_slave_sql_to_stop.inc
 include/start_slave.inc
 select id from coord_proc_list into @c_id;
 kill query @c_id;
-include/wait_for_slave_sql_to_stop.inc
+include/wait_for_slave_sql_error.inc [errno=1740]
 include/start_slave.inc
 CREATE TABLE t1 (a int primary key) engine=innodb;
 insert into t1 values (1),(2);

=== modified file 'mysql-test/suite/rpl/r/rpl_row_conflicts.result'
--- a/mysql-test/suite/rpl/r/rpl_row_conflicts.result	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/r/rpl_row_conflicts.result	2011-06-20 13:26:35 +0000
@@ -24,6 +24,7 @@ include/wait_for_slave_sql_error.inc [er
 Last_SQL_Error (expected "duplicate key" error)
 Could not execute Write_rows event on table test.t1; Duplicate entry '1' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log master-bin.000001, end_log_pos END_LOG_POS
 call mtr.add_suppression("Slave SQL.*Duplicate entry .1. for key .PRIMARY.* Error_code: 1062");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 SELECT * FROM t1;
 a
 1

=== modified file 'mysql-test/suite/rpl/r/rpl_row_img_eng_full.result'
--- a/mysql-test/suite/rpl/r/rpl_row_img_eng_full.result	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/r/rpl_row_img_eng_full.result	2011-06-20 13:26:35 +0000
@@ -3642,6 +3642,7 @@ c1	c2
 SET SQL_LOG_BIN=0;
 call mtr.add_suppression("Slave: Can\'t find record in \'t1\' Error_code: 1032");
 call mtr.add_suppression("Slave SQL: Could not execute Update_rows event on table test.t1; Can.t find record in .t1.* Error_code: 1032");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 SET SQL_LOG_BIN=1;
 include/wait_for_slave_sql_error_and_skip.inc [errno=1032]
 DROP TABLE t1;
@@ -3825,6 +3826,7 @@ c1	c2
 SET SQL_LOG_BIN=0;
 call mtr.add_suppression("Slave: Can\'t find record in \'t1\' Error_code: 1032");
 call mtr.add_suppression("Slave SQL: Could not execute Update_rows event on table test.t1; Can.t find record in .t1.* Error_code: 1032");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 SET SQL_LOG_BIN=1;
 include/wait_for_slave_sql_error_and_skip.inc [errno=1032]
 DROP TABLE t1;

=== modified file 'mysql-test/suite/rpl/r/rpl_row_img_sanity.result'
--- a/mysql-test/suite/rpl/r/rpl_row_img_sanity.result	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/r/rpl_row_img_sanity.result	2011-06-20 13:26:35 +0000
@@ -2,6 +2,7 @@ include/master-slave.inc
 [connection master]
 call mtr.add_suppression("Slave: Can\'t find record in \'t\' Error_code: 1032");
 call mtr.add_suppression("Slave SQL: Could not execute Update_rows event on table test.t; Can.t find record in .t.* Error_code: 1032");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 SHOW VARIABLES LIKE 'binlog_row_image';
 Variable_name	Value
 binlog_row_image	FULL

=== modified file 'mysql-test/suite/rpl/r/rpl_row_inexist_tbl.result'
--- a/mysql-test/suite/rpl/r/rpl_row_inexist_tbl.result	2011-02-23 11:54:58 +0000
+++ b/mysql-test/suite/rpl/r/rpl_row_inexist_tbl.result	2011-06-20 13:26:35 +0000
@@ -11,6 +11,7 @@ INSERT INTO t1 VALUES (1);
 ==== Verify error on slave ====
 [on slave]
 call mtr.add_suppression("Slave SQL.*Error executing row event: .Table .test.t1. doesn.t exist., Error_code: 1146");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 include/wait_for_slave_sql_error.inc [errno=1146]
 ==== Clean up ====
 include/stop_slave_io.inc

=== modified file 'mysql-test/suite/rpl/r/rpl_sequential.result'
--- a/mysql-test/suite/rpl/r/rpl_sequential.result	2011-06-15 17:41:33 +0000
+++ b/mysql-test/suite/rpl/r/rpl_sequential.result	2011-06-20 13:26:35 +0000
@@ -14,8 +14,7 @@ select ts from test0.benchmark where sta
 select ts from test0.benchmark where state like 'master ends load' into @m_1;
 select ts from test0.benchmark where state like 'slave takes on load' into @s_0;
 select ts from test0.benchmark where state like 'slave ends load' into @s_1;
-select time_to_sec(@m_1) - time_to_sec(@m_0) as 'delta_m', 
-time_to_sec(@s_1) - time_to_sec(@s_0) as 'delta_s' into outfile 'delta.out';
+select time_to_sec(@m_1) - time_to_sec(@m_0) as 'delta.out';
 include/diff_tables.inc [master:test15.v_tm_nk, slave:test15.v_tm_nk]
 include/diff_tables.inc [master:test15.v_ti_nk, slave:test15.v_ti_nk]
 include/diff_tables.inc [master:test15.v_tm_wk, slave:test15.v_tm_wk]

=== modified file 'mysql-test/suite/rpl/r/rpl_show_errors.result'
--- a/mysql-test/suite/rpl/r/rpl_show_errors.result	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/r/rpl_show_errors.result	2011-06-20 13:26:35 +0000
@@ -4,6 +4,7 @@ CREATE TABLE t1 (a INT, b blob, PRIMARY 
 DROP TABLE t1;
 DROP TABLE t1;
 call mtr.add_suppression("Slave SQL: Error .Unknown table .test.t1.. on query.* Error_code: 1051");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 include/wait_for_slave_sql_error.inc [errno=1051]
 include/assert.inc [Last_SQL_Error_Timestamp is not null and matches the expected format]
 include/stop_slave.inc

=== modified file 'mysql-test/suite/rpl/r/rpl_slave_load_remove_tmpfile.result'
--- a/mysql-test/suite/rpl/r/rpl_slave_load_remove_tmpfile.result	2011-03-15 15:16:34 +0000
+++ b/mysql-test/suite/rpl/r/rpl_slave_load_remove_tmpfile.result	2011-06-20 13:26:35 +0000
@@ -17,5 +17,6 @@ call mtr.add_suppression("Slave: Can't g
 call mtr.add_suppression("Slave SQL: Error .Can.t get stat of.* Error_code: 13");
 call mtr.add_suppression("Slave: File.* not found.*");
 call mtr.add_suppression("Slave SQL: Error .File.* not found.* Error_code: 29");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 SET @@GLOBAL.DEBUG = '';
 include/rpl_end.inc

=== modified file 'mysql-test/suite/rpl/r/rpl_slave_start.result'
--- a/mysql-test/suite/rpl/r/rpl_slave_start.result	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/r/rpl_slave_start.result	2011-06-20 13:26:35 +0000
@@ -7,6 +7,7 @@ include/master-slave.inc
 [on slave]
 CALL mtr.add_suppression("Slave: Table 't1' already exists Error_code: 1050");
 CALL mtr.add_suppression("Slave SQL: Error .Table .t1. already exists. on query.* Error_code: 1050");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 # The statement makes SQL thread to fail.
 CREATE TABLE t1(c1 INT);
 [on master]

=== modified file 'mysql-test/suite/rpl/r/rpl_stm_000001.result'
--- a/mysql-test/suite/rpl/r/rpl_stm_000001.result	2011-06-19 13:11:25 +0000
+++ b/mysql-test/suite/rpl/r/rpl_stm_000001.result	2011-06-20 13:26:35 +0000
@@ -1,6 +1,7 @@
 include/master-slave.inc
 [connection master]
 CALL mtr.add_suppression("Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 create table t1 (word char(20) not null);
 load data infile '../../std_data/words.dat' into table t1;
 load data local infile 'MYSQL_TEST_DIR/std_data/words.dat' into table t1;

=== modified file 'mysql-test/suite/rpl/r/rpl_stm_EE_err2.result'
--- a/mysql-test/suite/rpl/r/rpl_stm_EE_err2.result	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/r/rpl_stm_EE_err2.result	2011-06-20 13:26:35 +0000
@@ -8,9 +8,9 @@ insert into t1 values(1),(2);
 ERROR 23000: Duplicate entry '2' for key 'a'
 drop table t1;
 call mtr.add_suppression("Slave SQL.*Query caused different errors on master and slave.*Error on master:.* error code=1062.*Error on slave:.* Error_code: 0");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 (expect different error codes on master and slave)
-include/wait_for_slave_sql_error.inc [errno=0]
-Last_SQL_Error = 'Query caused different errors on master and slave.     Error on master: message (format)='Duplicate entry '%-.192s' for key %d' error code=1062 ; Error on slave: actual message='no error', error code=0. Default database: 'test'. Query: 'insert into t1 values(1),(2)''
+include/wait_for_slave_sql_error.inc [errno=0,1740]
 drop table t1;
 include/stop_slave.inc
 RESET SLAVE;

=== modified file 'mysql-test/suite/rpl/r/rpl_stm_conflicts.result'
--- a/mysql-test/suite/rpl/r/rpl_stm_conflicts.result	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/r/rpl_stm_conflicts.result	2011-06-20 13:26:35 +0000
@@ -19,6 +19,7 @@ include/wait_for_slave_sql_error.inc [er
 Last_SQL_Error (expected "duplicate key" error)
 Error 'Duplicate entry '1' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'INSERT INTO t1 VALUES (1)'
 call mtr.add_suppression("Slave SQL.*Duplicate entry .1. for key .PRIMARY.* Error_code: 1062");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 SELECT * FROM t1;
 a
 1

=== modified file 'mysql-test/suite/rpl/r/rpl_stm_loaddata_concurrent.result'
--- a/mysql-test/suite/rpl/r/rpl_stm_loaddata_concurrent.result	2011-06-17 18:01:58 +0000
+++ b/mysql-test/suite/rpl/r/rpl_stm_loaddata_concurrent.result	2011-06-20 13:26:35 +0000
@@ -89,7 +89,7 @@ load data CONCURRENT infile '../../std_d
 terminated by ',' optionally enclosed by '%' escaped by '@' lines terminated by
 '\n##\n' starting by '>' ignore 1 lines;
 ERROR 23000: Duplicate entry '2003-03-22' for key 'day'
-include/wait_for_slave_sql_error.inc [errno=0]
+include/wait_for_slave_sql_error.inc [errno=0,1740 ]
 drop table t1, t2;
 drop table t1, t2;
 CREATE TABLE t1 (word CHAR(20) NOT NULL PRIMARY KEY) ENGINE=INNODB;

=== modified file 'mysql-test/suite/rpl/t/rpl_circular_for_4_hosts.test'
--- a/mysql-test/suite/rpl/t/rpl_circular_for_4_hosts.test	2011-03-17 13:20:36 +0000
+++ b/mysql-test/suite/rpl/t/rpl_circular_for_4_hosts.test	2011-06-20 13:26:35 +0000
@@ -31,6 +31,8 @@ CREATE TABLE t2 (a INT NOT NULL AUTO_INC
 --source include/rpl_sync.inc
 --connection server_4
 call mtr.add_suppression("Slave SQL.*Request to stop slave SQL Thread received while applying a group that has non-transactional changes; waiting for completion of the group");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
+
 --echo
 
 #

=== modified file 'mysql-test/suite/rpl/t/rpl_heartbeat_basic.test'
--- a/mysql-test/suite/rpl/t/rpl_heartbeat_basic.test	2011-03-17 13:20:36 +0000
+++ b/mysql-test/suite/rpl/t/rpl_heartbeat_basic.test	2011-06-20 13:26:35 +0000
@@ -18,6 +18,7 @@
 --source include/have_binlog_format_mixed.inc
 
 call mtr.add_suppression("Slave I/O: The slave I/O thread stops because a fatal error is encountered when it tried to SET @master_binlog_checksum");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 
 --echo
 

=== modified file 'mysql-test/suite/rpl/t/rpl_loaddata_fatal.test'
--- a/mysql-test/suite/rpl/t/rpl_loaddata_fatal.test	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/t/rpl_loaddata_fatal.test	2011-06-20 13:26:35 +0000
@@ -16,6 +16,8 @@ LOAD DATA INFILE '../../std_data/rpl_loa
 
 connection slave;
 call mtr.add_suppression("Slave SQL.*Fatal error: Not enough memory, Error_code: 1593");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
+
 let $slave_sql_errno= 1593;
 let $show_slave_sql_error= 1;
 source include/wait_for_slave_sql_error_and_skip.inc;

=== modified file 'mysql-test/suite/rpl/t/rpl_parallel.test'
--- a/mysql-test/suite/rpl/t/rpl_parallel.test	2011-06-15 17:12:11 +0000
+++ b/mysql-test/suite/rpl/t/rpl_parallel.test	2011-06-20 13:26:35 +0000
@@ -27,7 +27,6 @@
 # In the end there will be mysql-test/delta.{parallel,sequential}.log files.
 #
 
-let $rpl_skip_reset_master_and_slave= 1;
 --source include/master-slave.inc
 
 connection master;

=== modified file 'mysql-test/suite/rpl/t/rpl_parallel_multi_db-master.opt'
--- a/mysql-test/suite/rpl/t/rpl_parallel_multi_db-master.opt	2011-02-27 17:35:25 +0000
+++ b/mysql-test/suite/rpl/t/rpl_parallel_multi_db-master.opt	2011-06-20 13:26:35 +0000
@@ -1 +1 @@
---thread_stack=512K
+--thread_stack=512K --log-warnings=0

=== modified file 'mysql-test/suite/rpl/t/rpl_parallel_multi_db-slave.opt'
--- a/mysql-test/suite/rpl/t/rpl_parallel_multi_db-slave.opt	2011-05-30 10:05:07 +0000
+++ b/mysql-test/suite/rpl/t/rpl_parallel_multi_db-slave.opt	2011-06-20 13:26:35 +0000
@@ -1,2 +1,2 @@
---thread_stack=512K --slave-transaction-retries=0
+--thread_stack=512K --slave-transaction-retries=0 --log-warnings=0
 

=== modified file 'mysql-test/suite/rpl/t/rpl_parallel_start_stop.test'
--- a/mysql-test/suite/rpl/t/rpl_parallel_start_stop.test	2011-06-15 17:12:11 +0000
+++ b/mysql-test/suite/rpl/t/rpl_parallel_start_stop.test	2011-06-20 13:26:35 +0000
@@ -15,6 +15,7 @@ connection slave;
 
 call mtr.add_suppression('Slave SQL: Could not execute Write_rows event on table test.t1');
 call mtr.add_suppression('Slave SQL: Could not execute Update_rows event on table test.t1; Deadlock found when trying to get lock');
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
 
 create view worker_proc_list as SELECT id  from Information_Schema.processlist
        where state like 'Waiting for an event from sql thread%';
@@ -66,7 +67,8 @@ let $count= 0;
 let $table= worker_proc_list;
 source include/wait_until_rows_count.inc;
 
-source include/wait_for_slave_sql_to_stop.inc;
+let $slave_sql_errno= 1740; # ER_MTS_PARALLEL_INCONSISTENT_DATA
+source include/wait_for_slave_sql_error.inc;
 
 source include/start_slave.inc;
 

=== modified file 'mysql-test/suite/rpl/t/rpl_row_img_sanity.test'
--- a/mysql-test/suite/rpl/t/rpl_row_img_sanity.test	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/t/rpl_row_img_sanity.test	2011-06-20 13:26:35 +0000
@@ -25,6 +25,8 @@
 -- connection slave
 call mtr.add_suppression("Slave: Can\'t find record in \'t\' Error_code: 1032");
 call mtr.add_suppression("Slave SQL: Could not execute Update_rows event on table test.t; Can.t find record in .t.* Error_code: 1032");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
+
 -- connection master
 
 ## assertion: check that default value for binlog-row-image == 'FULL'

=== modified file 'mysql-test/suite/rpl/t/rpl_row_inexist_tbl.test'
--- a/mysql-test/suite/rpl/t/rpl_row_inexist_tbl.test	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/t/rpl_row_inexist_tbl.test	2011-06-20 13:26:35 +0000
@@ -31,6 +31,8 @@ connection slave;
 # slave should have stopped because can't find table t1
 # 1146 = ER_NO_SUCH_TABLE
 call mtr.add_suppression("Slave SQL.*Error executing row event: .Table .test.t1. doesn.t exist., Error_code: 1146");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
+
 --let $slave_sql_errno= 1146
 --source include/wait_for_slave_sql_error.inc
 

=== modified file 'mysql-test/suite/rpl/t/rpl_show_errors.test'
--- a/mysql-test/suite/rpl/t/rpl_show_errors.test	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/t/rpl_show_errors.test	2011-06-20 13:26:35 +0000
@@ -29,6 +29,8 @@ DROP TABLE t1; 
 #         remove a table that does not exist
 let $slave_sql_errno=1051; # ER_BAD_TABLE_ERROR
 call mtr.add_suppression("Slave SQL: Error .Unknown table .test.t1.. on query.* Error_code: 1051");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
+
 -- source include/wait_for_slave_sql_error.inc
 
 --let $errts0= query_get_value("SHOW SLAVE STATUS", $field, 1)

=== modified file 'mysql-test/suite/rpl/t/rpl_slave_load_remove_tmpfile.test'
--- a/mysql-test/suite/rpl/t/rpl_slave_load_remove_tmpfile.test	2011-03-15 15:16:34 +0000
+++ b/mysql-test/suite/rpl/t/rpl_slave_load_remove_tmpfile.test	2011-06-20 13:26:35 +0000
@@ -72,6 +72,8 @@ call mtr.add_suppression("Slave: Can't g
 call mtr.add_suppression("Slave SQL: Error .Can.t get stat of.* Error_code: 13");
 call mtr.add_suppression("Slave: File.* not found.*");
 call mtr.add_suppression("Slave SQL: Error .File.* not found.* Error_code: 29");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
+
 --let $rpl_only_running_threads= 1
 
 eval SET @@GLOBAL.DEBUG = '$old_debug';

=== modified file 'mysql-test/suite/rpl/t/rpl_slave_start.test'
--- a/mysql-test/suite/rpl/t/rpl_slave_start.test	2011-02-23 20:01:27 +0000
+++ b/mysql-test/suite/rpl/t/rpl_slave_start.test	2011-06-20 13:26:35 +0000
@@ -10,6 +10,8 @@ connection slave;
 
 CALL mtr.add_suppression("Slave: Table 't1' already exists Error_code: 1050");
 CALL mtr.add_suppression("Slave SQL: Error .Table .t1. already exists. on query.* Error_code: 1050");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
+
 --echo # The statement makes SQL thread to fail.
 CREATE TABLE t1(c1 INT);
 

=== modified file 'mysql-test/suite/rpl/t/rpl_stm_000001.test'
--- a/mysql-test/suite/rpl/t/rpl_stm_000001.test	2011-06-19 13:11:25 +0000
+++ b/mysql-test/suite/rpl/t/rpl_stm_000001.test	2011-06-20 13:26:35 +0000
@@ -4,6 +4,8 @@
 -- source include/master-slave.inc
 
 CALL mtr.add_suppression("Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT");
+call mtr.add_suppression("The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state");
+
 --let $engine_type= myisam
 
 # Load some data into t1

=== modified file 'sql/rpl_reporting.h'
--- a/sql/rpl_reporting.h	2011-06-09 15:27:47 +0000
+++ b/sql/rpl_reporting.h	2011-06-20 13:26:35 +0000
@@ -127,6 +127,7 @@ public:
   };
 
   Error const& last_error() const { return m_last_error; }
+  bool is_error() const { return last_error().number != 0; }
 
   virtual ~Slave_reporting_capability()= 0;
 private:

=== modified file 'sql/rpl_rli_pdb.h'
--- a/sql/rpl_rli_pdb.h	2011-06-18 18:58:21 +0000
+++ b/sql/rpl_rli_pdb.h	2011-06-20 10:52:44 +0000
@@ -119,9 +119,7 @@ public:
 typedef struct st_slave_job_group
 {
   char *group_master_log_name; // (actually redundant)
-  my_off_t master_log_pos;       // B-event log_pos
   my_off_t group_master_log_pos; // T-event lop_pos filled by W for CheckPoint
-  my_off_t group_relay_log_pos;  // filled by W
 
   /* 
      When RL name changes C allocates and fill in a new name of RL,
@@ -133,10 +131,13 @@ typedef struct st_slave_job_group
      Freeing unless NULL is left to C at CheckPoint.
   */
   char     *group_relay_log_name; // The value is last seen relay-log 
+  my_off_t group_relay_log_pos;  // filled by W
+
   ulong worker_id;
   Slave_worker *worker;
   ulonglong total_seqno;
 
+  my_off_t master_log_pos;       // B-event log_pos
   /* checkpoint coord are reset by CP and rotate:s */
   uint  checkpoint_seqno;
   my_off_t checkpoint_log_pos; // T-event lop_pos filled by W for CheckPoint

=== modified file 'sql/rpl_slave.cc'
--- a/sql/rpl_slave.cc	2011-06-19 08:04:19 +0000
+++ b/sql/rpl_slave.cc	2011-06-20 13:32:33 +0000
@@ -192,6 +192,8 @@ static int terminate_slave_thread(THD *t
                                   bool skip_lock);
 static bool check_io_slave_killed(THD *thd, Master_info *mi, const char *info);
 int slave_worker_exec_job(Slave_worker * w, Relay_log_info *rli);
+static int mts_event_coord_cmp(LOG_POS_COORD *id1, LOG_POS_COORD *id2);
+static int mts_event_job_cmp(Slave_job_group *id1, Slave_job_group *id2);
 
 /*
   Find out which replications threads are running
@@ -1073,7 +1075,7 @@ static bool io_slave_killed(THD* thd, Ma
 static bool sql_slave_killed(THD* thd, Relay_log_info* rli)
 {
   bool ret= FALSE;
-  bool is_parallel_group= FALSE;
+  bool is_parallel_warn= FALSE;
 
   DBUG_ENTER("sql_slave_killed");
 
@@ -1085,8 +1087,11 @@ static bool sql_slave_killed(THD* thd, R
       Slave can execute stop being in one of two MTS or Single-Threaded mode.
       The modes define different criteria to accept the stop.
       In particular that relates to the concept of groupping.
+      Killed Coordinator thread expects the worst so it warns on
+      possible consistency issue.
     */
-    if ((is_parallel_group= rli->is_mts_in_group())
+    if ((is_parallel_warn= (rli->is_parallel_exec() && 
+                            (rli->is_mts_in_group() || thd->killed)))
         ||
         (!rli->is_parallel_exec() &&
          thd->transaction.all.cannot_safely_rollback() && rli->is_in_group()))
@@ -1136,7 +1141,7 @@ static bool sql_slave_killed(THD* thd, R
         if (!ret)
         {
           rli->report(WARNING_LEVEL, 0,
-                      !is_parallel_group ?
+                      !is_parallel_warn ?
                       "Request to stop slave SQL Thread received while "
                       "applying a group that has non-transactional "
                       "changes; waiting for completion of the group ... "
@@ -1148,8 +1153,9 @@ static bool sql_slave_killed(THD* thd, R
       }
       if (ret)
       {
-        if (is_parallel_group)
-          rli->report(WARNING_LEVEL,
+        if (is_parallel_warn)
+          rli->report(!rli->is_error() ? ERROR_LEVEL :
+                      WARNING_LEVEL,    // an error was reported by Worker
                       ER_MTS_PARALLEL_INCONSISTENT_DATA,
                       ER(ER_MTS_PARALLEL_INCONSISTENT_DATA),
                       msg_stopped_mts);
@@ -3837,16 +3843,23 @@ int mts_event_coord_cmp(LOG_POS_COORD *i
          (poscmp  < 0  ? -1 : (poscmp  > 0  ?  1 : 0))));
 }
 
+int mts_event_job_cmp(Slave_job_group *id1, Slave_job_group *id2)
+{
+  longlong filecmp= strcmp(id1->checkpoint_log_name, id2->checkpoint_log_name);
+  longlong poscmp= id1->checkpoint_log_pos - id2->checkpoint_log_pos;
+  return (filecmp < 0  ? -1 : (filecmp > 0  ?  1 :
+         (poscmp  < 0  ? -1 : (poscmp  > 0  ?  1 : 0))));
+}
+
 bool mts_recovery_groups(Relay_log_info *rli, MY_BITMAP *groups)
 { 
-  Log_event *ev= NULL; // , *desc= NULL;
+  Log_event *ev= NULL;
   const char *log_name= NULL;
   const char *errmsg= NULL;
   bool error= FALSE;
-  DYNAMIC_ARRAY above_lwm_jobs;
   bool curr_group_seen_begin= FALSE;
+  DYNAMIC_ARRAY above_lwm_jobs;
   Slave_job_group job_worker;
-  Slave_job_group job_file;
   IO_CACHE log;
   File file;
 
@@ -3871,14 +3884,19 @@ bool mts_recovery_groups(Relay_log_info 
     Slave_worker *worker=
       Rpl_info_factory::create_worker(opt_worker_repository_id, id, rli);
     worker->init_info();
-    retrieve_job(worker, job_file);
     LOG_POS_COORD w_last= {worker->group_master_log_name, worker->group_master_log_pos};
     if (mts_event_coord_cmp(&w_last, &cp) > 0)
     {
       /*
         Inserts information into a dynamic array for further processing.
+        The jobs/workers are ordered by the last checkpoint positions
+        workers have seen.
       */
-      insert_dynamic(&above_lwm_jobs, (uchar*) &job_file);
+      job_worker.worker= worker;
+      job_worker.checkpoint_log_pos= worker->checkpoint_master_log_pos;
+      job_worker.checkpoint_log_name= worker->checkpoint_master_log_name;
+
+      insert_dynamic(&above_lwm_jobs, (uchar*) &job_worker);
     }
     else
     {
@@ -3889,9 +3907,13 @@ bool mts_recovery_groups(Relay_log_info 
       worker->end_info();
       delete worker;
     }
-  };
+  }
 
-  sort_dynamic(&above_lwm_jobs, (qsort_cmp) mts_event_coord_cmp);
+  /*
+    This sorts the array by the last checkpoint positions workers have seen
+    and is required in the next recovery phase to compute shift bits.
+  */
+  sort_dynamic(&above_lwm_jobs, (qsort_cmp) mts_event_job_cmp);
   /*
     In what follows, the group Recovery Bitmap is constructed.
 
@@ -4393,7 +4415,7 @@ void slave_stop_workers(Relay_log_info *
 
   for (i= rli->workers.elements - 1; i >= 0; i--)
   {
-    Slave_worker *w;
+    Slave_worker *w= NULL;
     get_dynamic((DYNAMIC_ARRAY*)&rli->workers, (uchar*) &w, i);
 
     mysql_mutex_lock(&w->jobs_lock);


Attachment: [text/bzr-bundle] bzr/andrei.elkin@oracle.com-20110620133233-tubrgs8z2ijf8fn7.bundle
Thread
bzr push into mysql-next-mr-wl5569 branch (andrei.elkin:3310 to 3312)WL#5569 WL#5599Andrei Elkin20 Jun