Hi Libing,
Nice work. Please see some comments below. Thanks!
Best Reagards,
Daogang
Li-Bing.Song@stripped wrote:
> #At file:///home/anders/work/bzrwork/bug45516/mysql-5.1-bugteam/ based on
> revid:v.narayanan@stripped
>
> 3162 Li-Bing.Song@stripped 2009-10-19
> Bug#21437 server_errno=29 error message flood mysqld error log
>
> Sometimes, Slaves requests a binlog file which does not exists on master.
>
s/does not exists/ is not exist
> On this occasion, master will send an error to slave. The error number is 29
>
On this occasion, master will send 'EE_FILENOTFOUND' error to slave. The 'EE_FILENOTFOUND'
error's number is 29.
> Which means file does not find. This error message will lead a flood.
>
The 'EE_FILENOTFOUND' is a low level error, so the slave I/O thread will
still retry to request,
so that lots 'EE_FILENOTFOUND' errors lead the flood.
>
> Instead of error 29, error
> COM_BINLOG_DUMP has a group of its own errors, The errors which have happend
> will be
> sent to slave immediately.
> Before the COM_BINLOG_DUMP error is reported by this function, an error
> sometimes has been set into thd->main_da by a low level function such as
> my_register_filename, which is called during COM_BINLOG_DUMP is
> executing. Thus, instead of the COM_BINLOG_DUMP error, the low level
> error is sent to slave.
>
> The low level error should never be sent to slave, it only be reported
> to master.
>
To fix the problem, the ER_MASTER_FATAL_ERROR_READING_BINLOG instead of
'EE_FILENOTFOUND' error should be send to slave for notifying the slave
I/O thread
to stop immediately.
> added:
> mysql-test/suite/rpl/r/rpl_master_fatal_error.result
> mysql-test/suite/rpl/t/rpl_master_fatal_error.test
> modified:
> sql/sql_repl.cc
> === added file 'mysql-test/suite/rpl/r/rpl_master_fatal_error.result'
> --- a/mysql-test/suite/rpl/r/rpl_master_fatal_error.result 1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/rpl/r/rpl_master_fatal_error.result 2009-10-19 08:23:22 +0000
> @@ -0,0 +1,20 @@
> +stop slave;
> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> +reset master;
> +reset slave;
> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> +start slave;
> +call mtr.add_suppression("Got fatal error 1236 from master when reading data from
> binary log:");
> +CREATE TABLE t1(c1 int);
> +DROP TABLE t1;
> +FLUSH LOGS;
> +CREATE TABLE t1(c1 int);
> +DROP TABLE t1;
> +STOP SLAVE;
> +CHANGE MASTER TO master_host='127.0.0.1', master_log_file='master-bin.000001';
> +START SLAVE;
> +Last_IO_Error
> +echo Got fatal error 1236 from master when reading data from binary log: 'Could not
> open log file'
> +STOP SLAVE SQL_THREAD;
> +CHANGE MASTER TO master_host='127.0.0.1', master_log_file='master-bin.000002';
> +START SLAVE;
>
> === added file 'mysql-test/suite/rpl/t/rpl_master_fatal_error.test'
> --- a/mysql-test/suite/rpl/t/rpl_master_fatal_error.test 1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/rpl/t/rpl_master_fatal_error.test 2009-10-19 08:23:22 +0000
> @@ -0,0 +1,47 @@
> +###############################################################################
> +# Bug #21437 server_errno=29 error message flood mysqld error log
> +#
> +# server_errno 29 is EE_FILENOTFOUND. It should never be reported to slave.
> +# Instead of EE_FILENOTFOUND, ER_MASTER_FATAL_ERROR_READING_BINLOG and the
> +# specific reason "Could not open log file" are sent to slave.
>
This test verifies if the 'ER_MASTER_FATAL_ERROR_READING_BINLOG' instead
of 'EE_FILENOTFOUND'
error is send to salve, so that the slave I/O thread stops immediately.
> +################################################################################
> +
> +source include/master-slave.inc;
> +#Add supression for the expected error in slave error log
> +call mtr.add_suppression("Got fatal error 1236 from master when reading data from
> binary log:");
> +
> +CREATE TABLE t1(c1 int);
> +DROP TABLE t1;
> +
> +FLUSH LOGS;
> +
> +CREATE TABLE t1(c1 int);
> +DROP TABLE t1;
> +sync_slave_with_master;
> +
> +STOP SLAVE;
> +source include/wait_for_slave_to_stop.inc;
> +
> +#This operation will result in error EE_FILENOTFOUND which happens when slave IO
> thread is connecting to master.
> +move_file $MYSQL_TEST_DIR/var/mysqld.1/data/master-bin.000001
> $MYSQL_TEST_DIR/var/mysqld.1/data/master-bin.bak;
> +
> +connection slave;
> +CHANGE MASTER TO master_host='127.0.0.1', master_log_file='master-bin.000001';
> +START SLAVE;
> +source include/wait_for_slave_io_to_stop.inc;
> +
> +# Instead of EE_FILENOTFOUND, ER_MASTER_FATAL_ERROR_READING_BINLOG and the
> +# specific information are sent to slave.
> +let $last_io_error= query_get_value(SHOW SLAVE STATUS, Last_IO_Error, 1);
> +echo Last_IO_Error
> +echo $last_io_error;
> +
> +move_file $MYSQL_TEST_DIR/var/mysqld.1/data/master-bin.bak
> $MYSQL_TEST_DIR/var/mysqld.1/data/master-bin.000001;
> +
> +STOP SLAVE SQL_THREAD;
> +source include/wait_for_slave_sql_to_stop.inc;
> +CHANGE MASTER TO master_host='127.0.0.1', master_log_file='master-bin.000002';
> +START SLAVE;
> +source include/wait_for_slave_to_start.inc;
> +
> +source include/master-slave-end.inc;
>
> === modified file 'sql/sql_repl.cc'
> --- a/sql/sql_repl.cc 2009-09-18 08:20:29 +0000
> +++ b/sql/sql_repl.cc 2009-10-19 08:23:22 +0000
> @@ -777,6 +777,20 @@ err:
> if (file >= 0)
> (void) my_close(file, MYF(MY_WME));
>
> + /*
> + Bug #21437 server_errno=29 error message flood mysqld error log
> +
> + COM_BINLOG_DUMP has a group of its own errors, The errors which have happend
> will be
> + sent to slave immediately.
> + Before the COM_BINLOG_DUMP error is reported by this function, an error
> + sometimes has been set into thd->main_da by a low level function such as
> + my_register_filename, which is called during COM_BINLOG_DUMP is
> + executing. Thus, instead of the COM_BINLOG_DUMP error, the low level
> + error is sent to slave.
> + The low level error should never be sent to slave, it only be reported
> + to master.
> + */
>
See above.
> + thd->main_da.reset_diagnostics_area();
> my_message(my_errno, errmsg, MYF(0));
> DBUG_VOID_RETURN;
> }
>
>
> ------------------------------------------------------------------------
>
>