On Wed, 2009-12-23 at 14:30 +0000, Luís Soares wrote:
> Hi Libing,
>
> Furthermore, I think you need to clarify what happens. For
> instance, taking your test as an example, if I stop the
> slave thread before master changing the index and then
> restart it after the FLUSH LOGS, the test just hangs... Is
> this expected behavior? Should this be documented?
>
> The portion of the test that I changed:
>
>
> STOP SLAVE IO_THREAD;
> source include/wait_for_slave_io_to_stop.inc;
We can not do that.
Manually changing index file is not affect a new staring slave.
> # Now, all events has been replicate to slave. As current dump file
> # (master-bin.000002) is the last binlog file, so master is waiting for
> new
> # events.
>
> connection master;
> # Delete './master-bin.000001' from index file.
> let $datadir= `SELECT @@DATADIR`;
> remove_file $datadir/master-bin.index;
> write_file $datadir/master-bin.index;
> ./master-bin.000002
> EOF
> # Now, only 1 entry in index file.
> # ./master-bin.000002
>
>
> # Generate master-bin.000003, but it is in the second line.
> FLUSH LOGS;
> # Now, 2 entries in index file.
> # ./master-bin.000002
> # ./master-bin.000003
>
> -- connection slave
> START SLAVE IO_THREAD;
> source include/wait_for_slave_io_to_start.inc;
> -- connection master
>
>
> REQUESTS
> --------
> n/a
>
> SUGGESTIONS
> -----------
> n/a
>
> DETAILS
> -------
>
> On Thu, 2009-12-10 at 12:10 +0000, Li-Bing.Song@stripped wrote:
> > #At file:///home/anders/work/bzrwork/worktree2/mysql-5.1-bugteam/ based on
> revid:li-bing.song@stripped
> >
> > 3215 Li-Bing.Song@stripped 2009-12-10
> > BUG #28421 Infinite loop on slave relay logs
> >
> > That, manually deleteing one or more entries from 'master-bin.index',
> will
> > cause master infinitely loop to send one binlog file.
> >
>
> suggestion: Maybe loose the 'That' ^, and just make:
> "Manually deleting one or more ..."
>
> > When starting a dump session, master opens index file and search the
> binlog file
> > which is being requested by the slave. The position of the binlog file in
> the
> > index file is recorded. it will be used to find the next binlog file when
> current
> > binlog file has been dumped completely. As only the position is used, it
> will can
> > not get the correct file if some entries has been removed manually from
> the index file.
>
> suggestion: It may not get the correct file
>
> > the master will reopen the current binlog file which has been dump
> completely
>
> suggestion: s/has been dump/has dumped
>
> > and redump it if it can not get the next binlog file's name from index
> file.
> > It obviously is a logical error.
> >
> >
> > Manually changing index file is a illegal action, so after this patch,
> master
> > sends a fatal error to slave and close the dump session if a new binlog
> file
> > has been generated and master can not get it from the index file.
>
> See RC3.
>
> > added:
> > mysql-test/suite/rpl/r/rpl_manual_change_index_file.result
> > mysql-test/suite/rpl/t/rpl_manual_change_index_file.test
> > modified:
> > sql/sql_repl.cc
> > === added file 'mysql-test/suite/rpl/r/rpl_manual_change_index_file.result'
> > --- a/mysql-test/suite/rpl/r/rpl_manual_change_index_file.result 1970-01-01
> 00:00:00 +0000
> > +++ b/mysql-test/suite/rpl/r/rpl_manual_change_index_file.result 2009-12-10
> 12:10:33 +0000
> > @@ -0,0 +1,25 @@
> > +stop slave;
> > +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> > +reset master;
> > +reset slave;
> > +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> > +start slave;
> > +FLUSH LOGS;
> > +CREATE TABLE t1(c1 INT);
> > +FLUSH LOGS;
> > +call mtr.add_suppression('Got fatal error 1236 from master when reading data
> from binary log: .*could not find next log');
> > +Last_IO_Error
> > +Got fatal error 1236 from master when reading data from binary log: 'could not
> find next log'
> > +CREATE TABLE t2(c1 INT);
> > +FLUSH LOGS;
> > +CREATE TABLE t3(c1 INT);
> > +FLUSH LOGS;
> > +CREATE TABLE t4(c1 INT);
> > +START SLAVE IO_THREAD;
> > +SHOW TABLES;
> > +Tables_in_test
> > +t1
> > +t2
> > +t3
> > +t4
> > +DROP TABLE t1, t2, t3, t4;
> >
> > === added file 'mysql-test/suite/rpl/t/rpl_manual_change_index_file.test'
> > --- a/mysql-test/suite/rpl/t/rpl_manual_change_index_file.test 1970-01-01
> 00:00:00 +0000
> > +++ b/mysql-test/suite/rpl/t/rpl_manual_change_index_file.test 2009-12-10
> 12:10:33 +0000
> > @@ -0,0 +1,69 @@
> > +source include/master-slave.inc;
> > +
> > +#
> > +# BUG#28421 Infinite loop on slave relay logs
> > +#
> > +# That, manually deleteing one or more entries from 'master-bin.index', will
> > +# cause master infinitely loop to send one binlog file.
> > +#
> > +# Manually changing index file is a illegal action, so when this happen, we
> > +# send a fatal error to slave and close the dump session.
> > +
> > +FLUSH LOGS;
> > +# Now, 2 entries in index file.
> > +# ./master-bin.000001
> > +# ./master-bin.000002
> > +
> > +CREATE TABLE t1(c1 INT);
> > +# Now, the current dump file(master-bin.000002) is the second line of index
> > +# file
> > +sync_slave_with_master;
> > +# Now, all events has been replicate to slave. As current dump file
> > +# (master-bin.000002) is the last binlog file, so master is waiting for new
> > +# events.
> > +
> > +connection master;
> > +# Delete './master-bin.000001' from index file.
> > +let $datadir= `SELECT @@DATADIR`;
> > +remove_file $datadir/master-bin.index;
> > +write_file $datadir/master-bin.index;
> > +./master-bin.000002
> > +EOF
>
> RC1 and RC2 ^
>
> > +# Now, only 1 entry in index file.
> > +# ./master-bin.000002
> > +
> > +# Generate master-bin.000003, but it is in the second line.
> > +FLUSH LOGS;
> > +# Now, 2 entries in index file.
> > +# ./master-bin.000002
> > +# ./master-bin.000003
> > +
> > +# Now, master know that new binlog file(master-bin.000003) has been generated.
> > +# It expects that the new binlog file is in third line of index file, but
> > +# there is no third line in index file. It is so strange that master sends an
> > +# error to slave.
> > +call mtr.add_suppression('Got fatal error 1236 from master when reading data
> from binary log: .*could not find next log');
> > +connection slave;
> > +source include/wait_for_slave_io_to_stop.inc;
> > +let $last_error= query_get_value(SHOW SLAVE STATUS, Last_IO_Error, 1);
> > +echo Last_IO_Error;
> > +echo $last_error;
> > +
> > +connection master;
> > +CREATE TABLE t2(c1 INT);
> > +FLUSH LOGS;
> > +CREATE TABLE t3(c1 INT);
> > +FLUSH LOGS;
> > +CREATE TABLE t4(c1 INT);
> > +
> > +connection slave;
> > +START SLAVE IO_THREAD;
> > +source include/wait_for_slave_io_to_start.inc;
> > +
> > +connection master;
> > +sync_slave_with_master;
> > +SHOW TABLES;
> > +
> > +connection master;
> > +DROP TABLE t1, t2, t3, t4;
> > +source include/master-slave-end.inc;
> >
> > === modified file 'sql/sql_repl.cc'
> > --- a/sql/sql_repl.cc 2009-09-18 08:20:29 +0000
> > +++ b/sql/sql_repl.cc 2009-12-10 12:10:33 +0000
> > @@ -711,11 +711,14 @@ impossible position";
> >
> > thd_proc_info(thd, "Finished reading one binlog; switching to next
> binlog");
> > switch (mysql_bin_log.find_next_log(&linfo, 1)) {
> > - case LOG_INFO_EOF:
> > - loop_breaker = (flags & BINLOG_DUMP_NON_BLOCK);
> > - break;
> > case 0:
> > break;
> > + case LOG_INFO_EOF:
> > + if (mysql_bin_log.is_active(log_file_name))
> > + {
> > + loop_breaker = (flags & BINLOG_DUMP_NON_BLOCK);
> > + break;
> > + }
> > default:
> > errmsg = "could not find next log";
> > my_errno= ER_MASTER_FATAL_ERROR_READING_BINLOG;
> >
>
>
>
--
Your Sincerely,
Libing Song
==================================
MySQL Replication Team
Software Engineer
Certified (ISC)2 CISSP
Email : Li-Bing.Song@stripped
Skype : libing.song
MSN : slb_database@stripped
Phone : +86 010-6505-4020 ext. 319
Mobile: +86 138-1144-2038
==================================