List:Commits« Previous MessageNext Message »
From:Luís Soares Date:December 23 2009 2:30pm
Subject:Re: bzr commit into mysql-5.1-bugteam branch (Li-Bing.Song:3215)
Bug#28421
View as plain text  
Hi Libing,

  Nice Work. Please find my review comments below.

STATUS
------
  Not approved.

REQUIRED CHANGES
----------------
 
  RC1. The test case needs some work, as I believe it is not
       platform agnostic. For example, take this change:

       +write_file $datadir/master-bin.index;
       +./master-bin.000002
       +EOF

       In windows the path separator is not './' but instead
       '.\'. This needs to be fixed.

  RC2. I think there is another windows issue with this part of
       the test, related to removing and creating a new file,
       while MySQL has it opened. The specific part of the test I
       am wondering is:

         +remove_file $datadir/master-bin.index;
         +write_file $datadir/master-bin.index;
         +./master-bin.000002
         +EOF

       You need to try this test on windows before pushing. If
       you find problems with this, I suggest to try something
       like (basically, we truncate the file and then append the
       required lines, so that we dont delete and create it
       again):

         let INDEX_FILE=$index;
         perl;
         $file= $ENV{'INDEX_FILE'};
         open(FILE, ">$file") || die "Unable to open $file.";
         truncate(FILE,0);
         close ($file);
         EOF
       
         -- append_file $index 
         master-bin.000002
         EOF

  RC3. I also don't think that this is entirely correct:
   
       "Manually changing index file is a illegal action, ..."

       We allow the user to change the index file (even more
       after BUG 34582). What happens is that once he does that,
       it's his own responsibility.  It's like playing with fire,
       so the user should expect some issues and this seems
       definitely like one of them.

       Would you mind clarifying that in the commit message?

       Furthermore, I think you need to clarify what happens. For
       instance, taking your test as an example, if I stop the
       slave thread before master changing the index and then
       restart it after the FLUSH LOGS, the test just hangs... Is
       this expected behavior? Should this be documented? 

       The portion of the test that I changed:


STOP SLAVE IO_THREAD;
source include/wait_for_slave_io_to_stop.inc;
# Now, all events has been replicate to slave. As current dump file
# (master-bin.000002) is the last binlog file, so master is waiting for
new
# events.

connection master;
# Delete './master-bin.000001' from index file.
let $datadir= `SELECT @@DATADIR`;
remove_file $datadir/master-bin.index;
write_file $datadir/master-bin.index;
./master-bin.000002
EOF
# Now, only 1 entry in index file.
# ./master-bin.000002


# Generate master-bin.000003, but it is in the second line.
FLUSH LOGS;
# Now, 2 entries in index file.
# ./master-bin.000002
# ./master-bin.000003

-- connection slave
START SLAVE IO_THREAD;
source include/wait_for_slave_io_to_start.inc;
-- connection master


REQUESTS
--------
 n/a

SUGGESTIONS
-----------
 n/a

DETAILS 
-------

On Thu, 2009-12-10 at 12:10 +0000, Li-Bing.Song@stripped wrote:
> #At file:///home/anders/work/bzrwork/worktree2/mysql-5.1-bugteam/ based on
> revid:li-bing.song@stripped
> 
>  3215 Li-Bing.Song@stripped	2009-12-10
>       BUG #28421 Infinite loop on slave relay logs
>       
>       That, manually deleteing one or more entries from 'master-bin.index', will
>       cause master infinitely loop to send one binlog file. 
>       

suggestion: Maybe loose the 'That' ^, and just make:
            "Manually deleting one or more ..."

>       When starting a dump session, master opens index file and search the binlog
> file
>       which is being requested by the slave. The position of the binlog file in the
>       index file is recorded. it will be used to find the next binlog file when
> current
>       binlog file has been dumped completely. As only the position is used, it will
> can
>       not get the correct file if some entries has been removed manually from the
> index file.

suggestion: It may not get the correct file

>       the master will reopen the current binlog file which has been dump completely

suggestion: s/has been dump/has dumped 

>       and redump it if it can not get the next binlog file's name from index file.
>       It obviously is a logical error.
>       
>       
>       Manually changing index file is a illegal action, so after this patch, master
>       sends a fatal error to slave and close the dump session if a new binlog file
>       has been generated and master can not get it from the index file.

See RC3.

>     added:
>       mysql-test/suite/rpl/r/rpl_manual_change_index_file.result
>       mysql-test/suite/rpl/t/rpl_manual_change_index_file.test
>     modified:
>       sql/sql_repl.cc
> === added file 'mysql-test/suite/rpl/r/rpl_manual_change_index_file.result'
> --- a/mysql-test/suite/rpl/r/rpl_manual_change_index_file.result	1970-01-01 00:00:00
> +0000
> +++ b/mysql-test/suite/rpl/r/rpl_manual_change_index_file.result	2009-12-10 12:10:33
> +0000
> @@ -0,0 +1,25 @@
> +stop slave;
> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> +reset master;
> +reset slave;
> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> +start slave;
> +FLUSH LOGS;
> +CREATE TABLE t1(c1 INT);
> +FLUSH LOGS;
> +call mtr.add_suppression('Got fatal error 1236 from master when reading data from
> binary log: .*could not find next log');
> +Last_IO_Error
> +Got fatal error 1236 from master when reading data from binary log: 'could not find
> next log'
> +CREATE TABLE t2(c1 INT);
> +FLUSH LOGS;
> +CREATE TABLE t3(c1 INT);
> +FLUSH LOGS;
> +CREATE TABLE t4(c1 INT);
> +START SLAVE IO_THREAD;
> +SHOW TABLES;
> +Tables_in_test
> +t1
> +t2
> +t3
> +t4
> +DROP TABLE t1, t2, t3, t4;
> 
> === added file 'mysql-test/suite/rpl/t/rpl_manual_change_index_file.test'
> --- a/mysql-test/suite/rpl/t/rpl_manual_change_index_file.test	1970-01-01 00:00:00
> +0000
> +++ b/mysql-test/suite/rpl/t/rpl_manual_change_index_file.test	2009-12-10 12:10:33
> +0000
> @@ -0,0 +1,69 @@
> +source include/master-slave.inc;
> +
> +#
> +# BUG#28421 Infinite loop on slave relay logs
> +# 
> +# That, manually deleteing one or more entries from 'master-bin.index', will
> +# cause master infinitely loop to send one binlog file.
> +# 
> +# Manually changing index file is a illegal action, so when this happen, we
> +# send a fatal error to slave and close the dump session.
> +
> +FLUSH LOGS;
> +# Now, 2 entries in index file.
> +# ./master-bin.000001
> +# ./master-bin.000002
> +
> +CREATE TABLE t1(c1 INT);
> +# Now, the current dump file(master-bin.000002) is the second line of index
> +# file
> +sync_slave_with_master;
> +# Now, all events has been replicate to slave. As current dump file
> +# (master-bin.000002) is the last binlog file, so master is waiting for new
> +# events.
> +
> +connection master;
> +# Delete './master-bin.000001' from index file.
> +let $datadir= `SELECT @@DATADIR`;
> +remove_file $datadir/master-bin.index;
> +write_file $datadir/master-bin.index;
> +./master-bin.000002
> +EOF

RC1 and RC2 ^

> +# Now, only 1 entry in index file.
> +# ./master-bin.000002
> +
> +# Generate master-bin.000003, but it is in the second line.
> +FLUSH LOGS;
> +# Now, 2 entries in index file.
> +# ./master-bin.000002
> +# ./master-bin.000003
> +
> +# Now, master know that new binlog file(master-bin.000003) has been generated.
> +# It expects that the new binlog file is in third line of index file, but
> +# there is no third line in index file. It is so strange that master sends an
> +# error to slave. 
> +call mtr.add_suppression('Got fatal error 1236 from master when reading data from
> binary log: .*could not find next log');
> +connection slave;
> +source include/wait_for_slave_io_to_stop.inc;
> +let $last_error= query_get_value(SHOW SLAVE STATUS, Last_IO_Error, 1);
> +echo Last_IO_Error;
> +echo $last_error;
> +
> +connection master;
> +CREATE TABLE t2(c1 INT);
> +FLUSH LOGS;
> +CREATE TABLE t3(c1 INT);
> +FLUSH LOGS;
> +CREATE TABLE t4(c1 INT);
> +
> +connection slave;
> +START SLAVE IO_THREAD;
> +source include/wait_for_slave_io_to_start.inc;
> +
> +connection master;
> +sync_slave_with_master;
> +SHOW TABLES;
> +
> +connection master;
> +DROP TABLE t1, t2, t3, t4;
> +source include/master-slave-end.inc;
> 
> === modified file 'sql/sql_repl.cc'
> --- a/sql/sql_repl.cc	2009-09-18 08:20:29 +0000
> +++ b/sql/sql_repl.cc	2009-12-10 12:10:33 +0000
> @@ -711,11 +711,14 @@ impossible position";
>  
>        thd_proc_info(thd, "Finished reading one binlog; switching to next binlog");
>        switch (mysql_bin_log.find_next_log(&linfo, 1)) {
> -      case LOG_INFO_EOF:
> -	loop_breaker = (flags & BINLOG_DUMP_NON_BLOCK);
> -	break;
>        case 0:
>  	break;
> +      case LOG_INFO_EOF:
> +        if (mysql_bin_log.is_active(log_file_name))
> +        {
> +          loop_breaker = (flags & BINLOG_DUMP_NON_BLOCK);
> +          break;
> +        }
>        default:
>  	errmsg = "could not find next log";
>  	my_errno= ER_MASTER_FATAL_ERROR_READING_BINLOG;
> 


Thread
bzr commit into mysql-5.1-bugteam branch (Li-Bing.Song:3215) Bug#28421Li-Bing.Song10 Dec
  • Re: bzr commit into mysql-5.1-bugteam branch (Li-Bing.Song:3215)Bug#28421He Zhenxing11 Dec
    • Re: bzr commit into mysql-5.1-bugteam branch (Li-Bing.Song:3215)Bug#28421He Zhenxing15 Dec
  • Re: bzr commit into mysql-5.1-bugteam branch (Li-Bing.Song:3215)Bug#28421Luís Soares23 Dec
    • Re: bzr commit into mysql-5.1-bugteam branch (Li-Bing.Song:3215)Bug#28421Libing Song5 Jan
      • Re: bzr commit into mysql-5.1-bugteam branch (Li-Bing.Song:3215)Bug#28421Luís Soares5 Jan
        • Re: bzr commit into mysql-5.1-bugteam branch (Li-Bing.Song:3215)Bug#28421Libing Song6 Jan