List:Commits« Previous MessageNext Message »
From:Sven Sandberg Date:October 8 2008 2:51pm
Subject:Re: bzr commit into mysql-5.1 branch (luis.soares:2670) Bug#39325
View as plain text  
Hi Luís,

Good work with the test case. I have some suggestions below. Mainly, I'd
like to simplify some things in the test. Note: I'll review it once
again after you have committed the bugfix, so that I can run it on a
server where it succeeds.


Luís Soares wrote:
> #At file:///stuff/workspace/mysql-server/5.1-rpl-39325/
> 
>  2670 Luís Soares	2008-10-07
>       BUG#39325: In MYSQL_LOG::purge_first_log the order of operations is to first
> call

We usually start the changeset comment with:

BUG#<n>: <bug title>

So in this case it would be

BUG#39325: Server crash inside MYSQL_LOG::purge_first_log halts replication

>       MYSQL_LOG::purge_log(), which deletes the relay log and then updates the relay
> index, and
>       then to call flush_relay_log_info(), If the slave dies before the call to
>       flush_relay_log_info(), on restart of the server, replication stops with an
> error.
>       
>       This commits the test case to reproduce the deffective behavior.

speling error: defective

> added:
>   mysql-test/suite/rpl/t/rpl_relay_info_crash_slave.test
> modified:
>   sql/log.cc
> 
> === added file 'mysql-test/suite/rpl/t/rpl_relay_info_crash_slave.test'
> --- a/mysql-test/suite/rpl/t/rpl_relay_info_crash_slave.test	1970-01-01 00:00:00
> +0000
> +++ b/mysql-test/suite/rpl/t/rpl_relay_info_crash_slave.test	2008-10-07 10:03:06
> +0000
> @@ -0,0 +1,108 @@
> +########################################################################
> +#                                                                      #
> +# Testing seamless slave restarting after crashing before flushing     #
> +# the relay log info.                                                  #
> +#                                                                      #
> +# Date:      2009-10-03                                                #
> +# Developer: lsoares (Luis.Soares@stripped)                             #

Please avoid dates and names in files: it's difficult to maintain, and
everything is available from 'bzr annotate' anyways.

> +# Reason:    BUG#39325                                                 #
> +#                                                                      #
> +# Note:      This test borrows code from tests:                        # 
> +#             * t/crash_commit_before.test                             #
> +#             * extra/rpl_tests/rpl_max_relay_size.test                #

I think this is unnecessary, we steal all the time from tests, no need
to keep track of that :-)

> +########################################################################
> +
> +
> +# This test REQUIRES:
> +#  1. master slave replication setup
> +#  2. server compiled with debug

I think it's unnecessary to state this, it doesn't say more than the two
lines below...

> +--source include/master-slave.inc
> +--source include/have_debug.inc

You probably need to source include/not_valgrind.inc as well, to avoid
warnings for the memory leak when crashing the server (like
crash_commit_before).

> +
> +##### Option, for debugging support #####
> +let $debug= 1;

I think the printouts guarded by this flag are good, no need for a
special flag guarding them. I suggest removing $debug and make it print
the connection name unconditionally.

> +
> +##### SETUP ######
> +
> +# We have to sync with master, to ensure slave had time to start properly
> +# # before we stop it. If not, we get errors about UNIX_TIMESTAMP() in the
> +# log.
> +sync_slave_with_master;

I think this can be removed, it's probably not needed.

> +connection slave;
> +stop slave;
> +connection master;
> +
> +if ( $debug ) {
> +  --echo *** On Master ***
> +}
> +
> +--disable_warnings
> +DROP TABLE IF EXISTS t1;
> +--enable_warnings

Not needed: include/master-slave.inc does that already.

> +
> +CREATE TABLE t1 (id INT);
> +
> +##### Test ######
> +# Outline:
> +#   1. insert on master
> +#   2. start slave
> +#   3. crash slave before flushing relay logs
> +#   4. restart slave
> +#   5. sync it with master
> +
> +# on the master insert some rows
> +let $1=10;
> +disable_query_log;
> +begin;
> +while ($1)
> +{
> +  eval INSERT INTO t1 VALUES( $1 );
> +  dec $1;
> +}
> +enable_query_log;

I don't think this test needs any rows in the table, so this can
probably be removed.

> +DROP TABLE t1;
> +save_master_pos;
> +
> +# on the slave:
> +#  1. reset it and start it
> +#  2. set the crash flag
> +#  3. set the restart flag
> +#  4. flush logs
> +#  5. sync it with master
> +
> +if ( $debug ) {
> +  --echo *** On Slave ***
> +}
> +
> +connection slave;
> +reset slave;
> +start slave;

I think there is a race here: the slave may either finish replicating
before the crash, or may have events still not replicated before the
crash. I think you can make the test deterministic by adding
sync_slave_with_master here...

> +
> +SET GLOBAL debug= "+d,crash_before_flush_relay_log_info";
> +
> +# Write file to make mysql-test-run.pl expect crash and restart
> +--exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.2.expect
> +
> +if ( $debug ) {
> +  --echo   * On Slave: Flushing logs.
> +}
> +
> +# expect one of this errors (lost connection)
> +--error 2006,2013
> +flush logs;
> +
> +# enable slave restart
> +--enable_reconnect
> +
> +# wait until the connection is up again
> +--source include/wait_until_connected_again.inc
> +--source include/wait_for_slave_to_start.inc

... and if it is important to have events that are still unreplicated at
crash time, you can do something on master here, and then
sync_slave_with_master.

> +
> +#### TEAR DOWN ####
> +
> +#
> +#   As the master already does a DROP TABLE t1
> +#   there is no need to redo it here.
> +
> +# sync with master to clean everything up
> +sync_with_master 0;
> 
> === modified file 'sql/log.cc'
> --- a/sql/log.cc	2008-10-06 08:27:36 +0000
> +++ b/sql/log.cc	2008-10-07 10:03:06 +0000
> @@ -3013,6 +3013,9 @@ int MYSQL_BIN_LOG::purge_first_log(Relay
>      rli->notify_group_relay_log_name_update();
>    }
>  
> +  /* fault-injection required for bug#39325 (test: rpl.rpl_relay_info_crash_slave)
> */
> +  DBUG_EXECUTE_IF("crash_before_flush_relay_log_info", abort(););
> +
>    /* Store where we are in the new file for the execution thread */
>    flush_relay_log_info(rli);
>  
> 


-- 
Sven Sandberg, Software Engineer
MySQL AB, www.mysql.com
Thread
bzr commit into mysql-5.1 branch (luis.soares:2670) Bug#39325Luís Soares7 Oct
  • Re: bzr commit into mysql-5.1 branch (luis.soares:2670) Bug#39325Sven Sandberg8 Oct