List:Commits« Previous MessageNext Message »
From:Luís Soares Date:April 22 2010 10:26am
Subject:Re: bzr commit into mysql-5.1-rep+2 branch (luis.soares:3183) Bug#52166
View as plain text  
Hi Andrei,

On Thu, 2010-04-22 at 11:20 +0300, Andrei Elkin wrote:
> Luis, hello.
> 
> 
> The patch is great!
> Although my heart was crying to see how ruthlessly you had exterminated my awesome 
> simulations :-).

:)

> But the gain simplicity heals it.
> 
> I have few little findings.
> 
> >       Solution: We fix this by not saving and restoring the
> >       last_master_timestamp when the SQL thread goes to sleep. Instead,
> >       we set the rli->last_master_timestamp to zero, when the SQL
> >       thread starts waiting, and keep it to zero, 
> 
> >       when  the SQL thread awakes. 
> 
> Would it be better to say
> 
>         until the SQL thread has been awakened (by a new event) ?
> 
> or as you're saying later

Ok.

> 
> >       A complementary change in this patch, is the one that
> >       sets rli->last_master_timestamp before and not after event
> >       execution. Both of these changes make the spike go away, and keep
> >       showing zero, 
> 
> >       until the event starts being processed,
> 
> in here ^.
> 
> >       as opposed
> >       to showing zero until the event finishes its execution.
> >      @ mysql-test/suite/manual/t/rpl_replication_delay.test
> >         Removed this unmaintained test case.
> >      @ mysql-test/suite/rpl/t/seconds_behind_master.test
> >         Added some new tests. Collected remains from the removed 
> >         rpl_replication_delay, refactored and added in this file as well.
> >      @ sql/log_event.cc
> >         Reworked the fault injection to meet the new test case needs.
> >         Refactored the stmt_done interface (does not need 'when' anymore.
> >      @ sql/log_event_old.cc
> >         Accomodating changes in the stmt_done interface.
> >      @ sql/rpl_rli.cc
> >         Updated stmt_done implementation.
> >      @ sql/rpl_rli.h
> >         Changed stmt_done interface, removing the not needed 
> >         event_creation_time.
> >      @ sql/slave.cc
> >         Two changes: 
> >           (i) removed the save and restore of rli->last_master_timestamp, 
> >               when the SQL thread goes to sleeps. 
> >          (ii) Updating the rli->last_master_timestamp at the beginning of
> >               the event execution.
> >
> >     removed:
> >       mysql-test/suite/manual/
> >       mysql-test/suite/manual/r/
> >       mysql-test/suite/manual/r/rpl_replication_delay.result
> >       mysql-test/suite/manual/t/
> >       mysql-test/suite/manual/t/rpl_replication_delay-slave.opt
> >       mysql-test/suite/manual/t/rpl_replication_delay.test
> >     added:
> >       mysql-test/suite/rpl/r/seconds_behind_master.result
> >       mysql-test/suite/rpl/t/seconds_behind_master.test
> >     modified:
> >       sql/log_event.cc
> >       sql/log_event_old.cc
> >       sql/rpl_rli.cc
> >       sql/rpl_rli.h
> >       sql/slave.cc
> 
> > === added file 'mysql-test/suite/rpl/t/seconds_behind_master.test'
> > --- a/mysql-test/suite/rpl/t/seconds_behind_master.test	1970-01-01 00:00:00
> +0000
> > +++ b/mysql-test/suite/rpl/t/seconds_behind_master.test	2010-04-15 16:39:31
> +0000
> > @@ -0,0 +1,152 @@
> > +-- source include/master-slave.inc
> > +-- source include/have_debug.inc
> > +
> > +# make the test only run once (STMT is actually needed because of the SLEEPs
> below)
> > +-- source include/have_binlog_format_statement.inc 
> > +
> > +SET @old_debug= @@global.debug;
> > +call mtr.add_suppression('Unsafe statement binlogged in statement format since
> BINLOG_FORMAT = STATEMENT');
> > +
> > +#
> > +#  Based on BUG#29309 test case
> > +#
> > +
> > +connection master;
> > +create table t1 (f1 int);
> > +sync_slave_with_master;
> > +source include/stop_slave.inc;
> > +connection master;
> > +insert into t1 values (1);
> > +
> > +sleep 3;
> > +
> > +#connection slave;
> > +connection slave;
> > +lock table t1 write;
> > +source include/start_slave.inc;
> > +
> > +# ok, now wait for the SQL thread to start the insert
> > +let $wait_timeout= 60; # Override default of 30 seconds with 60.
> > +let $show_statement= SHOW PROCESSLIST;
> > +let $field= Info;
> > +let $condition= = 'insert into t1 values (1)';
> > +source include/wait_show_condition.inc;
> > +
> > +let $sbm= query_get_value("SHOW SLAVE STATUS", Seconds_Behind_Master, 1);
> > +if (`SELECT $sbm < 3 or $sbm > 6`)
> > +{
> > +  -- echo #### Seconds behind master must show value approx. to 3
> > +  -- echo #### giving a 3 secs margin (machine where the test executes
> > +  -- echo #### may be slow). Best effort approach.
> > +  -- echo #### $sbm > 6 || $sbm < 3
> > +  -- die
> > +}
> 
> I and Sven discussed similar issue on #rep yesterday.
> Sure, assert($sbm >= 3) shall hold because of the explicit --sleep 3.
> However, there is no determined range - it all depends on exec env -  and therefore 
> the right margin assert($sbm <= 6).
> We can only wait for some reasonable time in 
> > +source include/wait_show_condition.inc;
> 
> So I suggest to relax the assert to if (`SELECT $sbm < 3) ...

OK, but then as you state: 

"Sure, assert($sbm >= 3) shall hold because of the explicit --sleep 3."

I guess the assertion will then be almost empty. However,
it provides some coverage... So I agree. I'll keep the  "< 3"
condition (the alternative would be to remove this part of the 
test altogether).

> > +if (`SELECT $sbm < 3 or $sbm > 6`)
> > +
> > +unlock tables;
> > +connection master;
> > +sync_slave_with_master;
> > +lock table t1 write;
> > +connection master;
> > +insert into t1 values (2);
> > +
> > +sleep 3;
> > +
> > +connection slave;
> > +# ok, now wait for the SQL thread to start the insert
> > +let $wait_timeout= 60; # Override default of 30 seconds with 60.
> > +let $show_statement= SHOW PROCESSLIST;
> > +let $field= Info;
> > +let $condition= = 'insert into t1 values (2)';
> > +source include/wait_show_condition.inc;
> > +
> > +flush logs /* should not reset the sbm value */;
> > +
> > +sleep 3;
> > +
> > +let $sbm= query_get_value("SHOW SLAVE STATUS", Seconds_Behind_Master, 1);
> > +if (`SELECT $sbm < 6 or $sbm >= 9`)
> > +{
> > +  -- echo #### Seconds behind master must show value approx. to 6
> > +  -- echo #### giving a 3 secs margin (machine where the test executes
> > +  -- echo #### may be slow). Best effort approach.
> > +  -- echo #### $sbm < 6 || $sbm >= 9 
> > +  -- die
> > +}
> 
> Same as above.

Yep, same as above.

> 
> > +
> > +unlock tables;
> > +connection master;
> > +drop table t1;
> > +sync_slave_with_master;
> > +
> > +#
> > +# BUG#52166 test case 
> > +#
> > +
> > +#  1. Check that seconds behind master is set to 0 when SQL 
> > +#     thread waits for more events.
> > +#  2. Check that seconds behind master is updated once after 
> > +#     a newly received event starts to be processed.
> > +
> > +-- source include/master-slave-reset.inc
> > +-- connection master
> > +
> > +# this will increase the timestamp +3600
> > +SET GLOBAL DEBUG= '+d,dec_event_time_by_1_hour';
> > +CREATE TABLE t1 (a INT);
> > +SET GLOBAL DEBUG= '-d,dec_event_time_by_1_hour';
> > +-- sync_slave_with_master
> > +
> > +# ok, now wait for the SQL thread to sleep
> > +let $wait_timeout= 60; # Override default of 30 seconds with 60.
> > +let $show_statement= SHOW PROCESSLIST;
> > +let $field= State;
> > +let $condition= = 'Slave has read all relay log; waiting for the slave I/O
> thread to update it';
> > +-- source include/wait_show_condition.inc
> > +
> > +-- let $sbm= query_get_value("SHOW SLAVE STATUS", Seconds_Behind_Master, 1)
> > +if (`SELECT $sbm <> 0`)
> > +{
> > +  -- echo #### Seconds behind master must show '0', because SQL thread is now
> waiting for new events.
> > +  -- echo #### SBM ($sbm seconds) != 0 !! SQL thread is waiting so should be
> zero.
> > +  -- die
> > +}
> > +
> > +-- connection master
> > +
> 
> > +# 5 seconds this will give us time at the slave to 
> 
> You mean to say 10, don't you?

I do. (I had 5 in the initial approach, then switched to 10
and forgot to update the comment).

> 
> > +# show that the Seconds_Behind_Master will increase
> > +# when this query starts to execute
> > +-- disable_warnings
> > +SET GLOBAL DEBUG= '+d,dec_event_time_by_1_hour';
> > +INSERT INTO t1 VALUES (sleep (10));
> > +SET GLOBAL DEBUG= '-d,dec_event_time_by_1_hour';
> > +DROP TABLE t1;
> > +-- enable_warnings
> > +
> > +-- connection slave
> > +
> > +# ok, now wait for the SQL thread to start the insert
> > +let $wait_timeout= 60; # Override default of 30 seconds with 60.
> > +let $show_statement= SHOW PROCESSLIST;
> > +let $field= Info;
> > +let $condition= = 'INSERT INTO t1 VALUES (sleep (10))';
> > +-- source include/wait_show_condition.inc
> > +
> 
> > +-- sleep 2
> > +
> > +-- let $sbm= query_get_value("SHOW SLAVE STATUS", Seconds_Behind_Master, 1)
> > +if (`SELECT $sbm <= 3600`)
> > +{
> > +  -- echo #### Seconds behind master must show > 3600 because the event that
> 
> > +  -- echo #### is being processed was issued 3600 seconds ago.
> > +  -- echo #### SBM ($sbm seconds) <= 3600 ! 
> > +  -- die
> > +}
> 
> I suggest to change this paragraph to
> 
> source include/wait_for_slave_param 
> with 
> $slave_param, $slave_param_value  Seconds_Behind_Master, 3600 and
> $slave_param_comparison = <=
> 
> to avoid the evil sleep.

Ok, but you mean > instead of <=... 

We need to wait until "Seconds_Behind_Master > 3600" or, 
in other words, while "Seconds_Behind_Master <= 3600"...

> 
> 
> > +
> > +-- connection master
> > +-- sync_slave_with_master
> > +-- connection master
> > +
> > +SET @@global.debug= @old_debug;
> > +
> > +-- source include/master-slave-end.inc
> >
> > === modified file 'sql/log_event.cc'
> > --- a/sql/log_event.cc	2010-03-03 14:43:35 +0000
> > +++ b/sql/log_event.cc	2010-04-15 16:39:31 +0000
> > @@ -235,10 +235,6 @@ private:
> >    flag_set m_flags;
> >  };
> >  
> > -#ifndef DBUG_OFF
> > -uint debug_not_change_ts_if_art_event= 1; // bug#29309 simulation
> > -#endif
> > -
> >  /*
> >    pretty_print_str()
> >  */
> > @@ -781,32 +777,7 @@ int Log_event::do_update_pos(Relay_log_i
> >      Matz: I don't think we will need this check with this refactoring.
> >    */
> >    if (rli)
> > -  {
> > -    /*
> > -      bug#29309 simulation: resetting the flag to force
> > -      wrong behaviour of artificial event to update
> > -      rli->last_master_timestamp for only one time -
> > -      the first FLUSH LOGS in the test.
> > -    */
> > -    DBUG_EXECUTE_IF("let_first_flush_log_change_timestamp",
> > -                    if (debug_not_change_ts_if_art_event == 1
> > -                        && is_artificial_event())
> > -                    {
> > -                      debug_not_change_ts_if_art_event= 0;
> > -                    });
> > -#ifndef DBUG_OFF
> > -    rli->stmt_done(log_pos, 
> > -                   is_artificial_event() &&
> > -                   debug_not_change_ts_if_art_event > 0 ? 0 : when);
> > -#else
> > -    rli->stmt_done(log_pos, is_artificial_event()? 0 : when);
> > -#endif
> > -    DBUG_EXECUTE_IF("let_first_flush_log_change_timestamp",
> > -                    if (debug_not_change_ts_if_art_event == 0)
> > -                    {
> > -                      debug_not_change_ts_if_art_event= 2;
> > -                    });
> > -  }
> > +    rli->stmt_done(log_pos);
> >    return 0;                                   // Cannot fail currently
> >  }
> >  
> > @@ -942,6 +913,16 @@ bool Log_event::write_header(IO_CACHE* f
> >    }
> >  
> >    now= (ulong) get_time();                              // Query start time
> 
> > +  if (DBUG_EVALUATE_IF("inc_event_time_by_1_hour",1,0)  &&
> 
> Do we really need inc_event_time_by_1_hour at all?
> The tests don't use it.

No, the tests don't use it. But tests for WL#4806 I think they do.
Can we keep this?

> > +      DBUG_EVALUATE_IF("dec_event_time_by_1_hour",1,0))
> > +  {
> 
> > +    DBUG_ASSERT(0);
> 
> According to the def of DBUG_EVALUATE_IF() it's a check if DBUG is
> compiled in. Hence the assert does not look as appropriate even though
> it's currently expanded to noop in DBUG_OFF builts.

If DBUG is not compiled in, then the compiler should optimize the
  if (0) { A; } else { B; }
to
  B;

So in this case A; (the DBUG_ASSERT) is compiled away in DBUG 
disabled builds.

Given that B relies on DBUG as well, it should also 
be compiled away, and the entire if-else block goes away
in DBUG off builds.

In any case, just to be sure, we can wrap it with 

#ifndef DBUG_OFF
#endif

as you had with in your error simulation, can't we?

> > +  } 
> > +  else
> > +  {
> > +    DBUG_EXECUTE_IF("inc_event_time_by_1_hour", now= now + 3600;);
> > +    DBUG_EXECUTE_IF("dec_event_time_by_1_hour", now= now - 3600;);
> > +  }
> >  
> >    /*
> >      Header will be of size LOG_EVENT_HEADER_LEN for all events, except for
> > @@ -7873,7 +7854,7 @@ Rows_log_event::do_update_pos(Relay_log_

[snip]

> >        /*
> 
> regards,
> 
> Andrei

Thanks.

Cheers,
Luís


Thread
bzr commit into mysql-5.1-rep+2 branch (luis.soares:3183) Bug#52166Luis Soares15 Apr
  • Re: bzr commit into mysql-5.1-rep+2 branch (luis.soares:3183) Bug#52166Sven Sandberg27 Apr
  • Re: bzr commit into mysql-5.1-rep+2 branch (luis.soares:3183) Bug#52166Sven Sandberg28 Apr
    • Re: bzr commit into mysql-5.1-rep+2 branch (luis.soares:3183) Bug#52166Luís Soares30 Apr
Re: bzr commit into mysql-5.1-rep+2 branch (luis.soares:3183) Bug#52166Luís Soares22 Apr