List:Commits« Previous MessageNext Message »
From:Magnus Svensson Date:May 21 2008 10:29am
Subject:Re: bk commit into 6.0 tree (aelkin:1.2646) BUG#34654
View as plain text  
Sven Sandberg skrev:
> Hi,
> 
> Thanks for your work, the patch is approved!
> 
> /Sven

Hi Andrei,

after having dechiffered out the real changes I find them ok! (You could
have replaced the "while () {" with a "{" for smaller diff).

Patch approved!

/Magnus



> 
> 
> Andrei Elkin wrote:
>> Sven Sandberg <sven@stripped> writes:
>>
>>> Hi Andrei!
>>>
>>> Thanks for fixing this.
>>>
>>> As we discussed on IRC, please replace 'while (!io_slave_killed)' by
>>> if (!io_slave_killed)' in the outer loop of handle_slave_io.
>>>
>>
>> I could not help it and changed further to eliminate the outermost
>> while altogether.
>>
>> Please look at newer patch.
>> Just hope you will stop me at this point :)
>>
>>> There is one place where errors are cleared redundantly. I would also
>>> like more explicit comments in the test case, and a test case for
>>> Last_SQL_Err*. See below.
>>>
>>> /Sven
>>>
>>>
>>> Andrei Elkin wrote:
>>>> Below is the list of changes that have just been committed into a local
>>>> 6.0 repository of aelkin.  When aelkin does a push these changes
>>>> will be propagated to the main repository and, within 24 hours after
>>>> the
>>>> push, to the public repository.
>>>> For information on how to access the public repository
>>>> see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
>>>>
>>>> ChangeSet@stripped, 2008-05-01 16:26:05+03:00,
>>>> aelkin@stripped +5 -0
>>>>   Bug #34654 RESET SLAVE and START SLAVE does not clear Last_IO_Errno
>>>>     There were no resetting of the io thread's error status at
>>>> reset_slave() and   at the beginning of handle_slave_io() that
>>>> corresponds to start slave.
>>>>     Fixed with deploying clean-up at necessary places of the code.
>>>>
>>>>   mysql-test/extra/rpl_tests/rpl_reset_slave.test@stripped, 2008-05-01
>>>> 16:26:01+03:00, aelkin@stripped +51 -0
>>>>     adding regression test for bug#34654 that tests clearing the IO
>>>> error status
>>>>     for start slave and reset slave.
>>>>
>>>>   mysql-test/suite/rpl/r/rpl_row_reset_slave.result@stripped, 2008-05-01
>>>> 16:26:01+03:00, aelkin@stripped +21 -0
>>>>     results changed
>>>>
>>>>   mysql-test/suite/rpl/r/rpl_stm_reset_slave.result@stripped, 2008-05-01
>>>> 16:26:01+03:00, aelkin@stripped +21 -0
>>>>     results changed
>>>>
>>>>   sql/slave.cc@stripped, 2008-05-01 16:26:01+03:00,
>>>> aelkin@stripped +3 -0
>>>>     explicitly clearing the last mi's error at
>>>>         1. entering handle_slave_io
>>>>     2. getting to the read loop after passing successfully
>>>>        connecting, checking master version etc, registering and
>>>>        requesting for the dump
>>>>     3. right after successful connecting
>>>>
>>>>   sql/sql_repl.cc@stripped, 2008-05-01 16:26:01+03:00,
>>>> aelkin@stripped +1 -0
>>>>     reset slave now resets the io thread error.
>>>>
>>>> diff -Nrup a/mysql-test/extra/rpl_tests/rpl_reset_slave.test
>>>> b/mysql-test/extra/rpl_tests/rpl_reset_slave.test
>>>> --- a/mysql-test/extra/rpl_tests/rpl_reset_slave.test    2008-02-05
>>>> 15:29:36 +02:00
>>>> +++ b/mysql-test/extra/rpl_tests/rpl_reset_slave.test    2008-05-01
>>>> 16:26:01 +03:00
>>>> @@ -42,3 +42,54 @@ reset slave;
>>>>  start slave;
>>>>  sync_with_master;
>>>>  show status like 'slave_open_temp_tables';
>>>> +
>>>> +#
>>>> +#Bug#34654      RESET SLAVE does not clear LAST_IO_Err* +#
>>>> +
>>>> +# clearing the status
>>>> +stop slave;
>>>> +reset slave;
>>>> +let $last_io_errno= query_get_value(SHOW SLAVE STATUS,
>>>> Last_IO_Errno, 1);
>>>> +echo *** errno must be zero: $last_io_errno ***;
>>>> +
>>>> +#
>>>> +# verifying start slave resets
>>>> +#
>>> Please be more explicit in the comment describing the purpose of the
>>> test. You can write something like:
>>>
>>> # Verify that "START SLAVE" resets Last_IO_Error and Last_IO_Errno.
>>>
>>>> +
>>>> +change master to master_user='impossible_user_name';
>>>> +start slave;
>>>> +source include/wait_for_slave_io_to_stop.inc;
>>>> +let $last_io_errno= query_get_value(SHOW SLAVE STATUS,
>>>> Last_IO_Errno, 1);
>>>> +--disable_query_log
>>>> +eval SELECT $last_io_errno > 0 as ONE;
>>>> +--enable_query_log
>>>> +
>>>> +stop slave;
>>>> +change master to master_user='root';
>>>> +start slave;
>>>> +source include/wait_for_slave_to_start.inc;
>>>> +let $last_io_errno= query_get_value(SHOW SLAVE STATUS,
>>>> Last_IO_Errno, 1);
>>>> +let $last_io_error= query_get_value(SHOW SLAVE STATUS,
>>>> Last_IO_Error, 1);
>>>> +--echo *** last errno must be  zero: $last_io_errno ***
>>>> +--echo *** last error must be blank: $last_io_error ***
>>>> +
>>>> +#
>>>> +# verifying reset slave resets
>>>> +#
>>> Please be more explicit in the comment (see above).
>>>
>>>> +
>>>> +stop slave;
>>>> +change master to master_user='impossible_user_name';
>>>> +start slave;
>>>> +source include/wait_for_slave_io_to_stop.inc;
>>>> +let $last_io_errno= query_get_value(SHOW SLAVE STATUS,
>>>> Last_IO_Errno, 1);
>>>> +--disable_query_log
>>>> +eval SELECT $last_io_errno > 0 as ONE;
>>>> +--enable_query_log
>>>> +
>>>> +stop slave;
>>>> +reset slave;
>>>> +let $last_io_errno= query_get_value(SHOW SLAVE STATUS,
>>>> Last_IO_Errno, 1);
>>>> +let $last_io_error= query_get_value(SHOW SLAVE STATUS,
>>>> Last_IO_Error, 1);
>>>> +--echo *** last errno must be  zero: $last_io_errno ***
>>>> +--echo *** last error must be blank: $last_io_error ***
>>> I think the test should also verify that "RESET SLAVE" resets
>>> Last_SQL_Err*.
>>>
>>
>> Well, the fixes have not done anything in that regard but why not.
>>
>> Added.
>>
>>>> diff -Nrup a/mysql-test/suite/rpl/r/rpl_row_reset_slave.result
>>>> b/mysql-test/suite/rpl/r/rpl_row_reset_slave.result
>>>> --- a/mysql-test/suite/rpl/r/rpl_row_reset_slave.result   
>>>> 2008-02-05 15:29:36 +02:00
>>>> +++ b/mysql-test/suite/rpl/r/rpl_row_reset_slave.result   
>>>> 2008-05-01 16:26:01 +03:00
>>>> @@ -175,3 +175,24 @@ start slave;
>>>>  show status like 'slave_open_temp_tables';
>>>>  Variable_name    Value
>>>>  Slave_open_temp_tables    0
>>>> +stop slave;
>>>> +reset slave;
>>>> +*** errno must be zero: 0 ***
>>>> +change master to master_user='impossible_user_name';
>>>> +start slave;
>>>> +ONE
>>>> +1
>>>> +stop slave;
>>>> +change master to master_user='root';
>>>> +start slave;
>>>> +*** last errno must be  zero: 0 ***
>>>> +*** last error must be blank:  ***
>>>> +stop slave;
>>>> +change master to master_user='impossible_user_name';
>>>> +start slave;
>>>> +ONE
>>>> +1
>>>> +stop slave;
>>>> +reset slave;
>>>> +*** last errno must be  zero: 0 ***
>>>> +*** last error must be blank:  ***
>>>> diff -Nrup a/mysql-test/suite/rpl/r/rpl_stm_reset_slave.result
>>>> b/mysql-test/suite/rpl/r/rpl_stm_reset_slave.result
>>>> --- a/mysql-test/suite/rpl/r/rpl_stm_reset_slave.result   
>>>> 2008-02-05 15:29:36 +02:00
>>>> +++ b/mysql-test/suite/rpl/r/rpl_stm_reset_slave.result   
>>>> 2008-05-01 16:26:01 +03:00
>>>> @@ -175,3 +175,24 @@ start slave;
>>>>  show status like 'slave_open_temp_tables';
>>>>  Variable_name    Value
>>>>  Slave_open_temp_tables    1
>>>> +stop slave;
>>>> +reset slave;
>>>> +*** errno must be zero: 0 ***
>>>> +change master to master_user='impossible_user_name';
>>>> +start slave;
>>>> +ONE
>>>> +1
>>>> +stop slave;
>>>> +change master to master_user='root';
>>>> +start slave;
>>>> +*** last errno must be  zero: 0 ***
>>>> +*** last error must be blank:  ***
>>>> +stop slave;
>>>> +change master to master_user='impossible_user_name';
>>>> +start slave;
>>>> +ONE
>>>> +1
>>>> +stop slave;
>>>> +reset slave;
>>>> +*** last errno must be  zero: 0 ***
>>>> +*** last error must be blank:  ***
>>>> diff -Nrup a/sql/slave.cc b/sql/slave.cc
>>>> --- a/sql/slave.cc    2008-04-01 13:56:42 +03:00
>>>> +++ b/sql/slave.cc    2008-05-01 16:26:01 +03:00
>>>> @@ -2114,6 +2114,7 @@ pthread_handler_t handle_slave_io(void *
>>>>     pthread_detach_this_thread();
>>>>    thd->thread_stack= (char*) &thd; // remember where our stack
> is
>>>> +  mi->clear_error();
>>>>    if (init_slave_thread(thd, SLAVE_THD_IO))
>>>>    {
>>>>      pthread_cond_broadcast(&mi->start_cond);
>>>> @@ -2228,6 +2229,7 @@ requesting master dump") ||
>>>>          goto connected;
>>>>        });
>>>>  +    mi->clear_error(); // getting into read loop with clear sheet
>>> This place is actually redundant: the error is already cleared at this
>>> point.
>>>
>>> The error was cleared in the beginning of handle_slave_io(). If new
>>> errors are reported after that, either the errors will be cleared
>>> again before reaching the read loop, or the read loop will not be
>>> reached: here is the list of all possible errors reported in
>>> handle_slave_io() before the read loop:
>>>
>>>  - in handle_slave_io() if mysql_init() fails, but then we execute
>>> goto err'.
>>>
>>>  - in get_master_version_and_clock(), but then we return from
>>> handle_slave_io().
>>>
>>>  - in register_slave_on_master(), but only if it fails. In this case,
>>> either 'goto err' is executed or try_to_reconnect() succeeds and
>>> clears the error (actually, try_to_reconnect calls safe_reconnect
>>> which calls connect_to_master which resets the error).
>>>
>>
>> You are correct that changes done in connect_to_master() and the
>> following events logics guarantee the status has been cleared by this
>> point.
>>
>> Let's guard from future intrusions with
>>
>> DBUG_ASSERT(mi->last_error().number == 0);
>>
>> instead.
>>
>>> Argh, the reconnect logic of handle_slave_io is so horrible :-( . I'd
>>> like to see this refactored in the future (not in this bug, of
>>> course).
>>>
>>>>      while (!io_slave_killed(thd,mi))
>>>>      {
>>>>        ulong event_len;
>>>> @@ -3411,6 +3413,7 @@ static int connect_to_master(THD* thd, M
>>>>     if (!slave_was_killed)
>>>>    {
>>>> +    mi->clear_error();
>>>>      if (reconnect)
>>>>      {
>>>>        if (!suppress_warnings &&
> global_system_variables.log_warnings)
>>>> diff -Nrup a/sql/sql_repl.cc b/sql/sql_repl.cc
>>>> --- a/sql/sql_repl.cc    2008-03-08 12:14:45 +02:00
>>>> +++ b/sql/sql_repl.cc    2008-05-01 16:26:01 +03:00
>>>> @@ -1114,6 +1114,7 @@ int reset_slave(THD *thd, Master_info* m
>>>>       Reset errors (the idea is that we forget about the
>>>>       old master).
>>>>    */
>>>> +  mi->clear_error();
>>>>    mi->rli.clear_error();
>>>>    mi->rli.clear_until_condition();
>>>>  
>>>>
>>> -- 
>>> Sven Sandberg, Software Engineer
>>> MySQL AB, www.mysql.com
>>>
>>> -- 
>>> MySQL Code Commits Mailing List
>>> For list archives: http://lists.mysql.com/commits
>>> To unsubscribe:    http://lists.mysql.com/commits?unsub=1
>>>
>>
>> a newer patch has been committed.
>>
>> regards,
>>
>> Andrei
> 

Thread
bk commit into 6.0 tree (aelkin:1.2646) BUG#34654Andrei Elkin1 May
  • Re: bk commit into 6.0 tree (aelkin:1.2646) BUG#34654Sven Sandberg2 May
    • Re: bk commit into 6.0 tree (aelkin:1.2646) BUG#34654Andrei Elkin8 May
      • Re: bk commit into 6.0 tree (aelkin:1.2646) BUG#34654Sven Sandberg9 May
        • Re: bk commit into 6.0 tree (aelkin:1.2646) BUG#34654Magnus Svensson21 May