Andrei Elkin wrote:
> Mats, hej.
>
>> Hi Andrei!
>>
>> Comments inline below.
>>
>
> Thanks for comments! I tagged with (todo) some of points of discussion.
> A newer patch will come later today.
>
>> Just my few cents,
>> Mats Kindahl
>>
>> Andrei Elkin wrote:
>>> Below is the list of changes that have just been committed into a local
>>> 6.0 repository of aelkin. When aelkin does a push these changes
>>> will be propagated to the main repository and, within 24 hours after the
>>> push, to the public repository.
>>> For information on how to access the public repository
>>> see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
>>>
>>> ChangeSet@stripped, 2008-05-08 21:00:35+03:00, aelkin@stripped +13
> -0
>>> Bug #25998 problems about circle replication
>>> Bug #27808 Infinite looping in circular replication
>>> In case of withdrawing one of the servers from the circular
>>> multi-master replication group
>>> events generated by the removed server could become unstoppable
> (bug#25998).
>>> That's because the originator had been the terminator of the own event
> flow.
>>>
>> "circular multi-master replication group" ? We don't have
>> multi-master.
>
> Perhaps you confused with multi-sourced that we don't have indeed.
> It's okay with multi-master.
> http://en.wikipedia.org/wiki/Multi-master_replication
>
>> If you mean something else, then please re-write this
>> paragraph to clarify the situation.
>>
>
> confirms we have multi-master, and I am proud of that -:)
Sorry?
[snip]
>
>> "Artifact" is not a good word in these situations, you should use
>> "problem" instead.
>
> I'd rather stay off a linguistic discussion.
> No problem with me if I need to change this place indeed.
> Still, http://en.wikipedia.org/wiki/Artifact_%28error%29.
That's up to you.
>>> MASTER_IGNORE_SERVER_ID= some_server_id
>>> can be repeated multiple times to list all the servers from
>>> which events would not be put into
>>> relay log and therefore would not be executed.
>>>
>> Why not have a comma-separated list of server ids?
>>
>
> There are many reasons actually.
>
> Practically it's going to be a single identifier at most;
> the comma is bound to separate the top level sub-clauses such as
> master_port=, master_host=, etc, so that engaging the comma with
> more job would require re-writing the current parsing
> rules, particularly for `master_defs';
> imo it would not be a perfect solution anyway without introducing
> a new token(s) for grouping, e.g parenthesis "(", ")"
>
> something = (item_1, item_2, \dots, item_n)
>
> \footnote{but I would prefer then the white space as separator in the list
> something = (item_1 item_2 \dots item_n)
> }
The syntax for writing sets in SQL is '1,2,3,4'. You can use that, i.e.,
a string with comma-separated integers inside. That means you do not
have to extend the syntax with anything complex, just accept a string
and then you parse it internally.
[snip]
>>> d. resetting the ignored server ids with RESET SLAVE;
>>>
>> RESET SLAVE does not change MASTER_HOST, nor any of the other
>> parameters, so it does not make sense that it should reset the list of
>> ignored server ids.
>
> A good note. Indeed, it does change:
>
> 13.6.2.5 `RESET SLAVE' Syntax
> .............................
>
> `RESET SLAVE' makes the slave forget its replication position in the
> master's binary logs. This statement is meant to be used for a clean
> start: It deletes the `master.info' and `relay-log.info' files, all the
> relay logs, and starts a new relay log.
>
> If after `reset slave' the slave server restarts, it won't find
> nothing about the last time configuration.
>
> The verb "reset" and more the quoted paragraph does not make sense
> for not changing MASTER_HOST etc, which we are used to observe when
> `reset slave' can be effectively continued with `start slave'.
> I find the current behavior not compliant to the docs.
> The two must be separated with `CHANGE MASTER' imo.
>
> So your note actually manifests a new bug that we have to fix.
Maybe so. I agree that the binlog positions do reset, and that the
manual seems to be wrong, but the host does not reset, nor the port, nor
any of the other "fixed" parameters. I find this list of ignored
identifier most similar to this list, and therefore I would like to have
it stay and not be reset, precisely so that one can do a START SLAVE
afterwards.
[snip]
>>> mysys/array.c@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped
> +37 -0
>>> Implementation of the search function.
>>> The third in-out arg is provided for the caller if it needs the index
>>> of the found search pattern.
>>>
>> "The third in-out arg"?! I cannot decode this without reading the
>> code. Give the function name, what parameter you added, and the
>> purpose of that parameter.
>
> I'd rather to remove the details of implementation from the comments.
> Really, one can read the synopsis in the sources.
You have already provided implementation details by talking about the
position of an argument to a function. I agree that it might be better
to keep implementation details out of the comment, but then the comment
needs rewriting anyway.
>
> (todo)
>
>>> sql/lex.h@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped +1
> -0
>>> new keyword
>>>
>> What new keyword?
>
> indeed, a new member of struct st_lex_master_info (todo)
Just name it, please.
>>> sql/rpl_mi.cc@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped
> +67 -4
>>> initialization, storing into master info, reading from the file,
> deallocation
>>> of the list of ignored server id:s.
>>>
>> "initialization"? What did you do here?
>> "storing into master info"? Storing what into master info? What did
>> you do here?
>> "reading from the file"? What file? What did you do here?
>>
>
> I am lost in the number of questions...
>
> I seriously doubt you had a hard time to find out what is supposed to be
> stored into, read from the master info file :-)
> Besides, it's said explicitly at the end of the clause - "of the list
> of ignored server id:s". The latter is the common object of several
> operations mentioned in the comment clause.
>
> One point of your critique I have to accept:
> "initialization" meant allocation actually (of what is said at the end of the
> clause). (todo)
>
> Yes, there is yet another initialization. Of the new `master_id'
> member of Master_info.
> That's not said. Neither it is worth to mention as there is the source
> of the patch.
>
> "Reading from the file" means reading from a file that was last time
> mentioned.
>
> "What did you do here?" obviously, the slave reads from the master
> info file, the new line with the ignored server id list.
The point is that the comment is a fragment and does not make any sense
as it stands. I would prefer if you could write out the comment so that
it makes sense without having to look at the code.
Normally one browses the file comments when trying to find a specific
change, like if there were any changes to how a variable was operated
on, functions that changed behavior, etc. Reading 100+ file comments all
saying that something was initialized does not really help in this
situation, and then the comment is pointless and is not needed.
[snip]
>>> sql/sql_lex.h@stripped, 2008-05-08 21:00:32+03:00,
> aelkin@stripped +1 -0
>>> parser time storage for variable size array of server ids
>>>
>>> sql/sql_repl.cc@stripped, 2008-05-08 21:00:32+03:00,
> aelkin@stripped +46 -13
>>> aux function for sorting the ignored server_id:s;
>>> sorting of the ignored servers implementation;
>>> failing change master do not forget to free lex's dynamical storage with
> parsed
>>> ignored server_ids;
>>> resetting the old list of ignored server if the parser detected the new
> list;
>>> shifting some pieces of code in order to comply with the above logics;
>>>
>>> sql/sql_yacc.yy@stripped, 2008-05-08 21:00:32+03:00,
> aelkin@stripped +8 -0
>>> allocation a dyn array for possible list of ignored server ids;
>>> filling the array;
>>> resetting (missed part of wl#342) heartbeat_opt at the beginning of
>>> CHANGE MASTER processing;
>>>
>> I don't follow why you need to reset the heartbeat option when doing
>> CHANGE MASTER and why you need to do this change in this patch.
>
> We have a convention that absence of MASTER_HEARTBEAT_PERIOD in the args
> means heartbeat cancellation. The easiest way to enforce the rule it to
> reset a possible last time non-zero value at the beginning of CHANGE MASTER
> parsing.
> I might have reported a bug, but think piggybacking on the current
> ticket would be okay.
OK. Could you please write out that we have this assumption both in the
code and in the cset so that nobody is confused.
[snip]
>>> +
>>> +source include/master-slave.inc;
>>> +
>>> +connection slave;
>>> +source include/wait_for_slave_to_start.inc;
>>>
>> Is this necessary?
>
> Sure.
> As the test is going show `master_id' the new member of `show slave
> status' which is the server id of the master.
>
>> Shouldn't this line be added to master-slave.inc
>> instead in that case?
>
> An interesting note.
> I'd rather it to be a part of `start slave', which is called from
> include/master-slave.inc as well.
I would say that it is expected that after master-slave.inc has been
executed, test is ready to run, including that the slave has started.
I would prefer you could add that line to master-slave.inc instead.
>
>>> +# a new line for master_id
>>> +let $master_id= query_get_value(SHOW SLAVE STATUS, Master_Server_Id, 1);
>>> +echo master_id: $master_id;
>>> +
>>> +stop slave;
>>>
>> wait for slave to stop.
>>
>
> I don't think we really need to wait for stop, actually.
> In the following the test is concerned with results of CHANGE MASTER
> and execution on the slave after its restarting.
> Later in the test, there is sync_slave_with_master synchoronization
> that supersedes wait for stop and wait for start both.
True. Do as you like.
[snip]
>
>>> +start slave;
>>>
>
>> Wait for slave to start. There is a slight chance that you execute
>> this statement, switch to the master, and when doing
>> sync_slave_with_master, the master_pos_wait() will return NULL because
>> the slave has not had a chance to start.
>>
>
> I understand your worries:
>
> " The function returns `NULL' if the slave SQL thread is not started "
>
> However, `slave start' returns OK to mysqltest only after SQL thread has been
> started.
Not true. Start slave just initiates the slave start, it returns before
the thread has been properly started, so the chance I mention above is
valid.
>
>
>>> +
>>> +connection master;
>>> +
>>> +#connection slave;
>>> +sync_slave_with_master;
>>> +let $slave_relay_pos0= query_get_value(SHOW SLAVE STATUS, Relay_Log_Pos,
> 1);
>>> +
>>> +connection master;
>>> +create table t1 (n int);
>>> +let $master_binlog_end= query_get_value(SHOW MASTER STATUS, Position, 1);
>>> +
>>> +connection slave;
>>> +let $slave_param= Exec_Master_Log_Pos;
>>> +let $slave_param_value= $master_binlog_end;
>>> +source include/wait_for_slave_param.inc;
>>> +--echo *** must be empty as the event is to be filtered out ***
>>> +show tables;
>>> +--echo *** allowing events from master ***
>>> +let $slave_relay_pos1= query_get_value(SHOW SLAVE STATUS, Relay_Log_Pos,
> 1);
>>> +#
>>> +# checking stability of relay log pos
>>> +#
>>> +if (`select $slave_relay_pos1 - $slave_relay_pos0`)
>>> +{
>>> + --echo Error: relay log position changed: $slave_relay_pos0,
> $slave_relay_pos1
>>> + query_vertical show slave status;
>>>
>
>> This will print a lot of info that will cause the test to fail in
>> pushbuild.
>
> Sorry,
> this will print a lot of info if the test will fail in pushbuild.
>
>> Use "include/show_slave_status.inc" or something equivalent
>> to filter out the pointless columns.
>>
>
> I think you agree that's how it should be in the case of an error of
> a test.
Hmm... OK.
>
>>> +}
>>> +stop slave;
>>> +source include/wait_for_slave_to_stop.inc;
>>> +reset slave;
>>> +--echo *** must be empty due to reset slave ***
>>> +let $ignore_list= query_get_value(SHOW SLAVE STATUS,
> Replicate_Ignore_Server_id, 1);
>>> +echo ignore server id list: $ignore_list;
>>> +--replace_result $MASTER_MYPORT MASTER_PORT
>>> +eval change master to master_host='127.0.0.1', master_user='root',
>>> + master_password='', master_port=$MASTER_MYPORT;
>>> +start slave;
>>>
>> Wait for slave to start.
>>
>
> sorry, no need as explained.
See above.
[snip]
>
>>> diff -Nrup a/sql/lex.h b/sql/lex.h
>>> --- a/sql/lex.h 2008-04-01 13:56:38 +03:00
>>> +++ b/sql/lex.h 2008-05-08 21:00:32 +03:00
>>> @@ -321,6 +321,7 @@ static SYMBOL symbols[] = {
>>> { "MASTER_SSL_VERIFY_SERVER_CERT",
> SYM(MASTER_SSL_VERIFY_SERVER_CERT_SYM)},
>>> { "MASTER_USER", SYM(MASTER_USER_SYM)},
>>> { "MASTER_HEARTBEAT_PERIOD", SYM(MASTER_HEARTBEAT_PERIOD_SYM)},
>>> + { "MASTER_IGNORE_SERVER_ID", SYM(MASTER_IGNORE_SERVER_ID_SYM)},
>>>
>
>> What is "MASTER" referring to here? For "MASTER_USER" it is clearly
>> the user on the master, but what does the "MASTER" denote for
>> "IGNORE_SERVER_ID"?
>
> a good point. I seemed to had been hypnotized with that all the options
> having MASTER_ prefix. It's not really so, there are two that do not
> have. This drop off my early hesitations.
>
> (todo)
Does this mean you remove the MASTER_ for this option?
[snip]
>>> Check that the master's global character_set_server and ours are the
> same.
>>> Not fatal if query fails (old master?).
>>> @@ -1238,6 +1334,10 @@ bool show_master_info(THD* thd, Master_i
>>> field_list.push_back(new Item_empty_string("Last_IO_Error", 20));
>>> field_list.push_back(new Item_return_int("Last_SQL_Errno", 4,
> MYSQL_TYPE_LONG));
>>> field_list.push_back(new Item_empty_string("Last_SQL_Error", 20));
>>> + field_list.push_back(new Item_empty_string("Replicate_Ignore_Server_id",
>>> + FN_REFLEN));
>>>
>
>> Please be consistent: "Replication_Ignore_Server_Id"
>
> I am not sure what would your version be consistent with.
> The idea of the feature is to filter. And show slave status displays
>
> Replicate_Ignore_DB, Replicate_Ignore_Table, other Replicate_, never
> Replication_.
Sorry, I was referring to the capitalization of the "id", not the choice
of "Replication", that was a typo.
Just my few cents,
Mats Kindahl
--
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com