Andrei Elkin wrote:
> Mats, hello.
>
>> Andrei Elkin wrote:
>>> Mats, hej.
>>>
>>>> Hi Andrei!
>>>>
>>>> Comments inline below.
>>>>
>>> Thanks for comments! I tagged with (todo) some of points of discussion.
>>> A newer patch will come later today.
>>>
>>>> Just my few cents,
>>>> Mats Kindahl
>>>>
>>>> Andrei Elkin wrote:
>>>>> Below is the list of changes that have just been committed into a
> local
>>>>> 6.0 repository of aelkin. When aelkin does a push these changes
>>>>> will be propagated to the main repository and, within 24 hours after
> the
>>>>> push, to the public repository.
>>>>> For information on how to access the public repository
>>>>> see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
>>>>>
>>>>> ChangeSet@stripped, 2008-05-08 21:00:35+03:00,
> aelkin@stripped +13 -0
>>>>> Bug #25998 problems about circle replication
>>>>> Bug #27808 Infinite looping in circular replication
>>>>> In case of withdrawing one of the servers from the circular
>>>>> multi-master replication group
>>>>> events generated by the removed server could become unstoppable
> (bug#25998).
>>>>> That's because the originator had been the terminator of the own
> event flow.
>>>>>
>>>> "circular multi-master replication group" ?
>
>
>>>> We don't have multi-master.
>
>>> Perhaps you confused with multi-sourced that we don't have indeed.
>>> It's okay with multi-master.
>
>>> http://en.wikipedia.org/wiki/Multi-master_replication
>
>>>> If you mean something else, then please re-write this
>>>> paragraph to clarify the situation.
>>>>
>
>>> confirms we have multi-master, and I am proud of that -:)
>
>> Sorry?
>
> I invited you to read the page. It says MySQL has multi-master.
> And I expected you would share the proud :-)
OK. Got what you mean...
Yeah, yeah... according to Wikipedia, we have multi-master... :)
[snip]
>>>>> MASTER_IGNORE_SERVER_ID= some_server_id
>>>>> can be repeated multiple times to list all the servers from
>>>>> which events would not be put into
>>>>> relay log and therefore would not be executed.
>>>>>
>>>> Why not have a comma-separated list of server ids?
>>>>
>
> Well, never mind on the paragraph below.
> I think you are right.
> If an identifier the parser assigns to is a list, let's make the rhs
> as the list.
OK, good. I like that better.
[snip]
>>>>> d. resetting the ignored server ids with RESET SLAVE;
>>>>>
>>>> RESET SLAVE does not change MASTER_HOST, nor any of the other
>>>> parameters, so it does not make sense that it should reset the list of
>>>> ignored server ids.
>>> A good note. Indeed, it does change:
>>>
>>> 13.6.2.5 `RESET SLAVE' Syntax
>>> .............................
>>>
>>> `RESET SLAVE' makes the slave forget its replication position in the
>>> master's binary logs. This statement is meant to be used for a clean
>>> start: It deletes the `master.info' and `relay-log.info' files, all the
>>> relay logs, and starts a new relay log.
>>>
>>> If after `reset slave' the slave server restarts, it won't find
>>> nothing about the last time configuration.
>>>
>>> The verb "reset" and more the quoted paragraph does not make sense
>>> for not changing MASTER_HOST etc, which we are used to observe when
>>> `reset slave' can be effectively continued with `start slave'.
>>> I find the current behavior not compliant to the docs.
>>> The two must be separated with `CHANGE MASTER' imo.
>>>
>>> So your note actually manifests a new bug that we have to fix.
>> Maybe so. I agree that the binlog positions do reset, and that the
>> manual seems to be wrong, but the host does not reset, nor the port, nor
>> any of the other "fixed" parameters. I find this list of ignored
>> identifier most similar to this list, and therefore I would like to have
>> it stay and not be reset, precisely so that one can do a START SLAVE
>> afterwards.
>
> I'd rather to be sympathetic to your claim, as it's in my principles
> not to do any extra (i.e nobody/nothing asks for or needs) work.
> However, i could not help make resetting as the docs I think is
> conceptually correct, implementation not.
>
> One of the arguments that we can consider is if the slave server is
> shut down after RESET SLAVE, it will have to re-new master info with
> CHANGE MASTER. Without shutdown, the slave server is maintaining the
> master connecting parameters in its volatile memory.
>
> I think we can leave the current undocumented behavior for
> backward compatibility. Maybe to document it, but then any new feature
> should be questioned about a need to be reset or left volatile.
> Imo, ultimately, RESET MASTER needs always to follow with CHANGE MASTER.
>
> I think we need to vote on this issue.
>
> 1. this bug issue voting agenda:
>
> Shall we enforce that a new option to CHANGE MASTER will survive
> reset master. I.e at the end of the following sequence of queries
>
> change master ignore_server_id=(1,2,3);
> reset master;
> start master;
>
> wrt to the current IGNORE_SERVER_ID option we shall have either:
>
> SHOW SLAVE STATUS like `ignore_server_id' => 1,2,3
>
> or another alternative
>
> SHOW SLAVE STATUS like `ignore_server_id' => empty
I vote that the list shall behave uniformly to how the other options
work: i.e., being retained over resets (without an intervening shutdown).
> 2. a general agenda for voting, (can be postponed till better time)
>
> Shall we continue with undocumented policy (convert it to
> documented) for the slave server to memorize in its volatile memory
> parameters responsible for successful connecting to the master like
> master_host, master_port, master_user, master_password (some more),
> so that RESET SLAVE can be followed immediately with START SLAVE
> without an intermediate CHANGE MASTER
I vote for this for existing options, and also for the new ignore server
id list.
>
> or
>
> we shall enforce the user after RESET SLAVE always follow
> with a new CHANGE MASTER so that
>
> RESET SLAVE, shutdown, START SLAVE
> RESET SLAVE, START SLAVE
>
> will work uniformly trying to connect to nowhere, and the
> only effective way to resume replication after RESET SLAVE is
>
> CHANGE MASTER; START SLAVE;
>
>
>
>> [snip]
>>
>>>>> mysys/array.c@stripped, 2008-05-08 21:00:32+03:00,
> aelkin@stripped +37 -0
>>>>> Implementation of the search function.
>>>>> The third in-out arg is provided for the caller if it needs the
> index
>>>>> of the found search pattern.
>>>>>
>>>> "The third in-out arg"?! I cannot decode this without reading the
>>>> code. Give the function name, what parameter you added, and the
>>>> purpose of that parameter.
>>> I'd rather to remove the details of implementation from the comments.
>>> Really, one can read the synopsis in the sources.
>> You have already provided implementation details by talking about the
>> position of an argument to a function. I agree that it might be better
>> to keep implementation details out of the comment, but then the comment
>> needs rewriting anyway.
>>
>>> (todo)
>>>
>>>>> sql/lex.h@stripped, 2008-05-08 21:00:32+03:00,
> aelkin@stripped +1 -0
>>>>> new keyword
>>>>>
>>>> What new keyword?
>>> indeed, a new member of struct st_lex_master_info (todo)
>> Just name it, please.
>
> ok
>
>>>>> sql/rpl_mi.cc@stripped, 2008-05-08 21:00:32+03:00,
> aelkin@stripped +67 -4
>>>>> initialization, storing into master info, reading from the file,
> deallocation
>>>>> of the list of ignored server id:s.
>>>>>
>>>> "initialization"? What did you do here?
>>>> "storing into master info"? Storing what into master info? What did
>>>> you do here?
>>>> "reading from the file"? What file? What did you do here?
>>>>
>>> I am lost in the number of questions...
>>>
>>> I seriously doubt you had a hard time to find out what is supposed to be
>>> stored into, read from the master info file :-)
>>> Besides, it's said explicitly at the end of the clause - "of the list
>>> of ignored server id:s". The latter is the common object of several
>>> operations mentioned in the comment clause.
>>>
>>> One point of your critique I have to accept:
>>> "initialization" meant allocation actually (of what is said at the end of
> the
>>> clause). (todo)
>>>
>>> Yes, there is yet another initialization. Of the new `master_id'
>>> member of Master_info.
>>> That's not said. Neither it is worth to mention as there is the source
>>> of the patch.
>>>
>>> "Reading from the file" means reading from a file that was last time
>>> mentioned.
>>>
>>> "What did you do here?" obviously, the slave reads from the master
>>> info file, the new line with the ignored server id list.
>
>> The point is that the comment is a fragment and does not make any sense
>> as it stands. I would prefer if you could write out the comment so that
>> it makes sense without having to look at the code.
>
> It's too strong requirement to provide comments eliminating need to
> look as the source code. Comments are by the definition a part of it.
> In my opinion, none, incl a reviewer, can demand such service as
> release from reading the sources.
You're nitpicking: the comment should be readable. It serves no purpose
otherwise when you browse the file history, which is what the comment is
used for.
[snip]
>> I would say that it is expected that after master-slave.inc has been
>> executed, test is ready to run, including that the slave has started.
>>
>> I would prefer you could add that line to master-slave.inc instead.
>
> Well, two things.
>
> The main, i did not notice last time that it's not the test of SQL thread
>
> include/wait_for_slave_to_start.inc
>
> but synchronization with the master
>
> sync_slave_with_master;
>
> which was meant, sorry.
>
> Second is a footnote{it would not be instead. "Instead" would be to do that
> in `start slave' handler because that suffice to
> `include/master-slave.inc', not vice versa}
Are you saying that you will add the line to master-slave.inc instead or
not?
[snip]
>>>>> +start slave;
>>>>>
>>>> Wait for slave to start. There is a slight chance that you execute
>>>> this statement, switch to the master, and when doing
>>>> sync_slave_with_master, the master_pos_wait() will return NULL because
>>>> the slave has not had a chance to start.
>
> Could you please test your claim scenario?
> Reading the code does not confirm it.
>
> Even the comments :-) in start_slave():
>
>
> slave_errno = start_slave_threads(0 /*no mutex */,
> 1 /* wait for start */,
> mi,
> master_info_file,relay_log_info_file,
> thread_mask);
OK.
>
>>> I understand your worries:
>>>
>>> " The function returns `NULL' if the slave SQL thread is not started "
>>>
>>> However, `slave start' returns OK to mysqltest only after SQL thread has
> been
>>> started.
>
> And my own test confirm that ^.
>
>
>> Not true. Start slave just initiates the slave start, it returns before
>> the thread has been properly started, so the chance I mention above is
>> valid.
>
> I think you need to test it yourself.
I've seen mismatches in PB runs...
[snio]
>>> (todo)
>> Does this mean you remove the MASTER_ for this option?
>>
>
> yes
OK.
>>>>> Check that the master's global character_set_server and ours are
> the same.
>>>>> Not fatal if query fails (old master?).
>>>>> @@ -1238,6 +1334,10 @@ bool show_master_info(THD* thd, Master_i
>>>>> field_list.push_back(new Item_empty_string("Last_IO_Error", 20));
>>>>> field_list.push_back(new Item_return_int("Last_SQL_Errno", 4,
> MYSQL_TYPE_LONG));
>>>>> field_list.push_back(new Item_empty_string("Last_SQL_Error",
> 20));
>>>>> + field_list.push_back(new
> Item_empty_string("Replicate_Ignore_Server_id",
>>>>> + FN_REFLEN));
>>>>>
>>>> Please be consistent: "Replication_Ignore_Server_Id"
>>> I am not sure what would your version be consistent with.
>>> The idea of the feature is to filter. And show slave status displays
>>>
>>> Replicate_Ignore_DB, Replicate_Ignore_Table, other Replicate_, never
>>> Replication_.
>> Sorry, I was referring to the capitalization of the "id", not the choice
>> of "Replication", that was a typo.
>
> ok, that was really hard to guess :-)
Just my few cents,
Mats Kindahl
--
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com