List:Commits« Previous MessageNext Message »
From:Andrei Elkin Date:May 22 2008 1:52pm
Subject:Re: bk commit into 6.0 tree (aelkin:1.2634) BUG#25998
View as plain text  
Mats, hej.

> Hi Andrei!
>
> Comments inline below.
>

Thanks for comments! I tagged with (todo) some of points of discussion.
A newer patch will come later today.

> Just my few cents,
> Mats Kindahl
>
> Andrei Elkin wrote:
>> Below is the list of changes that have just been committed into a local
>> 6.0 repository of aelkin.  When aelkin does a push these changes
>> will be propagated to the main repository and, within 24 hours after the
>> push, to the public repository.
>> For information on how to access the public repository
>> see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
>>
>> ChangeSet@stripped, 2008-05-08 21:00:35+03:00, aelkin@stripped +13 -0
>>   Bug #25998 problems about circle replication
>>   Bug #27808 Infinite looping in circular replication
>>     In case of withdrawing one of the servers from the circular
>> multi-master replication group
>>   events generated by the removed server could become unstoppable (bug#25998).
>>   That's because the originator had been the terminator of the own event flow.
>>   
>
> "circular multi-master replication group" ? We don't have
> multi-master. 

Perhaps you confused with multi-sourced that we don't have indeed.
It's okay with multi-master. 
http://en.wikipedia.org/wiki/Multi-master_replication

> If you mean something else, then please re-write this
> paragraph to clarify the situation.
>

confirms we have multi-master, and I am proud of that -:)

>>   Other possibility of the unstoppable event is the cluster replication
> (bug#27808).
>>   In that case an event generated by a member of the same cluster and replicated
> to another member
>>   got accepted despite effects of the event had been propagated across the
> cluster via cluster
>>   communications.
>>     Both artifacts are fixable with introducing a new option for
>> CHANGE MASTER.
>>   
>
> It's very hard to read that paragraph. I suggest something along the
> line: "Another case can occur when using cluster replication
> (BUG#27808). In that case, the cluster can generate an event that is
> replicated to another member connected to the same cluster. Since the
> server id's are different, the event will be executed and the change
> applied to the cluster."
>

Thanks for your suggestion! I'd word it close to the original way and
hopefully more clearly this time:

  In that case an event generated by a member of a cluster was
  replicated to another member, got accepted and executed.
  By the same time effects of the event had been already propagated
  across the cluster via the cluster communications.
  In order to avoid double-applying, a replication event generated 
  by a co-member of the cluster should not be accepted.

(todo)

> "Artifact" is not a good word in these situations, you should use
> "problem" instead.

I'd rather stay off a linguistic discussion.
No problem with me if I need to change this place indeed.
Still, http://en.wikipedia.org/wiki/Artifact_%28error%29.

>
>>     MASTER_IGNORE_SERVER_ID= some_server_id
>>     can be repeated multiple times to list all the servers from
>> which events would not be put into
>>   relay log and therefore would not be executed.
>>   
>
> Why not have a comma-separated list of server ids?
>

There are many reasons actually.

Practically it's going to be a single identifier at most;
the comma is bound to separate the top level sub-clauses such as
master_port=, master_host=, etc, so that engaging the comma with
more job would require re-writing the current parsing
rules, particularly for `master_defs';
imo it would not be a perfect solution anyway without introducing
a new token(s) for grouping, e.g parenthesis "(", ")"

  something = (item_1, item_2, \dots, item_n)

\footnote{but I would prefer then the white space as separator in the list
  something = (item_1 item_2 \dots item_n)
}

>>     Fixed with implementing the feature.
>>     Properties of the feature:
>>      a. reporting an error if the id of an ignored server is the
>> slave itself and
>>       its configuration on startup was with --replicate-same-server-id;
>>   
>
>>    b. overriding the existing MASTER_IGNORE_SERVER_ID list by the
>> following       CHANGE MASTER ... MASTER_IGNORE_SERVER_ID=;
>>    c. preserving the existing list by CHANGE MASTER w/o MASTER_IGNORE_SERVER_ID;
>>   
>
> If you pass a comma-separated list, this complication of the logic
> goes away.
>

I would agree to speak on redundancy of using the keyword. It's not a
complication, esp. logics, but rather trade-off between a possible
redundancy and most probable practical usage (one server in the
ignored list at most).

>>    d. resetting the ignored server ids with RESET SLAVE;
>>   
>
> RESET SLAVE does not change MASTER_HOST, nor any of the other
> parameters, so it does not make sense that it should reset the list of
> ignored server ids.

A good note. Indeed, it does change:

   13.6.2.5 `RESET SLAVE' Syntax
   .............................

   `RESET SLAVE' makes the slave forget its replication position in the
   master's binary logs. This statement is meant to be used for a clean
   start: It deletes the `master.info' and `relay-log.info' files, all the
   relay logs, and starts a new relay log.

If after `reset slave' the slave server restarts, it won't find
nothing about the last time configuration.

The verb "reset" and more the quoted paragraph does not make sense
for not changing MASTER_HOST etc, which we are used to observe when
`reset slave' can be effectively continued with `start slave'.
I find the current behavior not compliant to the docs.
The two must be separated with `CHANGE MASTER' imo.

So your note actually manifests a new bug that we have to fix.
(todo)

>
>>    e. extending SHOW SLAVE STATUS with a new line listing ignored servers;
>>   
>
> OK.
>
>>    f. Differently from --replicate-same-server-id handling, the sql thread is
> not
>>       concerned with the ignored server ids, because it's supposed that
>>       the relay log consists only of events that can not be unstoppable.
>>       In order to guarantee that, e.g in case of the circular replication with a
> failing
>>       server DBA needs to change master necessarily using the new option.
>>   
>
> OK.
>
>>    g. Rotate and FD events originated by the current master listed
>>       in the ignored list are still relay-logged which does not create
>>       any termination issue.
>>   
>
> OK.
>
>>    h. The possible list of ignored servers is sorted for the fastest processing
> of filtering
>>       algorithm.
>>   
>
> OK.
>
>>     Two new lines to show slave status output are added: the list of
>> ignored servers and
>>   the current master server id.
>>   
>
> OK.
>
>>     Use cases for this feature can be found on the bug report page.
>>
>>   include/my_sys.h@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped
> +2 -0
>>     search function for a specific type of dynamic array with long type items.
>>
>>   mysql-test/suite/rpl/r/rpl_server_id_ignore.result@stripped, 2008-05-08
> 21:00:33+03:00, aelkin@stripped +44 -0
>>     new results
>>
>>   mysql-test/suite/rpl/r/rpl_server_id_ignore.result@stripped, 2008-05-08
> 21:00:33+03:00, aelkin@stripped +0 -0
>>
>>   mysql-test/suite/rpl/t/rpl_server_id_ignore-slave.opt@stripped, 2008-05-08
> 21:00:33+03:00, aelkin@stripped +1 -0
>>     deliberately requesting a clashing the new change-master option
>>     --replicate-same-server-id in order to show reaction.
>>   
>
> Please rewrite this comment. It doesn't make any sense.

not the best comments ever, agreed (todo).

>
>>   mysql-test/suite/rpl/t/rpl_server_id_ignore-slave.opt@stripped, 2008-05-08
> 21:00:33+03:00, aelkin@stripped +0 -0
>>
>>   mysql-test/suite/rpl/t/rpl_server_id_ignore.test@stripped, 2008-05-08
> 21:00:33+03:00, aelkin@stripped +107 -0
>>     The basic tests for the new feature.
>>
>>   mysql-test/suite/rpl/t/rpl_server_id_ignore.test@stripped, 2008-05-08
> 21:00:33+03:00, aelkin@stripped +0 -0
>>
>>   mysys/array.c@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped +37
> -0
>>     Implementation of the search function.
>>     The third in-out arg is provided for the caller if it needs the index
>>     of the found search pattern.
>>   
>
> "The third in-out arg"?! I cannot decode this without reading the
> code. Give the function name, what parameter you added, and the
> purpose of that parameter.

I'd rather to remove the details of implementation from the comments.
Really, one can read the synopsis in the sources.

(todo)

>
>>   sql/lex.h@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped +1 -0
>>     new keyword
>>   
>
> What new keyword?

indeed, a new member of struct st_lex_master_info (todo)

>
>>   sql/rpl_mi.cc@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped +67
> -4
>>     initialization, storing into master info, reading from the file,
> deallocation
>>     of the list of ignored server id:s.
>>   
>
> "initialization"? What did you do here?
> "storing into master info"? Storing what into master info? What did
> you do here?
> "reading from the file"? What file? What did you do here?
>

I am lost in the number of questions...

I seriously doubt you had a hard time to find out what is supposed to be
stored into, read from the master info file :-)
Besides, it's said explicitly at the end of the clause - "of the list
of ignored server id:s". The latter is the common object of several
operations mentioned in the comment clause.

One point of your critique I have to accept:
"initialization" meant allocation actually (of what is said at the end of the
clause). (todo)

Yes, there is yet another initialization. Of the new `master_id'
member of Master_info.
That's not said. Neither it is worth to mention as there is the source
of the patch.

"Reading from the file" means reading from a file that was last time
mentioned.

"What did you do here?" obviously, the slave reads from the master
info file, the new line with the ignored server id list.



>>   sql/rpl_mi.h@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped +4
> -0
>>     a new method answering filtering question;
>>     a new member to hold the current master server id.
>>
>>   sql/share/errmsg.txt@stripped, 2008-05-08 21:00:33+03:00,
> aelkin@stripped +2 -0
>>     a new error message about possible clashing of options.
>>
>>   sql/slave.cc@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped
> +144 -8
>>     a helper function to read an array of variable size of integers.
>>     Refining io-thread's filtering condition to consult with the list of ignored
> servers.
>>     Note, that FD and Rotate events from an ignored server being the
>> current master     are still accepted.
>>     Filtering basing on --replicate-same-server-id remains as before; if the
> ignored servers
>>     list is empty     Adding two new lines in show slave status.
>>   
>
> This is very hard to read: could you please re-write that cset
> comment. What helper function?
>

Something happens with punctuation, sorry.
I'll mention the name of the function. (todo)


>>   sql/sql_lex.h@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped +1
> -0
>>     parser time storage for variable size array of server ids
>>
>>   sql/sql_repl.cc@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped
> +46 -13
>>     aux function for sorting the ignored server_id:s;
>>     sorting of the ignored servers implementation;
>>     failing change master do not forget to free lex's dynamical storage with
> parsed
>>     ignored server_ids;
>>     resetting the old list of ignored server if the parser detected the new
> list;
>>     shifting some pieces of code in order to comply with the above logics;
>>
>>   sql/sql_yacc.yy@stripped, 2008-05-08 21:00:32+03:00, aelkin@stripped
> +8 -0
>>     allocation a dyn array for possible list of ignored server ids;
>>     filling the array;
>>     resetting (missed part of wl#342) heartbeat_opt at the beginning of
>>     CHANGE MASTER processing;
>>   
>
> I don't follow why you need to reset the heartbeat option when doing
> CHANGE MASTER and why you need to do this change in this patch.

We have a convention that absence of MASTER_HEARTBEAT_PERIOD in the args
means heartbeat cancellation. The easiest way to enforce the rule it to
reset a possible last time non-zero value at the beginning of CHANGE MASTER
parsing.
I might have reported a bug, but think piggybacking on the current
ticket would be okay.
>
>> diff -Nrup a/include/my_sys.h b/include/my_sys.h
>> --- a/include/my_sys.h	2008-04-01 13:56:27 +03:00
>> +++ b/include/my_sys.h	2008-05-08 21:00:32 +03:00
>> @@ -804,6 +804,8 @@ extern int  get_index_dynamic(DYNAMIC_AR
>>  #define push_dynamic(A,B) insert_dynamic((A),(B))
>>  #define reset_dynamic(array) ((array)->elements= 0)
>>  #define sort_dynamic(A,cmp) my_qsort((A)->buffer, (A)->elements,
> (A)->size_of_element, (cmp))
>> +extern my_bool dynamic_array_ulong_bin_search(DYNAMIC_ARRAY *a,
>> +                                           const ulong *pattern, uint * index);
>>   extern my_bool init_dynamic_string(DYNAMIC_STRING *str, const char
>> *init_str,
>>  				   size_t init_alloc,size_t alloc_increment);
>> diff -Nrup a/mysql-test/suite/rpl/r/rpl_server_id_ignore.result
> b/mysql-test/suite/rpl/r/rpl_server_id_ignore.result
>> --- /dev/null	Wed Dec 31 16:00:00 196900
>> +++ b/mysql-test/suite/rpl/r/rpl_server_id_ignore.result	2008-05-08 21:00:33
> +03:00
>> @@ -0,0 +1,44 @@
>> +stop slave;
>> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
>> +reset master;
>> +reset slave;
>> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
>> +start slave;
>> +master_id: 1
>> +stop slave;
>> +*** --replicate-same-server-id and change master option can clash ***
>> +change master to MASTER_IGNORE_SERVER_ID=2, MASTER_IGNORE_SERVER_ID=1;
>> +ERROR HY000: The requested server id 2 clashes with the slave startup option
> --replicate-same-server-id
>> +*** must be empty due to the error ***
>> +ignore server id list: +change master to
>> MASTER_IGNORE_SERVER_ID=10, MASTER_IGNORE_SERVER_ID=100;
>> +*** must be 10, 100 ***
>> +ignore server id list: 10, 100
>> +reset slave;
>> +*** must be empty due to reset slave ***
>> +ignore server id list: +change master to
>> MASTER_IGNORE_SERVER_ID=10, MASTER_IGNORE_SERVER_ID=100;
>> +*** CHANGE MASTER with MASTER_IGNORE_SERVER_ID option overrides (does not
> increment) the previous setup ***
>> +change master to MASTER_IGNORE_SERVER_ID=1, MASTER_IGNORE_SERVER_ID=3,
> MASTER_IGNORE_SERVER_ID=4;
>> +*** must be 1, 3, 4 due to overriding policy ***
>> +ignore server id list: 1, 3, 4
>> +*** ignore master (server 1) queries for a while ***
>> +start slave;
>> +create table t1 (n int);
>> +*** must be empty as the event is to be filtered out ***
>> +show tables;
>> +Tables_in_test
>> +*** allowing events from master ***
>> +stop slave;
>> +reset slave;
>> +*** must be empty due to reset slave ***
>> +ignore server id list: +change master to master_host='127.0.0.1',
>> master_user='root',
>> +master_password='', master_port=MASTER_PORT;
>> +start slave;
>> +*** must have caught create table ***
>> +show tables;
>> +Tables_in_test
>> +t1
>> +drop table t1;
>> +end of the tests
>>   
>
> Very good.

That's a relief :-)

>
>> diff -Nrup a/mysql-test/suite/rpl/t/rpl_server_id_ignore-slave.opt
> b/mysql-test/suite/rpl/t/rpl_server_id_ignore-slave.opt
>> --- /dev/null	Wed Dec 31 16:00:00 196900
>> +++ b/mysql-test/suite/rpl/t/rpl_server_id_ignore-slave.opt	2008-05-08 21:00:33
> +03:00
>> @@ -0,0 +1 @@
>> +--disable-log-slave-updates --replicate-same-server-id
>> diff -Nrup a/mysql-test/suite/rpl/t/rpl_server_id_ignore.test
> b/mysql-test/suite/rpl/t/rpl_server_id_ignore.test
>> --- /dev/null	Wed Dec 31 16:00:00 196900
>> +++ b/mysql-test/suite/rpl/t/rpl_server_id_ignore.test	2008-05-08 21:00:33
> +03:00
>> @@ -0,0 +1,107 @@
>> +# This test checks that the slave ignores queries originating
>> +# from the list of ignored originators (bug#27808 etc)
>> +#
>> +# phases of tests:
>> +#
>> +# 0. master_id new line in show slave status
>> +# 1. syntax related: +#    a. error reporting if options clash;
>> +#    b. overriding the old MASTER_IGNORE_SERVER_ID setup by the
>> following +#       CHANGE MASTER ... MASTER_IGNORE_SERVER_ID=;
>> +#    c. the old setup preserving by CHANGE MASTER w/o MASTER_IGNORE_SERVER_ID
>> +#    d. resetting the ignored server ids with RESET SLAVE
>> +# 2. run time related: +#    a. observing no processing events from
>> a master listed in MASTER_IGNORE_SERVER_ID:s
>> +#    b. nullifying the list and resuming of taking binlog from the very
> beginning with
>> +#       executing events this time
>> +
>> +source include/master-slave.inc;
>> +
>> +connection slave;
>> +source include/wait_for_slave_to_start.inc;
>>   
>
> Is this necessary? 

Sure.
As the test is going show `master_id' the new member of `show slave
status' which is the server id of the master.

> Shouldn't this line be added to master-slave.inc
> instead in that case?

An interesting note.
I'd rather it to be a part of `start slave', which is called from 
include/master-slave.inc as well.

>
>> +# a new line for master_id
>> +let $master_id= query_get_value(SHOW SLAVE STATUS, Master_Server_Id, 1);
>> +echo master_id: $master_id;
>> +
>> +stop slave;
>>   
>
> wait for slave to stop.
>

I don't think we really need to wait for stop, actually.
In the following the test is concerned with results of CHANGE MASTER
and execution on the slave after its restarting.
Later in the test, there is sync_slave_with_master synchoronization
that supersedes wait for stop and wait for start both.

>> +--echo *** --replicate-same-server-id and change master option can clash ***
>> +--error ER_SLAVE_IGNORE_SERVER_ID
>> +eval change master to MASTER_IGNORE_SERVER_ID=2, MASTER_IGNORE_SERVER_ID=1;
>>   
>
> There are no variables here, so you don't need the "eval".
>

right. I just have removed MASTER_PORT port etc. (todo)

>> +--echo *** must be empty due to the error ***
>> +let $ignore_list= query_get_value(SHOW SLAVE STATUS, Replicate_Ignore_Server_id,
> 1);
>> +echo ignore server id list: $ignore_list;
>> +
>> +eval change master to MASTER_IGNORE_SERVER_ID=10, MASTER_IGNORE_SERVER_ID=100;
>> +--echo *** must be 10, 100 ***
>> +let $ignore_list= query_get_value(SHOW SLAVE STATUS, Replicate_Ignore_Server_id,
> 1);
>> +echo ignore server id list: $ignore_list;
>> +reset slave;
>> +--echo *** must be empty due to reset slave ***
>> +let $ignore_list= query_get_value(SHOW SLAVE STATUS, Replicate_Ignore_Server_id,
> 1);
>> +echo ignore server id list: $ignore_list;
>> +eval change master to MASTER_IGNORE_SERVER_ID=10, MASTER_IGNORE_SERVER_ID=100;
>>   
>
> No "eval" necessary.
>

(todo)

>> +--echo *** CHANGE MASTER with MASTER_IGNORE_SERVER_ID option overrides (does not
> increment) the previous setup ***
>> +eval change master to MASTER_IGNORE_SERVER_ID=1, MASTER_IGNORE_SERVER_ID=3,
> MASTER_IGNORE_SERVER_ID=4;
>>   
>
> No "eval" necessary.

(todo)

>
>> +--echo *** must be 1, 3, 4 due to overriding policy ***
>> +let $ignore_list= query_get_value(SHOW SLAVE STATUS, Replicate_Ignore_Server_id,
> 1);
>> +echo ignore server id list: $ignore_list;
>> +--echo *** ignore master (server 1) queries for a while ***

>> +start slave;
>>   

>
> Wait for slave to start. There is a slight chance that you execute
> this statement, switch to the master, and when doing
> sync_slave_with_master, the master_pos_wait() will return NULL because
> the slave has not had a chance to start.
>

I understand your worries:

" The function returns `NULL' if the slave SQL thread is not started "

However, `slave start' returns OK to mysqltest only after SQL thread has been
started.


>> +
>> +connection master;
>> +
>> +#connection slave;
>> +sync_slave_with_master;
>> +let $slave_relay_pos0= query_get_value(SHOW SLAVE STATUS, Relay_Log_Pos, 1);
>> +
>> +connection master;
>> +create table t1 (n int);
>> +let $master_binlog_end= query_get_value(SHOW MASTER STATUS, Position, 1);
>> +
>> +connection slave;
>> +let $slave_param= Exec_Master_Log_Pos;
>> +let $slave_param_value= $master_binlog_end;
>> +source include/wait_for_slave_param.inc;
>> +--echo *** must be empty as the event is to be filtered out ***
>> +show tables;
>> +--echo *** allowing events from master ***
>> +let $slave_relay_pos1= query_get_value(SHOW SLAVE STATUS, Relay_Log_Pos, 1);
>> +#
>> +# checking stability of relay log pos
>> +#
>> +if (`select $slave_relay_pos1 - $slave_relay_pos0`)
>> +{
>> +    --echo Error: relay log position changed:  $slave_relay_pos0,
> $slave_relay_pos1
>> +    query_vertical show slave status;
>>   

>
> This will print a lot of info that will cause the test to fail in
> pushbuild.

Sorry, 
this will print a lot of info   if  the test will fail in pushbuild.

>  Use "include/show_slave_status.inc" or something equivalent
> to filter out the pointless columns.
>

I think you agree that's how it should be in the case of an error of
a test.

>> +}
>> +stop slave;
>> +source include/wait_for_slave_to_stop.inc;
>> +reset slave;
>> +--echo *** must be empty due to reset slave ***
>> +let $ignore_list= query_get_value(SHOW SLAVE STATUS, Replicate_Ignore_Server_id,
> 1);
>> +echo ignore server id list: $ignore_list;
>> +--replace_result $MASTER_MYPORT MASTER_PORT
>> +eval change master to master_host='127.0.0.1', master_user='root',
>> + master_password='', master_port=$MASTER_MYPORT;
>>   +start slave;
>>   
>
> Wait for slave to start.
>

sorry, no need as explained.

>> +
>> +connection master;
>> +
>> +#connection slave;
>> +sync_slave_with_master;
>> +--echo *** must have caught create table ***
>> +show tables;
>> +
>> +# cleanup
>> +connection master;
>> +drop table t1;
>> +#connection slave
>> +sync_slave_with_master;
>> +
>> +--echo end of the tests
>> +
>> +
>> +
>> +
>> +
>> +
>> diff -Nrup a/mysys/array.c b/mysys/array.c
>> --- a/mysys/array.c	2007-10-11 18:07:33 +03:00
>> +++ b/mysys/array.c	2008-05-08 21:00:32 +03:00
>> @@ -382,3 +382,40 @@ int get_index_dynamic(DYNAMIC_ARRAY *arr
>>    return ret;
>>   }
>> +
>> +/**
>> +   Binary search in a specific type of the array having ulong items.
>> +   The items are ascendant-sorted.
>> +
>> +   @param A             pointer to an array
>> +   @param search        pointer to a search pattern
>> +   @param index[in,out] pointer to a store for the index +
>> of a found element or NULL - not store
>> +   @return        +      @retval FALSE  if the pattern is found,
>> +      @retval TRUE    otherwise
>> + */
>> +
>> +my_bool dynamic_array_ulong_bin_search(DYNAMIC_ARRAY *array, const ulong
> *search,
>> +                                       uint * index)
>> +{
>> +     DBUG_ENTER("dynamic_array_bin_search");
>> +     long l= 0, i, r= array->elements - 1;
>> +     while (l <= r)
>> +     {
>> +          ulong found;
>> +          DBUG_ASSERT((r < 0 || (ulong) r < array->elements) &&
> l >=0);
>> +          i= (r + l)/2;
>> +          if ((found= (* (ulong*) dynamic_array_ptr(array, i))) == *search)
>> +          {
>> +               if (index) +                    *index= i;
>> +               DBUG_RETURN(FALSE);
>> +          }
>> +          if (found < *search)
>> +               l= i + 1;
>> +          else
>> +               r= i - 1;
>> +     }
>> +     DBUG_RETURN(TRUE);
>> +}
>>   

>
> If you are going to extend array.c with a function like this, make a
> generic version. If you want a specialized one, keep it in the file
> where you need it. Don't place a ulong-specific version in the mysys
> library.
>

a good suggestion (todo).


>> diff -Nrup a/sql/lex.h b/sql/lex.h
>> --- a/sql/lex.h	2008-04-01 13:56:38 +03:00
>> +++ b/sql/lex.h	2008-05-08 21:00:32 +03:00
>> @@ -321,6 +321,7 @@ static SYMBOL symbols[] = {
>>    { "MASTER_SSL_VERIFY_SERVER_CERT", SYM(MASTER_SSL_VERIFY_SERVER_CERT_SYM)},
>>    { "MASTER_USER",           SYM(MASTER_USER_SYM)},
>>    { "MASTER_HEARTBEAT_PERIOD", SYM(MASTER_HEARTBEAT_PERIOD_SYM)},
>> +  { "MASTER_IGNORE_SERVER_ID", SYM(MASTER_IGNORE_SERVER_ID_SYM)},
>>   

> What is "MASTER" referring to here? For "MASTER_USER" it is clearly
> the user on the master, but what does the "MASTER" denote for
> "IGNORE_SERVER_ID"?

a good point. I seemed to had been hypnotized with that all the options
having MASTER_ prefix. It's not really so, there are two that do not
have. This drop off my early hesitations.

(todo)

>
>>    { "MATCH",		SYM(MATCH)},
>>    { "MAX_CONNECTIONS_PER_HOUR", SYM(MAX_CONNECTIONS_PER_HOUR)},
>>    { "MAX_QUERIES_PER_HOUR", SYM(MAX_QUERIES_PER_HOUR)},
>> diff -Nrup a/sql/rpl_mi.cc b/sql/rpl_mi.cc
>> --- a/sql/rpl_mi.cc	2008-03-15 00:21:24 +02:00
>> +++ b/sql/rpl_mi.cc	2008-05-08 21:00:32 +03:00
>> @@ -28,17 +28,19 @@ int init_intvar_from_file(int* var, IO_C
>>  int init_strvar_from_file(char *var, int max_size, IO_CACHE *f,
>>  			  const char *default_val);
>>  int init_floatvar_from_file(float* var, IO_CACHE* f, float default_val);
>> +int init_dynarray_intvar_from_file(DYNAMIC_ARRAY* arr, IO_CACHE* f);
>>   Master_info::Master_info()
>>    :Slave_reporting_capability("I/O"),
>>     ssl(0), ssl_verify_server_cert(0), fd(-1),  io_thd(0), port(MYSQL_PORT),
>>     connect_retry(DEFAULT_CONNECT_RETRY), heartbeat_period(0),
>> -   received_heartbeats(0), inited(0),
>> +   received_heartbeats(0), inited(0), master_id(0),
>>     abort_slave(0), slave_running(0), slave_run_id(0)
>>  {
>>    host[0] = 0; user[0] = 0; password[0] = 0;
>>    ssl_ca[0]= 0; ssl_capath[0]= 0; ssl_cert[0]= 0;
>>    ssl_cipher[0]= 0; ssl_key[0]= 0;
>> +  my_init_dynamic_array(&ignore_server_ids, sizeof(::server_id), 16, 16);
>>     bzero((char*) &file, sizeof(file));
>>    pthread_mutex_init(&run_lock, MY_MUTEX_INIT_FAST);
>> @@ -50,6 +52,7 @@ Master_info::Master_info()
>>   Master_info::~Master_info()
>>  {
>> +  delete_dynamic(&ignore_server_ids);
>>    pthread_mutex_destroy(&run_lock);
>>    pthread_mutex_destroy(&data_lock);
>>    pthread_cond_destroy(&data_cond);
>> @@ -57,6 +60,27 @@ Master_info::~Master_info()
>>    pthread_cond_destroy(&stop_cond);
>>  }
>>  +/**
>> +   Reports if the s_id server has been configured to ignore events
>> +   it generates with
>> +
>> +      CHANGE MASTER MASTER_IGNORE_SERVER_ID= s_id
>> +
>> +   Method is called from the io thread event receiver filtering.
>> +
>> +   @param      s_id    the master server identifier
>> +   @return +     @retval   TRUE    if s_id is in the list of
>> ignored master  servers,
>> +     @retval   FALSE   otherwise.
>> + */
>> +bool Master_info::repl_server_id_ignore_p(ulong s_id)
>> +{
>> +  if (likely(ignore_server_ids.elements == 1))
>> +    return (* (ulong*) dynamic_array_ptr(&ignore_server_ids, 0)) == s_id;
>> +  else      +    return
>> !dynamic_array_ulong_bin_search(&ignore_server_ids,
>> +                                           (const ulong *) &s_id, NULL);
>> +}
>>   void init_master_log_pos(Master_info* mi)
>>  {
>> @@ -87,8 +111,11 @@ enum {
>>    /* 6.0 added value of master_heartbeat_period */
>>    LINE_FOR_MASTER_HEARTBEAT_PERIOD= 16,
>>  +  /* 6.0 added value of master_ignore_server_id */
>> +  LINE_FOR_MASTER_IGNORE_SERVER_ID= 17,
>> +
>>    /* Number of lines currently used when saving master info file */
>> -  LINES_IN_MASTER_INFO= LINE_FOR_MASTER_HEARTBEAT_PERIOD
>> +  LINES_IN_MASTER_INFO= LINE_FOR_MASTER_IGNORE_SERVER_ID
>>  };
>>   @@ -280,6 +307,17 @@ file '%s')", fname);
>>        if (lines >= LINE_FOR_MASTER_HEARTBEAT_PERIOD &&
>>            init_floatvar_from_file(&master_heartbeat_period,
> &mi->file, 0.0))
>>          goto errwithmsg;
>> +      /*
>> +        Starting from 6.0 list of server_id of ignorable servers might be
>> +        in the file
>> +      */
>> +      if (lines >= LINE_FOR_MASTER_IGNORE_SERVER_ID &&
>> +          init_dynarray_intvar_from_file(&mi->ignore_server_ids,
> &mi->file))
>> +      {
>> +        sql_print_error("Failed to initialize master info ignore_server_ids");
>> +        goto errwithmsg;
>> +      }
>> +
>>      }
>>   #ifndef HAVE_OPENSSL
>> @@ -379,16 +417,40 @@ int flush_master_info(Master_info* mi, b
>>    */
>>    char heartbeat_buf[sizeof(mi->heartbeat_period) * 4]; // buffer to suffice
> always
>>    my_sprintf(heartbeat_buf, (heartbeat_buf, "%.3f", mi->heartbeat_period));
>> +  /*
>> +    produce a line listing the total number and all the ignored server_id:s
>> +  */
>> +  char* ignore_server_ids_buf;
>> +  {
>> +    ignore_server_ids_buf=
>> +      (char *) my_malloc((sizeof(::server_id) * 3 + 1) *
>> +                         (1 + mi->ignore_server_ids.elements), MYF(MY_WME));
>> +    if (!ignore_server_ids_buf)
>> +      DBUG_RETURN(1);
>> +    my_sprintf(ignore_server_ids_buf,
>> +               (ignore_server_ids_buf, "%u",
> mi->ignore_server_ids.elements));
>> +    for (ulong i= 0; i < mi->ignore_server_ids.elements; i++)
>> +    {
>> +      ulong s_id;
>> +      ulong cur_len= strlen(ignore_server_ids_buf);
>>   
>
> You are calling the strlen() with each iteration when you can just
> keep track of it using the buffer pointer and maintain the a pointer
> to the current position by using the results of the
> my_sprintf(). Although this is not a critical loop, this is not
> something we should do.
>

I understand. (todo)

>> +      get_dynamic(&mi->ignore_server_ids, (uchar*) &s_id, i);
>> +      my_sprintf(ignore_server_ids_buf + cur_len,
>> +                 (ignore_server_ids_buf + cur_len,
>> +                  " %lu", s_id));
>> +    }
>> +  }
>>    my_b_seek(file, 0L);
>>    my_b_printf(file,
>> -             
> "%u\n%s\n%s\n%s\n%s\n%s\n%d\n%d\n%d\n%s\n%s\n%s\n%s\n%s\n%d\n%s\n",
>> +             
> "%u\n%s\n%s\n%s\n%s\n%s\n%d\n%d\n%d\n%s\n%s\n%s\n%s\n%s\n%d\n%s\n%s\n",
>>                LINES_IN_MASTER_INFO,
>>                mi->master_log_name, llstr(mi->master_log_pos, lbuf),
>>                mi->host, mi->user,
>>                mi->password, mi->port, mi->connect_retry,
>>                (int)(mi->ssl), mi->ssl_ca, mi->ssl_capath,
> mi->ssl_cert,
>>                mi->ssl_cipher, mi->ssl_key, mi->ssl_verify_server_cert,
>> -              heartbeat_buf);
>> +              heartbeat_buf,
>> +              ignore_server_ids_buf);
>> +  my_free(ignore_server_ids_buf, MYF(0));
>>    DBUG_RETURN(-flush_io_cache(file));
>>  }
>>  @@ -399,6 +461,7 @@ void end_master_info(Master_info* mi)
>>     if (!mi->inited)
>>      DBUG_VOID_RETURN;
>> +  reset_dynamic(&mi->ignore_server_ids);
>>    end_relay_log_info(&mi->rli);
>>    if (mi->fd >= 0)
>>    {
>> diff -Nrup a/sql/rpl_mi.h b/sql/rpl_mi.h
>> --- a/sql/rpl_mi.h	2008-02-03 11:00:47 +02:00
>> +++ b/sql/rpl_mi.h	2008-05-08 21:00:32 +03:00
>> @@ -20,6 +20,7 @@
>>   #include "rpl_rli.h"
>>  #include "rpl_reporting.h"
>> +#include "my_sys.h"
>>   
> /*****************************************************************************
>> @@ -60,6 +61,7 @@ class Master_info : public Slave_reporti
>>   public:
>>    Master_info();
>>    ~Master_info();
>> +  bool repl_server_id_ignore_p(ulong s_id);
>>     /* the variables below are needed because we can change masters
>> on the fly */
>>    char master_log_name[FN_REFLEN];
>> @@ -85,10 +87,12 @@ class Master_info : public Slave_reporti
>>    uint connect_retry;
>>    float heartbeat_period;         // interface with CHANGE MASTER or
> master.info
>>    ulonglong received_heartbeats;  // counter of received heartbeat events
>> +  DYNAMIC_ARRAY ignore_server_ids;
>>  #ifndef DBUG_OFF
>>    int events_till_disconnect;
>>  #endif
>>    bool inited;
>> +  volatile ulong master_id;
>>   
>
> Why volatile?

An idea was to be more precise with show slave status
that can display master_id concurrently with the setter thread.
But it'd be the same to ignore_server_ids. So, there is no good reason
for it.

(todo)

>
>>    volatile bool abort_slave;
>>    volatile uint slave_running;
>>    volatile ulong slave_run_id;
>> diff -Nrup a/sql/share/errmsg.txt b/sql/share/errmsg.txt
>> --- a/sql/share/errmsg.txt	2008-04-14 19:58:40 +03:00
>> +++ b/sql/share/errmsg.txt	2008-05-08 21:00:33 +03:00
>> @@ -6252,3 +6252,5 @@ ER_SLAVE_HEARTBEAT_FAILURE
>>    eng "Unexpected master's heartbeat data: %s"
>>  ER_SLAVE_HEARTBEAT_VALUE_OUT_OF_RANGE
>>    eng "The requested value for the heartbeat period %s %s"
>> +ER_SLAVE_IGNORE_SERVER_ID
>> +  eng "The requested server id %d clashes with the slave startup option
> --replicate-same-server-id"
>> diff -Nrup a/sql/slave.cc b/sql/slave.cc
>> --- a/sql/slave.cc	2008-04-01 13:56:42 +03:00
>> +++ b/sql/slave.cc	2008-05-08 21:00:32 +03:00
>> @@ -759,6 +759,95 @@ int init_floatvar_from_file(float* var,
>> DBUG_RETURN(1);
>>  }
>>  +/**
>> +   A master info read method
>> +
>> +   This function is called from @c init_master_info() @c along with
>> +   relatives to restore some of @c active_mi @c members.
>> +   Particularly, this function is responsible for restoring
>> +   MASTER_IGNORE_SERVER_ID list of servers whose events the slave is
>> +   going to ignore (to not log them in the relay log).
>> +   Items being read are supposed to be decimal output of values of a
>> +   type shorter or equal of @c long @c and separated by the single space.
>> +
>> +   @param arr         @c DYNAMIC_ARRAY @c pointer to storage for servers id
>> +   @param f           @c IO_CACHE @c pointer to the source file
>> +
>> +   @return
>> +     @retval 0         All OK
>> +     @retval non-zero  An error
>> +*/
>> +
>> +int init_dynarray_intvar_from_file(DYNAMIC_ARRAY* arr, IO_CACHE* f)
>> +{
>> +  int ret= 0;
>> +  char buf[16 * (sizeof(long)*4 + 1)]; // static buffer to use most of times
>> +  char *buf_act= buf; // actual buffer can be dynamic if static is short
>> +  char *token;
>> +  uint num_items;     // number of items of `arr'
>> +  size_t read_size;
>> +  DBUG_ENTER("init_dynarray_intvar_from_file");
>> +
>> +  if ((read_size= my_b_gets(f, buf_act, sizeof(buf))) == 0)
>> +  {
>> +    return 0; // no line in master.info
>> +  }
>> +  if (read_size + 1 == sizeof(buf) && buf[sizeof(buf) - 2] != '\n')
>> +  {
>> +    /*
>> +      short read happpend; allocate sufficient memory and make the 2nd read
>> +    */
>> +    char buf_work[(sizeof(long)*3 + 1)*16];
>> +    memcpy(buf_work, buf, sizeof(buf_work));
>> +    num_items= atoi(strtok(buf_work, " "));
>> +    size_t snd_size;
>> +    /*
>> +      max size lower bound approximate estimation bases on the formula:
>> +      (the items number + items themselves) * +          (decimal
>> size + space) - 1 + `\n' + '\0'
>> +    */
>> +    size_t max_size= (1 + num_items) * (sizeof(long)*3 + 1) + 1;
>> +    buf_act= (char*) my_malloc(max_size, MYF(MY_WME));
>> +    memcpy(buf_act, buf, read_size);
>> +    snd_size= my_b_gets(f, buf_act + read_size, max_size - read_size);
>> +    if (snd_size == 0 ||
>> +        (snd_size + 1 == max_size - read_size) &&  buf[max_size - 2] !=
> '\n')
>> +    {
>> +      /*
>> +        failure to make the 2nd read or short read again
>> +      */
>> +      ret= 1;
>> +      goto err;
>> +    }
>> +  }
>> +  token= strtok(buf_act, " ");
>>   

>
> strtok() is not thread-safe. Please stay away from it.
>

and that's my gross mistake ... (todo).

>> +  if (token == NULL)
>> +  {
>> +    ret= 1;
>> +    goto err;
>> +  }
>> +  num_items= atoi(token);
>> +  for (uint i=0; i < num_items; i++)
>> +  {
>> +    token= strtok(NULL, " ");
>> +    if (token == NULL)
>> +    {
>> +      ret= 1;
>> +      goto err;
>> +    }
>> +    else
>> +    {
>> +      ulong val= atol(token);
>> +      insert_dynamic(arr, (uchar *) &val);
>> +    }
>> +  }
>> +err:
>> +  if (buf_act != buf)
>> +    my_free(buf_act, MYF(0));
>> +  DBUG_RETURN(ret);
>> +}
>> +
>> +
>>  static bool check_io_slave_killed(THD *thd, Master_info *mi, const char *info)
>>  {
>>    if (io_slave_killed(thd, mi))
>> @@ -907,7 +996,7 @@ static int get_master_version_and_clock(
>>        (master_res= mysql_store_result(mysql)))
>>    {
>>      if ((master_row= mysql_fetch_row(master_res)) &&
>> -        (::server_id == strtoul(master_row[1], 0, 10)) &&
>> +        (::server_id == (mi->master_id= strtoul(master_row[1], 0, 10)))
> &&
>>          !mi->rli.replicate_same_server_id)
>>      {
>>        errmsg=
>> @@ -923,7 +1012,14 @@ static int get_master_version_and_clock(
>>      if (errmsg)
>>        goto err;
>>    }
>> -
>> +  if (mi->master_id == 0 && mi->ignore_server_ids.elements >
> 0)
>> +  {
>> +    errmsg="Slave configured with server id filtering could not detect the
> master server id.";
>> +    err_code= ER_SLAVE_FATAL_ERROR;
>> +    sprintf(err_buff, ER(err_code), errmsg);
>> +    err_msg.append(err_buff);
>> +    goto err;
>> +  }
>>    /*
>>      Check that the master's global character_set_server and ours are the same.
>>      Not fatal if query fails (old master?).
>> @@ -1238,6 +1334,10 @@ bool show_master_info(THD* thd, Master_i
>>    field_list.push_back(new Item_empty_string("Last_IO_Error", 20));
>>    field_list.push_back(new Item_return_int("Last_SQL_Errno", 4,
> MYSQL_TYPE_LONG));
>>    field_list.push_back(new Item_empty_string("Last_SQL_Error", 20));
>> +  field_list.push_back(new Item_empty_string("Replicate_Ignore_Server_id",
>> +                                             FN_REFLEN));
>>   

> Please be consistent: "Replication_Ignore_Server_Id" 

I am not sure what would your version be consistent with.
The idea of the feature is to filter. And show slave status displays

Replicate_Ignore_DB, Replicate_Ignore_Table, other Replicate_, never
Replication_.

> Also, I question
> the need for "Replication" and think that "Ignore_Server_Ids" or
> "Ignore_Server_Id_List" is more appropriate. Same for the variable
> given to CHANGE MASTER. In addition, they should have the same name
> (not counting case) to avoid confusion.

For the reason above, we are better stay consistent with other
semantically close filtering options.

>
>> +  field_list.push_back(new Item_return_int("Master_Server_Id", sizeof(ulong),
>> +                                           MYSQL_TYPE_LONG));
>>     if (protocol->send_fields(&field_list,
>>                              Protocol::SEND_NUM_ROWS | Protocol::SEND_EOF))
>> @@ -1357,6 +1457,32 @@ bool show_master_info(THD* thd, Master_i
>>      protocol->store(mi->rli.last_error().number);
>>      // Last_SQL_Error
>>      protocol->store(mi->rli.last_error().message, &my_charset_bin);
>> +    // Replicate_Ignore_Server_id
>> +    {
>> +      char buff[FN_REFLEN];
>> +      ulong i, cur_len;
>> +      for (i= 0, buff[0]= 0,cur_len= 0;
>> +           i < mi->ignore_server_ids.elements; i++, 
> cur_len=strlen(buff))
>> +      {
>> +        ulong s_id;
>> +        char sbuff[FN_REFLEN];
>> +        get_dynamic(&mi->ignore_server_ids, (uchar*) &s_id, i);
>> +        my_sprintf(sbuff, (sbuff, (i==0? "%lu" : ", %lu"), s_id));
>> +        if (cur_len + strlen(sbuff) + 4 > FN_REFLEN)
>>   
>
> You're computing strlen() inside a loop again when you can just use
> keep the length available using the result of my_sprintf(). It's
> quadratic when it should be linear.
>

ack (todo)

>> +        {
>> +          /*
>> +            break the loop whenever remained space could not fit
>> +            ellipses on the next cycle
>> +          */
>> +          my_sprintf(buff + cur_len, (buff + cur_len, "..."));
>> +          break;
>> +        }
>> +        my_sprintf(buff + cur_len, (buff + cur_len, "%s", sbuff));
>> +      }
>> +      protocol->store(buff, &my_charset_bin);
>> +    }
>> +    // Master_Server_id
>> +    protocol->store((uint32) mi->master_id);
>>       pthread_mutex_unlock(&mi->rli.data_lock);
>>      pthread_mutex_unlock(&mi->data_lock);
>> @@ -3069,6 +3195,7 @@ static int queue_event(Master_info* mi,c
>>    ulong inc_pos;
>>    Relay_log_info *rli= &mi->rli;
>>    pthread_mutex_t *log_lock= rli->relay_log.get_log_lock();
>> +  ulong s_id;
>>    DBUG_ENTER("queue_event");
>>     LINT_INIT(inc_pos);
>> @@ -3215,9 +3342,14 @@ static int queue_event(Master_info* mi,c
>>    */
>>     pthread_mutex_lock(log_lock);
>> -
>> -  if ((uint4korr(buf + SERVER_ID_OFFSET) == ::server_id) &&
>> -      !mi->rli.replicate_same_server_id)
>> +  s_id= uint4korr(buf + SERVER_ID_OFFSET);
>> +  if ((s_id == ::server_id && !mi->rli.replicate_same_server_id) ||
>> +      (mi->repl_server_id_ignore_p(s_id) &&
>> +       /* everything is filtered out from non-master */
>> +       (s_id != mi->master_id ||
>> +        /* for the master meta information is necessary */
>> +        buf[EVENT_TYPE_OFFSET] != FORMAT_DESCRIPTION_EVENT &&
>> +        buf[EVENT_TYPE_OFFSET] != ROTATE_EVENT)))
>>    {
>>      /*
>>        Do not write it to the relay log.
>> @@ -3232,8 +3364,12 @@ static int queue_event(Master_info* mi,c
>>        But events which were generated by this slave and which do not exist in
>>        the master's binlog (i.e. Format_desc, Rotate & Stop) should not
> increment
>>        mi->master_log_pos.
>> +      If the event is originated remotely and is being filtered out by
>> +      MASTER_IGNORE_SERVER_ID it increments mi->master_log_pos
>> +      as well as rli->group_relay_log_pos.
>>      */
>> -    if (buf[EVENT_TYPE_OFFSET]!=FORMAT_DESCRIPTION_EVENT &&
>> +    if (!(s_id == ::server_id && !mi->rli.replicate_same_server_id)
> ||
>> +        buf[EVENT_TYPE_OFFSET]!=FORMAT_DESCRIPTION_EVENT &&
>>          buf[EVENT_TYPE_OFFSET]!=ROTATE_EVENT &&
>>          buf[EVENT_TYPE_OFFSET]!=STOP_EVENT)
>>   

>
> It's not something you did, but could you please add spaces around the != ?
>

ok (todo)

>>      {
>> @@ -3243,8 +3379,8 @@ static int queue_event(Master_info* mi,c
>>        rli->ign_master_log_pos_end= mi->master_log_pos;
>>      }
>>      rli->relay_log.signal_update(); // the slave SQL thread needs to
> re-check
>> -    DBUG_PRINT("info", ("master_log_pos: %lu, event originating from the same
> server, ignored",
>> -                        (ulong) mi->master_log_pos));
>> +    DBUG_PRINT("info", ("master_log_pos: %lu, event originating from %u server,
> ignored",
>> +                        (ulong) mi->master_log_pos, uint4korr(buf +
> SERVER_ID_OFFSET)));
>>    }
>>    else
>>    {
>> diff -Nrup a/sql/sql_lex.h b/sql/sql_lex.h
>> --- a/sql/sql_lex.h	2008-04-01 13:56:44 +03:00
>> +++ b/sql/sql_lex.h	2008-05-08 21:00:32 +03:00
>> @@ -211,6 +211,7 @@ typedef struct st_lex_master_info
>>    char *ssl_key, *ssl_cert, *ssl_ca, *ssl_capath, *ssl_cipher;
>>    char *relay_log_name;
>>    ulong relay_log_pos;
>> +  DYNAMIC_ARRAY repl_ignore_server_ids;  } LEX_MASTER_INFO;
>>   diff -Nrup a/sql/sql_repl.cc b/sql/sql_repl.cc
>> --- a/sql/sql_repl.cc	2008-03-08 12:14:45 +02:00
>> +++ b/sql/sql_repl.cc	2008-05-08 21:00:32 +03:00
>> @@ -1189,32 +1189,41 @@ void kill_zombie_dump_threads(uint32 sla
>>    }
>>  }
>>  +/**
>> +   A comparison function to be supplied as argument to @c sort_dynamic() @c
>> +   provide ascendant sort of a ulong-items dynamic array
>> +*/
>> +static int change_master_server_id_cmp(ulong *id1, ulong *id2)
>> +{
>> +  return (*id1) > (*id2);
>>   
>
> Parentheses are not needed here.
>

agreed. (todo)

>> +}
>>   bool change_master(THD* thd, Master_info* mi)
>>  {
>>    int thread_mask;
>>    const char* errmsg= 0;
>>    bool need_relay_log_purge= 1;
>> +  bool ret= FALSE;
>>    DBUG_ENTER("change_master");
>>     lock_slave_threads(mi);
>>    init_thread_mask(&thread_mask,mi,0 /*not inverse*/);
>> +  LEX_MASTER_INFO* lex_mi= &thd->lex->mi;
>>    if (thread_mask) // We refuse if any slave thread is running
>>    {
>>      my_message(ER_SLAVE_MUST_STOP, ER(ER_SLAVE_MUST_STOP), MYF(0));
>> -    unlock_slave_threads(mi);
>> -    DBUG_RETURN(TRUE);
>> +    ret= TRUE;
>> +    goto err;
>>    }
>>     thd_proc_info(thd, "Changing master");
>> -  LEX_MASTER_INFO* lex_mi= &thd->lex->mi;
>>    // TODO: see if needs re-write
>>    if (init_master_info(mi, master_info_file, relay_log_info_file, 0,
>>  		       thread_mask))
>>    {
>>      my_message(ER_MASTER_INFO, ER(ER_MASTER_INFO), MYF(0));
>> -    unlock_slave_threads(mi);
>> -    DBUG_RETURN(TRUE);
>> +    ret= TRUE;
>> +    goto err;
>>    }
>>     /*
>> @@ -1259,6 +1268,27 @@ bool change_master(THD* thd, Master_info
>>      mi->heartbeat_period= (float) min(SLAVE_MAX_HEARTBEAT_PERIOD,
>>                                        (slave_net_timeout/2.0));
>>    mi->received_heartbeats= LL(0); // counter lives until master is CHANGEd
>> +  /*
>> +    reset the last time server_id list if the current CHANGE MASTER
>> +    is mentioning MASTER_IGNORE_SERVER_ID
>> +  */
>> +  if (lex_mi->repl_ignore_server_ids.elements > 0)
>> +    reset_dynamic(&mi->ignore_server_ids);
>> +  for (uint i= 0; i < lex_mi->repl_ignore_server_ids.elements; i++)
>> +  {
>> +    ulong s_id;
>> +    get_dynamic(&lex_mi->repl_ignore_server_ids, (uchar*) &s_id, i);
>> +    if (s_id == ::server_id && replicate_same_server_id)
>> +    {
>> +      my_error(ER_SLAVE_IGNORE_SERVER_ID, MYF(0), s_id);
>> +      ret= TRUE;
>> +      goto err;
>> +    }
>> +    else
>> +      insert_dynamic(&mi->ignore_server_ids, (uchar*) &s_id);
>> +  }
>> +  sort_dynamic(&mi->ignore_server_ids, (qsort_cmp)
> change_master_server_id_cmp);
>>   
>

> I see no reason to reset this list when doing a CHANGE MASTER. Is
> there a good reason for it?

I hope I have explained that answering your question at the beginning of the mail.

>
>> +
>>    if (lex_mi->ssl != LEX_MASTER_INFO::LEX_MI_UNCHANGED)
>>      mi->ssl= (lex_mi->ssl == LEX_MASTER_INFO::LEX_MI_ENABLE);
>>  @@ -1336,8 +1366,8 @@ bool change_master(THD* thd, Master_info
>>    if (flush_master_info(mi, 0))
>>    {
>>      my_error(ER_RELAY_LOG_INIT, MYF(0), "Failed to flush master info file");
>> -    unlock_slave_threads(mi);
>> -    DBUG_RETURN(TRUE);
>> +    ret= TRUE;
>> +    goto err;
>>    }
>>    if (need_relay_log_purge)
>>    {
>> @@ -1348,8 +1378,8 @@ bool change_master(THD* thd, Master_info
>>  			 &errmsg))
>>      {
>>        my_error(ER_RELAY_LOG_FAIL, MYF(0), errmsg);
>> -      unlock_slave_threads(mi);
>> -      DBUG_RETURN(TRUE);
>> +      ret= TRUE;
>> +      goto err;
>>      }
>>    }
>>    else
>> @@ -1364,8 +1394,8 @@ bool change_master(THD* thd, Master_info
>>  			   &msg, 0))
>>      {
>>        my_error(ER_RELAY_LOG_INIT, MYF(0), msg);
>> -      unlock_slave_threads(mi);
>> -      DBUG_RETURN(TRUE);
>> +      ret= TRUE;
>> +      goto err;
>>      }
>>    }
>>    /*
>> @@ -1402,10 +1432,13 @@ bool change_master(THD* thd, Master_info
>>    pthread_cond_broadcast(&mi->data_cond);
>>    pthread_mutex_unlock(&mi->rli.data_lock);
>>  +err:
>>    unlock_slave_threads(mi);
>>    thd_proc_info(thd, 0);
>> -  my_ok(thd);
>> -  DBUG_RETURN(FALSE);
>> +  if (ret == FALSE)
>> +    my_ok(thd);
>> +  delete_dynamic(&lex_mi->repl_ignore_server_ids); //freeing of
> parser-time alloc
>> +  DBUG_RETURN(ret);
>>  }
>>   int reset_master(THD* thd)
>> diff -Nrup a/sql/sql_yacc.yy b/sql/sql_yacc.yy
>> --- a/sql/sql_yacc.yy	2008-04-01 16:19:00 +03:00
>> +++ b/sql/sql_yacc.yy	2008-05-08 21:00:32 +03:00
>> @@ -892,6 +892,7 @@ bool my_yyoverflow(short **a, YYSTYPE **
>>  %token  MASTER_SYM
>>  %token  MASTER_USER_SYM
>>  %token  MASTER_HEARTBEAT_PERIOD_SYM
>> +%token  MASTER_IGNORE_SERVER_ID_SYM
>>  %token  MATCH                         /* SQL-2003-R */
>>  %token  MAX_CONNECTIONS_PER_HOUR
>>  %token  MAX_QUERIES_PER_HOUR
>> @@ -1636,6 +1637,9 @@ change:
>>              LEX *lex = Lex;
>>              lex->sql_command = SQLCOM_CHANGE_MASTER;
>>              bzero((char*) &lex->mi, sizeof(lex->mi));
>> +            lex->mi.heartbeat_opt= LEX_MASTER_INFO::LEX_MI_UNCHANGED;
>> +            my_init_dynamic_array(&Lex->mi.repl_ignore_server_ids,
>> +                                  sizeof(::server_id), 16, 16);
>>            }
>>            master_defs
>>            {}
>> @@ -1735,6 +1739,10 @@ master_def:
>>                Lex->mi.heartbeat_opt=  LEX_MASTER_INFO::LEX_MI_DISABLE;
>>              }
>>              Lex->mi.heartbeat_opt=  LEX_MASTER_INFO::LEX_MI_ENABLE;
>> +          }
>> +        | MASTER_IGNORE_SERVER_ID_SYM EQ ulong_num
>> +          {
>> +            insert_dynamic(&Lex->mi.repl_ignore_server_ids, (uchar*)
> &($3));
>>            }
>>          |
>>          master_file_def
>>
>>   


cheers,

Andrei
Thread
bk commit into 6.0 tree (aelkin:1.2634) BUG#25998Andrei Elkin8 May 2008
  • Re: bk commit into 6.0 tree (aelkin:1.2634) BUG#25998Mats Kindahl20 May 2008
    • Re: bk commit into 6.0 tree (aelkin:1.2634) BUG#25998Andrei Elkin22 May 2008
      • Re: bk commit into 6.0 tree (aelkin:1.2634) BUG#25998Mats Kindahl28 May 2008
        • Re: bk commit into 6.0 tree (aelkin:1.2634) BUG#25998Andrei Elkin30 May 2008
          • Re: bk commit into 6.0 tree (aelkin:1.2634) BUG#25998Mats Kindahl2 Jun 2008
Re: bk commit into 6.0 tree (aelkin:1.2634) BUG#25998Andrei Elkin3 Jun 2008
  • Re: bk commit into 6.0 tree (aelkin:1.2634) BUG#25998Mats Kindahl3 Jun 2008