| List: | Commits | « Previous MessageNext Message » | |
| From: | Andrei Elkin | Date: | July 7 2009 9:51am |
| Subject: | Re: bzr commit into mysql-5.1-bugteam branch (Dao-Gang.Qu:2941) Bug#45214 | ||
| View as plain text | |||
Dao Gang, hello. Thanks for improving the error messages! However, let us stay punctual. I understood we agreed yesterday that the warning + mi->report(WARNING_LEVEL, ER_UNKNOWN_SYSTEM_VARIABLE, + "Unknown system variable 'COLLATION_SERVER' on master, \ +maybe it is a *VERY OLD MASTER*."); is de-facto underestimation of severity. <jasonh> this is 5.0, I'd rather keep the old behavior, although I agree we can do this for 5.4+ <andrei> jasonh, however not fixing it in 5.0 means the user is under treat of inconsistency. ... in 5.4 we are better to look consistent having the only logics If (ver <= 4 && the master collation != the slave's) goto error. Could you please edit the text of the warning to look more adequate to the situation. I suggest: "Unknown system variable 'COLLATION_SERVER' on master, maybe it is a *VERY OLD MASTER*. *NOTE*: slave may experience inconsistency if replicated data deals with collation." And the 2nd, at merging with 5.4 we should have any excuse to keep this 5.0 exclusive compromise (see quoting above). cheers, Andrei > Hi Andrei and Zhengxing, > I have updated the code base on the last discussion. Please review the > newest patch. Thanks! > > Best Regards, > > Daogang > >> Dao Gang, hello. >> >> Getting back to COLLATION_SERVER topic I did not reply last time. >> The same logics remains in the newest patch: >> >> >>>>>>> + if (mysql_errno(mysql) == ER_UNKNOWN_SYSTEM_VARIABLE) >>>>>>> + sql_print_warning("Unknown system variable > 'COLLATION_SERVER' on master, \ >>>>>>> +maybe it is a very old master."); >>>>>>> >> >> Again, the above is worth of an error, not the warning. >> Unknown collation on an old master can't work with the modern slave. >> >> cheers, >> >> Andrei >> >> >> >>>>>>> >>>>>> I suggest to restore the only requirement COLLATION_SERVER must > be equal. >>>>>> The empty trivially is not equal to any non-empty, and then this > case is the same error as above. >>>>>> >>>>>> >>>>> NO. I think we should consider the very old master as the block's >>>>> comments said: >>>>> >>>> >>>>> /* >>>>> Check that the master's global character_set_server and ours are the > same. >>>>> >>>> Does not the above clause say along the lines I wrote? >>>> I.e if there was hypothetical master that was unaware of COLLATION_SERVER > then >>>> the selected results set would be empty. >>>> Further, `The empty trivially is not equal to any non-empty'. >>>> Therefore, you may not care of details like ER_UNKNOWN_SYSTEM_VARIABLE >>>> >>>> !mysql_real_query(SELECT @@GLOBAL.COLLATION_SERVER) => false >>>> >>>> and >>>> >>>> is_network_error() => false >>>> >>>> is equal to >>>> >>>> !mysql_real_query(SELECT @@GLOBAL.COLLATION_SERVER) => true >>>> >>>> and >>>> >>>> strcmp(master_row[0], global_system_variables.collation_server->name) > => true >>>> >>>> in sense they both "goto err" and the same errmsg="... The values must \ >>>> be equal for replication to work" is applicable. >>>> >>>> >>>> >>> The logic is right, but it confused reader easily. >>> So I update and make it more readable. Thanks! >>> Please review the update code. >>> >> >> >> >> >>>> >>>>> Not fatal if query fails (old master?). >>>>> Note that we don't check for equality of global character_set_client > and >>>>> collation_connection (neither do we prevent their setting in >>>>> set_var.cc). That's because from what I (Guilhem) have tested, the > global >>>>> values of these 2 are never used (new connections don't use them). >>>>> We don't test equality of global collation_database either as it's > is >>>>> going to be deprecated (made read-only) in 4.1 very soon. >>>>> The test is only relevant if master < 5.0.3 (we'll test only if > it's older >>>>> than the 5 branch; < 5.0.3 was alpha...), as >= 5.0.3 master > stores >>>>> charset info in each binlog event. >>>>> We don't do it for 3.23 because masters <3.23.50 hang on >>>>> SELECT @@unknown_var (BUG#7965 - see changelog of 3.23.50). So > finally we >>>>> test only if master is 4.x. >>>>> */ >>>>> >>>> >>>> >>>>> >>>>>>> + else >>>>>>> + { >>>>>>> + master_row= mysql_fetch_row(master_res); >>>>>>> + if (master_row && strcmp(master_row[0], >>>>>>> + > global_system_variables.collation_server->name)) >>>>>>> + { + errmsg= "The slave I/O thread stops > because >>>>>>> master and slave have \ >>>>>>> different values for the COLLATION_SERVER global variable. > The values must \ >>>>>>> be equal for replication to work"; >>>>>>> - err_code= ER_SLAVE_FATAL_ERROR; >>>>>>> - sprintf(err_buff, ER(err_code), errmsg); >>>>>>> + err_code= ER_SLAVE_FATAL_ERROR; >>>>>>> + sprintf(err_buff, ER(err_code), errmsg); >>>>>>> + } >>>>>>> + else if (!master_row && > is_network_error(mysql_errno(mysql))) >>>>>>> + { >>>>>>> + mysql_free_result(master_res); >>>>>>> + DBUG_RETURN(2); >>>>>>> + } >>>>>>> + mysql_free_result(master_res); >>>>>>> + if (errmsg) >>>>>>> + goto err; >>>>>>> } >>>>>>> - mysql_free_result(master_res); >>>>>>> - if (errmsg) >>>>>>> - goto err; >>>>>>> } >>>>>>> /* >>>>>>> @@ -1042,13 +1145,31 @@ be equal for replication to work"; >>>>>>> This check is only necessary for 4.x masters (and < > 5.0.4 masters but >>>>>>> those were alpha). >>>>>>> */ >>>>>>> - if ((*mysql->server_version == '4') && >>>>>>> - !mysql_real_query(mysql, STRING_WITH_LEN("SELECT > @@GLOBAL.TIME_ZONE")) && >>>>>>> - (master_res= mysql_store_result(mysql))) >>>>>>> - { >>>>>>> - if ((master_row= mysql_fetch_row(master_res)) > && >>>>>>> - strcmp(master_row[0], >>>>>>> - > global_system_variables.time_zone->get_name()->ptr())) >>>>>>> + if (*mysql->server_version == '4') >>>>>>> + { >>>>>>> + query_re= mysql_real_query(mysql, > STRING_WITH_LEN("SELECT @@GLOBAL.TIME_ZONE")); >>>>>>> + if (query_re || !(master_res= > mysql_store_result(mysql))) >>>>>>> + { >>>>>>> + if (is_network_error(mysql_errno(mysql))) >>>>>>> + { >>>>>>> + sql_print_warning("The slave I/O thread tries to > reconnect " >>>>>>> + "due to network error: %s (%d)", >>>>>>> + mysql_error(mysql), > mysql_errno(mysql)); >>>>>>> + DBUG_RETURN(2); >>>>>>> + } >>>>>>> >>>>>> >>>>>>> + /* Fatal error */ >>>>>>> + sql_print_error("The slave I/O thread stops. Error: %s > (%d)", >>>>>>> + mysql_error(mysql), > mysql_errno(mysql)); >>>>>>> + errmsg= "The slave I/O thread stops because a fatal > error is encountered \ >>>>>>> +when it try to get the value of TIME_ZONE global variable > from master."; >>>>>>> + err_code= ER_SLAVE_FATAL_ERROR; >>>>>>> + sprintf(err_buff, ER(err_code), errmsg); >>>>>>> + goto err; >>>>>>> + } >>>>>>> >>>>>> This block is fine. You must have noticed that this last case is > similar to the previous >>>>>> where, I think, you overdid with checking of COLLATION_SERVER. >>>>>> >>>>>> >>>>> NO. They are different situation. the 'COLLATION_SERVER' may be > empty >>>>> on very old master, >>>>> but the 'TIME_ZONE' can't be empty on very old master. >>>>> >>>> I meant to say the blocks are equal logically even though/if TIME_ZONE > might not be empty >>>> and COLLATION_SERVER might be empty. We still can strcmp()-compare >>>> the null set against a not-null one. >>>> >>>> >>> The 'COLLATION_SERVER' may be empty on very old master. >>> So If an ER_UNKNOWN_SYSTEM_VARIABLE error is encountered when query it. >>> The I/O thread just need report warning and pass successfully. >>> >>> The 'TIME_ZONE' can't be empty on any version master. >>> So If an ER_UNKNOWN_SYSTEM_VARIABLE error is encountered when query it. >>> The I/O thread must report a fatal error and stop. >>> >>> >>>>>>> + >>>>>>> + master_row= mysql_fetch_row(master_res); >>>>>>> + if (master_row && strcmp(master_row[0], >>>>>>> + > global_system_variables.time_zone->get_name()->ptr())) >>>>>>> { >>>>>>> errmsg= "The slave I/O thread stops because master and > slave have \ >>>>>>> different values for the TIME_ZONE global variable. The > values must \ >>>>>>> >>>>>> Here is an important question which was not supposed to be a part > of this bug. >>>>>> In row-based replication TIME_ZONE is irrelevant. >>>>>> However, the IO thread might not know on what type of event the > master will send. >>>>>> >>>>>> So, ideally we are better to defer the error until after the > slave has >>>>>> received a first query-event. It would be reasonable to stop by > the SQL >>>>>> slave only if the query deals with time_zone objects. >>>>>> >>>>>> I think you should not try to fix this rather complicated issue, > but >>>>>> rather to report a bug on this matter referring to this > analysis. >>>>>> >>>>>> >>>>>> -- Andrei Elkin, Software Developer Sun Database Group, Espoo, Finland, www.mysql.com Are you MySQL certified? www.mysql.com/certification
