From: Date: May 7 2008 9:32pm Subject: Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524 List-Archive: http://lists.mysql.com/commits/46480 Message-Id: <482203BE.2070208@mysql.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------090101060203070009030903" --------------090101060203070009030903 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Hi Andrei! Comments inline below. /Matz Andrei Elkin wrote: > Below is the list of changes that have just been committed into a local > 5.1 repository of aelkin. When aelkin does a push these changes > will be propagated to the main repository and, within 24 hours after the > push, to the public repository. > For information on how to access the public repository > see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html > > ChangeSet@stripped, 2008-05-07 15:49:18+03:00, aelkin@stripped +3 -0 > Bug #36524 incident of deadlock on slave is overdramatized > > In cases of temporary errors in replication event execution (deadlock, timeout) > the error log gained an overreacting error message whereas just a warning would be fine. > > Fixed with checking of the temporary status of the error > inside Slave_reporting_capability::report() to demote the error to the warning level if needed. > The warning gets converted into the error whenever number of successive attempts to re-apply > the event(s) gets equal to the global server variable slave_trans_retries. > > sql/rpl_reporting.cc@stripped, 2008-05-07 15:49:16+03:00, aelkin@stripped +16 -0 > putting burden of decision whether to report the message as the error or > as the warning to report() method. > > sql/slave.cc@stripped, 2008-05-07 15:49:16+03:00, aelkin@stripped +4 -1 > exporting the function to be used in log_event.cc; > promoting a temporary error to the fatal level so that > has_temporary_error() won't return TRUE if is_fatal_error is ON. > > sql/slave.h@stripped, 2008-05-07 15:49:16+03:00, aelkin@stripped +1 -0 > the function starts to be shared by log_event.cc as well. > > diff -Nrup a/sql/rpl_reporting.cc b/sql/rpl_reporting.cc > --- a/sql/rpl_reporting.cc 2007-06-11 23:15:28 +03:00 > +++ b/sql/rpl_reporting.cc 2008-05-07 15:49:16 +03:00 > @@ -2,6 +2,20 @@ > #include "mysql_priv.h" > #include "rpl_reporting.h" > > +/** > + The method composes a message for the error log and/or > + for future show slave status exposure. > + > + @param level designate seriousity of the message. > + @param err_code the user level error code > + @param msg the format string of the message, > + can be followed with more arguments > + providing the data for the format string > + > + @note level can be corrected inside. E.g when the error > + appears to be a temprorary the level is demoted > typo "temprorary" > + to WARNING_LEVEL > + */ > void > Slave_reporting_capability::report(loglevel level, int err_code, > const char *msg, ...) const > @@ -13,6 +27,8 @@ Slave_reporting_capability::report(logle > va_list args; > va_start(args, msg); > > + if (level == ERROR_LEVEL && has_temporary_error(current_thd)) > + level= WARNING_LEVEL; > Why do you put the logic here? Isn't the what log level should be used a decision to be made by the caller? This seems to put to many features into the function, which previously had the role of just reporting an error at a certain level. Why is this a problem? This will make all errors reported through this interface be demoted provided the current thread has a temporary error, regardless of what is actually being reported. This can cause two strange situations, to a naïve user of the reporting function: 1. A different error than the error present for the current thread is passed to the reporting function. This can be a decision on the callers behalf to regard a certain temporary error as a fatal error, but the reporting function will demote this error to a warning level just because the caller has "forgotten" to set the "fatal error" flag. 2. It is necessary to add code to prevent the error message to be repeated ad-nauseum, which means that part of the logic for deciding if this should be a warning is in the caller and part is inside the reporting function. This split just introduces an unnecessary coupling between the caller and the callee. I would like you to pass in the correct log-level to the reporting function instead, which will also keep the has_temporary_error() function inside the file and not exposing it outwards. > switch (level) > { > case ERROR_LEVEL: > diff -Nrup a/sql/slave.cc b/sql/slave.cc > --- a/sql/slave.cc 2008-03-31 11:55:42 +03:00 > +++ b/sql/slave.cc 2008-05-07 15:49:16 +03:00 > @@ -1780,7 +1780,7 @@ int check_expected_error(THD* thd, Relay > that the error is temporary by pushing a warning with the error code > ER_GET_TEMPORARY_ERRMSG, if the originating error is temporary. > */ > -static int has_temporary_error(THD *thd) > +int has_temporary_error(THD *thd) > { > DBUG_ENTER("has_temporary_error"); > > @@ -2121,10 +2121,13 @@ static int exec_relay_log_event(THD* thd > } > } > else > + { > sql_print_error("Slave SQL thread retried transaction %lu time(s) " > "in vain, giving up. Consider raising the value of " > "the slave_transaction_retries variable.", > slave_trans_retries); > + thd->is_fatal_error= 1; // to force has_temporary_error() return 0 > + } > } > else if (exec_res && !temp_err || > (opt_using_transactions && > diff -Nrup a/sql/slave.h b/sql/slave.h > --- a/sql/slave.h 2008-03-29 14:19:49 +02:00 > +++ b/sql/slave.h 2008-05-07 15:49:16 +03:00 > @@ -190,6 +190,7 @@ void set_slave_thread_default_charset(TH > void rotate_relay_log(Master_info* mi); > int apply_event_and_update_pos(Log_event* ev, THD* thd, Relay_log_info* rli, > bool skip); > +inline int has_temporary_error(THD *thd); > It is pointless to *declare* a function inline when it doesn't have a definition. Remove the keyword, it just adds confusion. > > pthread_handler_t handle_slave_io(void *arg); > pthread_handler_t handle_slave_sql(void *arg); > > -- Mats Kindahl Lead Software Developer Replication Team MySQL AB, www.mysql.com --------------090101060203070009030903--