From: Date: May 7 2008 2:49pm Subject: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524 List-Archive: http://lists.mysql.com/commits/46451 X-Bug: 36524 Message-Id: <200805071249.m47CnMRZ024154@mysql1000.dsl.inet.fi> Below is the list of changes that have just been committed into a local 5.1 repository of aelkin. When aelkin does a push these changes will be propagated to the main repository and, within 24 hours after the push, to the public repository. For information on how to access the public repository see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html ChangeSet@stripped, 2008-05-07 15:49:18+03:00, aelkin@stripped +3 -0 Bug #36524 incident of deadlock on slave is overdramatized In cases of temporary errors in replication event execution (deadlock, timeout) the error log gained an overreacting error message whereas just a warning would be fine. Fixed with checking of the temporary status of the error inside Slave_reporting_capability::report() to demote the error to the warning level if needed. The warning gets converted into the error whenever number of successive attempts to re-apply the event(s) gets equal to the global server variable slave_trans_retries. sql/rpl_reporting.cc@stripped, 2008-05-07 15:49:16+03:00, aelkin@stripped +16 -0 putting burden of decision whether to report the message as the error or as the warning to report() method. sql/slave.cc@stripped, 2008-05-07 15:49:16+03:00, aelkin@stripped +4 -1 exporting the function to be used in log_event.cc; promoting a temporary error to the fatal level so that has_temporary_error() won't return TRUE if is_fatal_error is ON. sql/slave.h@stripped, 2008-05-07 15:49:16+03:00, aelkin@stripped +1 -0 the function starts to be shared by log_event.cc as well. diff -Nrup a/sql/rpl_reporting.cc b/sql/rpl_reporting.cc --- a/sql/rpl_reporting.cc 2007-06-11 23:15:28 +03:00 +++ b/sql/rpl_reporting.cc 2008-05-07 15:49:16 +03:00 @@ -2,6 +2,20 @@ #include "mysql_priv.h" #include "rpl_reporting.h" +/** + The method composes a message for the error log and/or + for future show slave status exposure. + + @param level designate seriousity of the message. + @param err_code the user level error code + @param msg the format string of the message, + can be followed with more arguments + providing the data for the format string + + @note level can be corrected inside. E.g when the error + appears to be a temprorary the level is demoted + to WARNING_LEVEL + */ void Slave_reporting_capability::report(loglevel level, int err_code, const char *msg, ...) const @@ -13,6 +27,8 @@ Slave_reporting_capability::report(logle va_list args; va_start(args, msg); + if (level == ERROR_LEVEL && has_temporary_error(current_thd)) + level= WARNING_LEVEL; switch (level) { case ERROR_LEVEL: diff -Nrup a/sql/slave.cc b/sql/slave.cc --- a/sql/slave.cc 2008-03-31 11:55:42 +03:00 +++ b/sql/slave.cc 2008-05-07 15:49:16 +03:00 @@ -1780,7 +1780,7 @@ int check_expected_error(THD* thd, Relay that the error is temporary by pushing a warning with the error code ER_GET_TEMPORARY_ERRMSG, if the originating error is temporary. */ -static int has_temporary_error(THD *thd) +int has_temporary_error(THD *thd) { DBUG_ENTER("has_temporary_error"); @@ -2121,10 +2121,13 @@ static int exec_relay_log_event(THD* thd } } else + { sql_print_error("Slave SQL thread retried transaction %lu time(s) " "in vain, giving up. Consider raising the value of " "the slave_transaction_retries variable.", slave_trans_retries); + thd->is_fatal_error= 1; // to force has_temporary_error() return 0 + } } else if (exec_res && !temp_err || (opt_using_transactions && diff -Nrup a/sql/slave.h b/sql/slave.h --- a/sql/slave.h 2008-03-29 14:19:49 +02:00 +++ b/sql/slave.h 2008-05-07 15:49:16 +03:00 @@ -190,6 +190,7 @@ void set_slave_thread_default_charset(TH void rotate_relay_log(Master_info* mi); int apply_event_and_update_pos(Log_event* ev, THD* thd, Relay_log_info* rli, bool skip); +inline int has_temporary_error(THD *thd); pthread_handler_t handle_slave_io(void *arg); pthread_handler_t handle_slave_sql(void *arg);