From: Date: December 13 2006 3:21pm Subject: bk commit into 5.0 tree (aelkin:1.2347) BUG#20435 List-Archive: http://lists.mysql.com/commits/16882 X-Bug: 20435 Message-Id: <200612131421.kBDELLMo004049@dsl-hkibras-fe30f900-107.dhcp.inet.fi> Below is the list of changes that have just been committed into a local 5.0 repository of elkin. When elkin does a push these changes will be propagated to the main repository and, within 24 hours after the push, to the public repository. For information on how to access the public repository see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html ChangeSet@stripped, 2006-12-13 16:21:14+02:00, aelkin@stripped +1 -0 Bug #20435 Relay logs are rotated at slave_net_timeout when there's no activity Rotate events were generated locally on slave after reconnecting to the master upon slave_net_timeout expired while there were no events from master. That's the way how failure detection for master was originally implemented. Leaving aside the algorithm of failure detection (the first patch tries to solve rotation problem from that perspective) we refine behavour on slave's side to not rotate relay log files when master does not rotate itself when it brings with rotate event binlog postion the same as slave already knows. This remains valid even though master was stopped and downgraded. After reconnecting slave would receive first rotate and FD and other events of the last binlog where it was interupped to receive from, and only after that rotate and FD of new binlog of downgranded format version. If slave reconnects to all time online master and gets with rotate the same position it knows then rotate event is discarded, relay log files remain untouched also the event is not put into the current log. The latter applies to reconnecting after slave_net_timeout which repairs from the bug. sql/slave.cc@stripped, 2006-12-13 16:21:12+02:00, aelkin@stripped +41 -4 do not call process_io_rotate if master is sending `fake' reconnecting rotate event. Effective for all 3 binlog versions. # This is a BitKeeper patch. What follows are the unified diffs for the # set of deltas contained in the patch. The rest of the patch, the part # that BitKeeper cares about, is below these diffs. # User: aelkin # Host: dsl-hkibras-fe30f900-107.dhcp.inet.fi # Root: /home/elkin/MySQL/TEAM/FIXES/5.0/bug20435_relay_rot_reconn_fix2 --- 1.286/sql/slave.cc 2006-12-13 16:21:21 +02:00 +++ 1.287/sql/slave.cc 2006-12-13 16:21:21 +02:00 @@ -4234,7 +4234,17 @@ static int queue_binlog_ver_1_event(MAST inc_pos= event_len; break; case ROTATE_EVENT: - if (unlikely(process_io_rotate(mi,(Rotate_log_event*)ev))) + Rotate_log_event *rev= (Rotate_log_event*) ev; + ignore_event= /* if master reported the current pos */ + (strcmp(mi->master_log_name, rev->new_log_ident) == 0) && + (mi->master_log_pos == rev->pos); + if (ignore_event) + { + delete ev; + pthread_mutex_unlock(&mi->data_lock); + DBUG_RETURN(0); + } + if (unlikely(process_io_rotate(mi, rev))) { delete ev; pthread_mutex_unlock(&mi->data_lock); @@ -4318,7 +4328,17 @@ static int queue_binlog_ver_3_event(MAST case STOP_EVENT: goto err; case ROTATE_EVENT: - if (unlikely(process_io_rotate(mi,(Rotate_log_event*)ev))) + Rotate_log_event *rev= (Rotate_log_event*) ev; + bool ignore_event= /* if master reported the current pos */ + (strcmp(mi->master_log_name, rev->new_log_ident) == 0) && + (mi->master_log_pos == rev->pos); + if (ignore_event) + { + delete ev; + pthread_mutex_unlock(&mi->data_lock); + DBUG_RETURN(0); + } + if (unlikely(process_io_rotate(mi, rev))) { delete ev; pthread_mutex_unlock(&mi->data_lock); @@ -4414,8 +4434,25 @@ int queue_event(MASTER_INFO* mi,const ch goto err; case ROTATE_EVENT: { - Rotate_log_event rev(buf,event_len,mi->rli.relay_log.description_event_for_queue); - if (unlikely(process_io_rotate(mi,&rev))) + Rotate_log_event + rev(buf, event_len, mi->rli.relay_log.description_event_for_queue); + bool ignore_event= /* if master reported the current pos */ + (strcmp(mi->master_log_name, rev.new_log_ident) == 0) && + (mi->master_log_pos == rev.pos); + if (ignore_event) + { + /* + master's not moved its binlog position since last event slave + received. such `fake' rotate event caused by reconnection + does not require: rotating relay logs, event queueing, nor + advancing mi->master_log_pos. + */ + error= 0; + pthread_mutex_unlock(&mi->data_lock); + DBUG_RETURN(0); + } + + if (unlikely(process_io_rotate(mi, &rev))) { error= 1; goto err;