| List: | Commits | « Previous MessageNext Message » | |
| From: | alfranio correia | Date: | October 31 2008 10:27am |
| Subject: | Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705) Bug#40337 | ||
| View as plain text | |||
alfranio correia wrote: > Andrei Elkin wrote: > >> Alfranio, hello. >> >> >> >>> #At > file:///home/acorreia/workspace.sun/playground.mysql/mysql-5.0.68-custom_build/ >>> >>> 2705 Alfranio Correia 2008-10-29 >>> Introduced a recovery mechanism that throws away relay-log.bin* > retrieved >>> from a master in case of a crash and updates the master.info based on >>> the relay.info. This is done because such files may be corrupted after >>> a crash. We are assuming that relay.info is fsynced per transaction. >>> See BUG#40337. >>> >>> >> As I confessed the suggest fix of 1-3 items is principally okay with >> me except some details. >> And here they are. >> >> This part of bug#40337 looks as the most difficult. >> >> I have not found in your patch that there is a detection of a crash. >> >> > There is a need for that. I agree with you. > I can do that in fact I would like to do that but it depends on Lars and > his priorities. > This task can take time and I cannot measure how many hours :( > >> I am in strong opinion that we really need one prior to call your new >> function. >> >> > You are 100% right. > >> Otherwise, as we talked, there is a chance to loose of lot of data the >> IO thread received but the SQL thread has not processed. >> Notice, that the SQL slave could be simply offline. >> As I see you are trying to force the user to provide crash evidence >> through the new option relay_log_recovery. >> > This is not true because that function is only called right after the > startup > and as such if you stop and start the replication threads nothing > happens regarding > the recovery. > > >> Would not it be much >> cheaper to write a stand-alone script to fix the coordinates in the >> master.info? >> >> > Considering only the current status of the patch I would say that > a script could do the same. But recall that the procedure should be > automatic and in the feature do all the things that we have been > discussing. > > >> An approach that I can see we need to follow is >> 1. to find out evidence of a crash, >> 2. next to find out the last event (if any) executed from that relay log, >> 3. and finally to set up the binlog retrieval coordinates for the master to be >> the next event after it. >> >> If there is no evidence of a crash, no need to mess around with >> master, relay-log.info. >> If no event was executed from the last relay log then the binlog >> retrieval coordinates can be set to the next event after the last executed >> in the case the last executed is in the proceeding relay >> log. Otherwise, if there are some more unprocessed relay log we will >> have to find out the last event in the last properly closed relay log. >> >> >> Evidence of crash can be found out through a special flag bit >> (check out LOG_EVENT_BINLOG_IN_USE_F) that is stored at the log closing time. >> Alas, it works only for the binlog but not for the relay log. >> Still, this technique can be rather easily adopted to the relay log. >> and that's what I see we really need to implement. >> >> So I am offering now to implement pp. 1-3 where p.1 implies extending >> the relay log open/close operations with the binlog-like flagging. >> >> > Great Andrei !!! > But don't forget that we also need to read from the last relay log > information on > the master_log_pos and master_log_name ;) > > Andrei, one last comment. When I came up with this recovery idea I thought that would be great to have a mark written periodically to disk as a checkpoint. The mark at the end of the file is a great improvement when compared to what we have know, however, we should do better and target a periodic mark. The challenge in this case consists in ignoring from one point of the file until the end. Is this feasible? What do you think about this idea? >> regards, >> >> Andrei >> >> >> >>> modified: >>> mysql-test/r/rpl_flush_log_loop.result >>> sql/mysql_priv.h >>> sql/mysqld.cc >>> sql/set_var.cc >>> sql/slave.cc >>> >>> per-file messages: >>> mysql-test/r/rpl_flush_log_loop.result >>> Fixed a test case broken after creating the patch. >>> sql/mysql_priv.h >>> Created parameter --relay-log-recovery, default = 0 thus keeping the > original behavior. >>> sql/mysqld.cc >>> Created parameter --relay-log-recovery, default = 0 thus keeping the > original behavior. >>> sql/set_var.cc >>> Created parameter --relay-log-recovery, default = 0 thus keeping the > original behavior. >>> sql/slave.cc >>> Provided a recovery mechanism for the master.info and relay-log.bin* > based on the relay.info. >>> === modified file 'mysql-test/r/rpl_flush_log_loop.result' >>> --- a/mysql-test/r/rpl_flush_log_loop.result 2007-10-10 07:21:11 +0000 >>> +++ b/mysql-test/r/rpl_flush_log_loop.result 2008-10-29 16:37:16 +0000 >>> @@ -10,6 +10,7 @@ relay_log MYSQLTEST_VARDIR/master-data/r >>> relay_log_index >>> relay_log_info_file relay-log.info >>> relay_log_purge ON >>> +relay_log_recovery OFF >>> relay_log_space_limit 0 >>> stop slave; >>> change master to master_host='127.0.0.1',master_user='root', >>> >>> === modified file 'sql/mysql_priv.h' >>> --- a/sql/mysql_priv.h 2008-10-29 12:27:59 +0000 >>> +++ b/sql/mysql_priv.h 2008-10-29 16:37:16 +0000 >>> @@ -1314,7 +1314,7 @@ extern ulong expire_logs_days; >>> extern uint sync_binlog_period, sync_relaylog_period; >>> extern ulong opt_tc_log_size, tc_log_max_pages_used, tc_log_page_size; >>> extern ulong tc_log_page_waits; >>> -extern my_bool relay_log_purge, opt_innodb_safe_binlog, opt_innodb; >>> +extern my_bool relay_log_purge, relay_log_recovery, opt_innodb_safe_binlog, > opt_innodb; >>> extern uint test_flags,select_errors,ha_open_options; >>> extern uint protocol_version, mysqld_port, dropping_tables; >>> extern uint delay_key_write_options, lower_case_table_names; >>> >>> === modified file 'sql/mysqld.cc' >>> --- a/sql/mysqld.cc 2008-10-29 12:27:59 +0000 >>> +++ b/sql/mysqld.cc 2008-10-29 16:37:16 +0000 >>> @@ -404,7 +404,7 @@ ulong opt_ndb_cache_check_time; >>> const char *opt_ndb_mgmd; >>> ulong opt_ndb_nodeid; >>> #endif >>> -my_bool opt_readonly, use_temp_pool, relay_log_purge; >>> +my_bool opt_readonly, use_temp_pool, relay_log_purge, relay_log_recovery; >>> my_bool opt_sync_frm, opt_allow_suspicious_udfs; >>> my_bool opt_secure_auth= 0; >>> char* opt_secure_file_priv= 0; >>> @@ -4919,6 +4919,7 @@ enum options_mysqld >>> OPT_QUERY_CACHE_TYPE, OPT_QUERY_CACHE_WLOCK_INVALIDATE, > OPT_RECORD_BUFFER, >>> OPT_RECORD_RND_BUFFER, OPT_DIV_PRECINCREMENT, OPT_RELAY_LOG_SPACE_LIMIT, >>> OPT_RELAY_LOG_PURGE, >>> + OPT_RELAY_LOG_RECOVERY, >>> OPT_SLAVE_NET_TIMEOUT, OPT_SLAVE_COMPRESSED_PROTOCOL, > OPT_SLOW_LAUNCH_TIME, >>> OPT_SLAVE_TRANS_RETRIES, OPT_READONLY, OPT_DEBUGGING, >>> OPT_SORT_BUFFER, OPT_TABLE_CACHE, >>> @@ -6304,6 +6305,11 @@ The minimum value for this variable is 4 >>> (gptr*) &relay_log_purge, >>> (gptr*) &relay_log_purge, 0, GET_BOOL, NO_ARG, >>> 1, 0, 1, 0, 1, 0}, >>> + {"relay_log_recovery", OPT_RELAY_LOG_RECOVERY, >>> + "Enables automatic relay log recovery right after the database startup, > which means that the IO Thread starts re-fetching from the master right after the last > transaction processed. Use this parameter together with the sync relay log.", >>> + (gptr*) &relay_log_recovery, >>> + (gptr*) &relay_log_recovery, 0, GET_BOOL, NO_ARG, >>> + 0, 0, 1, 0, 1, 0}, >>> {"relay_log_space_limit", OPT_RELAY_LOG_SPACE_LIMIT, >>> "Maximum space to use for all relay logs.", >>> (gptr*) &relay_log_space_limit, >>> >>> === modified file 'sql/set_var.cc' >>> --- a/sql/set_var.cc 2008-10-29 12:27:59 +0000 >>> +++ b/sql/set_var.cc 2008-10-29 16:37:16 +0000 >>> @@ -323,6 +323,8 @@ sys_var_thd_ulong sys_div_precincrement( >>> #ifdef HAVE_REPLICATION >>> sys_var_bool_ptr sys_relay_log_purge("relay_log_purge", >>> &relay_log_purge); >>> +sys_var_bool_ptr sys_relay_log_recovery("relay_log_recovery", >>> + &relay_log_recovery); >>> #endif >>> sys_var_long_ptr sys_rpl_recovery_rank("rpl_recovery_rank", >>> &rpl_recovery_rank); >>> @@ -734,6 +736,7 @@ sys_var *sys_variables[]= >>> &sys_read_rnd_buff_size, >>> #ifdef HAVE_REPLICATION >>> &sys_relay_log_purge, >>> + &sys_relay_log_recovery, >>> #endif >>> &sys_rpl_recovery_rank, >>> &sys_safe_updates, >>> @@ -1054,6 +1057,7 @@ struct show_var_st init_vars[]= { >>> {"relay_log_index", (char*) &opt_relaylog_index_name, SHOW_CHAR_PTR}, >>> {"relay_log_info_file", (char*) &relay_log_info_file, SHOW_CHAR_PTR}, >>> {sys_relay_log_purge.name, (char*) &sys_relay_log_purge, > SHOW_SYS}, >>> + {sys_relay_log_recovery.name, (char*) &sys_relay_log_recovery, > SHOW_SYS}, >>> {"relay_log_space_limit", (char*) &relay_log_space_limit, > SHOW_LONGLONG}, >>> #endif >>> {sys_rpl_recovery_rank.name,(char*) &sys_rpl_recovery_rank, > SHOW_SYS}, >>> >>> === modified file 'sql/slave.cc' >>> --- a/sql/slave.cc 2008-10-29 12:27:59 +0000 >>> +++ b/sql/slave.cc 2008-10-29 16:37:16 +0000 >>> @@ -67,6 +67,7 @@ static inline bool io_slave_killed(THD* >>> static inline bool sql_slave_killed(THD* thd,RELAY_LOG_INFO* rli); >>> static int count_relay_log_space(RELAY_LOG_INFO* rli); >>> static int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type); >>> +static int init_recovery(MASTER_INFO* mi); >>> static int safe_connect(THD* thd, MYSQL* mysql, MASTER_INFO* mi); >>> static int safe_reconnect(THD* thd, MYSQL* mysql, MASTER_INFO* mi, >>> bool suppress_warnings); >>> @@ -174,6 +175,9 @@ int init_slave() >>> if (server_id && !master_host && active_mi->host[0]) >>> master_host= active_mi->host; >>> >>> + if (init_recovery(active_mi)) >>> + goto err; >>> + >>> /* If server id is not set, start_slave_thread() will say it */ >>> >>> if (master_host && !opt_skip_slave_start) >>> @@ -197,6 +201,73 @@ err: >>> DBUG_RETURN(1); >>> } >>> >>> +/* >>> +* Updates the master info based on the information stored in the relay info > and >>> +* ignores relay logs previously retrieved by the IO thread starting fetching > again >>> +* based on to the group_master_log_pos and group_master_log_name. >>> +* >>> +* There is no need for a mutex >>> +* as the caller (i.e. init_slave) already has one acquired. >>> +* >>> +* The recovery is done just after the database startup which means that >>> +* start slave and stop slave commands do not throw away any relay log >>> +* retrieved in the meantime. >>> +* >>> +* In the feature, we can improve this routine in order to avoid throwing >>> +* away logs that are safely stored in the disk. Eventually, the old >>> +* relay logs will be purged by the normal purge mechanism. >>> +* >>> +* Specifically, >>> +* >>> +* 1 - mi->master_log_pos <-- rli->group_master_log_pos >>> +* 2 - mi->master_log_name <-- rli->group_master_log_name >>> +* 3 - It moves the relay info to the new relay log file, by >>> +* rli->group_relay_log_pos <-- rli->group_relay_log_pos <-- > BIN_LOG_HEADER_SIZE; >>> +* rli->group_relay_log_name, rli->group_relay_log_pos >>> +* >>> +* If there is an error, it returns (1), otherwise returns (0). >>> +*/ >>> +static int init_recovery(MASTER_INFO* mi) >>> +{ >>> + const char* errmsg=0; >>> + DBUG_ENTER("init_recovery"); >>> + >>> + RELAY_LOG_INFO* rli = &mi->rli; >>> + if (relay_log_recovery && rli->group_master_log_name[0]) >>> + { >>> + mi->master_log_pos = max(BIN_LOG_HEADER_SIZE, >>> + rli->group_master_log_pos); >>> + strmake(mi->master_log_name, rli->group_master_log_name, >>> + sizeof(mi->master_log_name)-1); >>> + >>> + strmake(rli->group_relay_log_name, > rli->relay_log.get_log_fname(), >>> + sizeof(rli->group_relay_log_name)-1); >>> + strmake(rli->event_relay_log_name, > rli->relay_log.get_log_fname(), >>> + sizeof(mi->rli.event_relay_log_name)-1); >>> + >>> + rli->group_relay_log_pos = rli->group_relay_log_pos = > BIN_LOG_HEADER_SIZE; >>> + >>> + if (init_relay_log_pos(rli, >>> + rli->group_relay_log_name, >>> + rli->group_relay_log_pos, >>> + 0 /*no data lock*/, >>> + &errmsg, 0)) >>> + DBUG_RETURN(1); >>> + >>> + if (flush_master_info(mi, 0)) >>> + { >>> + sql_print_error("Failed to flush master info file"); >>> + DBUG_RETURN(1); >>> + } >>> + if (flush_relay_log_info(rli)) >>> + { >>> + sql_print_error("Failed to flush master info file"); >>> + DBUG_RETURN(1); >>> + } >>> + } >>> + >>> + DBUG_RETURN(0); >>> +} >>> >>> static void free_table_ent(TABLE_RULE_ENT* e) >>> { >>> >>> >>> -- >>> MySQL Code Commits Mailing List >>> For list archives: http://lists.mysql.com/commits >>> To unsubscribe: http://lists.mysql.com/commits?unsub=1 >>> >>> >> >> > > > Cheers.
