Alfranio, salut!
> Andrei Elkin wrote:
>> Alfranio, hello.
>>
>>
>>> #At
> file:///home/acorreia/workspace.sun/playground.mysql/mysql-5.0.68-custom_build/
>>>
>>> 2705 Alfranio Correia 2008-10-29
>>> Introduced a recovery mechanism that throws away relay-log.bin*
> retrieved
>>> from a master in case of a crash and updates the master.info based on
>>> the relay.info. This is done because such files may be corrupted after
>>> a crash. We are assuming that relay.info is fsynced per transaction.
>>> See BUG#40337.
>>>
>>
>> As I confessed the suggest fix of 1-3 items is principally okay with
>> me except some details.
>> And here they are.
>>
>> This part of bug#40337 looks as the most difficult.
>>
>> I have not found in your patch that there is a detection of a crash.
>>
> There is a need for that. I agree with you.
> I can do that in fact I would like to do that but it depends on Lars and
> his priorities.
Sure, please check with Lars this part. The common sense tells me
the detection should go first, the recovery next.
> This task can take time and I cannot measure how many hours :(
I can't measure your time. It might be slower at the begining but head
fears hands do as we say in russian.
A trival advice: don't try to
figure out all on your own, ask questions soon instead that will save
our time.
>> I am in strong opinion that we really need one prior to call your new
>> function.
>>
> You are 100% right.
>> Otherwise, as we talked, there is a chance to loose of lot of data the
>> IO thread received but the SQL thread has not processed.
>> Notice, that the SQL slave could be simply offline.
>> As I see you are trying to force the user to provide crash evidence
>> through the new option relay_log_recovery.
> This is not true because that function is only called right after the
> startup
> and as such if you stop and start the replication threads nothing
> happens regarding
> the recovery.
Well i meant the user is responsible to set relay_log_recovery to TRUE
and by this he provides crash evidence.
Certainly we need make detection automatic. Then relay_log_recovery
parameter does not make sense - we are going to detect a crash always
- but a policy what to do next can vary so that relay_log_recovery
should be converted to correspond to the policy with the default print
out only a warning that relay log was not closed due to a crash.
>
>> Would not it be much
>> cheaper to write a stand-alone script to fix the coordinates in the
>> master.info?
>>
> Considering only the current status of the patch I would say that
> a script could do the same. But recall that the procedure should be
> automatic and in the feature do all the things that we have been
> discussing.
>
My question was rather rhetoric.
>> An approach that I can see we need to follow is
>> 1. to find out evidence of a crash,
>> 2. next to find out the last event (if any) executed from that relay log,
>> 3. and finally to set up the binlog retrieval coordinates for the master to be
>> the next event after it.
>>
>> If there is no evidence of a crash, no need to mess around with
>> master, relay-log.info.
>> If no event was executed from the last relay log then the binlog
>> retrieval coordinates can be set to the next event after the last executed
>> in the case the last executed is in the proceeding relay
>> log. Otherwise, if there are some more unprocessed relay log we will
>> have to find out the last event in the last properly closed relay log.
>>
>>
>> Evidence of crash can be found out through a special flag bit
>> (check out LOG_EVENT_BINLOG_IN_USE_F) that is stored at the log closing time.
>> Alas, it works only for the binlog but not for the relay log.
>> Still, this technique can be rather easily adopted to the relay log.
>> and that's what I see we really need to implement.
>>
>> So I am offering now to implement pp. 1-3 where p.1 implies extending
>> the relay log open/close operations with the binlog-like flagging.
>>
> Great Andrei !!!
> But don't forget that we also need to read from the last relay log
> information on
> the master_log_pos and master_log_name ;)
That's you asked how to do yesterday, right.
Some facts that might be helpful for implementing this part.
1. the master binlog file name can be extracted from the first rotate
event that follows the very first event (FormatDescription) in the
relay log.
2. the master binlog pos can be extracted from the previous to the
last event in the relay log (which is a Rotate event).
However, we can try to extract the file and pos both from the *last*
non-closed relay log. Only if the log happened to not contain the
first rotate (very unprobable) we need to perform pp 1,2 for the prev
relay log.
cheers,
Andrei
>
>> regards,
>>
>> Andrei
>>
>>
>>> modified:
>>> mysql-test/r/rpl_flush_log_loop.result
>>> sql/mysql_priv.h
>>> sql/mysqld.cc
>>> sql/set_var.cc
>>> sql/slave.cc
>>>
>>> per-file messages:
>>> mysql-test/r/rpl_flush_log_loop.result
>>> Fixed a test case broken after creating the patch.
>>> sql/mysql_priv.h
>>> Created parameter --relay-log-recovery, default = 0 thus keeping the
> original behavior.
>>> sql/mysqld.cc
>>> Created parameter --relay-log-recovery, default = 0 thus keeping the
> original behavior.
>>> sql/set_var.cc
>>> Created parameter --relay-log-recovery, default = 0 thus keeping the
> original behavior.
>>> sql/slave.cc
>>> Provided a recovery mechanism for the master.info and relay-log.bin*
> based on the relay.info.
>>> === modified file 'mysql-test/r/rpl_flush_log_loop.result'
>>> --- a/mysql-test/r/rpl_flush_log_loop.result 2007-10-10 07:21:11 +0000
>>> +++ b/mysql-test/r/rpl_flush_log_loop.result 2008-10-29 16:37:16 +0000
>>> @@ -10,6 +10,7 @@ relay_log MYSQLTEST_VARDIR/master-data/r
>>> relay_log_index
>>> relay_log_info_file relay-log.info
>>> relay_log_purge ON
>>> +relay_log_recovery OFF
>>> relay_log_space_limit 0
>>> stop slave;
>>> change master to master_host='127.0.0.1',master_user='root',
>>>
>>> === modified file 'sql/mysql_priv.h'
>>> --- a/sql/mysql_priv.h 2008-10-29 12:27:59 +0000
>>> +++ b/sql/mysql_priv.h 2008-10-29 16:37:16 +0000
>>> @@ -1314,7 +1314,7 @@ extern ulong expire_logs_days;
>>> extern uint sync_binlog_period, sync_relaylog_period;
>>> extern ulong opt_tc_log_size, tc_log_max_pages_used, tc_log_page_size;
>>> extern ulong tc_log_page_waits;
>>> -extern my_bool relay_log_purge, opt_innodb_safe_binlog, opt_innodb;
>>> +extern my_bool relay_log_purge, relay_log_recovery, opt_innodb_safe_binlog,
> opt_innodb;
>>> extern uint test_flags,select_errors,ha_open_options;
>>> extern uint protocol_version, mysqld_port, dropping_tables;
>>> extern uint delay_key_write_options, lower_case_table_names;
>>>
>>> === modified file 'sql/mysqld.cc'
>>> --- a/sql/mysqld.cc 2008-10-29 12:27:59 +0000
>>> +++ b/sql/mysqld.cc 2008-10-29 16:37:16 +0000
>>> @@ -404,7 +404,7 @@ ulong opt_ndb_cache_check_time;
>>> const char *opt_ndb_mgmd;
>>> ulong opt_ndb_nodeid;
>>> #endif
>>> -my_bool opt_readonly, use_temp_pool, relay_log_purge;
>>> +my_bool opt_readonly, use_temp_pool, relay_log_purge, relay_log_recovery;
>>> my_bool opt_sync_frm, opt_allow_suspicious_udfs;
>>> my_bool opt_secure_auth= 0;
>>> char* opt_secure_file_priv= 0;
>>> @@ -4919,6 +4919,7 @@ enum options_mysqld
>>> OPT_QUERY_CACHE_TYPE, OPT_QUERY_CACHE_WLOCK_INVALIDATE,
> OPT_RECORD_BUFFER,
>>> OPT_RECORD_RND_BUFFER, OPT_DIV_PRECINCREMENT, OPT_RELAY_LOG_SPACE_LIMIT,
>>> OPT_RELAY_LOG_PURGE,
>>> + OPT_RELAY_LOG_RECOVERY,
>>> OPT_SLAVE_NET_TIMEOUT, OPT_SLAVE_COMPRESSED_PROTOCOL,
> OPT_SLOW_LAUNCH_TIME,
>>> OPT_SLAVE_TRANS_RETRIES, OPT_READONLY, OPT_DEBUGGING,
>>> OPT_SORT_BUFFER, OPT_TABLE_CACHE,
>>> @@ -6304,6 +6305,11 @@ The minimum value for this variable is 4
>>> (gptr*) &relay_log_purge,
>>> (gptr*) &relay_log_purge, 0, GET_BOOL, NO_ARG,
>>> 1, 0, 1, 0, 1, 0},
>>> + {"relay_log_recovery", OPT_RELAY_LOG_RECOVERY,
>>> + "Enables automatic relay log recovery right after the database startup,
> which means that the IO Thread starts re-fetching from the master right after the last
> transaction processed. Use this parameter together with the sync relay log.",
>>> + (gptr*) &relay_log_recovery,
>>> + (gptr*) &relay_log_recovery, 0, GET_BOOL, NO_ARG,
>>> + 0, 0, 1, 0, 1, 0},
>>> {"relay_log_space_limit", OPT_RELAY_LOG_SPACE_LIMIT,
>>> "Maximum space to use for all relay logs.",
>>> (gptr*) &relay_log_space_limit,
>>>
>>> === modified file 'sql/set_var.cc'
>>> --- a/sql/set_var.cc 2008-10-29 12:27:59 +0000
>>> +++ b/sql/set_var.cc 2008-10-29 16:37:16 +0000
>>> @@ -323,6 +323,8 @@ sys_var_thd_ulong sys_div_precincrement(
>>> #ifdef HAVE_REPLICATION
>>> sys_var_bool_ptr sys_relay_log_purge("relay_log_purge",
>>> &relay_log_purge);
>>> +sys_var_bool_ptr sys_relay_log_recovery("relay_log_recovery",
>>> + &relay_log_recovery);
>>> #endif
>>> sys_var_long_ptr sys_rpl_recovery_rank("rpl_recovery_rank",
>>> &rpl_recovery_rank);
>>> @@ -734,6 +736,7 @@ sys_var *sys_variables[]=
>>> &sys_read_rnd_buff_size,
>>> #ifdef HAVE_REPLICATION
>>> &sys_relay_log_purge,
>>> + &sys_relay_log_recovery,
>>> #endif
>>> &sys_rpl_recovery_rank,
>>> &sys_safe_updates,
>>> @@ -1054,6 +1057,7 @@ struct show_var_st init_vars[]= {
>>> {"relay_log_index", (char*) &opt_relaylog_index_name, SHOW_CHAR_PTR},
>>> {"relay_log_info_file", (char*) &relay_log_info_file, SHOW_CHAR_PTR},
>>> {sys_relay_log_purge.name, (char*) &sys_relay_log_purge,
> SHOW_SYS},
>>> + {sys_relay_log_recovery.name, (char*) &sys_relay_log_recovery,
> SHOW_SYS},
>>> {"relay_log_space_limit", (char*) &relay_log_space_limit,
> SHOW_LONGLONG},
>>> #endif
>>> {sys_rpl_recovery_rank.name,(char*) &sys_rpl_recovery_rank,
> SHOW_SYS},
>>>
>>> === modified file 'sql/slave.cc'
>>> --- a/sql/slave.cc 2008-10-29 12:27:59 +0000
>>> +++ b/sql/slave.cc 2008-10-29 16:37:16 +0000
>>> @@ -67,6 +67,7 @@ static inline bool io_slave_killed(THD*
>>> static inline bool sql_slave_killed(THD* thd,RELAY_LOG_INFO* rli);
>>> static int count_relay_log_space(RELAY_LOG_INFO* rli);
>>> static int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type);
>>> +static int init_recovery(MASTER_INFO* mi);
>>> static int safe_connect(THD* thd, MYSQL* mysql, MASTER_INFO* mi);
>>> static int safe_reconnect(THD* thd, MYSQL* mysql, MASTER_INFO* mi,
>>> bool suppress_warnings);
>>> @@ -174,6 +175,9 @@ int init_slave()
>>> if (server_id && !master_host && active_mi->host[0])
>>> master_host= active_mi->host;
>>>
>>> + if (init_recovery(active_mi))
>>> + goto err;
>>> +
>>> /* If server id is not set, start_slave_thread() will say it */
>>>
>>> if (master_host && !opt_skip_slave_start)
>>> @@ -197,6 +201,73 @@ err:
>>> DBUG_RETURN(1);
>>> }
>>>
>>> +/*
>>> +* Updates the master info based on the information stored in the relay info
> and
>>> +* ignores relay logs previously retrieved by the IO thread starting fetching
> again
>>> +* based on to the group_master_log_pos and group_master_log_name.
>>> +*
>>> +* There is no need for a mutex
>>> +* as the caller (i.e. init_slave) already has one acquired.
>>> +*
>>> +* The recovery is done just after the database startup which means that
>>> +* start slave and stop slave commands do not throw away any relay log
>>> +* retrieved in the meantime.
>>> +*
>>> +* In the feature, we can improve this routine in order to avoid throwing
>>> +* away logs that are safely stored in the disk. Eventually, the old
>>> +* relay logs will be purged by the normal purge mechanism.
>>> +*
>>> +* Specifically,
>>> +*
>>> +* 1 - mi->master_log_pos <-- rli->group_master_log_pos
>>> +* 2 - mi->master_log_name <-- rli->group_master_log_name
>>> +* 3 - It moves the relay info to the new relay log file, by
>>> +* rli->group_relay_log_pos <-- rli->group_relay_log_pos <--
> BIN_LOG_HEADER_SIZE;
>>> +* rli->group_relay_log_name, rli->group_relay_log_pos
>>> +*
>>> +* If there is an error, it returns (1), otherwise returns (0).
>>> +*/
>>> +static int init_recovery(MASTER_INFO* mi)
>>> +{
>>> + const char* errmsg=0;
>>> + DBUG_ENTER("init_recovery");
>>> +
>>> + RELAY_LOG_INFO* rli = &mi->rli;
>>> + if (relay_log_recovery && rli->group_master_log_name[0])
>>> + {
>>> + mi->master_log_pos = max(BIN_LOG_HEADER_SIZE,
>>> + rli->group_master_log_pos);
>>> + strmake(mi->master_log_name, rli->group_master_log_name,
>>> + sizeof(mi->master_log_name)-1);
>>> +
>>> + strmake(rli->group_relay_log_name,
> rli->relay_log.get_log_fname(),
>>> + sizeof(rli->group_relay_log_name)-1);
>>> + strmake(rli->event_relay_log_name,
> rli->relay_log.get_log_fname(),
>>> + sizeof(mi->rli.event_relay_log_name)-1);
>>> +
>>> + rli->group_relay_log_pos = rli->group_relay_log_pos =
> BIN_LOG_HEADER_SIZE;
>>> +
>>> + if (init_relay_log_pos(rli,
>>> + rli->group_relay_log_name,
>>> + rli->group_relay_log_pos,
>>> + 0 /*no data lock*/,
>>> + &errmsg, 0))
>>> + DBUG_RETURN(1);
>>> +
>>> + if (flush_master_info(mi, 0))
>>> + {
>>> + sql_print_error("Failed to flush master info file");
>>> + DBUG_RETURN(1);
>>> + }
>>> + if (flush_relay_log_info(rli))
>>> + {
>>> + sql_print_error("Failed to flush master info file");
>>> + DBUG_RETURN(1);
>>> + }
>>> + }
>>> +
>>> + DBUG_RETURN(0);
>>> +}
>>>
>>> static void free_table_ent(TABLE_RULE_ENT* e)
>>> {
>>>
>>>
>>> --
>>> MySQL Code Commits Mailing List
>>> For list archives: http://lists.mysql.com/commits
>>> To unsubscribe: http://lists.mysql.com/commits?unsub=1
>>>
>>
>>
>
>
> --
> MySQL Code Commits Mailing List
> For list archives: http://lists.mysql.com/commits
> To unsubscribe: http://lists.mysql.com/commits?unsub=1
>