Hi,
He Zhenxing wrote:
> Hi Alfranio
>
> Maybe I did not make my point very clearly in my previous mail, here is
> a summary of the things that I think you need to consider:
>
> 1) Only include the code of your approach in your patch, remove all the
> code from patches for BUG#31665
>
I agree with that but my patch needs to be applied right after applying
yours.
There is an intersection between these patches.
See the code below.
> 2) I think we only discard relay-log when we restart slave after a
> crash, or we detected some error in the relay-log or master.info,
> otherwise slave should act as normal.
>
> 3) the discarded relay-logs need to be purged correctly, so that there
> will not be duplicated binlog events in the relay-log. (You already
> commented on this, if it does, then it's OK)
>
> alfranio correia wrote:
>
>> He Zhenxing wrote:
>>
>>> Hi Alfranio
>>>
>>> Thank you for your work!
>>>
>>> Here is my thoughts about these two approaches:
>>> 1) The first approach (BUG#35542, BUG#31665) introduces
>>> --sync-relay-log, which fsync the relay-log and master.info for every
>>> sync_relay_log events, and fsync the relay-log.info every sync_relay_log
>>> transactions.
>>>
>>> 2) The second approach (BUG#40337) introduces --relay-log-throw, which
>>> fsync the relay-log.info for EVERY transaction, and when restart from
>>> server crash, the slave purges any relay-logs after the executed master
>>> log position recorded in relay-log.info and re-fetch master binlogs from
>>> that position.
>>>
>>> I think most part of these two approaches are orthogonal, except that we
>>> have to make sure when both are enabled, the relay-log.info is only
>>> flushed once for every transaction.
>>>
>>>
>> Jasonh, I partially agree with that. In order to ensure that the
>> relay-log.info is synced after every transaction
>> I've changed some lines in your patch:
>>
>> yours--->
>>
>> @@ -4885,10 +4890,16 @@ bool flush_relay_log_info(RELAY_LOG_INFO
>> error=1;
>> if (flush_io_cache(file))
>> error=1;
>> + static unsigned int count= 1;
>> if (sync_relaylog_period &&
>> !error &&
>> - my_sync(rli->info_fd, MYF(MY_WME)))
>> - error=1;
>> + count > sync_relaylog_period)
>> + {
>> + if (my_sync(rli->info_fd, MYF(MY_WME)))
>> + error=1;
>> + count= 1;
>> + }
>>
>> ----------------------
>> mine--->
>>
>> - if (sync_relaylog_period &&
>> - !error &&
>> - count > sync_relaylog_period)
>> + if (sync_relaylog_period &&
>> + !error)
>> ------------------------
>>
>> I can do something like this:
>>
>> if (((sync_relaylog_period && count > sync_relaylog_period) ||
> relay_log_throw)
>> && !error)
>>
>>
>>> So I think you do not need to base your patch on mine (actually
>>> Sinisa's), I think your approach and the approach proposed by Sinisa can
>>> be seperate. It should be possible that we only add your approach to the
>>> code. So I think your approach should only depend on option
>>> --relay-log-throw, and should not have dependency on option
>>> --sync-relay-log.
>>>
>>> BTW, since the performance of previous approach is acceptable if we turn
>>> on hard drive cache, and it won't delete the relay-log that's already
>>> been saved on slave, which some users may prefer, so I think the first
>>> approach still has its use case. So I think we can have both and the
>>> users can choose any one of them or both of them.
>>>
>>> There are also some inline comments.
>>>
>>> Alfranio Correia wrote:
>>>
>>>
>>>> #At
> file:///home/acorreia/workspace.sun/playground.mysql/mysql-5.0.68-custom_build/
>>>>
>>>> 2704 Alfranio Correia 2008-10-27
>>>> Merged patch that was proposed by Sinisa and modified by He
> Zhenxing to fix bugs: BUG#35542 and BUG#31665. This is an update to BUG#40337.
>>>> And introduced a new recovery process that disregards old
> relay-log.bin(s) and starts
>>>> re-fetching from the master based on the information in
> relay-log.info which may be flushed and synced after each transaction commit.
>>>> modified:
>>>> mysql-test/r/rpl_flush_log_loop.result
>>>> sql/log.cc
>>>> sql/mysql_priv.h
>>>> sql/mysqld.cc
>>>> sql/set_var.cc
>>>> sql/set_var.h
>>>> sql/slave.cc
>>>> sql/sql_class.h
>>>>
>>>> per-file messages:
>>>> mysql-test/r/rpl_flush_log_loop.result
>>>> Changed result file for a test case affected by fix proposed for
> BUG#40337
>>>> sql/log.cc
>>>> Added support to the checkpoint interval through the parameter
> --sync-relay-log.
>>>> sql/mysql_priv.h
>>>> Created parameter --relay_log_throw, default = 0 thus keeping the
> original behavior. And added support to the checkpoint interval through the parameter
> --sync-relay-log, default = 0 thus keeping the original behavior.
>>>>
>>>>
>>> Usually, we use hyphen (-) instead of underscore (_) in option names,
>>> but the corresponding parameter name will use unserscore.
>>>
>>>
>> I will fix this in the next commit.
>>
>>>
>>>
>>>> sql/mysqld.cc
>>>> Created parameter --relay_log_throw, default = 0 thus keeping the
> original behavior. And added support to the checkpoint interval through the parameter
> --sync-relay-log, default = 0 thus keeping the original behavior.
>>>> sql/set_var.cc
>>>> Created parameter --relay_log_throw, default = 0 thus keeping the
> original behavior. And added support to the checkpoint interval through the parameter
> --sync-relay-log, default = 0 thus keeping the original behavior.
>>>> sql/set_var.h
>>>> Added support to the checkpoint interval through the parameter
> --sync-relay-log.
>>>> sql/slave.cc
>>>> Introduced the function init_recovery that is fired right after the
> startup and
>>>> updates the master.info according to the relay-log.info and throws
> away (i.e. ignores)the old relay-log.bin. And also added support to the checkpoint
> interval through the parameter --sync-relay-log.
>>>>
>>>> Note that the when --sync-relay-log is on the relay-info is flushed
> and synced per transaction commit:
>>>>
>>>> if (sync_relaylog_period &&
>>>> !error)
>>>> {
>>>> if (my_sync(rli->info_fd, MYF(MY_WME)))
>>>> error=1;
>>>> }
>>>>
>>>> This is a bit different from the patch proposed for BUG#35542 and
> BUG#31665.
>>>> sql/sql_class.h
>>>> Created sync_period that determines how often the master.info and
> relay-log.bin are flushed and synch. This parameter should be seen as a checkpoint based
> on number of events.
>>>> === modified file 'mysql-test/r/rpl_flush_log_loop.result'
>>>> --- a/mysql-test/r/rpl_flush_log_loop.result 2007-10-10 07:21:11 +0000
>>>> +++ b/mysql-test/r/rpl_flush_log_loop.result 2008-10-27 15:09:43 +0000
>>>> @@ -10,6 +10,7 @@ relay_log MYSQLTEST_VARDIR/master-data/r
>>>> relay_log_index
>>>> relay_log_info_file relay-log.info
>>>> relay_log_purge ON
>>>> +relay_log_throw OFF
>>>> relay_log_space_limit 0
>>>> stop slave;
>>>> change master to master_host='127.0.0.1',master_user='root',
>>>>
>>>> === modified file 'sql/log.cc'
>>>> --- a/sql/log.cc 2008-07-24 12:28:21 +0000
>>>> +++ b/sql/log.cc 2008-10-27 15:09:43 +0000
>>>> @@ -32,8 +32,7 @@
>>>> #include "message.h"
>>>> #endif
>>>>
>>>> -MYSQL_LOG mysql_log, mysql_slow_log, mysql_bin_log;
>>>> -ulong sync_binlog_counter= 0;
>>>> +MYSQL_LOG mysql_log, mysql_slow_log,
> mysql_bin_log(&sync_binlog_period);
>>>>
>>>> static Muted_query_log_event invisible_commit;
>>>>
>>>> @@ -402,10 +401,11 @@ static int find_uniq_filename(char *name
>>>> }
>>>>
>>>>
>>>> -MYSQL_LOG::MYSQL_LOG()
>>>> +MYSQL_LOG::MYSQL_LOG(uint *sync_period)
>>>> :bytes_written(0), last_time(0), query_start(0), name(0),
>>>> prepared_xids(0), log_type(LOG_CLOSED), file_id(1), open_count(1),
>>>> write_error(FALSE), inited(FALSE), need_start_event(TRUE),
>>>> + sync_period_ptr(sync_period),
>>>> description_event_for_exec(0), description_event_for_queue(0)
>>>> {
>>>> /*
>>>> @@ -1622,6 +1622,8 @@ bool MYSQL_LOG::append(Log_event* ev)
>>>> }
>>>> bytes_written+= ev->data_written;
>>>> DBUG_PRINT("info",("max_size: %lu",max_size));
>>>> + if (flush_and_sync())
>>>> + goto err;
>>>> if ((uint) my_b_append_tell(&log_file) > max_size)
>>>> new_file(0);
>>>>
>>>> @@ -1652,6 +1654,8 @@ bool MYSQL_LOG::appendv(const char* buf,
>>>> bytes_written += len;
>>>> } while ((buf=va_arg(args,const char*)) &&
> (len=va_arg(args,uint)));
>>>> DBUG_PRINT("info",("max_size: %lu",max_size));
>>>> + if (flush_and_sync())
>>>> + goto err;
>>>> if ((uint) my_b_append_tell(&log_file) > max_size)
>>>> new_file(0);
>>>>
>>>> @@ -1758,9 +1762,10 @@ bool MYSQL_LOG::flush_and_sync()
>>>> safe_mutex_assert_owner(&LOCK_log);
>>>> if (flush_io_cache(&log_file))
>>>> return 1;
>>>> - if (++sync_binlog_counter >= sync_binlog_period &&
> sync_binlog_period)
>>>> + uint sync_period= get_sync_period();
>>>> + if (sync_period && ++sync_counter >= sync_period)
>>>> {
>>>> - sync_binlog_counter= 0;
>>>> + sync_counter= 0;
>>>> err=my_sync(fd, MYF(MY_WME));
>>>> }
>>>> return err;
>>>>
>>>> === modified file 'sql/mysql_priv.h'
>>>> --- a/sql/mysql_priv.h 2008-10-02 11:57:52 +0000
>>>> +++ b/sql/mysql_priv.h 2008-10-27 15:09:43 +0000
>>>> @@ -1310,10 +1310,11 @@ extern ulong max_binlog_size, max_relay_
>>>> extern ulong rpl_recovery_rank, thread_cache_size;
>>>> extern ulong back_log;
>>>> extern ulong specialflag, current_pid;
>>>> -extern ulong expire_logs_days, sync_binlog_period, sync_binlog_counter;
>>>> +extern ulong expire_logs_days;
>>>> +extern uint sync_binlog_period, sync_relaylog_period;
>>>> extern ulong opt_tc_log_size, tc_log_max_pages_used, tc_log_page_size;
>>>> extern ulong tc_log_page_waits;
>>>> -extern my_bool relay_log_purge, opt_innodb_safe_binlog, opt_innodb;
>>>> +extern my_bool relay_log_purge, relay_log_throw, opt_innodb_safe_binlog,
> opt_innodb;
>>>> extern uint test_flags,select_errors,ha_open_options;
>>>> extern uint protocol_version, mysqld_port, dropping_tables;
>>>> extern uint delay_key_write_options, lower_case_table_names;
>>>>
>>>> === modified file 'sql/mysqld.cc'
>>>> --- a/sql/mysqld.cc 2008-08-26 08:32:43 +0000
>>>> +++ b/sql/mysqld.cc 2008-10-27 15:09:43 +0000
>>>> @@ -404,7 +404,7 @@ ulong opt_ndb_cache_check_time;
>>>> const char *opt_ndb_mgmd;
>>>> ulong opt_ndb_nodeid;
>>>> #endif
>>>> -my_bool opt_readonly, use_temp_pool, relay_log_purge;
>>>> +my_bool opt_readonly, use_temp_pool, relay_log_purge, relay_log_throw;
>>>> my_bool opt_sync_frm, opt_allow_suspicious_udfs;
>>>> my_bool opt_secure_auth= 0;
>>>> char* opt_secure_file_priv= 0;
>>>> @@ -466,7 +466,8 @@ ulong max_prepared_stmt_count;
>>>> */
>>>> ulong prepared_stmt_count=0;
>>>> ulong thread_id=1L,current_pid;
>>>> -ulong slow_launch_threads = 0, sync_binlog_period;
>>>> +ulong slow_launch_threads = 0;
>>>> +uint sync_binlog_period= 0, sync_relaylog_period= 0;
>>>> ulong expire_logs_days = 0;
>>>> ulong rpl_recovery_rank=0;
>>>>
>>>> @@ -4918,6 +4919,7 @@ enum options_mysqld
>>>> OPT_QUERY_CACHE_TYPE, OPT_QUERY_CACHE_WLOCK_INVALIDATE,
> OPT_RECORD_BUFFER,
>>>> OPT_RECORD_RND_BUFFER, OPT_DIV_PRECINCREMENT,
> OPT_RELAY_LOG_SPACE_LIMIT,
>>>> OPT_RELAY_LOG_PURGE,
>>>> + OPT_RELAY_LOG_THROW,
>>>> OPT_SLAVE_NET_TIMEOUT, OPT_SLAVE_COMPRESSED_PROTOCOL,
> OPT_SLOW_LAUNCH_TIME,
>>>> OPT_SLAVE_TRANS_RETRIES, OPT_READONLY, OPT_DEBUGGING,
>>>> OPT_SORT_BUFFER, OPT_TABLE_CACHE,
>>>> @@ -4995,7 +4997,8 @@ enum options_mysqld
>>>> OPT_SECURE_FILE_PRIV,
>>>> OPT_KEEP_FILES_ON_CREATE,
>>>> OPT_INNODB_ADAPTIVE_HASH_INDEX,
>>>> - OPT_FEDERATED
>>>> + OPT_FEDERATED,
>>>> + OPT_SYNC_RELAY_LOG
>>>> };
>>>>
>>>>
>>>> @@ -6302,6 +6305,11 @@ The minimum value for this variable is 4
>>>> (gptr*) &relay_log_purge,
>>>> (gptr*) &relay_log_purge, 0, GET_BOOL, NO_ARG,
>>>> 1, 0, 1, 0, 1, 0},
>>>> + {"relay_log_throw", OPT_RELAY_LOG_THROW,
>>>> + "0 = do not ignore retrieved relay logs. 1 = ignore retrieved relay
> logs.",
>>>> + (gptr*) &relay_log_throw,
>>>> + (gptr*) &relay_log_throw, 0, GET_BOOL, NO_ARG,
>>>> + 0, 0, 1, 0, 1, 0},
>>>> {"relay_log_space_limit", OPT_RELAY_LOG_SPACE_LIMIT,
>>>> "Maximum space to use for all relay logs.",
>>>> (gptr*) &relay_log_space_limit,
>>>> @@ -6342,8 +6350,13 @@ The minimum value for this variable is 4
>>>> {"sync-binlog", OPT_SYNC_BINLOG,
>>>> "Synchronously flush binary log to disk after every #th event. "
>>>> "Use 0 (default) to disable synchronous flushing.",
>>>> - (gptr*) &sync_binlog_period, (gptr*) &sync_binlog_period, 0,
> GET_ULONG,
>>>> - REQUIRED_ARG, 0, 0, ULONG_MAX, 0, 1, 0},
>>>> + (gptr*) &sync_binlog_period, (gptr*) &sync_binlog_period, 0,
> GET_UINT,
>>>> + REQUIRED_ARG, 0, 0, UINT_MAX, 0, 1, 0},
>>>> + {"sync-relay-log", OPT_SYNC_RELAY_LOG,
>>>> + "Synchronously flush relay log to disk after every #th event. "
>>>> + "Use 0 (default) to disable synchronous flushing.",
>>>> + (gptr *) &sync_relaylog_period, (gptr *)
> &sync_relaylog_period, 0, GET_UINT,
>>>> + REQUIRED_ARG, 0, 0, UINT_MAX, 0, 1, 0},
>>>> {"sync-frm", OPT_SYNC_FRM, "Sync .frm to disk on create. Enabled by
> default.",
>>>> (gptr*) &opt_sync_frm, (gptr*) &opt_sync_frm, 0, GET_BOOL,
> NO_ARG, 1, 0,
>>>> 0, 0, 0, 0},
>>>>
>>>> === modified file 'sql/set_var.cc'
>>>> --- a/sql/set_var.cc 2008-08-25 12:11:59 +0000
>>>> +++ b/sql/set_var.cc 2008-10-27 15:09:43 +0000
>>>> @@ -323,6 +323,8 @@ sys_var_thd_ulong sys_div_precincrement(
>>>> #ifdef HAVE_REPLICATION
>>>> sys_var_bool_ptr sys_relay_log_purge("relay_log_purge",
>>>> &relay_log_purge);
>>>> +sys_var_bool_ptr sys_relay_log_throw("relay_log_throw",
>>>> + &relay_log_throw);
>>>> #endif
>>>> sys_var_long_ptr sys_rpl_recovery_rank("rpl_recovery_rank",
>>>> &rpl_recovery_rank);
>>>> @@ -402,7 +404,8 @@ sys_var_thd_table_type sys_table_type("
>>>> sys_var_thd_storage_engine sys_storage_engine("storage_engine",
>>>> &SV::table_type);
>>>> #ifdef HAVE_REPLICATION
>>>> -sys_var_sync_binlog_period sys_sync_binlog_period("sync_binlog",
> &sync_binlog_period);
>>>> +sys_var_int_ptr sys_sync_binlog_period("sync_binlog",
> &sync_binlog_period);
>>>> +sys_var_int_ptr sys_sync_relaylog_period("sync_relay_log",
> &sync_relaylog_period);
>>>> #endif
>>>> sys_var_bool_ptr sys_sync_frm("sync_frm", &opt_sync_frm);
>>>> sys_var_const_str sys_system_time_zone("system_time_zone",
>>>> @@ -733,6 +736,7 @@ sys_var *sys_variables[]=
>>>> &sys_read_rnd_buff_size,
>>>> #ifdef HAVE_REPLICATION
>>>> &sys_relay_log_purge,
>>>> + &sys_relay_log_throw,
>>>> #endif
>>>> &sys_rpl_recovery_rank,
>>>> &sys_safe_updates,
>>>> @@ -762,6 +766,7 @@ sys_var *sys_variables[]=
>>>> &sys_storage_engine,
>>>> #ifdef HAVE_REPLICATION
>>>> &sys_sync_binlog_period,
>>>> + &sys_sync_relaylog_period,
>>>> #endif
>>>> &sys_sync_frm,
>>>> &sys_system_time_zone,
>>>> @@ -1052,6 +1057,7 @@ struct show_var_st init_vars[]= {
>>>> {"relay_log_index", (char*) &opt_relaylog_index_name,
> SHOW_CHAR_PTR},
>>>> {"relay_log_info_file", (char*) &relay_log_info_file,
> SHOW_CHAR_PTR},
>>>> {sys_relay_log_purge.name, (char*) &sys_relay_log_purge,
> SHOW_SYS},
>>>> + {sys_relay_log_throw.name, (char*) &sys_relay_log_throw,
> SHOW_SYS},
>>>> {"relay_log_space_limit", (char*) &relay_log_space_limit,
> SHOW_LONGLONG},
>>>> #endif
>>>> {sys_rpl_recovery_rank.name,(char*) &sys_rpl_recovery_rank,
> SHOW_SYS},
>>>> @@ -1090,6 +1096,7 @@ struct show_var_st init_vars[]= {
>>>> {sys_storage_engine.name, (char*) &sys_storage_engine,
> SHOW_SYS},
>>>> #ifdef HAVE_REPLICATION
>>>> {sys_sync_binlog_period.name,(char*) &sys_sync_binlog_period,
> SHOW_SYS},
>>>> + {sys_sync_relaylog_period.name,(char*) &sys_sync_relaylog_period,
> SHOW_SYS},
>>>> #endif
>>>> {sys_sync_frm.name, (char*) &sys_sync_frm,
> SHOW_SYS},
>>>> #ifdef HAVE_TZNAME
>>>> @@ -1486,6 +1493,23 @@ static bool get_unsigned(THD *thd, set_v
>>>> }
>>>>
>>>>
>>>> +bool sys_var_int_ptr::check(THD *thd, set_var *var)
>>>> +{
>>>> + var->save_result.ulong_value= (ulong) var->value->val_int();
>>>> + return 0;
>>>> +}
>>>> +
>>>> +bool sys_var_int_ptr::update(THD *thd, set_var *var)
>>>> +{
>>>> + *value= (uint) var->save_result.ulong_value;
>>>> + return 0;
>>>> +}
>>>> +
>>>> +void sys_var_int_ptr::set_default(THD *thd, enum_var_type type)
>>>> +{
>>>> + *value= (uint) option_limits->def_value;
>>>> +}
>>>> +
>>>> sys_var_long_ptr::
>>>> sys_var_long_ptr(const char *name_arg, ulong *value_ptr_arg,
>>>> sys_after_update_func after_update_arg)
>>>> @@ -2759,13 +2783,6 @@ bool sys_var_slave_skip_counter::update(
>>>> pthread_mutex_unlock(&LOCK_active_mi);
>>>> return 0;
>>>> }
>>>> -
>>>> -
>>>> -bool sys_var_sync_binlog_period::update(THD *thd, set_var *var)
>>>> -{
>>>> - sync_binlog_period= (ulong) var->save_result.ulonglong_value;
>>>> - return 0;
>>>> -}
>>>> #endif /* HAVE_REPLICATION */
>>>>
>>>> bool sys_var_rand_seed1::update(THD *thd, set_var *var)
>>>>
>>>> === modified file 'sql/set_var.h'
>>>> --- a/sql/set_var.h 2008-01-23 15:03:58 +0000
>>>> +++ b/sql/set_var.h 2008-10-27 15:09:43 +0000
>>>> @@ -110,6 +110,26 @@ public:
>>>> { return (byte*) value; }
>>>> };
>>>>
>>>> +/**
>>>> + Unsigned int system variable class
>>>> + */
>>>> +class sys_var_int_ptr :public sys_var
>>>> +{
>>>> +public:
>>>> + sys_var_int_ptr(const char *name_arg, uint *value_ptr_arg,
>>>> + sys_after_update_func after_update_arg= NULL)
>>>> + :sys_var(name_arg, after_update_arg),
>>>> + value(value_ptr_arg)
>>>> + {}
>>>> + bool check(THD *thd, set_var *var);
>>>> + bool update(THD *thd, set_var *var);
>>>> + void set_default(THD *thd, enum_var_type type);
>>>> + SHOW_TYPE show_type() { return SHOW_INT; }
>>>> + byte *value_ptr(THD *thd, enum_var_type type, LEX_STRING *base)
>>>> + { return (byte*) value; }
>>>> +private:
>>>> + uint *value;
>>>> +};
>>>>
>>>> /*
>>>> A global ulong variable that is protected by
> LOCK_global_system_variables
>>>> @@ -550,11 +570,11 @@ public:
>>>> */
>>>> };
>>>>
>>>> -class sys_var_sync_binlog_period :public sys_var_long_ptr
>>>> +class sys_var_sync_binlog_period :public sys_var_int_ptr
>>>> {
>>>> public:
>>>> - sys_var_sync_binlog_period(const char *name_arg, ulong
> *value_ptr_arg)
>>>> - :sys_var_long_ptr(name_arg,value_ptr_arg) {}
>>>> + sys_var_sync_binlog_period(const char *name_arg, uint *value_ptr_arg)
>>>> + :sys_var_int_ptr(name_arg,value_ptr_arg) {}
>>>> bool update(THD *thd, set_var *var);
>>>> };
>>>> #endif
>>>>
>>>> === modified file 'sql/slave.cc'
>>>> --- a/sql/slave.cc 2008-03-28 20:01:05 +0000
>>>> +++ b/sql/slave.cc 2008-10-27 15:09:43 +0000
>>>> @@ -66,6 +66,7 @@ static bool wait_for_relay_log_space(REL
>>>> static inline bool io_slave_killed(THD* thd,MASTER_INFO* mi);
>>>> static inline bool sql_slave_killed(THD* thd,RELAY_LOG_INFO* rli);
>>>> static int count_relay_log_space(RELAY_LOG_INFO* rli);
>>>> +static int init_recovery(MASTER_INFO* mi);
>>>> static int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type);
>>>> static int safe_connect(THD* thd, MYSQL* mysql, MASTER_INFO* mi);
>>>> static int safe_reconnect(THD* thd, MYSQL* mysql, MASTER_INFO* mi,
>>>> @@ -174,6 +175,9 @@ int init_slave()
>>>> if (server_id && !master_host &&
> active_mi->host[0])
>>>> master_host= active_mi->host;
>>>>
>>>> + if (init_recovery(active_mi))
>>>> + goto err;
>>>> +
>>>> /* If server id is not set, start_slave_thread() will say it */
>>>>
>>>> if (master_host && !opt_skip_slave_start)
>>>> @@ -197,6 +201,48 @@ err:
>>>> DBUG_RETURN(1);
>>>> }
>>>>
>>>> +static int init_recovery(MASTER_INFO* mi)
>>>> +{
>>>> + const char* errmsg=0;
>>>> + DBUG_ENTER("recovery_init");
>>>> +
>>>>
>>>>
>>> A typo, should be "init_recovery"
>>>
>>>
>> done in my own tree.
>>
>>>
>>>
>>>> + RELAY_LOG_INFO* rli = &mi->rli;
>>>> + if (relay_log_throw && rli->group_master_log_name[0])
>>>> + {
>>>> + mi->master_log_pos = max(BIN_LOG_HEADER_SIZE,
>>>> + rli->group_master_log_pos);
>>>> + strmake(mi->master_log_name, rli->group_master_log_name,
>>>> + sizeof(mi->master_log_name)-1);
>>>> +
>>>> + strmake(rli->group_relay_log_name,
> rli->relay_log.get_log_fname(),
>>>> + sizeof(rli->group_relay_log_name)-1);
>>>> + strmake(rli->event_relay_log_name,
> rli->relay_log.get_log_fname(),
>>>> + sizeof(mi->rli.event_relay_log_name)-1);
>>>> +
>>>> + rli->group_relay_log_pos = rli->group_relay_log_pos =
> BIN_LOG_HEADER_SIZE;
>>>> +
>>>> + if (init_relay_log_pos(rli,
>>>> + rli->group_relay_log_name,
>>>> + rli->group_relay_log_pos,
>>>> + 0 /*no data lock*/,
>>>> + &errmsg, 0))
>>>> + DBUG_RETURN(1);
>>>> +
>>>> + if (flush_master_info(mi, 0))
>>>> + {
>>>> + sql_print_error("Failed to flush master info file");
>>>> + DBUG_RETURN(1);
>>>> + }
>>>> + if (flush_relay_log_info(rli))
>>>> + {
>>>> + sql_print_error("Failed to flush master info file");
>>>> + DBUG_RETURN(1);
>>>> + }
>>>>
>>>>
>>> I think you need to purge the discarded relay-log files here.
>>>
>>>
>> Eventually, it will be discarded in the next call to the purge method.
>> At this point I don't have a reference to a THD which is required to call
>> the purge method.
>>
>>>
>>>
>>>> + }
>>>> +
>>>> + DBUG_RETURN(0);
>>>> +}
>>>> +
>>>>
>>>> static void free_table_ent(TABLE_RULE_ENT* e)
>>>> {
>>>> @@ -2582,6 +2628,7 @@ int flush_master_info(MASTER_INFO* mi, b
>>>> {
>>>> IO_CACHE* file = &mi->file;
>>>> char lbuf[22];
>>>> + int err= 0;
>>>> DBUG_ENTER("flush_master_info");
>>>> DBUG_PRINT("enter",("master_pos: %ld", (long)
> mi->master_log_pos));
>>>>
>>>> @@ -2597,9 +2644,12 @@ int flush_master_info(MASTER_INFO* mi, b
>>>> When we come to this place in code, relay log may or not be
> initialized;
>>>> the caller is responsible for setting 'flush_relay_log_cache'
> accordingly.
>>>> */
>>>> - if (flush_relay_log_cache &&
>>>> - flush_io_cache(mi->rli.relay_log.get_log_file()))
>>>> - DBUG_RETURN(2);
>>>> + if (flush_relay_log_cache)
>>>> + {
>>>> + IO_CACHE *log_file= mi->rli.relay_log.get_log_file();
>>>> + if (flush_io_cache(log_file))
>>>> + DBUG_RETURN(2);
>>>> + }
>>>>
>>>> /*
>>>> We flushed the relay log BEFORE the master.info file, because if we
> crash
>>>> @@ -2626,12 +2676,21 @@ int flush_master_info(MASTER_INFO* mi, b
>>>> mi->password, mi->port, mi->connect_retry,
>>>> (int)(mi->ssl), mi->ssl_ca, mi->ssl_capath,
> mi->ssl_cert,
>>>> mi->ssl_cipher, mi->ssl_key);
>>>> - DBUG_RETURN(-flush_io_cache(file));
>>>> + err= flush_io_cache(file);
>>>> + static unsigned int count=1;
>>>> + if (sync_relaylog_period && !err && count >
> sync_relaylog_period)
>>>> + {
>>>> + err= my_sync(mi->fd, MYF(MY_WME));
>>>> + count= 1;
>>>> + }
>>>> + count++;
>>>> + DBUG_RETURN(-err);
>>>> }
>>>>
>>>>
>>>> st_relay_log_info::st_relay_log_info()
>>>> - :info_fd(-1), cur_log_fd(-1), save_temporary_tables(0),
>>>> + :info_fd(-1), cur_log_fd(-1), relay_log(&sync_relaylog_period),
>>>> + save_temporary_tables(0),
>>>> cur_log_old_open_count(0), group_master_log_pos(0),
> log_space_total(0),
>>>> ignore_log_space_limit(0), last_master_timestamp(0),
> slave_skip_counter(0),
>>>> abort_pos_wait(0), slave_run_id(0), sql_thd(0), last_slave_errno(0),
>>>> @@ -4872,6 +4931,14 @@ bool flush_relay_log_info(RELAY_LOG_INFO
>>>> error=1;
>>>> if (flush_io_cache(file))
>>>> error=1;
>>>> + static unsigned int count= 1;
>>>> + if (sync_relaylog_period &&
>>>> + !error)
>>>> + {
>>>> + if (my_sync(rli->info_fd, MYF(MY_WME)))
>>>> + error=1;
>>>> + }
>>>> + count++;
>>>> /* Flushing the relay log is done by the slave I/O thread */
>>>> return error;
>>>> }
>>>>
>>>> === modified file 'sql/sql_class.h'
>>>> --- a/sql/sql_class.h 2008-09-17 06:34:00 +0000
>>>> +++ b/sql/sql_class.h 2008-10-27 15:09:43 +0000
>>>> @@ -238,6 +238,18 @@ class MYSQL_LOG: public TC_LOG
>>>> bool no_auto_events;
>>>> friend class Log_event;
>>>>
>>>> + /* pointer to the sync period variable, for binlog this will be
>>>> + sync_binlog_period, for relay log this will be
>>>> + sync_relay_log_period
>>>> + */
>>>> + uint *sync_period_ptr;
>>>> + uint sync_counter;
>>>> +
>>>> + inline uint get_sync_period()
>>>> + {
>>>> + return *sync_period_ptr;
>>>> + }
>>>> +
>>>> public:
>>>> /*
>>>> These describe the log's format. This is used only for relay logs.
>>>> @@ -250,7 +262,7 @@ public:
>>>> Format_description_log_event *description_event_for_exec,
>>>> *description_event_for_queue;
>>>>
>>>> - MYSQL_LOG();
>>>> + MYSQL_LOG(uint *sync_period=0);
>>>> /*
>>>> note that there's no destructor ~MYSQL_LOG() !
>>>> The reason is that we don't want it to be automatically called
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>