List:Commits« Previous MessageNext Message »
From:alfranio correia Date:October 31 2008 10:27am
Subject:Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705)
Bug#40337
View as plain text  
alfranio correia wrote:
> Andrei Elkin wrote:
>   
>> Alfranio, hello.
>>
>>   
>>     
>>> #At
> file:///home/acorreia/workspace.sun/playground.mysql/mysql-5.0.68-custom_build/
>>>
>>>  2705 Alfranio Correia	2008-10-29
>>>       Introduced a recovery mechanism that throws away relay-log.bin*
> retrieved
>>>       from a master in case of a crash and updates the master.info based on
>>>       the relay.info. This is done because such files may be corrupted after
>>>       a crash. We are assuming that relay.info is fsynced per transaction.
>>>       See BUG#40337.
>>>     
>>>       
>> As I confessed the suggest fix of 1-3 items is principally okay with
>> me except some details.
>> And here they are.
>>
>> This part of bug#40337 looks as the most difficult.
>>
>> I have not found in your patch that there is a detection of a crash.
>>   
>>     
> There is a need for that. I agree with you.
> I can do that in fact I would like to do that but it depends on Lars and
> his priorities.
> This task can take time and I cannot measure how many hours :(
>   
>> I am in strong opinion that we really need one prior to call your new
>> function.
>>   
>>     
> You are 100% right.
>   
>> Otherwise, as we talked, there is a chance to loose of lot of data the
>> IO thread received but the SQL thread has not processed.
>> Notice, that the SQL slave could be simply offline.
>> As I see you are trying to force the user to provide crash evidence
>> through the new option relay_log_recovery. 
>>     
> This is not true because that function is only called right after the
> startup
> and as such if you stop and start the replication threads nothing
> happens regarding
> the recovery.
>
>   
>> Would not it be much
>> cheaper to write a stand-alone script to fix the coordinates in the
>> master.info?
>>   
>>     
> Considering only the current status of the patch I would say that
> a script could do the same. But recall that the procedure should be
> automatic and in the feature do all the things that we have been
> discussing.
>
>   
>> An approach that I can see we need to follow is 
>> 1. to find out evidence of a crash,
>> 2. next to find out the last event (if any) executed from that relay log, 
>> 3. and finally to set up the binlog retrieval coordinates for the master to be
>>    the next event after it.
>>
>> If there is no evidence of a crash, no need to mess around with
>> master, relay-log.info.
>> If no event was executed from the last relay log then the binlog
>> retrieval coordinates can be set to the next event after the last executed
>> in the case the last executed is in the proceeding relay
>> log. Otherwise, if there are some more unprocessed relay log we will
>> have to find out the last event in the last properly closed relay log.
>>
>>
>> Evidence of crash can be found out through a special flag bit 
>> (check out LOG_EVENT_BINLOG_IN_USE_F) that is stored at the log closing time.
>> Alas, it works only for the binlog but not for the relay log.
>> Still, this technique can be rather easily adopted to the relay log.
>> and that's what I see we really need to implement.
>>
>> So I am offering now to implement pp. 1-3 where p.1 implies extending
>> the relay log open/close operations with the binlog-like flagging.
>>   
>>     
> Great Andrei !!!
> But don't forget that we also need to read from the last relay log
> information on
> the master_log_pos and master_log_name ;)
>
>   
Andrei, one last comment.
When I came up with this recovery idea I thought that would be great to
have a mark written periodically
to disk as a checkpoint.
The mark at the end of the file is a great improvement when compared to
what we have know,
however, we should do better and target a periodic mark.

The challenge in this case consists in ignoring from one point of the
file until the end.
Is this feasible? What do you think about this idea?

>> regards,
>>
>> Andrei
>>
>>   
>>     
>>> modified:
>>>   mysql-test/r/rpl_flush_log_loop.result
>>>   sql/mysql_priv.h
>>>   sql/mysqld.cc
>>>   sql/set_var.cc
>>>   sql/slave.cc
>>>
>>> per-file messages:
>>>   mysql-test/r/rpl_flush_log_loop.result
>>>     Fixed a test case broken after creating the patch.
>>>   sql/mysql_priv.h
>>>     Created parameter --relay-log-recovery, default = 0 thus keeping the
> original behavior.
>>>   sql/mysqld.cc
>>>     Created parameter --relay-log-recovery, default = 0 thus keeping the
> original behavior.
>>>   sql/set_var.cc
>>>     Created parameter --relay-log-recovery, default = 0 thus keeping the
> original behavior.
>>>   sql/slave.cc
>>>     Provided a recovery mechanism for the master.info and relay-log.bin*
> based on the relay.info.
>>> === modified file 'mysql-test/r/rpl_flush_log_loop.result'
>>> --- a/mysql-test/r/rpl_flush_log_loop.result	2007-10-10 07:21:11 +0000
>>> +++ b/mysql-test/r/rpl_flush_log_loop.result	2008-10-29 16:37:16 +0000
>>> @@ -10,6 +10,7 @@ relay_log	MYSQLTEST_VARDIR/master-data/r
>>>  relay_log_index	
>>>  relay_log_info_file	relay-log.info
>>>  relay_log_purge	ON
>>> +relay_log_recovery	OFF
>>>  relay_log_space_limit	0
>>>  stop slave;
>>>  change master to master_host='127.0.0.1',master_user='root',
>>>
>>> === modified file 'sql/mysql_priv.h'
>>> --- a/sql/mysql_priv.h	2008-10-29 12:27:59 +0000
>>> +++ b/sql/mysql_priv.h	2008-10-29 16:37:16 +0000
>>> @@ -1314,7 +1314,7 @@ extern ulong expire_logs_days;
>>>  extern uint sync_binlog_period, sync_relaylog_period;
>>>  extern ulong opt_tc_log_size, tc_log_max_pages_used, tc_log_page_size;
>>>  extern ulong tc_log_page_waits;
>>> -extern my_bool relay_log_purge, opt_innodb_safe_binlog, opt_innodb;
>>> +extern my_bool relay_log_purge, relay_log_recovery, opt_innodb_safe_binlog,
> opt_innodb;
>>>  extern uint test_flags,select_errors,ha_open_options;
>>>  extern uint protocol_version, mysqld_port, dropping_tables;
>>>  extern uint delay_key_write_options, lower_case_table_names;
>>>
>>> === modified file 'sql/mysqld.cc'
>>> --- a/sql/mysqld.cc	2008-10-29 12:27:59 +0000
>>> +++ b/sql/mysqld.cc	2008-10-29 16:37:16 +0000
>>> @@ -404,7 +404,7 @@ ulong opt_ndb_cache_check_time;
>>>  const char *opt_ndb_mgmd;
>>>  ulong opt_ndb_nodeid;
>>>  #endif
>>> -my_bool opt_readonly, use_temp_pool, relay_log_purge;
>>> +my_bool opt_readonly, use_temp_pool, relay_log_purge, relay_log_recovery;
>>>  my_bool opt_sync_frm, opt_allow_suspicious_udfs;
>>>  my_bool opt_secure_auth= 0;
>>>  char* opt_secure_file_priv= 0;
>>> @@ -4919,6 +4919,7 @@ enum options_mysqld
>>>    OPT_QUERY_CACHE_TYPE, OPT_QUERY_CACHE_WLOCK_INVALIDATE,
> OPT_RECORD_BUFFER,
>>>    OPT_RECORD_RND_BUFFER, OPT_DIV_PRECINCREMENT, OPT_RELAY_LOG_SPACE_LIMIT,
>>>    OPT_RELAY_LOG_PURGE,
>>> +  OPT_RELAY_LOG_RECOVERY,
>>>    OPT_SLAVE_NET_TIMEOUT, OPT_SLAVE_COMPRESSED_PROTOCOL,
> OPT_SLOW_LAUNCH_TIME,
>>>    OPT_SLAVE_TRANS_RETRIES, OPT_READONLY, OPT_DEBUGGING,
>>>    OPT_SORT_BUFFER, OPT_TABLE_CACHE,
>>> @@ -6304,6 +6305,11 @@ The minimum value for this variable is 4
>>>     (gptr*) &relay_log_purge,
>>>     (gptr*) &relay_log_purge, 0, GET_BOOL, NO_ARG,
>>>     1, 0, 1, 0, 1, 0},
>>> +  {"relay_log_recovery", OPT_RELAY_LOG_RECOVERY,
>>> +   "Enables automatic relay log recovery right after the database startup,
> which means that the IO Thread starts re-fetching from the master right after the last
> transaction processed. Use this parameter together with the sync relay log.",
>>> +   (gptr*) &relay_log_recovery,
>>> +   (gptr*) &relay_log_recovery, 0, GET_BOOL, NO_ARG,
>>> +   0, 0, 1, 0, 1, 0},
>>>    {"relay_log_space_limit", OPT_RELAY_LOG_SPACE_LIMIT,
>>>     "Maximum space to use for all relay logs.",
>>>     (gptr*) &relay_log_space_limit,
>>>
>>> === modified file 'sql/set_var.cc'
>>> --- a/sql/set_var.cc	2008-10-29 12:27:59 +0000
>>> +++ b/sql/set_var.cc	2008-10-29 16:37:16 +0000
>>> @@ -323,6 +323,8 @@ sys_var_thd_ulong	sys_div_precincrement(
>>>  #ifdef HAVE_REPLICATION
>>>  sys_var_bool_ptr	sys_relay_log_purge("relay_log_purge",
>>>                                              &relay_log_purge);
>>> +sys_var_bool_ptr	sys_relay_log_recovery("relay_log_recovery",
>>> +                                            &relay_log_recovery);
>>>  #endif
>>>  sys_var_long_ptr	sys_rpl_recovery_rank("rpl_recovery_rank",
>>>  					      &rpl_recovery_rank);
>>> @@ -734,6 +736,7 @@ sys_var *sys_variables[]=
>>>    &sys_read_rnd_buff_size,
>>>  #ifdef HAVE_REPLICATION
>>>    &sys_relay_log_purge,
>>> +  &sys_relay_log_recovery,
>>>  #endif
>>>    &sys_rpl_recovery_rank,
>>>    &sys_safe_updates,
>>> @@ -1054,6 +1057,7 @@ struct show_var_st init_vars[]= {
>>>    {"relay_log_index", (char*) &opt_relaylog_index_name, SHOW_CHAR_PTR},
>>>    {"relay_log_info_file", (char*) &relay_log_info_file, SHOW_CHAR_PTR},
>>>    {sys_relay_log_purge.name,  (char*) &sys_relay_log_purge,        
> SHOW_SYS},
>>> +  {sys_relay_log_recovery.name,  (char*) &sys_relay_log_recovery,  
> SHOW_SYS},
>>>    {"relay_log_space_limit",  (char*) &relay_log_space_limit,       
> SHOW_LONGLONG},
>>>  #endif
>>>    {sys_rpl_recovery_rank.name,(char*) &sys_rpl_recovery_rank,      
> SHOW_SYS},
>>>
>>> === modified file 'sql/slave.cc'
>>> --- a/sql/slave.cc	2008-10-29 12:27:59 +0000
>>> +++ b/sql/slave.cc	2008-10-29 16:37:16 +0000
>>> @@ -67,6 +67,7 @@ static inline bool io_slave_killed(THD* 
>>>  static inline bool sql_slave_killed(THD* thd,RELAY_LOG_INFO* rli);
>>>  static int count_relay_log_space(RELAY_LOG_INFO* rli);
>>>  static int init_slave_thread(THD* thd, SLAVE_THD_TYPE thd_type);
>>> +static int init_recovery(MASTER_INFO* mi);
>>>  static int safe_connect(THD* thd, MYSQL* mysql, MASTER_INFO* mi);
>>>  static int safe_reconnect(THD* thd, MYSQL* mysql, MASTER_INFO* mi,
>>>  			  bool suppress_warnings);
>>> @@ -174,6 +175,9 @@ int init_slave()
>>>    if (server_id && !master_host && active_mi->host[0])
>>>      master_host= active_mi->host;
>>>  
>>> +  if (init_recovery(active_mi))
>>> +    goto err;
>>> +
>>>    /* If server id is not set, start_slave_thread() will say it */
>>>  
>>>    if (master_host && !opt_skip_slave_start)
>>> @@ -197,6 +201,73 @@ err:
>>>    DBUG_RETURN(1);
>>>  }
>>>  
>>> +/*
>>> +* Updates the master info based on the information stored in the relay info
> and
>>> +* ignores relay logs previously retrieved by the IO thread starting fetching
> again
>>> +* based on to the group_master_log_pos and group_master_log_name.
>>> +*
>>> +*  There is no need for a mutex 
>>> +*  as the caller (i.e. init_slave) already has one acquired.
>>> +*
>>> +*  The recovery is done just after the database startup which means that
>>> +*  start slave and stop slave commands do not throw away any relay log 
>>> +*  retrieved in the meantime.
>>> +*
>>> +*  In the feature, we can improve this routine in order to avoid throwing
>>> +*  away logs that are safely stored in the disk. Eventually, the old
>>> +*  relay logs will be purged by the normal purge mechanism.
>>> +*
>>> +* Specifically,
>>> +*
>>> +*  1 - mi->master_log_pos  <-- rli->group_master_log_pos
>>> +*  2 - mi->master_log_name <-- rli->group_master_log_name
>>> +*  3 - It moves the relay info to the new relay log file, by
>>> +*   rli->group_relay_log_pos <-- rli->group_relay_log_pos <--
> BIN_LOG_HEADER_SIZE;
>>> +*   rli->group_relay_log_name, rli->group_relay_log_pos
>>> +*
>>> +* If there is an error, it returns (1), otherwise returns (0). 
>>> +*/
>>> +static int init_recovery(MASTER_INFO* mi)
>>> +{
>>> +  const char* errmsg=0;
>>> +  DBUG_ENTER("init_recovery");
>>> +
>>> +  RELAY_LOG_INFO* rli = &mi->rli;
>>> +  if (relay_log_recovery && rli->group_master_log_name[0])
>>> +  {
>>> +    mi->master_log_pos = max(BIN_LOG_HEADER_SIZE,
>>> +                              rli->group_master_log_pos);
>>> +    strmake(mi->master_log_name, rli->group_master_log_name,
>>> +      sizeof(mi->master_log_name)-1);
>>> +
>>> +    strmake(rli->group_relay_log_name,
> rli->relay_log.get_log_fname(),
>>> +            sizeof(rli->group_relay_log_name)-1);
>>> +    strmake(rli->event_relay_log_name,
> rli->relay_log.get_log_fname(),
>>> +            sizeof(mi->rli.event_relay_log_name)-1);
>>> +
>>> +    rli->group_relay_log_pos = rli->group_relay_log_pos =
> BIN_LOG_HEADER_SIZE;
>>> +
>>> +    if (init_relay_log_pos(rli,
>>> +                           rli->group_relay_log_name,
>>> +                           rli->group_relay_log_pos,
>>> +                           0 /*no data lock*/,
>>> +                           &errmsg, 0))
>>> +      DBUG_RETURN(1);
>>> +
>>> +    if (flush_master_info(mi, 0))
>>> +    {
>>> +      sql_print_error("Failed to flush master info file");
>>> +      DBUG_RETURN(1);
>>> +    }
>>> +    if (flush_relay_log_info(rli))
>>> +    {
>>> +       sql_print_error("Failed to flush master info file");
>>> +       DBUG_RETURN(1);
>>> +    }
>>> +  }
>>> +
>>> +  DBUG_RETURN(0);
>>> +}
>>>  
>>>  static void free_table_ent(TABLE_RULE_ENT* e)
>>>  {
>>>
>>>
>>> -- 
>>> MySQL Code Commits Mailing List
>>> For list archives: http://lists.mysql.com/commits
>>> To unsubscribe:    http://lists.mysql.com/commits?unsub=1
>>>     
>>>       
>>   
>>     
>
>
>   

Cheers.
Thread
bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705) Bug#40337Alfranio Correia29 Oct
  • Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705) Bug#40337Andrei Elkin30 Oct
    • Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705)Bug#40337alfranio correia31 Oct
      • Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705)Bug#40337alfranio correia31 Oct
      • Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705) Bug#40337Andrei Elkin31 Oct
  • Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705)Bug#40337He Zhenxing3 Nov
    • Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705)Bug#40337alfranio correia12 Nov
Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705)Bug#40337He Zhenxing15 Nov
  • Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705)Bug#40337alfranio correia15 Nov
    • Re: bzr commit into mysql-5.0-bugteam branch (alfranio.correia:2705)Bug#40337He Zhenxing15 Nov