List:Commits« Previous MessageNext Message »
From:Mats Kindahl Date:May 13 2008 2:08pm
Subject:Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524
View as plain text  
Andrei Elkin wrote:
> Mats, hello.
>
>   
>> Hi Andrei!
>>
>> Comments inline below.
>>
>>     
>
> Let's talk it over.
>
>   
>> /Matz
>>
>> Andrei Elkin wrote:
>>     
>>

[snip]

>>> +          to WARNING_LEVEL
>>> + */
>>>  void
>>>  Slave_reporting_capability::report(loglevel level, int err_code,
>>>                                     const char *msg, ...) const
>>> @@ -13,6 +27,8 @@ Slave_reporting_capability::report(logle
>>>    va_list args;
>>>    va_start(args, msg);
>>>  +  if (level == ERROR_LEVEL && has_temporary_error(current_thd))
>>> +    level= WARNING_LEVEL;
>>>   
>>>       
>
>   
>> Why do you put the logic here? Isn't the what log level should be used
>> a decision to be made by the caller? This seems to put to many
>> features into the function, which previously had the role of just
>> reporting an error at a certain level.
>>     
>
> When I made the first patch where
> Rows_log_event,Query_log_event::do_apply_event
> composed the correct arg for rli->report(),
> I had thought there would be just two occurances where rli->report()
> needs to be careful.
> But overnight I realized a simple thing that load data is prone to the
> timeout, and moreover, there is the general problem.
>
> Anywhere in the slave sql thread execution there might happen a situation
> that we would not like to report as an error for some time.
> As there are many sources I think it's wiser to implement a general
> mechanism.
>   

I don't understand what you're saying here...

> The mechanism that can block reporting and resume it later could be as
> simple as the patch offers, which is in
>
> 1. check if has_temporary_error() and block (demote) the error from
>    showing up as an error;
>
> 2. raise is_fatal_error as soon as there is certainty that a blocked
>    error reporting needs resuming.
>
>   
>> Why is this a problem? 
>>     
>
> I hope you see my concern to avoid duplication of the same logics all
> around the code (mentioned above as done in the first patch).
>   

Duplicating the logic might be an issue, but changing the report() 
function for this purpose strikes me as fixing the problem at the wrong 
place.

Is there any particular reason to why you don't construct a special 
function for this file only with the functionality you want? That way, 
you avoid code duplication and the changes are kept local to the file, 
which eliminates the risk that you are changing behavior of code that 
should not be affected by this change. The other option is to get all 
the information from the THD instance instead of part of the error as 
parameter and part from the THD instance.

>   
>> This will make all errors reported through this
>> interface be demoted provided the current thread has a temporary
>> error, regardless of what is actually being reported.
>>     
>
> It should not. Let's look at your cases.
>
>   
>> This can cause
>> two strange situations, to a naïve user of the reporting function:
>>
>>   1. A different error than the error present for the current thread is
>>      passed to the reporting function. This can be a decision on the
>>      callers behalf to regard a certain temporary error as a fatal
>>      error, but the reporting function will demote this error to a
>>      warning level just because the caller has "forgotten" to set the
>>      "fatal error" flag.
>>     
>
> That's a good question.
>
> In other words, if there had been a temporary error and after that
> the 2nd error that is not meant to be temporary. 
> Is that the concern?
>   

No, that is not the case I'm describing. The report() function can be 
used to report any error, not just one that has been generated.

> The current handling in rli->report() would respect
> the current error, coming as an argument, so that the 2nd error
> remains a legal error as long as if has_temporary_error() confirms
> that.

... but what I'm describing is a case where has_temporary_error() looks 
at the last generated error and not at the error that was supplied to 
the reporting function.

> The latter function is interested only in the latest error.
>   

Yes, but that might not be what is passed to report().

> This observation tells me that has_temporary_error() is combinable
> with rli->report() that also deals only with the latest.
>   

Again, this is not the scenario I describe. You are changing the 
behavior of the report() function to *all* users, not just the case you 
are trying to "fix".

>   
>>   2. It is necessary to add code to prevent the error message to be
>>      repeated ad-nauseum, 
>>     
>
> there was no such intention, I change only for demotion.  But logics
> of rli->report() can be extended to add such a rule. Again, it would
> be an advantage to do that in the method not on the caller level.
>   

I cannot see any advantage. What would the advantage be?

>   
>>      which means that part of the logic for
>>      deciding if this should be a warning is in the caller and part is
>>      inside the reporting function. This split just introduces an
>>      unnecessary coupling between the caller and the callee.
>>     
>
> The caller has is_fatal_error always available to express his strong
> preference. And that idea has been worked so far. And then let's leave
> it to the caller but release the caller from expressing the soft
> opinion because that soft opinion is just a default.
>   

I don't understand this...

>   
>> I would like you to pass in the correct log-level to the reporting
>> function instead,
>>     
>
> I hope my comments will make it more clear the idea of this patch.
>   
>   
>> which will also keep the has_temporary_error()
>> function inside the file and not exposing it outwards.
>>
>>     
>
> You can compare one one-line hunk with what we would have to do -
> finding all case prone to the temprorary errors, duplicate the same dumb
> lines all over the cases, and after that sit and wait then there will
> show up yet another mtr->report() where the caller forgets to check
> the temporary status.
>   

The role of the reporting function is to report errors... *any* error, 
not just the latest generated error.

If you want to change the logic, then information should be collected so 
that the risk of incompatible information is minimal. The 
has_temporary_error() checks one error, and the error coming through the 
parameter list is another error. If they don't match, then something is 
wrong. Either you pass both items through as parameters *or* you get all 
information from the THD instance, but don't try to mix the two parts of 
the error, because they may or may not be compatible.

*If* you decide to take the information from the THD instance instead 
(*both* the has_temporary_error() information and the actual error code 
that has_temporary_error() has decided is the temporary error, then you 
should make sure to pass the THD structure as parameter since using 
"current_thd" is very expensive (it starts digging in the thread-local 
store).

Just my few cents,
Mats Kindahl

-- 
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com


Thread
bk commit into 5.1 tree (aelkin:1.2612) BUG#36524Andrei Elkin7 May
  • Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524Mats Kindahl7 May
    • Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524Andrei Elkin8 May
      • Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524Mats Kindahl13 May
        • Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524Andrei Elkin13 May
          • Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524Andrei Elkin17 Jun 2009
            • Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524Mats Kindahl25 Jun 2009
              • Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524Andrei Elkin25 Jun 2009
                • Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524Mats Kindahl25 Jun 2009
                  • Re: bk commit into 5.1 tree (aelkin:1.2612) BUG#36524Andrei Elkin29 Jun 2009