List:Commits« Previous MessageNext Message »
From:Mats Kindahl Date:December 13 2007 7:45pm
Subject:Re: bk commit into 5.1 tree (sven:1.2651) BUG#32407
View as plain text  
Hi Sven!

I'll read the new patch. Here are some comments on the comments.

Just my few cents,
Mats Kindahl

Sven Sandberg wrote:
> Hi Mats!
>
> Many thanks for a very thorough review! See my comments below.
>
> /Sven
>
>
> Mats Kindahl wrote:
>   
[snip]
>>>   
>>>       
>> "should"?!
>>
>> If you are unsure if you can make the refactored code behave
>> identically, then refactoring is dangerous.
>>     
>
> Sorry: it *is* functionally identical, except:
>  - the return value, which is documented above, and
>  - skipping is now optional (the parameter 'skip'). See further down,
> and attached diff.
>   

I suspected that. :)

> I'll be more clear in the re-commit.
>   

Good.

[snip]

>> Instead of introducing a new variable, you could check the mode. In AUTO
>> and ALWAYS mode, a FD event shall have been printed earlier, so by just
>> checking the mode, you can decide if you should print an error or not.
>>     
>
> This is technically possible; however, I think the flag is more readable
> and avoids a dependency in the code. The flag is set in the place where
> the event is printed, which is what we need to check. So verifying
> correctness is trivial. The alternative you suggest depends on the logic
> that decides when FD events are printed. This logic is complicated, and
> if we ever change that, we would have to rewrite this part of the code.
>   

OK.

[snip]

>> and don't give advice on how the option should
>> be used in the description of the option, since this makes it difficult
>> to maintain. The documentation is for giving suggestions on how and when
>> to use options, not the code. IMHO, the option description should
>> contain what the option does, and nothing else.
>>     
>
>   

[snip]

> This is a very dangerous option - see also BUG#18337. I'd suggest we
> keep the warning. This flag is useful only in our tests (we use it
> because it removes all traces of timestamps, so when the output of
> mysqlbinlog ends up in a result file, the result file is deterministic).
> Users really should not use this flag. Possible user scenario:
>
>  - User writes script that pipes output of mysqlbinlog to client
>  - User wants to optimize their script by removing unnecessary output.
>  - User sees "--short-form" flag which removes comments. User tries it
> on a binlog with no row events, and it seems to do this.
>  - Later, user switches to row-based replication, and the script breaks.
>
> At some point, I think we should deprecate the short-form flag and
> introduce two safer flags like this:
>  --no-comments                Do not print comments.
>  --deterministic-output flag  Print all events as if they happened
>                               at time 0.
>   

OK.

[snip]

>> It is more useful to have the actual value of the event type offset in
>> the debug output.
>>     
>
> I got the opposite comment in another review, and care very little
> myself, so to please everyone I will do
>
> DBUG_PRINT("info",("buf[EVENT_TYPE_OFFSET=%d]=%d",
>                    EVENT_TYPE_OFFSET, buf[EVENT_TYPE_OFFSET]));
>   

:)

>   
>>>        /* always test for a Start_v3, even if no --start-position */
>>> -      if (buf[4] == START_EVENT_V3)       /* This is 3.23 or 4.x */
>>> +      if (buf[EVENT_TYPE_OFFSET] == START_EVENT_V3)
>>>        {
>>> +        /* This is 3.23 or 4.x */
>>>          if (uint4korr(buf + EVENT_LEN_OFFSET) <             
>>> (LOG_EVENT_MINIMAL_HEADER_LEN + START_V3_HEADER_LEN))
>>>          {
>>> @@ -1384,8 +1452,9 @@ Could not read entry at offset %lu : Err
>>>        }
>>>        else if (tmp_pos >= start_position)
>>>          break;
>>> -      else if (buf[4] == FORMAT_DESCRIPTION_EVENT) /* This is 5.0 */
>>> +      else if (buf[EVENT_TYPE_OFFSET] == FORMAT_DESCRIPTION_EVENT)
>>>        {
>>> +        /* This is 5.0 */
>>>          Format_description_log_event *new_description_event;
>>>          my_b_seek(file, tmp_pos); /* seek back to event's start */
>>>          if (!(new_description_event= (Format_description_log_event*)
>>> @@ -1397,11 +1466,22 @@ Could not read entry at offset %lu : Err
>>>  at offset %lu ; this could be a log format error or read error",
>>>                tmp_pos);          }
>>> -        delete *description_event;
>>> -        *description_event= new_description_event;
>>> +        if (opt_base64_output_mode == BASE64_OUTPUT_AUTO
>>> +            || opt_base64_output_mode == BASE64_OUTPUT_ALWAYS)
>>> +          /*
>>> +            process_event will delete *description_event and set it to
>>> +            the new one, so we should not do it ourselves in this
>>> +            case.
>>> +          */
>>>   
>>>       
>> If feels kind of strange to have one branch of the if-statement turning
>> over the responsibility of deleting the event to another function and
>> the other one not. Makes maintenance harder.
>>
>>     
>>> +          process_event(print_event_info, new_description_event,
>>> tmp_pos);
>>> +        else
>>> +        {
>>> +          delete *description_event;
>>> +          *description_event= new_description_event;
>>> +        }
>>>       
>> Don't you need the FD event event even for NEVER mode? Don't if work if
>> the FD event is always available? If you always set the FD event, it
>> makes this code easier and more generic and also gives the possibility
>> of using the FD event for checking versions even in NEVER mode.
>>     
>
> Logically, it's not about deleting but about switching. There is always
> a "current" FD event. The responsibility for creating it is at the top
> of check_header() / check_master_version() (depending on if we are
> printing a local or remote binlog). The responsibility for deleting it
> is... uhm... er... *should be* somewhere near the shutdown (currently
> it's not always done, that's a bug, but it's not my patch's fault).
> Between creation and deletion, the FD event may be switched any number
> of times. This is safe to do in a totally irresponsible way. Here,
> process_event will switch, but since we don't call process_event in
> NEVER mode, we have to switch manually.
>
> To summarize: there is one create and one delete in this function, and
> the new object replaces the old, so the sum is +-0.
>   

OK.

>   
>> Also, it seems that you are using the existence of the FD event as an
>> indicator for whether to print an error or not, and this logic will
>> break down if the FD event for some reason is needed in the future; for
>> example, if there is a need to do version checking even in NEVER mode.
>>
>> I would suggest to always set the FD event and look at the mode instead
>> to decide when an error should be printed. This makes the code simpler,
>> more generic, and less prone to introducing bugs with future changes.
>> This also rhymes well with the fact that preconditions should be as weak
>> as possible, but not weaker.
>>     
>
> What part of the code are you referring to here? I see no error checking
> here, and there is no place that tests the existence of FD events to
> check errors. Possibly a misunderstanding? Does the above paragraph make
> things more clear?
>   

Yes, definitely.

[snip]

>> Please enable the test. If mysqlbinlog contain a bug, we should report
>> it and fix it. If there is no bug reported for this, please create one.
>>     
>
> Basically, the bug is that we don't use exceptions ;-) More seriously,
> when the program dies abruptly is IMHO one of few times that it is
> usually better to *not* clean up - making sure that everything is freed
> is a significant amount of work, and the OS does it for us anyway.
>   

Not on Windows, sorry. Windows is notoriously bad at cleaning up after 
processes that die unexpectedly.


> Should I uncomment the test even if it fails? Doesn't it have to be
> disabled in the suite then?
>   

With a reported bug, yes.

[snip]

>>>  #ifdef MYSQL_CLIENT
>>> +enum enum_base64_output_mode {
>>> +  BASE64_OUTPUT_NEVER= 0,
>>> +  BASE64_OUTPUT_AUTO= 1,
>>> +  BASE64_OUTPUT_ALWAYS= 2,
>>> +  BASE64_OUTPUT_UNSPEC= 3,
>>> +  /* insert new output modes here */
>>> +  BASE64_OUTPUT_LAST_TYPE
>>> +};
>>>   
>>>       
>> Since variables declared with this enum defaults to 0 (unless auto, of
>> course), I would suggest to place the "UNSPEC" as 0, because that allows
>> you to get a sensible and detectable value if you forget to assign a
>> value to a variable of this type.
>>
>> Then, since the enumeration only needs a sequence of number, and it does
>> not matter which ones, I would say that you can remove the explicit 
>> values for the enumeration constants.
>>     
>
> These numbers are used as array indices in mysqlbinlog.cc, function
> get_one_option, case OPT_BASE64_OUTPUT_MODE, in the call to
> find_type_or_exit(). I know, it's horrible, but it's how the
> --binlog-format flag is handled for mysqld and I see no way to get rid
> of it using our current command line parsing library.
>   

OK, but I still suggest that you use UNSPEC first.

>> Skip the periods last in the sentences, they are not needed.
>>     
>
> I don't want to argue too much about this kind of things, but our
> internal coding guidelines explicitly state that there should be a
> period: see
> http://forge.mysql.com/wiki/MySQL_Internals_Coding_Guidelines#Commenting_Code
>   

It is not important. Do what you feel is best.

[snip]

>>>   
>>>       
>> I assume this is just a cut and paste of the code removed below. If that
>> is not the case, please tell me and I will take a more thorough look.
>>     
>
> It is a cut and paste with small modifications. See the attached patch:
> it is a 'diff -b' between the old fraction of exec_relay_log_event and
> the new apply_event_and_update_pos.
>   

OK.

[snip]

>>>  }
>>>       
>> If you add a prototype (declaration) for a new function early in the
>> file, and add the new function definition *last* in the file, you will
>> have significantly smaller and more readable diffs.
>>     
>
> I think patch readability is a matter of giving the right arguments to
> diff (use diff -c -b). Optimizing the code for patch readability should
> have low priority. IMHO, it is more important to have the functions
> organized in a logical manner in the file.
>   

Unimportant, do what you feel is best.

> --- apply-old	2007-12-13 14:31:38.000000000 +0100
> +++ apply-new	2007-12-13 14:32:17.000000000 +0100
> @@ -1,14 +1,18 @@
> -    int const type_code= ev->get_type_code();
> +int apply_event_and_update_pos(Log_event* ev, THD* thd, Relay_log_info* rli,
> +                               bool skip)
> +{
>      int exec_res= 0;
>  
> +  DBUG_ENTER("apply_event_and_update_pos");
> +
>      DBUG_PRINT("exec_event",("%s(type_code: %d; server_id: %d)",
> -                       ev->get_type_str(), type_code, ev->server_id));
> +                           ev->get_type_str(), ev->get_type_code(),
> +                           ev->server_id));
>      DBUG_PRINT("info", ("thd->options: %s%s; rli->last_event_start_time:
> %lu",
>                          FLAGSTR(thd->options, OPTION_NOT_AUTOCOMMIT),
>                          FLAGSTR(thd->options, OPTION_BEGIN),
>                          rli->last_event_start_time));
>  
> -
>      /*
>        Execute the event to change the database and update the binary
>        log coordinates, but first we set some data that is needed for
> @@ -40,6 +44,8 @@
>        ev->when= my_time(0);
>      ev->thd = thd; // because up to this point, ev->thd == 0
>  
> +  if (skip)
> +  {
>      int reason= ev->shall_skip(rli);
>      if (reason == Log_event::EVENT_SKIP_COUNT)
>        --rli->slave_skip_counter;
> @@ -66,6 +72,9 @@
>      DBUG_PRINT("skip_event", ("%s event was %s",
>                                ev->get_type_str(), explain[reason]));
>  #endif
> +  }
> +  else
> +    exec_res= ev->apply_event(rli);
>  
>      DBUG_PRINT("info", ("apply_event error = %d", exec_res));
>      if (exec_res == 0)
> @@ -95,6 +104,9 @@
>                      " Stopped in %s position %s",
>                      rli->group_relay_log_name,
>                      llstr(rli->group_relay_log_pos, buf));
> -        DBUG_RETURN(1);
> +      DBUG_RETURN(2);
>        }
>      }
> +
> +  DBUG_RETURN(exec_res ? 1 : 0);
> +}
>   


-- 
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com


Thread
bk commit into 5.1 tree (sven:1.2651) BUG#32407Sven Sandberg10 Dec
  • Re: bk commit into 5.1 tree (sven:1.2651) BUG#32407Mats Kindahl13 Dec
    • Re: bk commit into 5.1 tree (sven:1.2651) BUG#32407Sven Sandberg13 Dec
      • Re: bk commit into 5.1 tree (sven:1.2651) BUG#32407Mats Kindahl13 Dec