List:Commits« Previous MessageNext Message »
From:Luís Soares Date:March 26 2010 11:12am
Subject:Re: WL#5096 review
View as plain text  
Hi Sven,

On Wed, 2009-12-02 at 19:07 +0100, Sven Sandberg wrote:
> Hi Luís,
> 
> This is a *partial* review of your worklog. I haven't had time to look 
> at everything, but since I'm away Thu and Fri, I send it now to make the 
> work more pipelined. If you prefer to have the whole review in one 
> email, let me know and I'll just extend on this when I continue with it 
> next week.
> 
> Btw, thanks for the improvements, I think it became much easier to 
> follow the logic now.

Well... Thank you for your good suggestions!

> 
> (S11) Test the default value of binlog_row_image (see comment inline)

Done.

> (S12) Move diff_tables2.inc to diff_tables.inc (see comment inline)

Done.

> (S13) Remove duplicate test rpl_record_find.test (see comment inline)

Removed.

> (S14) Refactor rpl_row_img_sanity_set_img.test (see comment inline)

Removed and refactored rpl_row_img_set.inc to be used in all
scenarios.

> (S15) Please don't refer to review comments in test cases. The comments 
> should be self-contained and just describe the current version of the 
> test. How it worked in previous versions of the tests and how it changed 
> after review comments will not be interesting for anyone else than you 
> and me or at any other point in time than now. (see comment inline)

Removed.

> (S16) Ensure that the matching part of the output from mysqlbinlog 
> really belongs to the previous statement in 
> rpl_row_mysqlbinlog_img_check.inc. (See comments inline.)

Refactored. Only dump and parse binary log from the position that
relates to the statement executed.

> (S17) Move code repeated in rpl_row_img_sanity.test into 
> rpl_row_mysqlbinlog_img_check.inc to avoid code repetition. (see comment 
> inline)

Moved. (Small impact here: mysqlbinlog will be called per include)

> (S18) Add comments to the top of all files (see comments inline)

Added.

> (S19) rpl_row_img_set.inc is a generic file used for the chained 3 hosts 
> setup - maybe it should be named rpl_chained_3_hosts_set_img.inc or 
> something like that? Also, what's the logic of the division into 
> mysql-test/extra/rpl_tests vs mysql-test/suite/rpl/include ? Can't we 
> have all files in one place?

rpl_row_img_set.inc is now usable in all scenarios.
I usually reason like this:
  - accessories/auxiliar stuff put in mysql-test/suite/rpl/include
  - more strictly related to testing (assertion wise) put in
    mysql-test/extra/rpl_tests and mysql-test/suite/rpl/t

> (S20) Suggested simplification (?) of rpl_row_mysqlbinlog_img_check.inc 
> (see comments inline)

Thanks for suggestion. Done.

The incremental commit is at:
http://lists.mysql.com/commits/104427

It is based on mysql-5.1-rpl-wl5092 tree, available
at the usual URL ;) .

See some more comments inline.

> 
> Luís Soares wrote:
> >> Suggestions:
> >>
> >> (S1) Please state for each test (1) what server property is tested; (2) 
> >> how is it tested (see comments inline)
> >>
> >> (S2) I think a little refactoring of rpl_row_img_*.test would make the 
> >> test even more transparent (see comments inline)
> >>
> >> (S3) I refuse to review rpl_row_sanity :-) Seriously, I'm impressed that 
> >> you could write something that looks so complex, but I think it is 
> >> possible to refactor it to make it more easy to decipher. See comments 
> >> inline.
> > 
> > Ok.
> > 
> >> (S4) I think some tests are missing. In particular, can you add tests 
> >> for the following:
> >>
> >>   (S4-1) In rpl_row_sanity, please make sure all cases listed in the 
> >> tables in WL#5072 are tested. Currently, it misses e.g., the autoinc case.
> > 
> > Hmm... OK. WL#5092 does not change anything regarding autoincs, but we
> > should check them anyway. Done.
> > 
> >>   (S4-2) Test that a slave with missing columns does not write columns 
> >> that it does not have
> > 
> > OK.
> > 
> >>   (S4-3) Test that if master has binlog_row_image=minimal and slave has 
> >> binlog_row_image=noblob|full, then slave logs the correct columns.
> > 
> > OK.
> > 
> >>   (S4-4) Test that an error is generated when the slave does not have 
> >> the columns included in the AI.
> > 
> > OK.
> > 
> >>   (S4-5) Test that correct default values are inserted in extra slave 
> >> columns.
> > 
> > OK
> > 
> >>   (S4-6) Test that extra master columns are correctly ignored by the 
> >> slave as long as there is at least one usable column on the slave. E.g.:
> >>
> >>    master> CREATE TABLE t1 (a INT NOT NULL, b INT, c INT
> >>                             PRIMARY KEY (a,c), UNIQUE KEY (a));
> >>    slave> CREATE TABLE t1 (a INT NOT NULL, b INT,
> >>                            UNIQUE KEY (a));
> >>
> >> AIUI, the above should always replicate correctly even with 
> >> binlog_row_image=minimal?
> > 
> > Yes, it should. The slave will receive a BI with data for columns 
> > 'a' and 'c'. Since it knows nothing about 'c' it will use the available
> > data for 'a' so that the unique index is used. I have added a case
> > to cover this.
> > 
> >>   (S4-7) Test that the default value for binlog_row_image is 'full'.
> > 
> > Hmmm.... I believe that this was already covered implicitly in the
> > sanity test. However, I've made it explicit.
> 
> (S11) I meant that the test should do:
> 
> SELECT @@global.binlog_row_image;
> 
> I couldn't see this in the new patch.

Added to the beginning of rpl_row_img_sanity.test

> > 
> >> (S5) Instead of the connect/disconnect hack in 
> >> rpl_chained_3_hosts_diff_tables.inc , could you make 
> >> include/diff_tables.inc capable of diffing tables in arbitrary 
> >> connections? I think diff_tables.inc should also reset the original 
> >> connection at the end. (Ah, finally! I've been waiting for a chance to 
> >> scratch this itch ever since BUG#35701 was fixed :-)
> > 
> > I can, and I have. However, I would prefer to create diff_tables2.inc,
> > so that we don't cause much disruption with what is currently accepted
> > among developers wrt diff_tables.inc.
> 
> (S12) All use cases of the old version are covered by the new version, 
> so I don't think there is any need to fork. For test cases, we don't 
> need to maintain backward compatibility, we can just make it work in the 
> best possible way. Could you remove the old test case and rename the new 
> one to diff_tables.inc?

OK. Moved the content of diff_tables2.inc to diff_tables.inc and removed
the former one.

> > 
> >> (S6) Some tests seem redundant and/or it is unclear what is being 
> >> tested. See comments inline.
> > 
> > OK.
> > 
> >> (S7) Possible typo, see comments inline.
> > 
> > OK.
> > 
> >> (S8) Remove DELAYED from INSERT DELAYED (see comments inline)
> > 
> > No, I don't want to remove it. I'll explain inline.
> 
> OK
> 
> > 
> >> (S9) It looks like a lot of code is duplicated between 
> >> rpl_record_find.test and rpl_row_img.test. Isn't rpl_record_find.test a 
> >> special case of rpl_row_img.test, only with one server less? If not, 
> >> please explain!
> > 
> > rpl_record_find.test was part of BUG#40045 which I had to backport. As
> > such, I want to keep it.
> 
> (S13) AFAICS all that we test for BUG#40045 in rpl_record_find.test is 
> also tested in rpl_row_img.test. I think there is too much duplication 
> between the tests to keep both. If rpl_record_find.test has anything 
> that rpl_row_img.test does not have, can it be merged to rpl_row_img.test?

I have double checked and rpl_row_record_find seems to be
superseded by rpl_row_img test. So I am OK with removing
mostly everything on rpl_row_record_find. However, I kept
the two sanity checks (the ones used in the bug report) 
for BUG#40007 and BUG#40045 (see: rpl_row_record_find_myisam).

> 
> 
> rpl_row_img_sanity_set_img:
> > -- let $old_con= $CURRENT_CONNECTION
> > 
> > if (`SELECT $iteration = 1`)
> > {
> >   -- connection master
> >   -- echo ON MASTER
> >   # global so that flush tables restarts the 
> >   # insert dealyed thread
> >   SET GLOBAL binlog_row_image='FULL';
> >   SET SESSION binlog_row_image='FULL';
> >   SHOW VARIABLES LIKE 'binlog_row_image';
> >   FLUSH TABLES;
> > 
> >   -- connection slave
> >   -- echo ON SLAVE
> >   # global so that SQL thread also gets the change
> >   SET GLOBAL binlog_row_image='FULL';
> >   SET SESSION binlog_row_image='FULL';
> >   SHOW VARIABLES LIKE 'binlog_row_image';
> > }
> [...]
> 
> (S14) The code above is duplicated three times. I would suggest to refactor:
> 
>    if (`SELECT $iteration = 1`) {
>      --set $img_type= 'FULL'
>    }
>    if (`SELECT $iteration = 2`) {
>      --set $img_type= 'NOBLOB'
>    }
>    if (`SELECT $iteration = 3`) {
>      --set $img_type= 'MINIMAL'
>    }
>    [...]
>    --eval SET GLOBAL binlog_row_image= $img_type
> 
> etc. Also, this file is only used once. Can't it be copied to the place 
> where it's used? It is slightly confusing since this file does almost 
> exactly the same as rpl_row_img_set.inc
> 

Removed rpl_row_img_sanity_set_img.inc and made 
rpl_row_img_set generic. The latter is now shared between two 
tests scripts: rpl_row_img_sanity and rpl_row_img.test (inside one
of its includes).

> 
> rpl_row_img_set.inc:
> > -- connection mysqld_a
> 
> (S18) Please add a (short) description of what this file does and how to 
> use it.

Added.

> > -- eval SET SESSION binlog_row_image= $row_image_param
> > -- eval SET GLOBAL binlog_row_image= $row_image_param
> > FLUSH TABLES;
> > SHOW VARIABLES LIKE 'binlog_row_image';
> > 
> > -- connection mysqld_b
> > -- eval SET SESSION binlog_row_image= $row_image_param
> > -- eval SET GLOBAL binlog_row_image= $row_image_param
> > -- source include/stop_slave.inc
> > -- source include/start_slave.inc
> > FLUSH TABLES;
> > SHOW VARIABLES LIKE 'binlog_row_image';
> > 
> > -- connection mysqld_c
> > -- eval SET SESSION binlog_row_image= $row_image_param
> > -- eval SET GLOBAL binlog_row_image= $row_image_param
> > -- source include/stop_slave.inc
> > -- source include/start_slave.inc
> > FLUSH TABLES;
> > SHOW VARIABLES LIKE 'binlog_row_image';
> > 
> > -- connection mysqld_a
> > 
> 
> 
> 
> rpl_row_mysqlbinlog_img_check.inc:

Renamed to rpl_row_img_parts_assertion.inc.

> > -- perl END_OF_FILE
> > 
> 
> (S17) I suggest to add this to the top of rpl_row_mysqlbinlog_img_check.inc:
> 
>    -- sync_slave_with_master
>    -- let $MYSQLD_DATADIR= `select @@datadir;`
>    -- exec $MYSQL_BINLOG -v $MYSQLD_DATADIR/slave-bin.000001 > $TMP_FILE
> 
> Then these commands can be eliminated from the rpl_row_img_sanity.test, 
> where they are repeated a lot.

Done. However, I must note that now mysqlbinlog is called on every
rpl_row_img_parts_assertion.inc include. I think that's probably one
thing we can't escape if we want to reduce the scope of the pattern
matching (I provide details later on in this email).

> 
> (S18) Please also add a comment to the top of the file, explaining what 
> it does and how to use it.

Added.

> > sub match_imgs
> > {
> >   my ($contents, $ai_ref, $bi_ref, $stmt_type) = @_;
> > 
> >   my $match= 0;
> >   my $error= 0;
> > 
> >   %ai= %{$ai_ref};
> >   %bi= %{$bi_ref};
> > 
> >   my $composed_bi= "";
> >   for my $k (sort (keys(%bi)))
> >   {
> >     $composed_bi = $composed_bi . "###   \@$k=" . $bi{$k} . "\n";
> >   }
> > 
> >   my $composed_ai= "";
> >   for my $k (sort (keys(%ai)))
> >   {
> >     $composed_ai = $composed_ai . "###   \@$k=" . $ai{$k} . "\n";
> >   }
> > 
> >   if($stmt_type=~/INSERT/)
> >   {
> >     $match= ($contents =~ /### INSERT.*\n### SET\n$composed_ai# at/m) ? 1 : 0;
> >   }
> >   elsif ($stmt_type=~/UPDATE/)
> >   {
> >     $match= ($contents =~ /### UPDATE.*\n### WHERE\n$composed_bi###
> SET\n$composed_ai# at/m) ? 1 : 0;
> >   }
> >   elsif ($stmt_type=~/DELETE/)
> >   {
> >     $match= ($contents =~ /### DELETE.*\n### WHERE\n$composed_bi# at/m) ? 1 :
> 0;
> >   }
> >   else
> >   {
> >     $error= 1;
> >   }
> 
> (S20) This is mostly a matter of style, but here is a version that takes 
> a shorter syntax as input:

Thanks for this. I think it is more clear as you put it. I have merged
your suggestion. Again thanks.

>    # Save IMG_EXPECTED in $img and check if it has correct format
>    my $img = $ENV{'IMG_EXPECTED'};
>    $img =~ /^([0-9]+:\S+ )*\|( [0-9]+:\S+)*$/
>      or die "Invalid format of IMG_EXPECTED parameter";
>    # Turn $img into the format of the binlog, and get BI and AI
>    $img =~ s/ *([0-9]+):(\S*) */###   \@$1=$2\n/g;
>    my ($ai, $bi)= split(/\|/, $img);

I reversed ($ai and $bi) here. Makes it more intuitive while 
reading/writing the expected values before the include... 
Any special reason for you to put it $ai, $bi ?

>    # Generate regular expression
>    if ($ai) {
>      if ($bi) {
>        $pattern= "### UPDATE.*\n### WHERE\n$bi### SET\n$ai";
>      } else {
>        $pattern= "### INSERT.*\n### SET\n$ai";
>      }
>    } else {
>      $pattern= "### DELETE.*\n### WHERE\n$bi";
>    }
>    $match= ($contents =~ /$pattern/);
> 
> 
> The point is, we could abbreviate this:
> 
>    -- let IMG_STMT_TYPE=UPDATE
>    -- let IMG_BI_EXPECTED={1=>1, 3=>"'foo'"}
>    -- let IMG_AI_EXPECTED={2=>4}
> 
> ... to this:
> 
>    -- let IMG_EXPECTED=1:1 3:'foo' | 2:4
> 
> ... which is more brief.
> 
> Hmm, I don't know if this is better or anything, use whatever version 
> you prefer. :-)

I'll take yours... :)

> > 
> >   if ($error)
> >   {
> >     print "Unexpected statement type: $stmt_type.\n"
> >   }
> > 
> >   if (!$match)
> >   {
> >     print "Unexpected columns in $stmt_type.\n";
> >     print "EXPECTED BEFORE IMAGE:\n$composed_bi\nEXPECTED AFTER
> IMAGE:\n$composed_ai\nBINLOG CONTENTS:" . $contents;
> >   }
> > 
> >   return $match;
> > }
> > 
> > 
> > my $ai;
> > my $bi;
> > 
> > if (not defined $ENV{'IMG_AI_EXPECTED'})
> > {
> >   $ai= {};
> > }
> > else
> > {
> >   $ai= eval ($ENV{'IMG_AI_EXPECTED'}) or die("Invalid AI parameter: ". 
> $ENV{'IMG_AI_EXPECTED'});
> > }
> > 
> > if (not defined $ENV{'IMG_BI_EXPECTED'})
> > {
> >   $bi= {};
> > }
> > else
> > {
> >   $bi= eval ($ENV{'IMG_BI_EXPECTED'}) or die("Invalid BI parameter: ". 
> $ENV{'IMG_BI_EXPECTED'});
> > }
> > 
> > my $binlog= $ENV{'IMG_BINLOG_FILE'};
> > my $stmt_type= $ENV{'IMG_STMT_TYPE'};
> > 
> > open(FILE, "$binlog") or die("Unable to open $binlog: $!\n");
> > my $contents = do { local $/; <FILE> };
> > close(FILE) or die("Unable to close file.");
> > 
> > match_imgs($contents, $ai, $bi, $stmt_type);
> > 
> > END_OF_FILE
> 
> (S16) I suggest to add "RESET MASTER" to the bottom of 
> rpl_row_mysqlbinlog_img_check.inc. See comment (S16) in 
> rpl_row_img_sanity.test, below.

I think it's better to take other approach. Let this script 
take the start position of the binary log for the event one
is scanning. (If we have the need we can later had the stop
position as well). 

Then we won't risk removing the binary log each time this
file is included. Also... calling reset master unconditionally
may result in undefined behavior (what if there is a slave 
attached and replicating by the time one does it?).

So instead of calling RESET MASTER, I have added the start
position as an argument to reduce the output of mysqlbinlog
therefore reducing the scope of the pattern matching.

> 
> 
> rpl_row_img_sanity.test:
> > #
> > #  Description
> > #  ===========
> > #
> > #  This test case checks whether binlog files contain
> > #  Before and After image values as expected.
> > #
> > #  Configuration is done using the --binlog-row-image
> > #  variable.
> > #
> > #  How it works
> > #  ============
> > #
> > #  The test case is implemented such that master and slave
> > #  create a table with different types of keys. Then, deletes
> > #  and updates are performed in the master. The resulting
> > #  output of mysqlbinlog -v issued on the binlogs generated 
> > #  is then searched for the pattern expected wrt before and 
> > #  after images.
> > #
> > #  Some special cases are also considered. A small description
> > #  for each one is provided inline.
> > #
> > #  See also WL#5096.
> > #
> > #  NOTE: cases marked with S#(-#) address review comments in:
> > #        http://lists.mysql.com/commits/89844        
> 
> (S15) Please don't refer to review comments. Instead, state explicitly 
> what properties we test.

Removed.

> > 
> > -- source include/master-slave.inc
> > -- source include/have_binlog_format_row.inc
> > 
> > -- connection slave
> > call mtr.add_suppression("Slave: Can\'t find record in \'t\' Error_code:
> 1032");
> > -- connection master
> > 
> > -- let $TMP_FILE= $MYSQLTEST_VARDIR/tmp/rpl_row_img_sanity.tmp
> > -- let IMG_BINLOG_FILE= $TMP_FILE
> > 
> > -- source include/master-slave-reset.inc
> > -- connection master
> > 
> > -- echo #####################################################
> > -- echo # (S4-7) basic assertion that binlog_row_image='FULL' is the
> > -- echo ######################################################
> 
> (S15) Please avoid the reference to (S4-7), it's not understandable 
> unless you read the review email. Better to describe what is tested.

Removed.

> > 
> > CREATE TABLE t (c1 int, c2 int, c3 blob, primary key(c1));
> > INSERT INTO t(c1,c3) VALUES (1, 'a');
> > UPDATE t SET c1=2 WHERE c1=1;
> > DELETE FROM t;
> > 
> > -- sync_slave_with_master
> > 
> > -- let $MYSQLD_DATADIR= `select @@datadir;`
> > -- exec $MYSQL_BINLOG -v $MYSQLD_DATADIR/slave-bin.000001 > $TMP_FILE
> > 
> > -- let IMG_STMT_TYPE=INSERT
> > -- let IMG_AI_EXPECTED={1=>1, 2=>NULL, 3=>"\'a\'"}
> 
> (Just a style comment: single quotes withing double quotes don't need to 
> be escaped in perl. We can just write 3=>"'a'".)

OK.

> > -- source suite/rpl/include/rpl_row_mysqbinlog_img_check.inc
> 
> (S16) Note that the test currently does not check *which* statement 
> generated the INSERT in the binlog. If we execute 
> rpl_row_mysqlbinlog_img_check.inc after we have executed N statements, 
> then it will report succes if any of the N statements matches.
> 
> I suggest that we instead execute rpl_row_mysqlbinlog_img_check.inc 
> immediately after the INSERT statement, and run RESET MASTER at the end 
> of rpl_row_mysqlbinlog_img_check.inc. Then we are sure that the matching 
> output from mysqlbinlog really belongs to the last INSERT statement and 
> not to any other INSERT statement further up in the file. See also 
> comment (S16) in rpl_row_mysqlbinlog_img_check.inc, above.

Moved every call to right after the statement execution.
mysqlbinlog output is controlled by specifying the start position
instead of calling RESET MASTER.

> > 
> > -- let IMG_STMT_TYPE=UPDATE
> > -- let IMG_BI_EXPECTED={1=>1, 2=>NULL, 3=>"\'a\'"}
> > -- let IMG_AI_EXPECTED={1=>2, 2=>NULL, 3=>"\'a\'"}
> > -- source suite/rpl/include/rpl_row_mysqbinlog_img_check.inc
> > 
> > -- let IMG_STMT_TYPE=DELETE
> > -- let IMG_BI_EXPECTED={1=>2, 2=>NULL, 3=>"\'a\'"}
> > -- source suite/rpl/include/rpl_row_mysqbinlog_img_check.inc
> > 
> [...]
> 

Thanks Sven for thorough review.

Regards,
Luís


Thread
WL#5096 reviewSven Sandberg9 Nov
  • Re: WL#5096 reviewLuís Soares24 Nov
    • Re: WL#5096 reviewLuís Soares25 Nov
    • Re: WL#5096 reviewSven Sandberg2 Dec
      • Re: WL#5096 reviewLuís Soares26 Mar
        • Re: WL#5096 reviewSven Sandberg30 Mar
          • Re: WL#5096 reviewLuís Soares30 Mar