List:Commits« Previous MessageNext Message »
From:Mats Kindahl Date:February 20 2009 10:28am
Subject:Re: bzr commit into mysql-5.1-bugteam branch (luis.soares:2801)
Bug#39701
View as plain text  
Hi Luis!

Comments inline. Basically, the actual fix is OK, but the test needs re-writing.

Just my few cents,
Mats Kindahl

Luis Soares wrote:
> #At file:///home/lsoares/Workspace/mysql-server/bugfix/39701/5.1-bt/ based on
> revid:alexey.kopytov@stripped
> 
>  2801 Luis Soares	2009-02-17
>       BUG#39701: Mixed binlog format does not switch to row mode on LOAD_FILE
>       
>       MIXED mode binlog_format should switch to ROW mode whenever there is a risk of
> breaking 
>       replication. However, LOAD_FILE, which depends on a file on the same host, does
> not trigger 
>       a switch to ROW based mode. This leads to scenarios on which the slave
> replicates the 
>       statement with 'load_file' and it will try to load the file from local file
> system. Given that 
>       the file is mostly likely inexistent in slave filesystem the operation will not
> succeed 
>       (probably returning NULL), causing master and slave(s) to diverge.

Well... strictly speaking there is nothing that says that the master should
switch to row format because of this, since it doesn't work for statement-based
replication.

I think you should describe the situation first (including that it does not work
for statement-based replication), and then say that for mixed mode, we can make
this work by switching the row-format.

>       This patch addresses this bug by marking the load_file function as unsafe. When
> in mixed mode 
>       logging, this will make the procedure that decides on which logging format to
> use to be aware of
>       the load_file function and decide accordingly.

First sentence is fine, but the rest is kind of vague. What does "decide
accordingly" mean? That is, what is the decision?

>       added:
>         mysql-test/suite/rpl/r/rpl_mix_load_file.result
>         mysql-test/suite/rpl/t/rpl_mix_load_file.test
>       modified:
>         sql/item_create.cc
> 
> per-file messages:
>   mysql-test/suite/rpl/t/rpl_mix_load_file.test
>     Simple test to assert that the patch fixes this bug.
>   sql/item_create.cc
>     Added the set_stmt_unsafe call to lex.
> === added file 'mysql-test/suite/rpl/r/rpl_mix_load_file.result'
> --- a/mysql-test/suite/rpl/r/rpl_mix_load_file.result	1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/rpl/r/rpl_mix_load_file.result	2009-02-17 00:49:33 +0000
> @@ -0,0 +1,29 @@
> +stop slave;
> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> +reset master;
> +reset slave;
> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> +start slave;
> +SELECT repeat('x',20) INTO OUTFILE
> '/home/lsoares/Workspace/mysql-server/bugfix/39701/5.1-bt/mysql-test/var/tmp/bug_39701.data';

You cannot have your path in the result file, it will cause a result mismatch on
other machines. Use replace_string to replace (part of) the path with something
machine neutral.

> +DROP TABLE IF EXISTS t1;
> +Warnings:
> +Note	1051	Unknown table 't1'

Turn of the warnings for this statement.

Since you write "IF EXISTS" it means that the table may or may not exist, and in
the even that it did exist, you will not get a warning and hence a result
mismatch for the test.

> +CREATE TABLE t1 (t text);
> +INSERT INTO t1 VALUES
> (load_file('/home/lsoares/Workspace/mysql-server/bugfix/39701/5.1-bt/mysql-test/var/tmp/bug_39701.data'));;

Same here.

> +INSERT INTO t1 VALUES ('1');
> +SET BINLOG_FORMAT=STATEMENT;

Don't set the format explicitly in the test, use the directive

	source include/have_binlog_format_statement.inc;

in the beginning of the file to just run the test under statement mode.
Otherwise, you will run the test three times: once for each mode, which is
unnecessary.

> +INSERT INTO t1 VALUES (LOAD_FILE('$MYSQL_TMP_DIR/inexistent'));
> +Warnings:
> +Warning	1592	Statement is not safe to log in statement format.
> +show binlog events from <binlog_start>;
> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> +master-bin.000001	#	Query	#	#	use `test`; DROP TABLE IF EXISTS t1
> +master-bin.000001	#	Query	#	#	use `test`; CREATE TABLE t1 (t text)
> +master-bin.000001	#	Query	#	#	use `test`; BEGIN
> +master-bin.000001	#	Table_map	#	#	table_id: # (test.t1)
> +master-bin.000001	#	Write_rows	#	#	table_id: # flags: STMT_END_F
> +master-bin.000001	#	Query	#	#	use `test`; COMMIT
> +master-bin.000001	#	Query	#	#	use `test`; INSERT INTO t1 VALUES ('1')
> +master-bin.000001	#	Query	#	#	use `test`; INSERT INTO t1 VALUES
> (LOAD_FILE('$MYSQL_TMP_DIR/inexistent'))

We avoid showing the contents of the binary log in replication tests and do that
in the binlog test suite. The reason is that by showing contents of the binary
log, any tests have to be updated whenever there is a minor change of the format
which does not affect replication. Another reason is that the binary log is
different between formats, which means that it is necessary to write three
different tests instead of a single one where just the effects of replication is
compared (and the actual contents of the binary log is disregarded).

Since this is more an issue of how the LOAD_FILE() is logged, I would suggest to
move this test to the binlog suite, which means that any checks with the slave
is skipped.

> +Comparing tables master:test.t1 and slave:test.t1
> +DROP TABLE t1;
> 
> === added file 'mysql-test/suite/rpl/t/rpl_mix_load_file.test'
> --- a/mysql-test/suite/rpl/t/rpl_mix_load_file.test	1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/rpl/t/rpl_mix_load_file.test	2009-02-17 00:49:33 +0000
> @@ -0,0 +1,62 @@
> +#  BUG#39701: Mixed binlog format does not switch to row mode on LOAD_FILE
> +#
> +#  DESCRIPTION
> +#
> +#    Problem: when using load_file string function and mixed binlogging format
> +#             there was no switch to row based binlogging format. This leads
> +#             to scenarios on which the slave replicates the statement and it
> +#             will try to load the file from local file system, which in most
> +#             likely it will not exist.
> +#
> +#    Solution:
> +#             Marking this function as unsafe for statement format, makes the
> +#             statement using it to be logged in row based format. As such, data 
> +#             replicated from the master, becomes the content of the loaded file.
> +#             Consequently, the slave receives the necessary data to complete
> +#             the load_file instruction correctly.
> +#
> +#  IMPLEMENTATION
> +#
> +#    The test is implemented as follows:
> +#
> +#      On Master,
> +#       i) write to file the desired content.
> +#      ii) create table, and load into it the file contents
> +#     iii) execute load_file and safe statement (wrt to mixed mode binlogging)
> +#      iv) show binlogevents to assert that tere are rows for load_file and
> +#          statement for safe statement from iii)
> +#       v) force format to statement and check that the statement is still
> +#          logged, but this time with a warning.
> +#      vi) remove the file used in load_file
> +#
> +#      On Slave,
> +#     vii) check that it has replicated the correct values from master (should 
> +#          load_file not switch to row mode, then slave differs from master, 
> +#          because the file was removed in vi) and the load_file returns NULL
> +#          when file does not exist) 
> +#
> +
> +--source include/master-slave.inc
> +--source include/have_binlog_format_mixed.inc
> +
> +let $file= $MYSQL_TMP_DIR/bug_39701.data;
> +--eval SELECT repeat('x',20) INTO OUTFILE '$file'
> +
> +DROP TABLE IF EXISTS t1;
> +CREATE TABLE t1 (t text);
> +--eval INSERT INTO t1 VALUES (load_file('$file'));
> +INSERT INTO t1 VALUES ('1');
> +
> +SET BINLOG_FORMAT=STATEMENT;
> +INSERT INTO t1 VALUES (LOAD_FILE('$MYSQL_TMP_DIR/inexistent'));
> +--source include/show_binlog_events.inc
> +
> +remove_file $file;
> +sync_slave_with_master;
> +
> +let $diff_table_1=master:test.t1;
> +let $diff_table_2=slave:test.t1;
> +--source include/diff_tables.inc

The test is significantly simpler if you just check the contents of the binary
log. Since you only care about it for mixed and row, you can write a single test
for both modes (there are some other binlog tests in a similar vein, so you can
check those).

> +
> +DROP TABLE t1;
> +sync_slave_with_master;
> 
> === modified file 'sql/item_create.cc'
> --- a/sql/item_create.cc	2008-04-11 03:27:24 +0000
> +++ b/sql/item_create.cc	2009-02-17 00:49:33 +0000
> @@ -3791,6 +3791,7 @@ Create_func_load_file Create_func_load_f
>  Item*
>  Create_func_load_file::create(THD *thd, Item *arg1)
>  {
> +  thd->lex->set_stmt_unsafe();
>    thd->lex->uncacheable(UNCACHEABLE_SIDEEFFECT);
>    return new (thd->mem_root) Item_load_file(arg1);
>  }

This is fine. :)

-- 
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com

Thread
bzr commit into mysql-5.1-bugteam branch (luis.soares:2801) Bug#39701Luis Soares17 Feb 2009
  • Re: bzr commit into mysql-5.1-bugteam branch (luis.soares:2801)Bug#39701Andrei Elkin19 Feb 2009
  • Re: bzr commit into mysql-5.1-bugteam branch (luis.soares:2801)Bug#39701Mats Kindahl20 Feb 2009