Hi Zhenxing,
Thanks for fixing this, it's a hard bug and a creative solution. Your
analysis seems sound and I think the strategy to add more information to
the binlog is right. I think the architecture can be modified slightly
though (see next paragraph). There were also some minor mistakes in the
patch and some things that need polishing (see comments inline).
I think your suggestion on the bug page, to include the information in
Query_log_event, is feasible, and better than adding a user variable.
You can do it by adding a "status variable" to the query log event.
Status variables which are not known to a server version are just
ignored, so the change is safe for replicating between server versions.
See the comment starting "Here there could be code like" in
log_event.cc, and see the doxygen help for Query_log_event. You'll need
to add the variable to Query_log_event, update Query_log_event
constructors, Query_log_event::write, and
Query_log_event::do_apply_event, and update the doxygen comments for
Query_log_event.
/Sven
He Zhenxing wrote:
> #At file:///media/sda3/work/mysql/bzrwork/b37051/5.1-rpl-new/
>
> 2632 He Zhenxing 2008-07-25
> BUG#37051 Replication rules not evaluated correctly
>
> The problem of this bug is that we need to get the list of tables
> to be updated for a multi-update statement, which requires to
> open all the tables referenced by the statement and resolve all
> the fields involved in update in order to figure out the list of
> tables for update. However if there are replicate filter rules,
> some tables might not exist on slave and result in a failure
> before we could examine the filter rules.
>
> I think the whole problem can not be solved on slave alone,
> the master must record and send the information of tables
> involved for update to slave, so that the slave do not need to
> open all the tables referencec by the multi-update statement to
> figure out which tables are involved for udpate.
>
> So I added an internal user variable to store the
> value of table map for update on master and written it to binlog
> before the multi-update query log event. And on slave, it will
> try to get the value of this variable and use it to examine
> filter rules without opening any tables on slave.
> added:
> mysql-test/suite/rpl/r/rpl_bug37051.result
> mysql-test/suite/rpl/t/rpl_bug37051-slave.opt
> mysql-test/suite/rpl/t/rpl_bug37051.test
> modified:
> sql/log.cc
> sql/sql_class.h
> sql/sql_parse.cc
> sql/sql_update.cc
>
> per-file messages:
> sql/log.cc
> Write the user var event of table_map_for_update for multi-upate
> sql/sql_class.h
> add member table_map_for_update to THD
> sql/sql_parse.cc
> check filter rules by using table_map_for_update value
> sql/sql_update.cc
> save the value of table_map_for_update
> === added file 'mysql-test/suite/rpl/r/rpl_bug37051.result'
> --- a/mysql-test/suite/rpl/r/rpl_bug37051.result 1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/rpl/r/rpl_bug37051.result 2008-07-25 08:23:42 +0000
> @@ -0,0 +1,29 @@
> +stop slave;
> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> +reset master;
> +reset slave;
> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> +start slave;
> +CREATE TABLE t1 (id int, a int);
> +CREATE TABLE t2 (id int, b int);
> +CREATE TABLE t3 (id int, c int);
> +INSERT INTO t1 VALUES (1, 1), (2, 2), (3, 3);
> +INSERT INTO t2 VALUES (1, 1), (2, 2), (3, 3);
> +INSERT INTO t3 VALUES (1, 1), (2, 2), (3, 3);
> +UPDATE t2 LEFT JOIN t1 ON (t1.id = t2.id) SET a=a+10, b=a+10 where t1.id = 1;
> +UPDATE t1 LEFT JOIN t2 ON (t1.id = t2.id) SET a=a+10, b=a+10 where t1.id = 1;
> +set global sql_slave_skip_counter=1;
> +start slave;
> +UPDATE t1 NATURAL JOIN t2 SET b = a+100 where id = 1;
> +UPDATE t1 NATURAL JOIN t2 SET a = a+100 where id = 1;
> +set global sql_slave_skip_counter=1;
> +start slave;
> +UPDATE t3 LEFT JOIN t1 ON (t1.id = t3.id) SET c = c+1000 where t1.id = 1;
> +UPDATE t1 NATURAL JOIN t3 SET c = c+10000 where id = 1;
> +UPDATE t3 LEFT JOIN t2 ON (t2.id = t3.id) SET b=b+100000, c=c+100000 where t2.id =
> 1;
> +UPDATE t2 NATURAL JOIN t3 SET b=b+1000000, c=c+1000000 where id = 1;
> +SELECT * FROM t1;
> +id a
> +1 1
> +2 2
> +3 3
>
> === added file 'mysql-test/suite/rpl/t/rpl_bug37051-slave.opt'
> --- a/mysql-test/suite/rpl/t/rpl_bug37051-slave.opt 1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/rpl/t/rpl_bug37051-slave.opt 2008-07-25 08:23:42 +0000
> @@ -0,0 +1 @@
> +--replicate-do-table=test.t1 --replicate-ignore-table=test.t2
> --replicate-ignore-table=test.t3
>
> === added file 'mysql-test/suite/rpl/t/rpl_bug37051.test'
I think you could make the test name describe the purpose of the test
better. Maybe rpl_ignore_table.test? (that's taken though, so if you
can't test what you want to test with the server options for the
existing test, you can always make it rpl_ignore_table_2.test :-)
> --- a/mysql-test/suite/rpl/t/rpl_bug37051.test 1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/rpl/t/rpl_bug37051.test 2008-07-25 08:23:42 +0000
> @@ -0,0 +1,43 @@
> +source include/have_binlog_format_statement.inc;
> +source include/master-slave.inc;
Please begin each test case with comments describing:
- The purpose of the test (e.g., something like "Verify that
replication rules specified with --replicate-*-table are correctly
evaluated for UPDATE statements that modify some tables and use other
tables only for reading.")
- A description of how the test is implemented (e.g., something like
"The master creates tables t1, t2, t3, and the slave runs with
--replicate-do-table=t1, --replicate-ignore-table=t2,
--replicate-ignore-table=t3. It verifies the following situations:
UPDATE that modifies a do-table and reads an ignore-table, UPDATE
that... etc")
- A list of related bugs (BUG#37051).
> +
> +CREATE TABLE t1 (id int, a int);
> +CREATE TABLE t2 (id int, b int);
> +CREATE TABLE t3 (id int, c int);
> +
> +INSERT INTO t1 VALUES (1, 1), (2, 2), (3, 3);
> +INSERT INTO t2 VALUES (1, 1), (2, 2), (3, 3);
> +INSERT INTO t3 VALUES (1, 1), (2, 2), (3, 3);
> +
> +# This will be ignored by slave
> +UPDATE t2 LEFT JOIN t1 ON (t1.id = t2.id) SET a=a+10, b=a+10 where t1.id = 1;
> +
> +# This will cause error on slave
> +UPDATE t1 LEFT JOIN t2 ON (t1.id = t2.id) SET a=a+10, b=a+10 where t1.id = 1;
> +connection slave;
Please do "--echo [connection slave]" each time the connection changes
(connection slave, connection master, and sync_slave_with_master). That
makes the result file more readable.
> +source include/wait_for_slave_sql_to_stop.inc;
For clarity, I think it would be even better to do 'source
include/wait_for_slave_sql_error.inc', and then
let $error= query_get_value("SHOW SLAVE STATUS", Last_SQL_Error, 1);
--echo Last_SQL_Error = $error
> +set global sql_slave_skip_counter=1;
> +start slave;
Please use include/start_slave.inc instead.
> +
> +connection master;
> +
> +# This will be ignored by slave
> +UPDATE t1 NATURAL JOIN t2 SET b = a+100 where id = 1;
Please sync_slave_with_master to verify that this is ignored by the
slave (otherwise, the test will pass if this statement fails and the
next is ignored).
> +
> +# This will cause error on slave
> +UPDATE t1 NATURAL JOIN t2 SET a = a+100 where id = 1;
> +connection slave;
> +source include/wait_for_slave_sql_to_stop.inc;
Same as above: please use wait_for_slave_sql_error and print the error
message.
> +set global sql_slave_skip_counter=1;
> +start slave;
> +
> +connection master;
> +
> +# These will be ignored by slave
> +UPDATE t3 LEFT JOIN t1 ON (t1.id = t3.id) SET c = c+1000 where t1.id = 1;
> +UPDATE t1 NATURAL JOIN t3 SET c = c+10000 where id = 1;
> +UPDATE t3 LEFT JOIN t2 ON (t2.id = t3.id) SET b=b+100000, c=c+100000 where t2.id =
> 1;
> +UPDATE t2 NATURAL JOIN t3 SET b=b+1000000, c=c+1000000 where id = 1;
To test more execution paths, I think it would be good if the test also
used a table not mentioned by any --replicate-*-table options. So you
could remove --replicate-ignore-table=t3. I think all tests above will
be executed the same way if you do that.
Some more scenarios you may test:
- An ignore-table and a do-table are modified and the statement is
ignored by the slave because the ignore-table is mentioned before the
do-table.
- A table not mentioned by replication rules and a do-table are
updated and the statement is executed (and fails) on the slave (no
matter what order the tables appear in).
> +
> +sync_slave_with_master;
> +SELECT * FROM t1;
Please check that Last_SQL_Error is empty.
Please clean up (drop all tables and sync_slave_with_master) at the end
of the test.
>
> === modified file 'sql/log.cc'
> --- a/sql/log.cc 2008-05-21 12:44:30 +0000
> +++ b/sql/log.cc 2008-07-25 08:23:42 +0000
> @@ -3965,6 +3965,17 @@ bool MYSQL_BIN_LOG::write(Log_event *eve
> goto err;
> }
> }
> + if (thd->lex->sql_command == SQLCOM_UPDATE_MULTI)
> + {
> + User_var_log_event e(thd,
> + (char *)STRING_WITH_LEN("__TABLE_MAP_FOR_UPDATE__"),
> + (char *)&thd->table_map_for_update,
> + sizeof(ulonglong),
> + INT_RESULT,
> + my_charset_bin.number);
> + if (e.write(file))
> + goto err;
> + }
> }
> }
>
>
> === modified file 'sql/sql_class.h'
> --- a/sql/sql_class.h 2008-05-20 07:38:17 +0000
> +++ b/sql/sql_class.h 2008-07-25 08:23:42 +0000
> @@ -1196,6 +1196,7 @@ class THD :public Statement,
> public Open_tables_state
> {
> public:
Please add a comment here, explaining the purpose of new member variable.
> + table_map table_map_for_update;
> /* Used to execute base64 coded binlog events in MySQL server */
> Relay_log_info* rli_fake;
>
>
> === modified file 'sql/sql_parse.cc'
> --- a/sql/sql_parse.cc 2008-05-20 16:36:26 +0000
> +++ b/sql/sql_parse.cc 2008-07-25 08:23:42 +0000
> @@ -1953,6 +1953,40 @@ mysql_execute_command(THD *thd)
> all_tables->updating= 1;
> }
>
Please add a comment here, explaining what the block of code below does.
> + if (lex->sql_command == SQLCOM_UPDATE_MULTI)
> + {
> + my_bool null_value;
> + table_map table_map_for_update= 0;
> + LEX_STRING name= {(char *)STRING_WITH_LEN("__TABLE_MAP_FOR_UPDATE__")};
> + user_var_entry *var_entry= (user_var_entry
> *)hash_search(&thd->user_vars,
> + (const uchar
> *)name.str,
> + name.length);
> + if (var_entry)
> + table_map_for_update= (table_map)var_entry->val_int(&null_value);
> +
> + uint nr= 0;
> + TABLE_LIST *table;
> + for (table=all_tables; table; table=table->next_global)
> + {
> + if (table_map_for_update & (1 << nr))
I think you have to use ((table_map)1) << nr. Otherwise it will not work
when nr >= 32 on 32 bit machines.
> + table->updating= TRUE;
> + else
> + table->updating= FALSE;
> + }
I think you forgot nr++ in the loop?
> +
> + if (all_tables_not_ok(thd, all_tables))
> + {
> + /* we warn the slave SQL thread */
> + my_message(ER_SLAVE_IGNORED_TABLE, ER(ER_SLAVE_IGNORED_TABLE), MYF(0));
> + if (thd->one_shot_set)
> + reset_one_shot_variables(thd);
> + DBUG_RETURN(0);
> + }
> +
> + for (table=all_tables; table; table=table->next_global)
> + table->updating= TRUE;
> + }
> +
> /*
> Check if statment should be skipped because of slave filtering
> rules
>
> === modified file 'sql/sql_update.cc'
> --- a/sql/sql_update.cc 2008-05-20 07:38:17 +0000
> +++ b/sql/sql_update.cc 2008-07-25 08:23:42 +0000
> @@ -1000,7 +1000,7 @@ reopen_tables:
> DBUG_RETURN(TRUE);
> }
>
> - tables_for_update= get_table_map(fields);
> + thd->table_map_for_update= tables_for_update= get_table_map(fields);
>
> /*
> Setup timestamp handling and locking mode
>
>
--
Sven Sandberg, Software Engineer
MySQL AB, www.mysql.com