MySQL Lists are EOL. Please join:

List:Commits« Previous MessageNext Message »
From:He Zhenxing Date:July 29 2008 2:47am
Subject:Re: bzr commit into mysql-5.1 branch (hezx:2632) Bug#37051
View as plain text  
Sven Sandberg wrote:
> Hi Zhenxing,
> 
> Thanks for fixing this, it's a hard bug and a creative solution. Your 
> analysis seems sound and I think the strategy to add more information to 
> the binlog is right. I think the architecture can be modified slightly 
> though (see next paragraph). There were also some minor mistakes in the 
> patch and some things that need polishing (see comments inline).
> 
> I think your suggestion on the bug page, to include the information in 
> Query_log_event, is feasible, and better than adding a user variable. 
> You can do it by adding a "status variable" to the query log event. 
> Status variables which are not known to a server version are just 
> ignored, so the change is safe for replicating between server versions. 
> See the comment starting "Here there could be code like" in 
> log_event.cc, and see the doxygen help for Query_log_event. You'll need 
> to add the variable to Query_log_event, update Query_log_event 
> constructors, Query_log_event::write, and 
> Query_log_event::do_apply_event, and update the doxygen comments for 
> Query_log_event.

Good point, I'll check that, thanks!

> 
> /Sven
> 
> 
> He Zhenxing wrote:
> > #At file:///media/sda3/work/mysql/bzrwork/b37051/5.1-rpl-new/
> > 
> >  2632 He Zhenxing	2008-07-25
> >       BUG#37051 Replication rules not evaluated correctly
> >       
> >       The problem of this bug is that we need to get the list of tables
> >       to be updated for a multi-update statement, which requires to
> >       open all the tables referenced by the statement and resolve all
> >       the fields involved in update in order to figure out the list of
> >       tables for update. However if there are replicate filter rules,
> >       some tables might not exist on slave and result in a failure
> >       before we could examine the filter rules.
> >       
> >       I think the whole problem can not be solved on slave alone,
> >       the master must record and send the information of tables
> >       involved for update to slave, so that the slave do not need to
> >       open all the tables referencec by the multi-update statement to
> >       figure out which tables are involved for udpate.
> >       
> >       So I added an internal user variable to store the
> >       value of table map for update on master and written it to binlog
> >       before the multi-update query log event. And on slave, it will
> >       try to get the value of this variable and use it to examine
> >       filter rules without opening any tables on slave.
> > added:
> >   mysql-test/suite/rpl/r/rpl_bug37051.result
> >   mysql-test/suite/rpl/t/rpl_bug37051-slave.opt
> >   mysql-test/suite/rpl/t/rpl_bug37051.test
> > modified:
> >   sql/log.cc
> >   sql/sql_class.h
> >   sql/sql_parse.cc
> >   sql/sql_update.cc
> > 
> > per-file messages:
> >   sql/log.cc
> >     Write the user var event of table_map_for_update for multi-upate
> >   sql/sql_class.h
> >     add member table_map_for_update to THD
> >   sql/sql_parse.cc
> >     check filter rules by using table_map_for_update value
> >   sql/sql_update.cc
> >     save the value of table_map_for_update
> > === added file 'mysql-test/suite/rpl/r/rpl_bug37051.result'
> > --- a/mysql-test/suite/rpl/r/rpl_bug37051.result	1970-01-01 00:00:00 +0000
> > +++ b/mysql-test/suite/rpl/r/rpl_bug37051.result	2008-07-25 08:23:42 +0000
> > @@ -0,0 +1,29 @@
> > +stop slave;
> > +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> > +reset master;
> > +reset slave;
> > +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> > +start slave;
> > +CREATE TABLE t1 (id int, a int);
> > +CREATE TABLE t2 (id int, b int);
> > +CREATE TABLE t3 (id int, c int);
> > +INSERT INTO t1 VALUES (1, 1), (2, 2), (3, 3);
> > +INSERT INTO t2 VALUES (1, 1), (2, 2), (3, 3);
> > +INSERT INTO t3 VALUES (1, 1), (2, 2), (3, 3);
> > +UPDATE t2 LEFT JOIN t1 ON (t1.id = t2.id) SET a=a+10, b=a+10 where t1.id = 1;
> > +UPDATE t1 LEFT JOIN t2 ON (t1.id = t2.id) SET a=a+10, b=a+10 where t1.id = 1;
> > +set global sql_slave_skip_counter=1;
> > +start slave;
> > +UPDATE t1 NATURAL JOIN t2 SET b = a+100 where id = 1;
> > +UPDATE t1 NATURAL JOIN t2 SET a = a+100 where id = 1;
> > +set global sql_slave_skip_counter=1;
> > +start slave;
> > +UPDATE t3 LEFT JOIN t1 ON (t1.id = t3.id) SET c = c+1000 where t1.id = 1;
> > +UPDATE t1 NATURAL JOIN t3 SET c = c+10000 where id = 1;
> > +UPDATE t3 LEFT JOIN t2 ON (t2.id = t3.id) SET b=b+100000, c=c+100000 where
> t2.id = 1;
> > +UPDATE t2 NATURAL JOIN t3 SET b=b+1000000, c=c+1000000 where id = 1;
> > +SELECT * FROM t1;
> > +id	a
> > +1	1
> > +2	2
> > +3	3
> > 
> > === added file 'mysql-test/suite/rpl/t/rpl_bug37051-slave.opt'
> > --- a/mysql-test/suite/rpl/t/rpl_bug37051-slave.opt	1970-01-01 00:00:00 +0000
> > +++ b/mysql-test/suite/rpl/t/rpl_bug37051-slave.opt	2008-07-25 08:23:42 +0000
> > @@ -0,0 +1 @@
> > +--replicate-do-table=test.t1 --replicate-ignore-table=test.t2
> --replicate-ignore-table=test.t3
> > 
> > === added file 'mysql-test/suite/rpl/t/rpl_bug37051.test'
> 
> I think you could make the test name describe the purpose of the test 
> better. Maybe rpl_ignore_table.test? (that's taken though, so if you 
> can't test what you want to test with the server options for the 
> existing test, you can always make it rpl_ignore_table_2.test :-)
> 

OK, how about rpl_filter_tables.test

> > --- a/mysql-test/suite/rpl/t/rpl_bug37051.test	1970-01-01 00:00:00 +0000
> > +++ b/mysql-test/suite/rpl/t/rpl_bug37051.test	2008-07-25 08:23:42 +0000
> > @@ -0,0 +1,43 @@
> > +source include/have_binlog_format_statement.inc;
> > +source include/master-slave.inc;
> 
> Please begin each test case with comments describing:
> 
>   - The purpose of the test (e.g., something like "Verify that 
> replication rules specified with --replicate-*-table are correctly 
> evaluated for UPDATE statements that modify some tables and use other 
> tables only for reading.")
> 
>   - A description of how the test is implemented (e.g., something like 
> "The master creates tables t1, t2, t3, and the slave runs with 
> --replicate-do-table=t1, --replicate-ignore-table=t2, 
> --replicate-ignore-table=t3. It verifies the following situations: 
> UPDATE that modifies a do-table and reads an ignore-table, UPDATE 
> that... etc")
> 
>   - A list of related bugs (BUG#37051).
> 

OK

> > +
> > +CREATE TABLE t1 (id int, a int);
> > +CREATE TABLE t2 (id int, b int);
> > +CREATE TABLE t3 (id int, c int);
> > +
> > +INSERT INTO t1 VALUES (1, 1), (2, 2), (3, 3);
> > +INSERT INTO t2 VALUES (1, 1), (2, 2), (3, 3);
> > +INSERT INTO t3 VALUES (1, 1), (2, 2), (3, 3);
> > +
> > +# This will be ignored by slave
> > +UPDATE t2 LEFT JOIN t1 ON (t1.id = t2.id) SET a=a+10, b=a+10 where t1.id = 1;
> > +
> > +# This will cause error on slave
> > +UPDATE t1 LEFT JOIN t2 ON (t1.id = t2.id) SET a=a+10, b=a+10 where t1.id = 1;
> > +connection slave;
> 
> Please do "--echo [connection slave]" each time the connection changes 
> (connection slave, connection master, and sync_slave_with_master). That 
> makes the result file more readable.
> 

Good

> > +source include/wait_for_slave_sql_to_stop.inc;
> 
> For clarity, I think it would be even better to do 'source 
> include/wait_for_slave_sql_error.inc', and then
> 
> let $error= query_get_value("SHOW SLAVE STATUS", Last_SQL_Error, 1);
> --echo Last_SQL_Error = $error
> 

Right, thanks!

> > +set global sql_slave_skip_counter=1;
> > +start slave;
> 
> Please use include/start_slave.inc instead.
> 

good

> > +
> > +connection master;
> > +
> > +# This will be ignored by slave
> > +UPDATE t1 NATURAL JOIN t2 SET b = a+100 where id = 1;
> 
> Please sync_slave_with_master to verify that this is ignored by the 
> slave (otherwise, the test will pass if this statement fails and the 
> next is ignored).
> 

actually this is not necessary, because this and the statement below
updates different tables, so if this statement fails and the next is
ignored, then the value of column 'a' in 't1' will be different, maybe I
need to write some comments to describe this.

> > +
> > +# This will cause error on slave
> > +UPDATE t1 NATURAL JOIN t2 SET a = a+100 where id = 1;
> > +connection slave;
> > +source include/wait_for_slave_sql_to_stop.inc;
> 
> Same as above: please use wait_for_slave_sql_error and print the error 
> message.
> 

OK

> > +set global sql_slave_skip_counter=1;
> > +start slave;
> > +
> > +connection master;
> > +
> > +# These will be ignored by slave
> > +UPDATE t3 LEFT JOIN t1 ON (t1.id = t3.id) SET c = c+1000 where t1.id = 1;
> > +UPDATE t1 NATURAL JOIN t3 SET c = c+10000 where id = 1;
> > +UPDATE t3 LEFT JOIN t2 ON (t2.id = t3.id) SET b=b+100000, c=c+100000 where
> t2.id = 1;
> > +UPDATE t2 NATURAL JOIN t3 SET b=b+1000000, c=c+1000000 where id = 1;
> 
> To test more execution paths, I think it would be good if the test also 
> used a table not mentioned by any --replicate-*-table options. So you 
> could remove --replicate-ignore-table=t3. I think all tests above will 
> be executed the same way if you do that.
> 
> Some more scenarios you may test:
> 
>   - An ignore-table and a do-table are modified and the statement is 
> ignored by the slave because the ignore-table is mentioned before the 
> do-table.
> 
>   - A table not mentioned by replication rules and a do-table are 
> updated and the statement is executed (and fails) on the slave (no 
> matter what order the tables appear in).
> 

yes, you're right, this test case need to be expanded, and even several
tests run with different filter rules.

> > +
> > +sync_slave_with_master;
> > +SELECT * FROM t1;
> 
> Please check that Last_SQL_Error is empty.
> 

I don't think this is neccesary, if there are SQL errors, then
sync_slave_with_master would timeout and fail the test case, right?

> Please clean up (drop all tables and sync_slave_with_master) at the end 
> of the test.
> 

OK

> > 
> > === modified file 'sql/log.cc'
> > --- a/sql/log.cc	2008-05-21 12:44:30 +0000
> > +++ b/sql/log.cc	2008-07-25 08:23:42 +0000
> > @@ -3965,6 +3965,17 @@ bool MYSQL_BIN_LOG::write(Log_event *eve
> >                goto err;
> >            }
> >          }
> > +        if (thd->lex->sql_command == SQLCOM_UPDATE_MULTI)
> > +        {
> > +          User_var_log_event e(thd,
> > +                               (char
> *)STRING_WITH_LEN("__TABLE_MAP_FOR_UPDATE__"),
> > +                               (char *)&thd->table_map_for_update,
> > +                               sizeof(ulonglong),
> > +                               INT_RESULT,
> > +                               my_charset_bin.number);
> > +          if (e.write(file))
> > +            goto err;
> > +        }
> >        }
> >      }
> >  
> > 
> > === modified file 'sql/sql_class.h'
> > --- a/sql/sql_class.h	2008-05-20 07:38:17 +0000
> > +++ b/sql/sql_class.h	2008-07-25 08:23:42 +0000
> > @@ -1196,6 +1196,7 @@ class THD :public Statement,
> >             public Open_tables_state
> >  {
> >  public:
> 
> Please add a comment here, explaining the purpose of new member variable.
> 

OK

> > +  table_map table_map_for_update;
> >    /* Used to execute base64 coded binlog events in MySQL server */
> >    Relay_log_info* rli_fake;
> >  
> > 
> > === modified file 'sql/sql_parse.cc'
> > --- a/sql/sql_parse.cc	2008-05-20 16:36:26 +0000
> > +++ b/sql/sql_parse.cc	2008-07-25 08:23:42 +0000
> > @@ -1953,6 +1953,40 @@ mysql_execute_command(THD *thd)
> >        all_tables->updating= 1;
> >      }
> >      
> 
> Please add a comment here, explaining what the block of code below does.
> 

right

> > +    if (lex->sql_command == SQLCOM_UPDATE_MULTI)
> > +    {
> > +      my_bool null_value;
> > +      table_map table_map_for_update= 0;
> > +      LEX_STRING name= {(char *)STRING_WITH_LEN("__TABLE_MAP_FOR_UPDATE__")};
> > +      user_var_entry *var_entry= (user_var_entry
> *)hash_search(&thd->user_vars,
> > +                                                               (const uchar
> *)name.str,
> > +                                                               name.length);
> > +      if (var_entry)
> > +        table_map_for_update=
> (table_map)var_entry->val_int(&null_value);
> > +
> > +      uint nr= 0;
> > +      TABLE_LIST *table;
> > +      for (table=all_tables; table; table=table->next_global)
> > +      {
> > +        if (table_map_for_update & (1 << nr))
> 
> I think you have to use ((table_map)1) << nr. Otherwise it will not work 
> when nr >= 32 on 32 bit machines.
> 

correct, thanks for catching this.

> > +          table->updating= TRUE;
> > +        else
> > +          table->updating= FALSE;
> > +      }
> 
> I think you forgot nr++ in the loop?
> 

right, thanks! And it also suggest that I need to do test with more
tables.

> > +
> > +      if (all_tables_not_ok(thd, all_tables))
> > +      {
> > +        /* we warn the slave SQL thread */
> > +        my_message(ER_SLAVE_IGNORED_TABLE, ER(ER_SLAVE_IGNORED_TABLE),
> MYF(0));
> > +        if (thd->one_shot_set)
> > +          reset_one_shot_variables(thd);
> > +        DBUG_RETURN(0);
> > +      }
> > +      
> > +      for (table=all_tables; table; table=table->next_global)
> > +        table->updating= TRUE;
> > +    }
> > +    
> >      /*
> >        Check if statment should be skipped because of slave filtering
> >        rules
> > 
> > === modified file 'sql/sql_update.cc'
> > --- a/sql/sql_update.cc	2008-05-20 07:38:17 +0000
> > +++ b/sql/sql_update.cc	2008-07-25 08:23:42 +0000
> > @@ -1000,7 +1000,7 @@ reopen_tables:
> >      DBUG_RETURN(TRUE);
> >    }
> >  
> > -  tables_for_update= get_table_map(fields);
> > +  thd->table_map_for_update= tables_for_update= get_table_map(fields);
> >  
> >    /*
> >      Setup timestamp handling and locking mode
> > 
> > 
> 

Thread
bzr commit into mysql-5.1 branch (hezx:2632) Bug#37051He Zhenxing25 Jul
  • Re: bzr commit into mysql-5.1 branch (hezx:2632) Bug#37051Sven Sandberg28 Jul
    • Re: bzr commit into mysql-5.1 branch (hezx:2632) Bug#37051He Zhenxing29 Jul
    • Re: bzr commit into mysql-5.1 branch (hezx:2632) Bug#37051Ingo Strüwing29 Jul
      • Re: bzr commit into mysql-5.1 branch (hezx:2632) Bug#37051Sven Sandberg29 Jul
        • Re: bzr commit into mysql-5.1 branch (hezx:2632) Bug#37051He Zhenxing30 Jul
      • Re: bzr commit into mysql-5.1 branch (hezx:2632) Bug#37051He Zhenxing30 Jul