List:Commits« Previous MessageNext Message »
From:He Zhenxing Date:November 27 2008 6:11am
Subject:Re: bzr commit into mysql-5.1 branch (aelkin:2701) Bug#40221
View as plain text  
Hi Andrei

Andrei Elkin wrote:
> He Zhenxing, hello.
> 
> > Hi Andrei
> >
> > Thank you for you work! And sorry for the delay!
> >
> > I tried my best to understand the code dealing with binlog and
> > transaction, but still not very clear about that, and I think your patch
> > probably not correct, although it seems get the right result. So I'd
> > suggest to write a dedicated test case to test all the possibilities
> > involved binlog and transaction to make sure the fix is correct and not
> > introducing new hidden bugs.
> >
> 
> To presume the worst is your job, sure :-)
> 
> > I think we can use the decision table for commit in the comment of
> > binlog_commit function as a guide line for the test case we should test,
> > and also do this for rollback.
> >
> > I might be wrong, so don't hesitate to argue!
> >
> > see also inline comments below.
> >
> > Andrei Elkin wrote:
> >> #At file:///home/andrei/MySQL/BZR/FIXES/bug40221-RBR_UPDATE_pk/
> >> 
> >>  2701 Andrei Elkin	2008-11-21
> >>       Bug #40221 Replication failure on RBR + UPDATE the primary key
> >>       
> >>       A transaction could result in having an extra event after a query that
> errored e.g because of 
> >>       a dup key. Such a query is rolled back in innodb, as specified, but
> has not been in binlog.
> >>       It appeares that the binlog engine did not always register for a query
> (statement) because 
> >>       the previous query had not reset  at its statement commit time.
> Because of that fact there was no
> >>       roll-back to the trx_data->before_stmt_pos position and a the
> pending event of the errorred query
> >>       could become flushed to the binlog file.
> >
> > I think the real problem is that when running a statement in a
> > transaction, the thd->transaction.stmt.ha_list will not include the
> > binlog_hton, and so when that statement is rolled back, it will not be
> > rolled back in the binlog, only in the storage engine, 
> 
> The patch comments don't speak the way you do, but they are supposed to
> say the same. Look, "the binlog engine did not always register for a
> query "  ==  yours "the thd->transaction.stmt.ha_list will not include
> the  binlog_hton"
> 

Yes, I noticed that too.

> > so I think the
> > correct fix should to add binlog_hton to thd->transaction.stmt.ha_list.
> >
> 
> There is nothing to disagree again, my lines "Fixed with deploying the
> reset of trx_data->before_stmt_pos at the end of the query processing"
> have a consequence such that the binlog handlerton will be registered
> with the fix I am suggesting; just scan for references of
> `before_stmt_pos' to find out the point in the code thar registers:
> THD::binlog_start_trans_and_stmt().
> 
> Maybe I should have said explplicitly in the comments.
> 

Right, I noticed that your patch actually did what I was thinking, but
the problem is why not do it in a direct way?

In your patch, by setting 
	rx_data->before_stmt_pos = MY_OFF_T_UNDEF;
Then when the next statment starts and the
THD::binlog_start_trans_and_stmt is executed, then the following will be
executed:

  if (trx_data == NULL ||
      trx_data->before_stmt_pos == MY_OFF_T_UNDEF)
  {
    this->binlog_set_stmt_begin();
    if (options & (OPTION_NOT_AUTOCOMMIT | OPTION_BEGIN))
      trans_register_ha(this, TRUE, binlog_hton);
    trans_register_ha(this, FALSE, binlog_hton);
    ha_data[binlog_hton->slot].ha_info[0].set_trx_read_write();
  }

But what I am thinking is why not just remove the if statement in the above block, 
I think we should do this for every statement, and I think this is actually the effect
after
your patch (right?).

I also tested the following to check if it works with CREATE-SELECT after remove the check
of if statement.
and I see no problem.

-----------------------------------------------------------
SET BINLOG_FORMAT=ROW;
DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
SET AUTOCOMMIT=0;
CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=INNODB;
INSERT INTO t1 VALUES (1,2), (2,2), (3,4), (4,4);
error ER_DUP_ENTRY;
CREATE TABLE t2(UNIQUE(b)) SELECT * FROM t1;
COMMIT;
SHOW BINLOG EVENTS;
-----------------------------------------------------------

The result of above test is:
---------------------------------------------------------------------------------
SET BINLOG_FORMAT=ROW;
DROP TABLE IF EXISTS t1;
Warnings:
Note	1051	Unknown table 't1'
DROP TABLE IF EXISTS t2;
Warnings:
Note	1051	Unknown table 't2'
SET AUTOCOMMIT=0;
CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=INNODB;
INSERT INTO t1 VALUES (1,2), (2,2), (3,4), (4,4);
CREATE TABLE t2(UNIQUE(b)) SELECT * FROM t1;
ERROR 23000: Duplicate entry '2' for key 'b'
COMMIT;
SHOW BINLOG EVENTS;
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
master-bin.000001	4	Format_desc	1	106	Server ver: 5.1.31-debug-log, Binlog ver: 4
master-bin.000001	106	Query	1	192	use `test`; DROP TABLE IF EXISTS t1
master-bin.000001	192	Query	1	278	use `test`; DROP TABLE IF EXISTS t2
master-bin.000001	278	Query	1	397	use `test`; CREATE TABLE t1 (a INT PRIMARY KEY, b INT)
ENGINE=INNODB
master-bin.000001	397	Query	1	465	use `test`; BEGIN
master-bin.000001	465	Table_map	1	507	table_id: 23 (test.t1)
master-bin.000001	507	Write_rows	1	572	table_id: 23 flags: STMT_END_F
master-bin.000001	572	Xid	1	599	COMMIT /* xid=26 */
----------------------------------------------------------------------------------------

So I'd suggest to remove the if check instead of set before_stmt_pos. What do you think?

> > I think the original code handling trx_data->beore_stmt_pos is correct.
> >
> 
> Sorry, I believe it behaves against intentions. Really, if there is an
> assignment trx_data->beore_stmt_pos := pos at the begining of the
> statement for the sake of use `pos' as a savepoint at rollback of the
> statement then  there should exist resetting 
> trx_data->beore_stmt_pos := RESET_VALUE 
> to release the savepoint at the end.
> 
> 
> >>       
> >>       Fixed with deploying the reset of trx_data->before_stmt_pos at the
> end of the query processing.
> >> modified:
> >>   mysql-test/suite/rpl/r/rpl_innodb.result
> >>   mysql-test/suite/rpl/t/rpl_innodb.test
> >>   sql/log.cc
> >> 
> >> per-file messages:
> >>   mysql-test/suite/rpl/r/rpl_innodb.result
> >>     results gained bug#40221 regression test part.
> 
> BTW, the regression test has been moved to binlog_innodb.test for
> several reasons, please look at my answer to Mats' review for details.
> 
> >>   mysql-test/suite/rpl/t/rpl_innodb.test
> >>     bug#40221 regression test is added.
> >>   sql/log.cc
> >>     Flushing the pending row event at binlog_end_trans() is moved down to
> the commit branch.
> >>     Resetting of trx_data->before_stmt_pos to the uninitialized value at
> the end of the statement is
> >>     implemented so that both binlog_commit() and binlog_rollback()
> execututes it.
> >> === modified file 'mysql-test/suite/rpl/r/rpl_innodb.result'
> >> --- a/mysql-test/suite/rpl/r/rpl_innodb.result	2008-10-29 13:25:03 +0000
> >> +++ b/mysql-test/suite/rpl/r/rpl_innodb.result	2008-11-21 12:41:52 +0000
> >> @@ -81,4 +81,27 @@ FLUSH LOGS;
> >>  -------- switch to master --------
> >>  FLUSH LOGS;
> >>  DROP DATABASE mysqltest1;
> >> +DROP TEMPORARY TABLE mysqltest1.tmp2;
> >> +SET @@session.binlog_format = 'row';
> >> +drop table if exists t1;
> >> +Warnings:
> >> +Note	1051	Unknown table 't1'
> >> +CREATE TABLE t1 (pk int auto_increment primary key) ENGINE=innodb;
> >> +reset master;
> >> +show binlog events from <binlog_start>;
> >> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> >> +begin;
> >> +insert into t1 values (1),(2);
> >> +*** the following UPDATE query wont generate any updates for the binlog
> ***
> >> +update t1 set pk = 3 where pk < 3;
> >> +ERROR 23000: Duplicate entry '3' for key 'PRIMARY'
> >> +commit;
> >> +*** the binlog must have only Write_rows events not any Update_rows ***
> >> +show binlog events from <binlog_start>;
> >> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> >> +master-bin.000001	#	Query	#	#	use `test`; BEGIN
> >> +master-bin.000001	#	Table_map	#	#	table_id: # (test.t1)
> >> +master-bin.000001	#	Write_rows	#	#	table_id: # flags: STMT_END_F
> >> +master-bin.000001	#	Xid	#	#	COMMIT /* XID */
> >> +drop table t1;
> >>  End of 5.1 tests
> >> 
> >> === modified file 'mysql-test/suite/rpl/t/rpl_innodb.test'
> >> --- a/mysql-test/suite/rpl/t/rpl_innodb.test	2008-10-29 13:25:03 +0000
> >> +++ b/mysql-test/suite/rpl/t/rpl_innodb.test	2008-11-21 12:41:52 +0000
> >> @@ -120,6 +120,31 @@ connection master;
> >>  FLUSH LOGS;
> >>  
> >>  DROP DATABASE mysqltest1;
> >> +DROP TEMPORARY TABLE mysqltest1.tmp2;
> >>  -- source include/master-slave-end.inc
> >>  
> >> +#
> >> +# Bug #40221 Replication failure on RBR + UPDATE the primary key
> >> +#
> >> +
> >> +connection master;
> >> +
> >> +SET @@session.binlog_format = 'row';
> >> +drop table if exists t1;
> >> +CREATE TABLE t1 (pk int auto_increment primary key) ENGINE=innodb;
> >> +reset master;
> >> +source include/show_binlog_events.inc;
> >> +
> >> +begin;
> >> +insert into t1 values (1),(2);
> >> +--echo *** the following UPDATE query wont generate any updates for the
> binlog ***
> >> +--error ER_DUP_ENTRY
> >> +update t1 set pk = 3 where pk < 3;
> >> +commit;
> >> +--echo *** the binlog must have only Write_rows events not any Update_rows
> ***
> >> +source include/show_binlog_events.inc;
> >> +
> >> +drop table t1;
> >> +
> >> +
> >>  --echo End of 5.1 tests
> >> 
> >> === modified file 'sql/log.cc'
> >> --- a/sql/log.cc	2008-11-12 17:51:47 +0000
> >> +++ b/sql/log.cc	2008-11-21 12:41:52 +0000
> >> @@ -1377,8 +1377,6 @@ binlog_end_trans(THD *thd, binlog_trx_da
> >>                        FLAGSTR(thd->options, OPTION_NOT_AUTOCOMMIT),
> >>                        FLAGSTR(thd->options, OPTION_BEGIN)));
> >>  
> >> -  thd->binlog_flush_pending_rows_event(TRUE);
> >> -
> >
> > No problem for this.
> >
> >>    /*
> >>      NULL denotes ROLLBACK with nothing to replicate: i.e., rollback of
> >>      only transactional tables.  If the transaction contain changes to
> >> @@ -1387,6 +1385,7 @@ binlog_end_trans(THD *thd, binlog_trx_da
> >>    */
> >>    if (end_ev != NULL)
> >>    {
> >> +    thd->binlog_flush_pending_rows_event(TRUE);
> >
> > No problem for this.
> >
> >>      /*
> >>        Doing a commit or a rollback including non-transactional tables,
> >>        i.e., ending a transaction where we might write the transaction
> >> @@ -1434,7 +1433,8 @@ binlog_end_trans(THD *thd, binlog_trx_da
> >>      */
> >>      mysql_bin_log.update_table_map_version();
> >>    }
> >> -
> >> +  if (!all)
> >> +    trx_data->before_stmt_pos = MY_OFF_T_UNDEF;
> >
> > If change this to trx_data->reset as Mats suggested in his review, the
> > result will be wrong.
> 
> Please find my arguments in my answer to that suggestion: the
> statement rollback should clear out the transaction cache.
> 

I don't see your reply to Mats' review on commit list. Have you replied
to all?

> >
> >>    DBUG_RETURN(error);
> >>  }
> >>  
> >> @@ -1553,6 +1553,8 @@ static int binlog_commit(handlerton *hto
> >>      int error= binlog_end_trans(thd, trx_data, &qev, all);
> >>      DBUG_RETURN(error);
> >>    }
> >> +  if (!all)
> >> +    trx_data->before_stmt_pos = MY_OFF_T_UNDEF;
> >
> > The same as above.
> 
> as said.
> 
> >
> >>    DBUG_RETURN(0);
> >>  }
> 
> cheers,
> 
> Andrei

Thread
bzr commit into mysql-5.1 branch (aelkin:2701) Bug#40221Andrei Elkin21 Nov
  • Re: bzr commit into mysql-5.1 branch (aelkin:2701) Bug#40221Mats Kindahl25 Nov
  • Re: bzr commit into mysql-5.1 branch (aelkin:2701) Bug#40221He Zhenxing26 Nov
Re: bzr commit into mysql-5.1 branch (aelkin:2701) Bug#40221He Zhenxing27 Nov