Hi Andrei
Andrei Elkin wrote:
> He Zhenxing, hello.
>
> > Hi Andrei
> >
> > Thank you for you work! And sorry for the delay!
> >
> > I tried my best to understand the code dealing with binlog and
> > transaction, but still not very clear about that, and I think your patch
> > probably not correct, although it seems get the right result. So I'd
> > suggest to write a dedicated test case to test all the possibilities
> > involved binlog and transaction to make sure the fix is correct and not
> > introducing new hidden bugs.
> >
>
> To presume the worst is your job, sure :-)
>
> > I think we can use the decision table for commit in the comment of
> > binlog_commit function as a guide line for the test case we should test,
> > and also do this for rollback.
> >
> > I might be wrong, so don't hesitate to argue!
> >
> > see also inline comments below.
> >
> > Andrei Elkin wrote:
> >> #At file:///home/andrei/MySQL/BZR/FIXES/bug40221-RBR_UPDATE_pk/
> >>
> >> 2701 Andrei Elkin 2008-11-21
> >> Bug #40221 Replication failure on RBR + UPDATE the primary key
> >>
> >> A transaction could result in having an extra event after a query that
> errored e.g because of
> >> a dup key. Such a query is rolled back in innodb, as specified, but
> has not been in binlog.
> >> It appeares that the binlog engine did not always register for a query
> (statement) because
> >> the previous query had not reset at its statement commit time.
> Because of that fact there was no
> >> roll-back to the trx_data->before_stmt_pos position and a the
> pending event of the errorred query
> >> could become flushed to the binlog file.
> >
> > I think the real problem is that when running a statement in a
> > transaction, the thd->transaction.stmt.ha_list will not include the
> > binlog_hton, and so when that statement is rolled back, it will not be
> > rolled back in the binlog, only in the storage engine,
>
> The patch comments don't speak the way you do, but they are supposed to
> say the same. Look, "the binlog engine did not always register for a
> query " == yours "the thd->transaction.stmt.ha_list will not include
> the binlog_hton"
>
Yes, I noticed that too.
> > so I think the
> > correct fix should to add binlog_hton to thd->transaction.stmt.ha_list.
> >
>
> There is nothing to disagree again, my lines "Fixed with deploying the
> reset of trx_data->before_stmt_pos at the end of the query processing"
> have a consequence such that the binlog handlerton will be registered
> with the fix I am suggesting; just scan for references of
> `before_stmt_pos' to find out the point in the code thar registers:
> THD::binlog_start_trans_and_stmt().
>
> Maybe I should have said explplicitly in the comments.
>
Right, I noticed that your patch actually did what I was thinking, but
the problem is why not do it in a direct way?
In your patch, by setting
rx_data->before_stmt_pos = MY_OFF_T_UNDEF;
Then when the next statment starts and the
THD::binlog_start_trans_and_stmt is executed, then the following will be
executed:
if (trx_data == NULL ||
trx_data->before_stmt_pos == MY_OFF_T_UNDEF)
{
this->binlog_set_stmt_begin();
if (options & (OPTION_NOT_AUTOCOMMIT | OPTION_BEGIN))
trans_register_ha(this, TRUE, binlog_hton);
trans_register_ha(this, FALSE, binlog_hton);
ha_data[binlog_hton->slot].ha_info[0].set_trx_read_write();
}
But what I am thinking is why not just remove the if statement in the above block,
I think we should do this for every statement, and I think this is actually the effect
after
your patch (right?).
I also tested the following to check if it works with CREATE-SELECT after remove the check
of if statement.
and I see no problem.
-----------------------------------------------------------
SET BINLOG_FORMAT=ROW;
DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
SET AUTOCOMMIT=0;
CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=INNODB;
INSERT INTO t1 VALUES (1,2), (2,2), (3,4), (4,4);
error ER_DUP_ENTRY;
CREATE TABLE t2(UNIQUE(b)) SELECT * FROM t1;
COMMIT;
SHOW BINLOG EVENTS;
-----------------------------------------------------------
The result of above test is:
---------------------------------------------------------------------------------
SET BINLOG_FORMAT=ROW;
DROP TABLE IF EXISTS t1;
Warnings:
Note 1051 Unknown table 't1'
DROP TABLE IF EXISTS t2;
Warnings:
Note 1051 Unknown table 't2'
SET AUTOCOMMIT=0;
CREATE TABLE t1 (a INT PRIMARY KEY, b INT) ENGINE=INNODB;
INSERT INTO t1 VALUES (1,2), (2,2), (3,4), (4,4);
CREATE TABLE t2(UNIQUE(b)) SELECT * FROM t1;
ERROR 23000: Duplicate entry '2' for key 'b'
COMMIT;
SHOW BINLOG EVENTS;
Log_name Pos Event_type Server_id End_log_pos Info
master-bin.000001 4 Format_desc 1 106 Server ver: 5.1.31-debug-log, Binlog ver: 4
master-bin.000001 106 Query 1 192 use `test`; DROP TABLE IF EXISTS t1
master-bin.000001 192 Query 1 278 use `test`; DROP TABLE IF EXISTS t2
master-bin.000001 278 Query 1 397 use `test`; CREATE TABLE t1 (a INT PRIMARY KEY, b INT)
ENGINE=INNODB
master-bin.000001 397 Query 1 465 use `test`; BEGIN
master-bin.000001 465 Table_map 1 507 table_id: 23 (test.t1)
master-bin.000001 507 Write_rows 1 572 table_id: 23 flags: STMT_END_F
master-bin.000001 572 Xid 1 599 COMMIT /* xid=26 */
----------------------------------------------------------------------------------------
So I'd suggest to remove the if check instead of set before_stmt_pos. What do you think?
> > I think the original code handling trx_data->beore_stmt_pos is correct.
> >
>
> Sorry, I believe it behaves against intentions. Really, if there is an
> assignment trx_data->beore_stmt_pos := pos at the begining of the
> statement for the sake of use `pos' as a savepoint at rollback of the
> statement then there should exist resetting
> trx_data->beore_stmt_pos := RESET_VALUE
> to release the savepoint at the end.
>
>
> >>
> >> Fixed with deploying the reset of trx_data->before_stmt_pos at the
> end of the query processing.
> >> modified:
> >> mysql-test/suite/rpl/r/rpl_innodb.result
> >> mysql-test/suite/rpl/t/rpl_innodb.test
> >> sql/log.cc
> >>
> >> per-file messages:
> >> mysql-test/suite/rpl/r/rpl_innodb.result
> >> results gained bug#40221 regression test part.
>
> BTW, the regression test has been moved to binlog_innodb.test for
> several reasons, please look at my answer to Mats' review for details.
>
> >> mysql-test/suite/rpl/t/rpl_innodb.test
> >> bug#40221 regression test is added.
> >> sql/log.cc
> >> Flushing the pending row event at binlog_end_trans() is moved down to
> the commit branch.
> >> Resetting of trx_data->before_stmt_pos to the uninitialized value at
> the end of the statement is
> >> implemented so that both binlog_commit() and binlog_rollback()
> execututes it.
> >> === modified file 'mysql-test/suite/rpl/r/rpl_innodb.result'
> >> --- a/mysql-test/suite/rpl/r/rpl_innodb.result 2008-10-29 13:25:03 +0000
> >> +++ b/mysql-test/suite/rpl/r/rpl_innodb.result 2008-11-21 12:41:52 +0000
> >> @@ -81,4 +81,27 @@ FLUSH LOGS;
> >> -------- switch to master --------
> >> FLUSH LOGS;
> >> DROP DATABASE mysqltest1;
> >> +DROP TEMPORARY TABLE mysqltest1.tmp2;
> >> +SET @@session.binlog_format = 'row';
> >> +drop table if exists t1;
> >> +Warnings:
> >> +Note 1051 Unknown table 't1'
> >> +CREATE TABLE t1 (pk int auto_increment primary key) ENGINE=innodb;
> >> +reset master;
> >> +show binlog events from <binlog_start>;
> >> +Log_name Pos Event_type Server_id End_log_pos Info
> >> +begin;
> >> +insert into t1 values (1),(2);
> >> +*** the following UPDATE query wont generate any updates for the binlog
> ***
> >> +update t1 set pk = 3 where pk < 3;
> >> +ERROR 23000: Duplicate entry '3' for key 'PRIMARY'
> >> +commit;
> >> +*** the binlog must have only Write_rows events not any Update_rows ***
> >> +show binlog events from <binlog_start>;
> >> +Log_name Pos Event_type Server_id End_log_pos Info
> >> +master-bin.000001 # Query # # use `test`; BEGIN
> >> +master-bin.000001 # Table_map # # table_id: # (test.t1)
> >> +master-bin.000001 # Write_rows # # table_id: # flags: STMT_END_F
> >> +master-bin.000001 # Xid # # COMMIT /* XID */
> >> +drop table t1;
> >> End of 5.1 tests
> >>
> >> === modified file 'mysql-test/suite/rpl/t/rpl_innodb.test'
> >> --- a/mysql-test/suite/rpl/t/rpl_innodb.test 2008-10-29 13:25:03 +0000
> >> +++ b/mysql-test/suite/rpl/t/rpl_innodb.test 2008-11-21 12:41:52 +0000
> >> @@ -120,6 +120,31 @@ connection master;
> >> FLUSH LOGS;
> >>
> >> DROP DATABASE mysqltest1;
> >> +DROP TEMPORARY TABLE mysqltest1.tmp2;
> >> -- source include/master-slave-end.inc
> >>
> >> +#
> >> +# Bug #40221 Replication failure on RBR + UPDATE the primary key
> >> +#
> >> +
> >> +connection master;
> >> +
> >> +SET @@session.binlog_format = 'row';
> >> +drop table if exists t1;
> >> +CREATE TABLE t1 (pk int auto_increment primary key) ENGINE=innodb;
> >> +reset master;
> >> +source include/show_binlog_events.inc;
> >> +
> >> +begin;
> >> +insert into t1 values (1),(2);
> >> +--echo *** the following UPDATE query wont generate any updates for the
> binlog ***
> >> +--error ER_DUP_ENTRY
> >> +update t1 set pk = 3 where pk < 3;
> >> +commit;
> >> +--echo *** the binlog must have only Write_rows events not any Update_rows
> ***
> >> +source include/show_binlog_events.inc;
> >> +
> >> +drop table t1;
> >> +
> >> +
> >> --echo End of 5.1 tests
> >>
> >> === modified file 'sql/log.cc'
> >> --- a/sql/log.cc 2008-11-12 17:51:47 +0000
> >> +++ b/sql/log.cc 2008-11-21 12:41:52 +0000
> >> @@ -1377,8 +1377,6 @@ binlog_end_trans(THD *thd, binlog_trx_da
> >> FLAGSTR(thd->options, OPTION_NOT_AUTOCOMMIT),
> >> FLAGSTR(thd->options, OPTION_BEGIN)));
> >>
> >> - thd->binlog_flush_pending_rows_event(TRUE);
> >> -
> >
> > No problem for this.
> >
> >> /*
> >> NULL denotes ROLLBACK with nothing to replicate: i.e., rollback of
> >> only transactional tables. If the transaction contain changes to
> >> @@ -1387,6 +1385,7 @@ binlog_end_trans(THD *thd, binlog_trx_da
> >> */
> >> if (end_ev != NULL)
> >> {
> >> + thd->binlog_flush_pending_rows_event(TRUE);
> >
> > No problem for this.
> >
> >> /*
> >> Doing a commit or a rollback including non-transactional tables,
> >> i.e., ending a transaction where we might write the transaction
> >> @@ -1434,7 +1433,8 @@ binlog_end_trans(THD *thd, binlog_trx_da
> >> */
> >> mysql_bin_log.update_table_map_version();
> >> }
> >> -
> >> + if (!all)
> >> + trx_data->before_stmt_pos = MY_OFF_T_UNDEF;
> >
> > If change this to trx_data->reset as Mats suggested in his review, the
> > result will be wrong.
>
> Please find my arguments in my answer to that suggestion: the
> statement rollback should clear out the transaction cache.
>
I don't see your reply to Mats' review on commit list. Have you replied
to all?
> >
> >> DBUG_RETURN(error);
> >> }
> >>
> >> @@ -1553,6 +1553,8 @@ static int binlog_commit(handlerton *hto
> >> int error= binlog_end_trans(thd, trx_data, &qev, all);
> >> DBUG_RETURN(error);
> >> }
> >> + if (!all)
> >> + trx_data->before_stmt_pos = MY_OFF_T_UNDEF;
> >
> > The same as above.
>
> as said.
>
> >
> >> DBUG_RETURN(0);
> >> }
>
> cheers,
>
> Andrei