Hi Andrei
Thank you for you work! And sorry for the delay!
I tried my best to understand the code dealing with binlog and
transaction, but still not very clear about that, and I think your patch
probably not correct, although it seems get the right result. So I'd
suggest to write a dedicated test case to test all the possibilities
involved binlog and transaction to make sure the fix is correct and not
introducing new hidden bugs.
I think we can use the decision table for commit in the comment of
binlog_commit function as a guide line for the test case we should test,
and also do this for rollback.
I might be wrong, so don't hesitate to argue!
see also inline comments below.
Andrei Elkin wrote:
> #At file:///home/andrei/MySQL/BZR/FIXES/bug40221-RBR_UPDATE_pk/
>
> 2701 Andrei Elkin 2008-11-21
> Bug #40221 Replication failure on RBR + UPDATE the primary key
>
> A transaction could result in having an extra event after a query that errored
> e.g because of
> a dup key. Such a query is rolled back in innodb, as specified, but has not
> been in binlog.
> It appeares that the binlog engine did not always register for a query
> (statement) because
> the previous query had not reset at its statement commit time. Because of that
> fact there was no
> roll-back to the trx_data->before_stmt_pos position and a the pending event
> of the errorred query
> could become flushed to the binlog file.
I think the real problem is that when running a statement in a
transaction, the thd->transaction.stmt.ha_list will not include the
binlog_hton, and so when that statement is rolled back, it will not be
rolled back in the binlog, only in the storage engine, so I think the
correct fix should to add binlog_hton to thd->transaction.stmt.ha_list.
I think the original code handling trx_data->beore_stmt_pos is correct.
>
> Fixed with deploying the reset of trx_data->before_stmt_pos at the end of
> the query processing.
> modified:
> mysql-test/suite/rpl/r/rpl_innodb.result
> mysql-test/suite/rpl/t/rpl_innodb.test
> sql/log.cc
>
> per-file messages:
> mysql-test/suite/rpl/r/rpl_innodb.result
> results gained bug#40221 regression test part.
> mysql-test/suite/rpl/t/rpl_innodb.test
> bug#40221 regression test is added.
> sql/log.cc
> Flushing the pending row event at binlog_end_trans() is moved down to the commit
> branch.
> Resetting of trx_data->before_stmt_pos to the uninitialized value at the end
> of the statement is
> implemented so that both binlog_commit() and binlog_rollback() execututes it.
> === modified file 'mysql-test/suite/rpl/r/rpl_innodb.result'
> --- a/mysql-test/suite/rpl/r/rpl_innodb.result 2008-10-29 13:25:03 +0000
> +++ b/mysql-test/suite/rpl/r/rpl_innodb.result 2008-11-21 12:41:52 +0000
> @@ -81,4 +81,27 @@ FLUSH LOGS;
> -------- switch to master --------
> FLUSH LOGS;
> DROP DATABASE mysqltest1;
> +DROP TEMPORARY TABLE mysqltest1.tmp2;
> +SET @@session.binlog_format = 'row';
> +drop table if exists t1;
> +Warnings:
> +Note 1051 Unknown table 't1'
> +CREATE TABLE t1 (pk int auto_increment primary key) ENGINE=innodb;
> +reset master;
> +show binlog events from <binlog_start>;
> +Log_name Pos Event_type Server_id End_log_pos Info
> +begin;
> +insert into t1 values (1),(2);
> +*** the following UPDATE query wont generate any updates for the binlog ***
> +update t1 set pk = 3 where pk < 3;
> +ERROR 23000: Duplicate entry '3' for key 'PRIMARY'
> +commit;
> +*** the binlog must have only Write_rows events not any Update_rows ***
> +show binlog events from <binlog_start>;
> +Log_name Pos Event_type Server_id End_log_pos Info
> +master-bin.000001 # Query # # use `test`; BEGIN
> +master-bin.000001 # Table_map # # table_id: # (test.t1)
> +master-bin.000001 # Write_rows # # table_id: # flags: STMT_END_F
> +master-bin.000001 # Xid # # COMMIT /* XID */
> +drop table t1;
> End of 5.1 tests
>
> === modified file 'mysql-test/suite/rpl/t/rpl_innodb.test'
> --- a/mysql-test/suite/rpl/t/rpl_innodb.test 2008-10-29 13:25:03 +0000
> +++ b/mysql-test/suite/rpl/t/rpl_innodb.test 2008-11-21 12:41:52 +0000
> @@ -120,6 +120,31 @@ connection master;
> FLUSH LOGS;
>
> DROP DATABASE mysqltest1;
> +DROP TEMPORARY TABLE mysqltest1.tmp2;
> -- source include/master-slave-end.inc
>
> +#
> +# Bug #40221 Replication failure on RBR + UPDATE the primary key
> +#
> +
> +connection master;
> +
> +SET @@session.binlog_format = 'row';
> +drop table if exists t1;
> +CREATE TABLE t1 (pk int auto_increment primary key) ENGINE=innodb;
> +reset master;
> +source include/show_binlog_events.inc;
> +
> +begin;
> +insert into t1 values (1),(2);
> +--echo *** the following UPDATE query wont generate any updates for the binlog ***
> +--error ER_DUP_ENTRY
> +update t1 set pk = 3 where pk < 3;
> +commit;
> +--echo *** the binlog must have only Write_rows events not any Update_rows ***
> +source include/show_binlog_events.inc;
> +
> +drop table t1;
> +
> +
> --echo End of 5.1 tests
>
> === modified file 'sql/log.cc'
> --- a/sql/log.cc 2008-11-12 17:51:47 +0000
> +++ b/sql/log.cc 2008-11-21 12:41:52 +0000
> @@ -1377,8 +1377,6 @@ binlog_end_trans(THD *thd, binlog_trx_da
> FLAGSTR(thd->options, OPTION_NOT_AUTOCOMMIT),
> FLAGSTR(thd->options, OPTION_BEGIN)));
>
> - thd->binlog_flush_pending_rows_event(TRUE);
> -
No problem for this.
> /*
> NULL denotes ROLLBACK with nothing to replicate: i.e., rollback of
> only transactional tables. If the transaction contain changes to
> @@ -1387,6 +1385,7 @@ binlog_end_trans(THD *thd, binlog_trx_da
> */
> if (end_ev != NULL)
> {
> + thd->binlog_flush_pending_rows_event(TRUE);
No problem for this.
> /*
> Doing a commit or a rollback including non-transactional tables,
> i.e., ending a transaction where we might write the transaction
> @@ -1434,7 +1433,8 @@ binlog_end_trans(THD *thd, binlog_trx_da
> */
> mysql_bin_log.update_table_map_version();
> }
> -
> + if (!all)
> + trx_data->before_stmt_pos = MY_OFF_T_UNDEF;
If change this to trx_data->reset as Mats suggested in his review, the
result will be wrong.
> DBUG_RETURN(error);
> }
>
> @@ -1553,6 +1553,8 @@ static int binlog_commit(handlerton *hto
> int error= binlog_end_trans(thd, trx_data, &qev, all);
> DBUG_RETURN(error);
> }
> + if (!all)
> + trx_data->before_stmt_pos = MY_OFF_T_UNDEF;
The same as above.
> DBUG_RETURN(0);
> }
>
>