List:Commits« Previous MessageNext Message »
From:Mats Kindahl Date:December 14 2007 11:35am
Subject:Re: bk commit into 5.1 tree (sven:1.2651) BUG#26395
View as plain text  
Hi Sven!

I have some minor comments, but after than the patch is approved. You 
can just handle the items below, and then push.

Just my few cents,
Mats Kindahl

Sven Sandberg wrote:
> Below is the list of changes that have just been committed into a local
> 5.1 repository of sven. When sven does a push these changes will
> be propagated to the main repository and, within 24 hours after the
> push, to the public repository.
> For information on how to access the public repository
> see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
>
> ChangeSet@stripped, 2007-12-13 19:56:17+01:00, sven@riska.(none) +21 -0
>   BUG#26395: if crash during autocommit update to transactional table on master,
> slave fails
>   Now, every transaction (including autocommit transactions) start with
>   a BEGIN and end with a COMMIT/ROLLBACK in the binlog.
>   Added a test case, and updated lots of test case result files.
>
>   mysql-test/r/multi_update.result@stripped, 2007-12-13 19:56:12+01:00, sven@riska.(none)
> +2 -2
>     Updated result file
>
>   mysql-test/r/sp_trans_log.result@stripped, 2007-12-13 19:56:12+01:00, sven@riska.(none)
> +1 -0
>     Updated result file
>
>   mysql-test/suite/binlog/r/binlog_innodb.result@stripped, 2007-12-13 19:56:12+01:00,
> sven@riska.(none) +1 -0
>     Updated result file
>
>   mysql-test/suite/binlog/r/binlog_multi_engine.result@stripped, 2007-12-13
> 19:56:12+01:00, sven@riska.(none) +8 -0
>     Updated result file
>
>   mysql-test/suite/binlog/r/binlog_stm_blackhole.result@stripped, 2007-12-13
> 19:56:12+01:00, sven@riska.(none) +16 -0
>     Updated result file
>
>   mysql-test/suite/binlog/r/binlog_stm_mix_innodb_myisam.result@stripped, 2007-12-13
> 19:56:12+01:00, sven@riska.(none) +15 -0
>     Updated result file
>
>   mysql-test/suite/ndb/r/ndb_binlog_format.result@stripped, 2007-12-13 19:56:12+01:00,
> sven@riska.(none) +4 -0
>     Updated result file
>
>   mysql-test/suite/rpl/r/rpl_innodb_mixed_dml.result@stripped, 2007-12-13 19:56:12+01:00,
> sven@riska.(none) +71 -0
>     Updated result file
>
>   mysql-test/suite/rpl/r/rpl_row_charset_innodb.result@stripped, 2007-12-13
> 19:56:12+01:00, sven@riska.(none) +12 -0
>     Updated result file
>
>   mysql-test/suite/rpl/r/rpl_row_create_table.result@stripped, 2007-12-13 19:56:12+01:00,
> sven@riska.(none) +20 -19
>     Updated result file
>
>   mysql-test/suite/rpl/r/rpl_row_log_innodb.result@stripped, 2007-12-13 19:56:12+01:00,
> sven@riska.(none) +16 -8
>     Updated result file
>
>   mysql-test/suite/rpl/r/rpl_switch_stm_row_mixed.result@stripped, 2007-12-13
> 19:56:12+01:00, sven@riska.(none) +12 -0
>     Updated result file
>
>   mysql-test/suite/rpl/r/rpl_truncate_3innodb.result@stripped, 2007-12-13 19:56:12+01:00,
> sven@riska.(none) +12 -0
>     Updated result file
>
>   mysql-test/suite/rpl/t/rpl_row_create_table.test@stripped, 2007-12-13 19:56:12+01:00,
> sven@riska.(none) +4 -4
>     Updated result file
>
>   mysql-test/suite/rpl_ndb/r/rpl_ndb_stm_innodb.result@stripped, 2007-12-13
> 19:56:12+01:00, sven@riska.(none) +2 -2
>     Updated result file
>
>   mysql-test/suite/rpl_ndb/r/rpl_ndb_transaction.result@stripped, 2007-12-13
> 19:56:13+01:00, sven@riska.(none) +101 -0
>     Results for new test case
>
>   mysql-test/suite/rpl_ndb/r/rpl_ndb_transaction.result@stripped, 2007-12-13
> 19:56:13+01:00, sven@riska.(none) +0 -0
>
>   mysql-test/suite/rpl_ndb/t/rpl_ndb_transaction-master.opt@stripped, 2007-12-13
> 19:56:13+01:00, sven@riska.(none) +1 -0
>     Options for new test case
>
>   mysql-test/suite/rpl_ndb/t/rpl_ndb_transaction-master.opt@stripped, 2007-12-13
> 19:56:13+01:00, sven@riska.(none) +0 -0
>
>   mysql-test/suite/rpl_ndb/t/rpl_ndb_transaction-slave.opt@stripped, 2007-12-13
> 19:56:13+01:00, sven@riska.(none) +1 -0
>     Options for new test case
>
>   mysql-test/suite/rpl_ndb/t/rpl_ndb_transaction-slave.opt@stripped, 2007-12-13
> 19:56:13+01:00, sven@riska.(none) +0 -0
>
>   mysql-test/suite/rpl_ndb/t/rpl_ndb_transaction.test@stripped, 2007-12-13 19:56:13+01:00,
> sven@riska.(none) +123 -0
>     Added new test case.
>
>   mysql-test/suite/rpl_ndb/t/rpl_ndb_transaction.test@stripped, 2007-12-13 19:56:13+01:00,
> sven@riska.(none) +0 -0
>
>   sql/log.cc@stripped, 2007-12-13 19:56:13+01:00, sven@riska.(none) +65 -29
>      - Always write BEGIN and COMMIT around statements, even in autocommit
>        mode.
>      - Added comments for binlog_commit and binlog_rollback.
>
>   sql/log_event.cc@stripped, 2007-12-13 19:56:13+01:00, sven@riska.(none) +1 -0
>     Added debug trigger to avoid writing xid events to the binlog.
>   

[snip]

> diff -Nrup a/mysql-test/suite/rpl_ndb/t/rpl_ndb_transaction.test
> b/mysql-test/suite/rpl_ndb/t/rpl_ndb_transaction.test
> --- /dev/null	Wed Dec 31 16:00:00 196900
> +++ b/mysql-test/suite/rpl_ndb/t/rpl_ndb_transaction.test	2007-12-13 19:56:13 +01:00
> @@ -0,0 +1,123 @@
> +# Tests that transactions are replicated correctly, with various
> +# combinations of non-transactional and transactional non-XA tables.
> +# Also tests that an XA transaction where the master crashes just
> +# before writing the XID log event is executed correctly.  See below
> +# for implementation details.
> +#
> +# Written by Sven.
>   

Skip your name, for the same reasons as in the previous patch (you don't 
want to be associated with any changes done to this file later.)

> +
> +source include/ndb_master-slave.inc;
> +source include/have_ndb.inc;
> +
> +
> +CREATE TABLE tmyisam (a int) ENGINE = MYISAM;
> +CREATE TABLE tinnodb (a int) ENGINE = INNODB;
> +CREATE TABLE tndb (a int) ENGINE = NDB;
> +
> +SHOW CREATE TABLE tmyisam;
> +SHOW CREATE TABLE tinnodb;
> +SHOW CREATE TABLE tndb;
> +
> +
> +--echo ==== Test 1: Non-XA Engines ====
> +# Test that everything works fine with non-XA engines.  We just try
> +# all ways to do transactions involving ndb and/or myisam, with
> +# rollback or commit.
> +
> +--echo --- on master ---
> +
> +SET AUTOCOMMIT = 1;
> +
> +INSERT INTO tndb VALUES (1);
> +INSERT INTO tmyisam VALUES (1);
> +
> +BEGIN;
> +INSERT INTO tndb VALUES (2);
> +INSERT INTO tndb VALUES (3);
> +COMMIT;
> +
> +BEGIN;
> +INSERT INTO tmyisam VALUES (2);
> +INSERT INTO tmyisam VALUES (3);
> +COMMIT;
> +
> +BEGIN;
> +INSERT INTO tndb VALUES (4);
> +INSERT INTO tmyisam VALUES (4);
> +COMMIT;
> +
> +BEGIN;
> +INSERT INTO tndb VALUES (5);
> +INSERT INTO tndb VALUES (6);
> +ROLLBACK;
> +
> +BEGIN;
> +INSERT INTO tmyisam VALUES (5);
> +INSERT INTO tmyisam VALUES (6);
> +--warning 1196
> +ROLLBACK;
> +
> +BEGIN;
> +INSERT INTO tndb VALUES (7);
> +INSERT INTO tmyisam VALUES (7);
> +--warning 1196
> +ROLLBACK;
> +
> +SELECT * FROM tndb ORDER BY a;
> +SELECT * FROM tmyisam ORDER BY a;
> +
> +--echo --- on slave ---
> +--sync_slave_with_master
> +SELECT * FROM tndb ORDER BY a;
> +SELECT * FROM tmyisam ORDER BY a;
> +
> +
> +--echo ==== Test 2: Master crash before writing XID event on XA engine ====
> +# We now want to test the following scenario, to verify that BUG#26395
> +# has been fixed:
> +
> +#   "master and slave have a transactional table that uses XA. Master
> +#    has AUTOCOMMIT on and executes a statement (in this case an
> +#    INSERT). Master crashes just before writing the XID event."
> +
> +# In this scenario, master will roll back, so slave should not execute
> +# the statement, and slave should roll back later when master is
> +# restarted.
> +
> +# However, we the master to be alive so that we are sure it replicates
> +# the statement to the slave. So in the test case, we must therefore
> +# not crash the master. Instead, we fake the crash by just not writing
> +# the XID event to the binlog. This is done by the
> +# --debug=d,do_not_write_xid flag in the .opt file.
> +
> +# So, unlike if the master had crashed, the master *will* execute the
> +# statement. But the slave should not execute it. Hence, after the
> +# first test is executed, the expected result on master is a table
> +# with one row, and on slave a table with no rows.
> +
> +# To simulate the slave correctly, we wait until everything up to the
> +# XID is replicated. We cannot sync_slave_with_master, because that
> +# would wait for the transaction to end. Instead, we wait for
> +# "sufficiently long time". Then we stop the slave.
> +
> +# Note: since this puts the master binlog in an inconsistent state,
> +# this should be the last test of the file.
>   

It is kind of a hack. We run the risk of this test failing in PB when 
the "sufficiently long time" is not sufficient.

> +
> +--echo --- on master ---
> +--connection master
> +
> +INSERT INTO tinnodb VALUES (1);
> +SELECT * FROM tinnodb ORDER BY a;
> +
> +--echo --- on slave ---
> +--connection slave
> +--sleep 3
> +STOP SLAVE;
> +# the following statement should show that nothing has been replicated
> +SELECT * FROM tinnodb ORDER BY a;
>   

Please show the status of the slave as well by including 
"include/show_slave_status.inc". This will show that the slave has 
executed all the events up to the non-existent XID event. This will give 
us a result mismatch of the "sufficiently long time" is not sufficient.

The fact that it's hard to test these kinds of scenarios is a separate 
problem.

> +
> +
> +# clean up
> +DROP TABLE tmyisam;
> +DROP TABLE tinnodb;
> +DROP TABLE tndb;
> diff -Nrup a/sql/log.cc b/sql/log.cc
> --- a/sql/log.cc	2007-11-14 11:01:42 +01:00
> +++ b/sql/log.cc	2007-12-13 19:56:13 +01:00
> @@ -1402,6 +1402,13 @@ static int binlog_prepare(handlerton *ht
>    return 0;
>  }
>  
> +/**
> +  This function is called when a transaction involving a transactional
> +  table is committed.
> +
> +  It has the responsibility to flush the transaction cache to the
> +  binlog file.  See handlerton::commit for more details.
> +*/
>   

Add parameters and return value to the Doxygen comment.

Use @see handlerton::commit to reference another symbol in Doxygen.

>  static int binlog_commit(handlerton *hton, THD *thd, bool all)
>  {
>    DBUG_ENTER("binlog_commit");
> @@ -1414,7 +1421,15 @@ static int binlog_commit(handlerton *hto
>      trx_data->reset();
>      DBUG_RETURN(0);
>    }
> -  if (all)
> +  /*
> +    Write commit event if at least one of the following holds:
> +     - the user sends an explicit COMMIT; or
> +     - (we are not inside a BEGIN) and (the autocommit flag is on).
>   

Strange use of parentheses. This is plain English, not "Attempto 
Controlled English".

Minor issue, fix it if you feel inclined.

> +    However, if the user has not sent an explicit COMMIT, and we are
> +    either inside a BEGIN or run with autocommit off, then this is not
> +    the end of a transaction and we should not write a commit event.
> +  */
> +  if (all || !(thd->options & (OPTION_NOT_AUTOCOMMIT | OPTION_BEGIN)))
>    {
>      Query_log_event qev(thd, STRING_WITH_LEN("COMMIT"), TRUE, FALSE);
>      qev.error_code= 0; // see comment in MYSQL_LOG::write(THD, IO_CACHE)
> @@ -1428,6 +1443,15 @@ static int binlog_commit(handlerton *hto
>    }
>  }
>  
> +/**
> +  This function is called when a transaction involving a transactional
> +  table is rolled back.
> +
> +  It has the responsibility to flush the transaction cache to the
> +  binlog file. However, if the transaction does not involve
> +  non-transactional tables, nothing needs to be logged. See
> +  handlerton::rollback for more details.
> +*/
>   

Add parameters and return value to the Doxygen comment. Use @see for 
referencing other symbols in Doxygen.

>  static int binlog_rollback(handlerton *hton, THD *thd, bool all)
>  {
>    DBUG_ENTER("binlog_rollback");
> @@ -3967,9 +3991,11 @@ int MYSQL_BIN_LOG::write_cache(IO_CACHE 
>    IMPLEMENTATION
>      - To support transaction over replication, we wrap the transaction
>        with BEGIN/COMMIT or BEGIN/ROLLBACK in the binary log.
> -      We want to write a BEGIN/ROLLBACK block when a non-transactional table
> -      was updated in a transaction which was rolled back. This is to ensure
> -      that the same updates are run on the slave.
> +      If a transaction that only involves transactional tables is
> +      rolled back, we do not binlog it. However, we write a
> +      BEGIN/ROLLBACK block when a non-transactional table was updated
> +      in a transaction which was rolled back. This is to ensure that
> +      the same updates are run on the slave.
>  */
>  
>  bool MYSQL_BIN_LOG::write(THD *thd, IO_CACHE *cache, Log_event *commit_event)
> @@ -3990,32 +4016,42 @@ bool MYSQL_BIN_LOG::write(THD *thd, IO_C
>      if (my_b_tell(cache) > 0)
>      {
>        /*
> -        Log "BEGIN" at the beginning of the transaction.
> -        which may contain more than 1 SQL statement.
> +        Log "BEGIN" at the beginning of every transaction.  Here, a
> +        transaction is either a BEGIN..COMMIT block or a single
> +        statement in autocommit mode.
>        */
> -      if (thd->options & (OPTION_NOT_AUTOCOMMIT | OPTION_BEGIN))
> -      {
> -        Query_log_event qinfo(thd, STRING_WITH_LEN("BEGIN"), TRUE, FALSE);
> -        /*
> -          Imagine this is rollback due to net timeout, after all statements of
> -          the transaction succeeded. Then we want a zero-error code in BEGIN.
> -          In other words, if there was a really serious error code it's already
> -          in the statement's events, there is no need to put it also in this
> -          internally generated event, and as this event is generated late it
> -          would lead to false alarms.
> -          This is safer than thd->clear_error() against kills at shutdown.
> -        */
> -        qinfo.error_code= 0;
> -        /*
> -          Now this Query_log_event has artificial log_pos 0. It must be adjusted
> -          to reflect the real position in the log. Not doing it would confuse the
> -          slave: it would prevent this one from knowing where he is in the
> -          master's binlog, which would result in wrong positions being shown to
> -          the user, MASTER_POS_WAIT undue waiting etc.
> -        */
> -        if (qinfo.write(&log_file))
> -          goto err;
> -      }
> +      Query_log_event qinfo(thd, STRING_WITH_LEN("BEGIN"), TRUE, FALSE);
> +      /*
> +        Imagine this is rollback due to net timeout, after all
> +        statements of the transaction succeeded. Then we want a
> +        zero-error code in BEGIN.  In other words, if there was a
> +        really serious error code it's already in the statement's
> +        events, there is no need to put it also in this internally
> +        generated event, and as this event is generated late it would
> +        lead to false alarms.
> +
> +        This is safer than thd->clear_error() against kills at shutdown.
> +      */
> +      qinfo.error_code= 0;
>   

Yes, BEGIN cannot fail. Maybe you should clear the error code before 
constructing the BEGIN event instead, but I really have no strong 
opinion on this. It seems to be just a copy and paste of the code above, 
so just leave it as it is.

> +      /*
> +        Now this Query_log_event has artificial log_pos 0. It must be
> +        adjusted to reflect the real position in the log. Not doing it
> +        would confuse the slave: it would prevent this one from
> +        knowing where he is in the master's binlog, which would result
> +        in wrong positions being shown to the user, MASTER_POS_WAIT
> +        undue waiting etc.
> +      */
> +      if (qinfo.write(&log_file))
> +        goto err;
> +
> +      DBUG_EXECUTE_IF("crash_before_writing_xid",
> +                      {
> +                        if ((write_error= write_cache(cache, false, true)))
> +                          DBUG_PRINT("error", ("error writing binlog cache: %d",
> +                                               write_error));
> +                        DBUG_PRINT("warning", ("crashing before writing xid"));
>   

The symbols in the DBUG_PRINT() should not be "warning" or "error". Use 
"info" instead.

> +                        abort();
> +                      });
>  
>        if ((write_error= write_cache(cache, false, false)))
>          goto err;
> diff -Nrup a/sql/log_event.cc b/sql/log_event.cc
> --- a/sql/log_event.cc	2007-11-28 12:11:13 +01:00
> +++ b/sql/log_event.cc	2007-12-13 19:56:13 +01:00
> @@ -4265,6 +4265,7 @@ Xid_log_event(const char* buf,
>  #ifndef MYSQL_CLIENT
>  bool Xid_log_event::write(IO_CACHE* file)
>  {
> +  DBUG_EXECUTE_IF("do_not_write_xid", return 0;);
>    return write_header(file, sizeof(xid)) ||
>           my_b_safe_write(file, (uchar*) &xid, sizeof(xid));
>  }
>
>   


-- 
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com


Thread
bk commit into 5.1 tree (sven:1.2651) BUG#26395Sven Sandberg13 Dec
  • Re: bk commit into 5.1 tree (sven:1.2651) BUG#26395Mats Kindahl14 Dec