List:Replication« Previous MessageNext Message »
From:He Zhenxing Date:May 14 2009 9:46am
Subject:Re: Feedback needed for WL#2687
View as plain text  
Alfranio Correia wrote:
> Hi all,
> 
> Find below a proposal on how to handle WL#2687.
> There is also a draft of a patch available in a related bug (BUG#40278).
> 
> ACTIONS:
> 
> 1 - Feedback
> 2 - Review of the patch (Jasonh, Luis).
> Mats and Andrei, could you both also take a look?
> 
> cheers.
> 
> BACKGROUND
> ----------
> Mixing transactional and non-transactional tables is not supported by the
> current code. In a nutshell, non-transactional operations are logged upon
> commit or rollback thus generating invalid sequence of operations.
> For instance, assume the following statements and sessions (connections):
> 
> con0: CREATE TABLE test (a int);
> con0: INSERT INTO test VALUES(1);
> 
> con1: START TRANSACTION;
> con2: START TRANSACTION;
> con1: UPDATE test SET a = 10;
> con2: DELETE FROM test;
> con2: COMMIT;
> con1: COMMIT;
> 
> In RBR mode, the binary log produced would look like this:
> 
> con0: CREATE TABLE test (a int);
> con0: INSERT INTO test VALUES(1);
> 
> con2: START TRANSACTION;
> con2: DELETE FROM test WHERE a = 10;
> con2: COMMIT;
> con1: START TRANSACTION;
> con1: UPDATE test SET a = 10 WHERE a = 1;
> con1: COMMIT;
> 
> Clearly, this would generate an error in the slave as the row (a = 10) to be
> deleted would not be found.
> 
> 
> 
> PROPOSAL
> --------
> To correctly handle the mix of transactional and non-transactional
> tables, we
> propose what follows.
> 
> In STMT and MIXED modes, statements are copied to the binary log upon
> commit or rollback thus preserving any semantic among the statements in a
> transaction. Unfortunately, we cannot ensure that mixing transactional and
> non-transactional tables will be handled correctly and as such we should
> properly document this and discourage such modes under this circumstance.
> Find below an example of what may go wrong:
> 
> con0: CREATE TABLE test (a int);
> con0: INSERT INTO test VALUES(1);
> 
> con1: START TRANSACTION;
> con2: START TRANSACTION;
> con1: UPDATE test SET a = 10 where a = 1;
> con2: DELETE FROM test where a = 10;
> con2: COMMIT;
> con1: COMMIT;
> 
> In STMT or MIXED mode, the binary log produced would look like this:
> 
> con0: CREATE TABLE test (a int);
> con0: INSERT INTO test VALUES(1);
> 
> con2: START TRANSACTION;
> con2: DELETE FROM test WHERE a = 10;
> con2: COMMIT;
> 
> Clearly, the master and the slave would be out of sync.
> 
> Note that we are not proposing to distinguish the case that
> non-transactional
> changes are processed before any transactional change, what could be safely
> handled with minor changes to our proposal. We do not print out any warning
> message or throw an error either. This is done to discourage the use of such
> modes when mixing transactional and non-transactional tables.
> 
> In RBR mode, non-transactional changes are logged to the binary log upon
> committing or rolling back a statement. On the other hand, transactional
> changes
> are logged upon committing any implicit (i.e autocommit = 1) or explicit
> (autocommit = 0/COMMIT or START/COMMIT) transaction and ignored upon
> rolling it
> back. To understand the details of this approach, let's use T to represent a
> statement that only changes transactional tables, N to represent a statement
> that only changes non-transactional tables, and M to represent a
> statement that
> changes both (i.e. multi-table update).
> 
> We have what follows:
> 
> 1) T...T: all the statements are transactional.
> 
> In this case, the changes are copied to the binary log upon commit and
> ignored
> upon rollback.
> 

OK

> 2) N...N: all the statements are non-transactional.
> 
> In this case, the changes are logged to the binary log as soon as each
> statement
> is completed.
> 

OK

> 3) N T...: the beginning part is non-transactional, followed by
> transactional
> statements.

In this case, the beginning part of non-transactional statements should
be logged separately and written to binlog directly in all modes. There
is nothing unsafe while binlogging the first non-transactional part. And
then the reset part T... will be either a T...T or TN..., which are
handled by 1) or 4).

> 4) T N...: the beginning part is transactional, followed by
> non-transactional
> statements.
> 

In this case, the transaction is not safe to log in STATEMENT format, a
warning should be issued in STATEMENT mode, and a switch to ROW format
should be done in MIXED mode.

When in or after switched to ROW format, rows to non-transactional
tables should be logged immediately to binlog. rows to transactional
tables should be logged to the transaction cache first, and flush to
binlog when commit or throw away when rollback.

> In both cases, same as above.
> 
> 5) M...: where M is either N T or T N.
> 

M is one single statement that modifies both transactional and
non-transactional tables, so it is either NT or TN *only* in ROW format.
In STATEMENT format, we cannot split it, and an M in a transaction
should be treated the same way as 4).

> 
> 
> IMPROVEMENTS
> ------------
> In STMT and MIXED modes, if we have the 1), 2) or 3) cases described
> above, the
> replication is completely safe. Otherwise, there may be inconsistencies
> among
> master and slaves.
> 
> TODO ------- WE NEED TO DISCUSS THIS CASE --------- DISCUSSION 1
> 1.1 - Should we distinguish between safe and unsafe cases?

yes

> 1.2 - Should we either print out warning messages or throw an error for the
> unsafe cases?
> 

yes

> 
> TESTS
> -----
> The test case should check if transactions that mixes transactional and
> non-transactional tables are correctly handled.
> 
> In STMT and MIXED modes, statements are copied to the binary log upon
> commit or rollback thus preserving any semantic among the statements in a
> transaction.
> 
> In RBR, changes on non-transactional tables must be directly logged into the
> binary log while changes on transactional tables must be put into a
> cache and
> moved into the binary log upon commit. The changes in the code has
> impact not
> only on simple transactions but on how concurrent statements are handled,
> creation of tables through select, updates on multiple-tables, functions and
> triggers.
> 
> Thus to check the behavior provided by the current code we divided
> the test as follows:
> 
> 1 - MIXING TRANSACTIONAL and NON-TRANSACTIONAL TABLES
>      1.1 - NN TT
>      1.2 - TT NN
>      1.3 - TT NN TT
> 2 - CONCURRENCY:
>      2.1 - NON-TRANSACT TABLES -  SET AUTOCOMMIT = 0  | COMMIT
>      2.1 - NON-TRANSACT TABLES -  SET AUTOCOMMIT = 1
>      2.2 - NON-TRANSACT TABLES -  SET AUTOCOMMIT = 1  | START - COMMIT
> 3 - CREATING TABLES through SELECT * FROM
> 4 - UPDATING MULTIPLE TABLES
> 5 - PROCEDURES, FUNCTIONS AND TRIGGERS
> 6 - EVENTS
> 
> TODO ------- WE NEED TO DISCUSS THIS CASE --------- DISCUSSION 2
> 2.1 Are the tests proposed for procedures, functions and triggers enough?
> 2.2 What should we test for events?
> 

I'm not quite sure I understand events correctly, I think an event is
just an automatically scheculed client session, so I don't think we need
to test for events because it should already been handled by other
cases.

> 
> IMPLEMENTATION DETAILS
> ----------------------
> 
> There are two different sets of pending events. One for
> non-transactional events
> which are flushed to the binary log upon committing or rolling back a
> statement;
> The other one for transactional events which are flushed to the binary
> log upon
> committing a transaction and ignored upon rolling back it.
> 
> TODO ------- WE NEED TO DISCUSS THIS CASE --------- DISCUSSION 3
> 3.1 Do we really need a cache for non-transactional events? IMHO, a cache to
> address the only the current worklog is overkilling.
> 


Thread