List:Internals« Previous MessageNext Message »
From:Mats Kindahl Date:May 13 2009 6:55am
Subject:Re: Feedback needed for WL#2687
View as plain text  
Alfranio Correia wrote:
> Hi all,
> 
> Find below a proposal on how to handle WL#2687.
> There is also a draft of a patch available in a related bug (BUG#40278).

There is also BUG#28976 to consider when implementing WL#2687.

> ACTIONS:
> 
> 1 - Feedback
> 2 - Review of the patch (Jasonh, Luis).
> Mats and Andrei, could you both also take a look?

See my comments below.

> 
> cheers.
> 
> BACKGROUND
> ----------
> Mixing transactional and non-transactional tables is not supported by the
> current code. In a nutshell, non-transactional operations are logged upon
> commit or rollback thus generating invalid sequence of operations.
> For instance, assume the following statements and sessions (connections):
> 
> con0: CREATE TABLE test (a int);
> con0: INSERT INTO test VALUES(1);
> 
> con1: START TRANSACTION;
> con2: START TRANSACTION;
> con1: UPDATE test SET a = 10;
> con2: DELETE FROM test;
> con2: COMMIT;
> con1: COMMIT;
> 
> In RBR mode, the binary log produced would look like this:
> 
> con0: CREATE TABLE test (a int);
> con0: INSERT INTO test VALUES(1);
> 
> con2: START TRANSACTION;
> con2: DELETE FROM test WHERE a = 10;
> con2: COMMIT;
> con1: START TRANSACTION;
> con1: UPDATE test SET a = 10 WHERE a = 1;
> con1: COMMIT;
> 
> Clearly, this would generate an error in the slave as the row (a = 10) to be
> deleted would not be found.
> 
> 
> 
> PROPOSAL
> --------
> To correctly handle the mix of transactional and non-transactional
> tables, we
> propose what follows.
> 
> In STMT and MIXED modes, statements are copied to the binary log upon
> commit or rollback thus preserving any semantic among the statements in a
> transaction.

You need to clarify this and point out that the non-transactional statements are
*not* written out of order but will appear in the same order in the transaction
as they are written.

However, I am not sure why this need to apply to mixed mode. We can easily
switch to row-based, and write the T(s) part to the transaction cache and the
N(s) part directly to the binary log.

> Unfortunately, we cannot ensure that mixing transactional and
> non-transactional tables will be handled correctly

... in statement-based replication...

> and as such we should
> properly document this and discourage such modes under this circumstance.

Clarify "this circumstance." Do you mean when using statement-based replication?

> Find below an example of what may go wrong:
> 
> con0: CREATE TABLE test (a int);
> con0: INSERT INTO test VALUES(1);
> 
> con1: START TRANSACTION;
> con2: START TRANSACTION;
> con1: UPDATE test SET a = 10 where a = 1;
> con2: DELETE FROM test where a = 10;
> con2: COMMIT;
> con1: COMMIT;
> 
> In STMT or MIXED mode, the binary log produced would look like this:
> 
> con0: CREATE TABLE test (a int);
> con0: INSERT INTO test VALUES(1);
> 
> con2: START TRANSACTION;
> con2: DELETE FROM test WHERE a = 10;
> con2: COMMIT;
> 
> Clearly, the master and the slave would be out of sync.
> 
> Note that we are not proposing to distinguish the case that
> non-transactional changes are processed before any transactional change, what could
> be safely
> handled with minor changes to our proposal.

I don't understand this sentence.

> We do not print out any warning
> message or throw an error either. This is done to discourage the use of such
> modes when mixing transactional and non-transactional tables.

The second sentence does not follow from the first. Why would we not print a
warning or throw an error?

> In RBR mode, non-transactional changes are logged to the binary log upon
> committing or rolling back a statement. On the other hand, transactional
> changes are logged upon committing any implicit (i.e autocommit = 1) or
> explicit (autocommit = 0/COMMIT or START/COMMIT) transaction and ignored
> upon rolling it back. To understand the details of this approach, let's use
> T to represent a statement that only changes transactional tables, N to
> represent a statement that only changes non-transactional tables, and M to
> represent a statement that changes both (i.e. multi-table update).
> 
> We have what follows:
> 
> 1) T...T: all the statements are transactional.
> 
> In this case, the changes are copied to the binary log upon commit and
> ignored upon rollback.

OK.

> 2) N...N: all the statements are non-transactional.
> 
> In this case, the changes are logged to the binary log as soon as each
> statement is completed.

OK.

> 3) N T...: the beginning part is non-transactional, followed by
> transactional
> statements.
> 4) T N...: the beginning part is transactional, followed by
> non-transactional
> statements.
> 
> In both cases, same as above.
> 
> 5) M...: where M is either N T or T N.

You're not mentioning how to handle this.


> IMPROVEMENTS
> ------------
> In STMT and MIXED modes, if we have the 1), 2) or 3) cases described
> above, the replication is completely safe. Otherwise, there may be
> inconsistencies among master and slaves.
> 
> TODO ------- WE NEED TO DISCUSS THIS CASE --------- DISCUSSION 1
> 1.1 - Should we distinguish between safe and unsafe cases?

Do you mean unsafe as in "statements containing, e.g., non-deterministic
changes" or in any other sense. You were using "safe" in the preceding sentence,
so I am not sure which one you are referring to.

> 1.2 - Should we either print out warning messages or throw an error for the
> unsafe cases?

Yes?

> TESTS
> -----
> The test case should check if transactions that mixes transactional and
> non-transactional tables are correctly handled.
> 
> In STMT and MIXED modes, statements are copied to the binary log upon
> commit or rollback thus preserving any semantic among the statements in a
> transaction.
> 
> In RBR, changes on non-transactional tables must be directly logged into the
> binary log while changes on transactional tables must be put into a
> cache and
> moved into the binary log upon commit. The changes in the code has
> impact not
> only on simple transactions but on how concurrent statements are handled,
> creation of tables through select, updates on multiple-tables, functions and
> triggers.
> 
> Thus to check the behavior provided by the current code we divided
> the test as follows:
> 
> 1 - MIXING TRANSACTIONAL and NON-TRANSACTIONAL TABLES
>      1.1 - NN TT
>      1.2 - TT NN
>      1.3 - TT NN TT
> 2 - CONCURRENCY:
>      2.1 - NON-TRANSACT TABLES -  SET AUTOCOMMIT = 0  | COMMIT
>      2.1 - NON-TRANSACT TABLES -  SET AUTOCOMMIT = 1
>      2.2 - NON-TRANSACT TABLES -  SET AUTOCOMMIT = 1  | START - COMMIT
> 3 - CREATING TABLES through SELECT * FROM
> 4 - UPDATING MULTIPLE TABLES
> 5 - PROCEDURES, FUNCTIONS AND TRIGGERS
> 6 - EVENTS
> 
> TODO ------- WE NEED TO DISCUSS THIS CASE --------- DISCUSSION 2
> 2.1 Are the tests proposed for procedures, functions and triggers enough?

You need to ensure that you have different nesting levels for stored routines:
the behavior is different depending on the nesting level (not for RBR, but for SBR).

> 2.2 What should we test for events?

Some mixed cases as outlined in 1.

> IMPLEMENTATION DETAILS
> ----------------------
> 
> There are two different sets of pending events. One for
> non-transactional events
> which are flushed to the binary log upon committing or rolling back a
> statement;
> The other one for transactional events which are flushed to the binary
> log upon
> committing a transaction and ignored upon rolling back it.
> 
> TODO ------- WE NEED TO DISCUSS THIS CASE --------- DISCUSSION 3
> 3.1 Do we really need a cache for non-transactional events? IMHO, a cache to
> address the only the current worklog is overkilling.

I don't think so, and here is why:

Suppose that we start a statement "s" that contain transactional ( T(s) ) and
non-transactional ( N(s) ) changes so that the non-transactional changes span
more than one event. If we flush the N(s) part immediately, the binary log will
look something like this:

Table-map  n1
Write-rows N(s)  STMT_END_F
Table-map  n1
Write-rows N(s)  STMT_END_F
BEGIN;
Table-map  t2
Write-rows T(s)
Write-rows T(s)
COMMIT;

When we execute these statements on the master, any concurrent operations on n1
will be blocked, waiting for the statement to complete. However, on the slave,
concurrent operation on n1 will block until the first part has been executed,
then the lock will be released, and it will be allowed to execute. In that
sense, the changes to n1 will not be atomically executed, which can have all
sorts of consequences.

If we instead cache the N(s) changes, and then write them upon completing the
statement, we will have the following sequence.

Table-map  n1
Write-rows N(s)
Write-rows N(s)  STMT_END_F
BEGIN;
Table-map  t2
Write-rows T(s)
Write-rows T(s)
COMMIT;

which will execute correctly even in the presence of concurrent operations on
the same objects.

Just my few cemts,
Mats Kindahl
-- 
Mats Kindahl
Senior Software Engineer
Database Technology Group
Sun Microsystems
Thread
Feedback needed for WL#2687Alfranio Correia13 May
  • Re: Feedback needed for WL#2687Mats Kindahl13 May
    • Re: Feedback needed for WL#2687Alfranio Correia13 May
  • Re: Feedback needed for WL#2687He Zhenxing14 May
    • Re: Feedback needed for WL#2687He Zhenxing14 May
    • Re: Feedback needed for WL#2687Alfranio Correia15 May
      • Re: Feedback needed for WL#2687He Zhenxing18 May
  • re: Feedback needed for WL#2687Michael Widenius14 May