List:Internals« Previous MessageNext Message »
From:Alfranio Correia Date:May 15 2009 6:34pm
Subject:Re: Feedback needed for WL#2687
View as plain text  
Hi all,

Find some comments in-line.

Cheers.

He Zhenxing wrote:
> Alfranio Correia wrote:
>   
>> Hi all,
>>
>> Find below a proposal on how to handle WL#2687.
>> There is also a draft of a patch available in a related bug (BUG#40278).
>>
>> ACTIONS:
>>
>> 1 - Feedback
>> 2 - Review of the patch (Jasonh, Luis).
>> Mats and Andrei, could you both also take a look?
>>
>> cheers.
>>
>> BACKGROUND
>> ----------
>> Mixing transactional and non-transactional tables is not supported by the
>> current code. In a nutshell, non-transactional operations are logged upon
>> commit or rollback thus generating invalid sequence of operations.
>> For instance, assume the following statements and sessions (connections):
>>
>> con0: CREATE TABLE test (a int);
>> con0: INSERT INTO test VALUES(1);
>>
>> con1: START TRANSACTION;
>> con2: START TRANSACTION;
>> con1: UPDATE test SET a = 10;
>> con2: DELETE FROM test;
>> con2: COMMIT;
>> con1: COMMIT;
>>
>> In RBR mode, the binary log produced would look like this:
>>
>> con0: CREATE TABLE test (a int);
>> con0: INSERT INTO test VALUES(1);
>>
>> con2: START TRANSACTION;
>> con2: DELETE FROM test WHERE a = 10;
>> con2: COMMIT;
>> con1: START TRANSACTION;
>> con1: UPDATE test SET a = 10 WHERE a = 1;
>> con1: COMMIT;
>>
>> Clearly, this would generate an error in the slave as the row (a = 10) to be
>> deleted would not be found.
>>
>>
>>
>> PROPOSAL
>> --------
>> To correctly handle the mix of transactional and non-transactional
>> tables, we
>> propose what follows.
>>
>> In STMT and MIXED modes, statements are copied to the binary log upon
>> commit or rollback thus preserving any semantic among the statements in a
>> transaction. Unfortunately, we cannot ensure that mixing transactional and
>> non-transactional tables will be handled correctly and as such we should
>> properly document this and discourage such modes under this circumstance.
>> Find below an example of what may go wrong:
>>
>> con0: CREATE TABLE test (a int);
>> con0: INSERT INTO test VALUES(1);
>>
>> con1: START TRANSACTION;
>> con2: START TRANSACTION;
>> con1: UPDATE test SET a = 10 where a = 1;
>> con2: DELETE FROM test where a = 10;
>> con2: COMMIT;
>> con1: COMMIT;
>>
>> In STMT or MIXED mode, the binary log produced would look like this:
>>
>> con0: CREATE TABLE test (a int);
>> con0: INSERT INTO test VALUES(1);
>>
>> con2: START TRANSACTION;
>> con2: DELETE FROM test WHERE a = 10;
>> con2: COMMIT;
>>
>> Clearly, the master and the slave would be out of sync.
>>
>> Note that we are not proposing to distinguish the case that
>> non-transactional
>> changes are processed before any transactional change, what could be safely
>> handled with minor changes to our proposal. We do not print out any warning
>> message or throw an error either. This is done to discourage the use of such
>> modes when mixing transactional and non-transactional tables.
>>
>> In RBR mode, non-transactional changes are logged to the binary log upon
>> committing or rolling back a statement. On the other hand, transactional
>> changes
>> are logged upon committing any implicit (i.e autocommit = 1) or explicit
>> (autocommit = 0/COMMIT or START/COMMIT) transaction and ignored upon
>> rolling it
>> back. To understand the details of this approach, let's use T to represent a
>> statement that only changes transactional tables, N to represent a statement
>> that only changes non-transactional tables, and M to represent a
>> statement that
>> changes both (i.e. multi-table update).
>>
>> We have what follows:
>>
>> 1) T...T: all the statements are transactional.
>>
>> In this case, the changes are copied to the binary log upon commit and
>> ignored
>> upon rollback.
>>
>>     
>
> OK
>
>   
>> 2) N...N: all the statements are non-transactional.
>>
>> In this case, the changes are logged to the binary log as soon as each
>> statement
>> is completed.
>>
>>     
>
> OK
>
>   
>> 3) N T...: the beginning part is non-transactional, followed by
>> transactional
>> statements.
>>     
>
> In this case, the beginning part of non-transactional statements should
> be logged separately and written to binlog directly in all modes. There
> is nothing unsafe while binlogging the first non-transactional part. And
> then the reset part T... will be either a T...T or TN..., which are
> handled by 1) or 4).
>   
I really don't like this idea of handling this case differently.
However, this is my personal opinion.
>   
>> 4) T N...: the beginning part is transactional, followed by
>> non-transactional
>> statements.
>>
>>     
>
> In this case, the transaction is not safe to log in STATEMENT format, a
> warning should be issued in STATEMENT mode, and a switch to ROW format
> should be done in MIXED mode.
>
> When in or after switched to ROW format, rows to non-transactional
> tables should be logged immediately to binlog. rows to transactional
> tables should be logged to the transaction cache first, and flush to
> binlog when commit or throw away when rollback.
>
>   
>> In both cases, same as above.
>>
>> 5) M...: where M is either N T or T N.
>>
>>     
>
> M is one single statement that modifies both transactional and
> non-transactional tables, so it is either NT or TN *only* in ROW format.
> In STATEMENT format, we cannot split it, and an M in a transaction
> should be treated the same way as 4).
>
>   
>> IMPROVEMENTS
>> ------------
>> In STMT and MIXED modes, if we have the 1), 2) or 3) cases described
>> above, the
>> replication is completely safe. Otherwise, there may be inconsistencies
>> among
>> master and slaves.
>>
>> TODO ------- WE NEED TO DISCUSS THIS CASE --------- DISCUSSION 1
>> 1.1 - Should we distinguish between safe and unsafe cases?
>>     
>
> yes
>
>   
>> 1.2 - Should we either print out warning messages or throw an error for the
>> unsafe cases?
>>
>>     
>
> yes
>
>   
>> TESTS
>> -----
>> The test case should check if transactions that mixes transactional and
>> non-transactional tables are correctly handled.
>>
>> In STMT and MIXED modes, statements are copied to the binary log upon
>> commit or rollback thus preserving any semantic among the statements in a
>> transaction.
>>
>> In RBR, changes on non-transactional tables must be directly logged into the
>> binary log while changes on transactional tables must be put into a
>> cache and
>> moved into the binary log upon commit. The changes in the code has
>> impact not
>> only on simple transactions but on how concurrent statements are handled,
>> creation of tables through select, updates on multiple-tables, functions and
>> triggers.
>>
>> Thus to check the behavior provided by the current code we divided
>> the test as follows:
>>
>> 1 - MIXING TRANSACTIONAL and NON-TRANSACTIONAL TABLES
>>      1.1 - NN TT
>>      1.2 - TT NN
>>      1.3 - TT NN TT
>> 2 - CONCURRENCY:
>>      2.1 - NON-TRANSACT TABLES -  SET AUTOCOMMIT = 0  | COMMIT
>>      2.1 - NON-TRANSACT TABLES -  SET AUTOCOMMIT = 1
>>      2.2 - NON-TRANSACT TABLES -  SET AUTOCOMMIT = 1  | START - COMMIT
>> 3 - CREATING TABLES through SELECT * FROM
>> 4 - UPDATING MULTIPLE TABLES
>> 5 - PROCEDURES, FUNCTIONS AND TRIGGERS
>> 6 - EVENTS
>>
>> TODO ------- WE NEED TO DISCUSS THIS CASE --------- DISCUSSION 2
>> 2.1 Are the tests proposed for procedures, functions and triggers enough?
>> 2.2 What should we test for events?
>>
>>     
>
> I'm not quite sure I understand events correctly, I think an event is
> just an automatically scheculed client session, so I don't think we need
> to test for events because it should already been handled by other
> cases.
>
>   
Me neither.
>> IMPLEMENTATION DETAILS
>> ----------------------
>>
>> There are two different sets of pending events. One for
>> non-transactional events
>> which are flushed to the binary log upon committing or rolling back a
>> statement;
>> The other one for transactional events which are flushed to the binary
>> log upon
>> committing a transaction and ignored upon rolling back it.
>>
>> TODO ------- WE NEED TO DISCUSS THIS CASE --------- DISCUSSION 3
>> 3.1 Do we really need a cache for non-transactional events? IMHO, a cache to
>> address the only the current worklog is overkilling.
>>
>>     
>
>
>   

Thread
Feedback needed for WL#2687Alfranio Correia13 May
  • Re: Feedback needed for WL#2687Mats Kindahl13 May
    • Re: Feedback needed for WL#2687Alfranio Correia13 May
  • Re: Feedback needed for WL#2687He Zhenxing14 May
    • Re: Feedback needed for WL#2687He Zhenxing14 May
    • Re: Feedback needed for WL#2687Alfranio Correia15 May
      • Re: Feedback needed for WL#2687He Zhenxing18 May
  • re: Feedback needed for WL#2687Michael Widenius14 May