List:Internals« Previous MessageNext Message »
From:Michael Widenius Date:May 19 2009 9:58pm
Subject:Re: Feedback needed for WL#2687
View as plain text  
Hi!

>>>>> "Mats" == Mats Kindahl <mats@stripped> writes:

<cut>

Alfranio> Mixing transactional and non-transactional tables is not supported by the
Alfranio> current code. In a nutshell, non-transactional operations are logged upon
Alfranio> commit or rollback thus generating invalid sequence of operations.
Alfranio> For instance, assume the following statements and sessions (connections):
>> 
Alfranio> con0: CREATE TABLE test (a int);
Alfranio> con0: INSERT INTO test VALUES(1);
>> 
Alfranio> con1: START TRANSACTION;
Alfranio> con2: START TRANSACTION;
Alfranio> con1: UPDATE test SET a = 10;
Alfranio> con2: DELETE FROM test;
Alfranio> con2: COMMIT;
Alfranio> con1: COMMIT;
>> 
>> <cut>
>> 
>> The above assumption wasn't the case for MySQL 5.0-GA (but was changed
>> recently without any good reason).

Mats> Not so. There has been a change in the behavior regarding non-transactional
Mats> changes that were *first* in the transaction, but this behavior was already
Mats> present for non-transactional changes that were inside a transaction.

What exactly is the recent change ?

I am a bit confused here as a customer's told me how the change
in 5.0 had broken the data on their slaves and when discussing this
got to belive that we before wrote all non-transactional statements
directly to the binary log.

Looking at the code and the comments in 4.1 it looks like we wrote
them directly only of the cache was empty.

>> MySQL used to always log changes to non-transactional tables at once
>> and only buffer transactional changes.  This is something that MySQL
>> users have known and taken into account when writing their
>> applications.

Mats> Non-transactional changes were written to the binary log only when they were
Mats> first in the transaction. Also, the notion of when a statement was transactional
Mats> or not was decided on very loose grounds.

Define loose grounds.

>> Now things seams to work that everything is non-transactional tables
>> are logged at once until we change a non-transactional table in which
>> case everything is logged at commit.

Mats> I assume that you mean "transactional" instead of "non-transactional" in the
Mats> second case.

Yes.

>> The other change that was recently introduced, probably at the same
>> time, is that in case of ROLLBACK we send the whole statement to the
>> binary log, but with a ROLLBACK at end.  This will crash all setups
>> where you have transactional tables on the master and
>> non-transactional tables on the slave.

Mats> I assume that you means "the whole transaction" above. I don't know if you
Mats> consider that recent, but the code for this is in 4.1.

When testing this, it looked as if we would got ROLLBACK in more cases
than before in 5.1, but I may be mistaken.  This was however not what
the customer was complaining about, this was just an observation while
examining the logs.

>> The changed behavior in MySQL 5.0-GA have caused some concerns among
>> MySQL users as it has caused their data to be inconsistent on their
>> slaves.  I got several requests at the MySQL users conference what
>> could be done to get back the old behavior.

Mats> As mentioned, this behavior was in 4.1, and I don't know what inconsistencies
Mats> you are referring to, so it is hard for me to answer that.

It goes back to me understanding the exact nature of the change in
MySQL 5.0 that breaks customers applications.

Mats> I don't question that inconsistencies are there, but basically, statement based
Mats> replication cannot be made to behave in a consistent manner when mixing
Mats> non-transactional and transactional tables except in some controlled and limited
Mats> circumstances. This is what we are striving to change.

Yes, I understand that.
However doing changes like this in a GA release is not something one
should do.

The correct way, I assume to fix this, is to go over to row level
logging at once when there is mix of transactional followed by
non-transactional tables under one transaction.

>> It's clear that one can't mix non-transactional and transactional able
>> safely in any imaginable way;  Some are safe and some are not.
>> 
>> However the current way to do it will cause more problems as any
>> 'random' concurrent usage of the non transactional tables will show up
>> with wrong data one the slave.
>> 
>> With random I mean that depending on in which order tables are used,
>> we get different results on the slave.

Mats> Exactly.

Mats> As I said, this behavior was already in 4.1, and we want to change that so that
Mats> we are not dependent on the order and can replicate the data correctly to the
Mats> slave regardless of how transactional and non-transactional changes are made.

Change that, but in what version ?

>> Example:
>> 
>> CREATE TABLE t1 (a int) engine=innodb;
>> CREATE TABLE n1 (a int) engine=myisam;
>> CREATE TABLE n2 (a int) engine=myisam;
>> 
>> T1: BEGIN;
>> T1: INSERT into t1 values (1);
>> T1: INSERT into n1 values (1);
>> T2: INSERT INTO n2 select * from n1;
>> T1: COMMIT;
>> 
>> In this case you will get wrong data on the slave for n2 as the insert
>> of 1 into n1 is not logged until the commit.

Mats> ... and in the following scenario, you will have inconsistency on the slave
Mats> because n2 is logged before the transaction:

Mats> T1: BEGIN;
Mats> T1: INSERT into t1 values (1);
Mats> T1: INSERT INTO n2 select * from t1;
Mats> T1: COMMIT;

That is correct, but that is something our customers knows about in
5.0/5.1.

There is nothing that says which of the above versions is better as
both can break things.

There is however one really serious problem with not logging anything
until commit:

If you are ONLY using non-transactional tables, this will cause a
serious problem if you are using BEGIN/COMMIT in your code!

>> Bascily all INSERT SELECT, multi table updates and multi table delets
>> may now produce wrong results on the slave.

Mats> Correction: will since 4.1 produce the wrong results on the slave, yes.

Yes, in some cases if you don't wrap things mixeding of different
table types with LOCK TABLES.

>> One should NEVER change behavior in a GA release (especially behavior
>> that customers are relying on), so I suggest strongly that you should
>> change back replication how it worked in earlier MySQL 5.0 releases.

Mats> You mean pre-4.1, I assume.

No, I am talking about the change done in replication behavior in
MySQL 5.0 / 5.1 that was done recently.

So the question still stands: What, if anything, changed in logging
recently in these releases?

There may of course a chance that the customer was misinformed and
was able to confused me on this, but this particular customer is very
seldom misinformed about things like this.

Regards,
Monty
Thread
Feedback needed for WL#2687Alfranio Correia13 May
  • Re: Feedback needed for WL#2687Mats Kindahl13 May
    • Re: Feedback needed for WL#2687Alfranio Correia13 May
  • Re: Feedback needed for WL#2687He Zhenxing14 May
    • Re: Feedback needed for WL#2687He Zhenxing14 May
    • Re: Feedback needed for WL#2687Alfranio Correia15 May
      • Re: Feedback needed for WL#2687He Zhenxing18 May
  • re: Feedback needed for WL#2687Michael Widenius14 May