Andrei Elkin wrote:
> Mats,
>
>
[snip]
>>>>> binlog_commit()'s argument
>>>>>
>>>>> @param all true if this is the last statement before a COMMIT
>>>>> statement; false
>>>>> if either this is a statement in a
>>>>> transaction but not the last, or if this is a
> statement
>>>>> not inside a BEGIN block and autocommit is on.
>>>>>
>>>>> the false's meaning has extended with your patch. Could you please
> update the
>>>>> comments?
>>>>>
>>>>>
>>>> Well... actually, it changed with the introduction of patch for
>>>> BUG#12713, but I will update the comment.
>>>>
>>>>
>>> I thought post-bug#12713 server executed binlog_commit() same way as
>>> the pre- version.
>>> That's your patch forces committing even though `all' being false.
>>>
>>>
>> No, but AIUI, with the introduction of the patch for BUG#12713, every
>> statement is terminated with a statement commit, and every real
>> transaction is terminated with a transaction commit. Previously, the
>> last statement commit could be skipped, but it shouldn't be skipped
>> any more.
>>
>
> What i meant is that either before and after bug#12717 fixes
> binlog_commit(all==false) does not do any effective work.
> But with your fixes the function gains an extra rule and starts handling
> all==false with doing some work.
>
No, binlog_commit() has always been able to do some work with all ==
false. For example, executing a statement outside of a transaction will
call binlog_commit() with all == false. I'm not sure what you're aiming at.
>> What do you mean by "forces committing even though `all' being false"?
>> Of course we can commit a statement (and even regardless of whether it
>> is part of a transaction or not).
>>
>
> I meant the technical side of committing.
>
> In either format an autocommit-ON query on a trans table has got
> committed to binlog via
>
> TC_LOG_BINLOG::log_xid ->
> binlog_end_trans(all==TRUE)
> bypassing (effectively - not making any
> work, not actually - still being invoked) binlog_commit(all==FALSE).
>
> Should not be there any tricks related to XA, all==TRUE would be
> passed to binlog_commit even though it's the autocommit=ON query not a
> part of multi-statement transaction.
>
> So, that's your changes force committing - flushing the cache to the
> binlog file - when all==false.
>
No, no. The parameter 'all' is not used for this purpose. If
TC_LOG_BINLOG::log_xid() has been called, that will result in an empty
transaction cache, causing binlog_commit() to not do any work. The value
of the all flag depends on whether we are committing a real transaction
or whether we are committing a statement transaction.
>>>>> the original logics do not mention (thd->options &
> OPTION_KEEP_LOG)
>>>>> in the following if arglist.
>>>>> What is the reason on add it?
>>>>>
>>>>>
>>>> It was there from the beginning in both binlog_rollback() and
>>>> binlog_savepoint_rollback(), so I have just added that as a
>>>> condition. The option is used to force the statement to be written to
>>>> the binary log in the case of a rollback, so it may be partially
>>>> redundant (it's main purpose seems to be to ensure that DROP TEMPORARY
>>>> TABLE and CREATE TEMPORARY TABLE goes to the binary log). I think we
>>>> should go over it's usage, but that is beyond this patch.
>>>>
>>>>
>>>>
>>> But that's binlog_commit() that i have meant, and it did no have the
>>> option referred.
>>>
>>>
>> True. As I mentioned on IRC, I tried not using it and semantically
>> there is no difference. There is a difference in the binary log in
>> that the BEGIN is logged either as "use test; BEGIN" or as just
>> "BEGIN". Safer to keep it the way it was though, so I'll change it to
>> the original use of OPTION_KEEP_LOG (i.e., not use it).
>>
>
> ok
>
>
>
>
>>>>>> + in_transaction &&
>>>>>> + (all
>>>>>> + ||
>>>>>> + (!trx_data->at_least_one_stmt &&
>>>>>> + thd->transaction.stmt.modified_non_trans_table)) ||
>>>>>>
>>>>>>
>>>>> Previously binlog_commit called binlog_end_trans(thd, trx_data,
> &qev)
>>>>> having as warrants to do that either
>>>>> all or ! in_transaction
>>>>>
>>>>> Now i see that in the case
>>>>>
>>>>> in_transaction is true,
>>>>> all is false,
>>>>> the current statement has made what the identifier says -
>>>>> thd->transaction.stmt.modified_non_trans_table
>>>>> and the trans cache is empty
>>>>>
>>>>> binlog_commit() is going to inject COMMIT into the binlog.
>>>>>
>>>>> And what is a use case scenario for such combination?
>>>>>
>>>>>
>>>>
>>>>
>
> ...
>
>
>>> I think you mean that if the first query after BEGIN changes on a
>>> non-ta table the query binlogged ahead of BEGIN with binlog format
>>> stmt, correct?
>>>
>>>
>> Yes.
>>
>>
>>> But there was no special reason behind that policy. It has remained for
>>> statement format as a consequence of binlogging algirthm which starts
>>> filling the trans cache at first appearence of a transactional table.
>>>
>>>
>
>
>> The reason for this policy is that there is no obvious reason to hold
>> not send that change over to the slave since it has already
>> occurred. The following scenario would fail if we did not do that
>> (both threads are on the master):
>>
>> T-1 T-2
>> W(t) |
>> | W(t)
>> | |
>> | C
>> R(t)
>> W(n)
>> |
>> C
>>
>> This scenario would move the T-2 W(t) ahead of the T-1 W(t), and the
>> only reason that we do *not* write all non-transactional writes
>> directly to the binary log is because it would potentially cause the
>> slave to stop hard with a syntax error.
>>
>
> I know that, but neither original statement replication nor your
> attempt to mimic that behaviour help with Bug#28976 Mixing trans and
> non-trans tables in one transaction results in incorrect binlog, the
> bug case you must be perfectly aware of.
>
Yes, of course, but I don't think that we should change the behavior in
the sense that you propose and instead move towards a more correct
behavior by logging non-transactional writes if we can. The behavior has
been there for quite some time also for row-based replication, which
means that there is a chance that there are customers relying on it.
>>> With your fixes, I believe, we could do not copy that statement tricky
>>> behaviour and to put the first non-ta query row-events into the cache
>>> to appear in binlog after BEGIN.
>>>
>>> Getting back to that if
>>>
>>> + if ((thd->options & OPTION_KEEP_LOG) ||
>>> + in_transaction &&
>>> + (all ||
>>> + (!trx_data->at_least_one_stmt &&
>>> + thd->transaction.stmt.modified_non_trans_table)) ||
>>> + !in_transaction && !all)
>>>
>>> Why should not it be just as it was before:
>>>
>>> - if (all || !in_transaction)
>>>
>>> ?
>>>
>>>
>> Because that would break the case I give above.
>>
>
> I think we should not try to mimic statement format binloging of mixed
> transaction. To repeat, it was a consequence of the design of
> accumulating and flushing the trans cache. It happens to be viewed
> as an "optimization", but it is not less a confusion nor a remedy
> against bug#28976.
>
> But simplyfying the if condition we will gain less troubles in future as
> well as will remove pieces of your patch - at_least_one_stmt.
>
I don't think we should do that change of the semantics now, since we
are about to release 5.1 as GA. We need to solve BUG#28976 in some way,
however, the simplification you propose will not solve it, but introduce
an incompatibility with previous versions that we do not want to
introduce at this time.
>
>>> You principal use case of ac=ON query on a not-ta table with row-based
>>> format satisfy.
>>> All can be false for
>>>
> ROW
>
>>> format.
>>> Then in_transaction == false and the rows binlog.
>>>
>>> Yes, there will be "difference" in the order of binlogging comparing
>>> with stmt format. However, indeed that would be correction of the order of
>>> binlogging.
>>>
>>>
>
>
>> The AUTO_COMMIT=ON case has to do with writing BEGIN/COMMIT for each
>> transactional unit.
>>
>>
>
> In above paragraph I meant that the original condition
>
> if (all || !in_transaction)
>
> satisfy to your changes. There will be only the difference with statement
> format mixed trasaction binlogging wrt to the first (next to BEGIN)
> query(s) of the mixed transaction (if it is on a non-ta table).
>
Yes, but I don't think we should change that semantics at this time. We
do that when we have a good solution for BUG#28976.
Best wishes,
Mats Kindahl
> regards,
>
> Andrei
>
--
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com