Hi Dmitry,
Thanks for your comments on the patch.
First of all, I will follow most of your advices(eg. fixing typo or
commenting the code)
Some comments which need to be clarified are below, please check them.
[snip]
> > @@ -667,10 +661,6 @@ Note 1051 Unknown table 'test.tt_xx_1'
> > ROLLBACK;
> > -b-b-b-b-b-b-b-b-b-b-b- >> B T Drop-Temp-If-Xe-Temp R <<
> -b-b-b-b-b-b-b-b-b-b-b-
> > Log_name Pos Event_type Server_id End_log_pos Info
> > -master-bin.000001 # Query # # BEGIN
> > -master-bin.000001 # Query # # use `test`; INSERT INTO tt_xx_1() VALUES (1)
> > -master-bin.000001 # Query # # use `test`; DROP TEMPORARY TABLE IF EXISTS
> `tt_xx_1` /* generated by server */
> > -master-bin.000001 # Query # # ROLLBACK
> > -e-e-e-e-e-e-e-e-e-e-e- >> B T Drop-Temp-If-Xe-Temp R <<
> -e-e-e-e-e-e-e-e-e-e-e-
> >
> > SET @commands= 'B T Drop-Temp-If-Xe-Temp N Drop-Temp-If-Xe-Temp R';
> > @@ -685,17 +675,12 @@ Warnings:
> > Note 1051 Unknown table 'test.tt_xx_1'
> > ROLLBACK;
> > Warnings:
> > -Warning 1196 Some non-transactional changed tables couldn't be rolled back
> > +Warning # Some non-transactional changed tables couldn't be rolled back
> > -b-b-b-b-b-b-b-b-b-b-b- >> B T Drop-Temp-If-Xe-Temp N
> Drop-Temp-If-Xe-Temp R << -b-b-b-b-b-b-b-b-b-b-b-
> > Log_name Pos Event_type Server_id End_log_pos Info
> > master-bin.000001 # Query # # BEGIN
> > master-bin.000001 # Query # # use `test`; INSERT INTO nt_xx_1() VALUES (1)
> > master-bin.000001 # Query # # COMMIT
> > -master-bin.000001 # Query # # BEGIN
> > -master-bin.000001 # Query # # use `test`; INSERT INTO tt_xx_1() VALUES (1)
> > -master-bin.000001 # Query # # use `test`; DROP TEMPORARY TABLE IF EXISTS
> `tt_xx_1` /* generated by server */
> > -master-bin.000001 # Query # # use `test`; DROP TEMPORARY TABLE IF EXISTS
> `tt_xx_1` /* generated by server */
> > -master-bin.000001 # Query # # ROLLBACK
> > -e-e-e-e-e-e-e-e-e-e-e- >> B T Drop-Temp-If-Xe-Temp N
> Drop-Temp-If-Xe-Temp R << -e-e-e-e-e-e-e-e-e-e-e-
> >
> > SET @commands= 'B T Drop-Temp-TXe-Temp R';
>
> Hmmm... So after your patch successful DROP TEMPORARY TABLE IF EXISTS
> that didn't drop table will no longer stay in binary log in case of ROLLBACK.
> OTOH it will present in binary log in case of COMMIT.
> Isn't such behavior inconsistent?
>
> I don't think this change will have any practical consequences and,
> ultimately, it is you, replication guys, who decide what is the correct/
> acceptable behavior in such cases...
>
I talked to others. It seems they agree with you. But we didn't decided
yet.
> ...
>
[snip]
> > === modified file 'sql/binlog.cc'
> > --- a/sql/binlog.cc 2010-12-17 02:01:32 +0000
> > +++ b/sql/binlog.cc 2011-01-05 05:21:07 +0000
> > @@ -775,55 +806,23 @@ static int binlog_rollback(handlerton *h
> > a cache and need to be rolled back.
> > */
> > error |= binlog_truncate_trx_cache(thd, cache_mngr, all);
> > + DBUG_RETURN(error);
> > }
> > - else if (!error)
> > - {
> > - /*
> > - We flush the cache wrapped in a beging/rollback if:
> > - . aborting a single or multi-statement transaction and;
> > - . the format is STMT and non-trans engines were updated or;
> > - . the OPTION_KEEP_LOG is activate.
> > - . the OPTION_KEEP_LOG is active or;
> > - . the format is STMT and a non-trans table was updated or;
> > - . the format is MIXED and a temporary non-trans table was
> > - updated or;
> > - . the format is MIXED, non-trans table was updated and
> > - aborting a single statement transaction;
> > - */
> > - if (ending_trans(thd, all) &&
> > - ((thd->variables.option_bits & OPTION_KEEP_LOG) ||
> > - (trans_has_updated_non_trans_table(thd) &&
> > - thd->variables.binlog_format == BINLOG_FORMAT_STMT) ||
> > - (cache_mngr->trx_cache.changes_to_non_trans_temp_table()
> &&
> > - thd->variables.binlog_format == BINLOG_FORMAT_MIXED) ||
> > - (trans_has_updated_non_trans_table(thd) &&
> > - ending_single_stmt_trans(thd,all) &&
> > - thd->variables.binlog_format == BINLOG_FORMAT_MIXED)))
> > - error= binlog_rollback_flush_trx_cache(thd, cache_mngr);
> >
> > - /*
> > - Truncate the cache if:
> > - . aborting a single or multi-statement transaction or;
> > - . the OPTION_KEEP_LOG is not activate and;
> > - . the format is not STMT or no non-trans were updated.
> > - . the OPTION_KEEP_LOG is not active and;
> > - . the format is not STMT or no non-trans was updated and;
> > - . the format is not MIXED or no temporary non-trans was
> > - updated.
> > - */
> > - else if (ending_trans(thd, all) ||
> > - (!(thd->variables.option_bits & OPTION_KEEP_LOG)
> &&
> > - (!stmt_has_updated_non_trans_table(thd) ||
> > - thd->variables.binlog_format != BINLOG_FORMAT_STMT)
> &&
> > - (!cache_mngr->trx_cache.changes_to_non_trans_temp_table() ||
> > - thd->variables.binlog_format != BINLOG_FORMAT_MIXED)))
> > - error= binlog_truncate_trx_cache(thd, cache_mngr, all);
> > + if (cache_mngr->trx_cache_cannot_rollback())
> > + {
> > + if (ending_trans(thd, all))
> > + {
> > + error= binlog_rollback_flush_trx_cache(thd, cache_mngr);
> > + }
> > + else
> > + {
> > + /* The statement should be kept in trx_cache. */
> > + cache_mngr->trx_cache.set_prev_position(MY_OFF_T_UNDEF);
> > + }
> > }
> > - /*
> > - This is part of the stmt rollback.
> > - */
> > - if (!all)
> > - cache_mngr->trx_cache.set_prev_position(MY_OFF_T_UNDEF);
> > + else
> > + error= binlog_truncate_trx_cache(thd, cache_mngr, all);
> >
> > DBUG_RETURN(error);
> > }
>
> To my shame I wasn't able to understand fully the above changes. So
> please ensure that they are properly reviewed by one of replication
> experts. Please also consider adding/keeping comments explaining
> logic of the above code.
>
I am sorry for the complex code. I've refactored the code and added
comments to make it more clear. And I will ask other reviewer to check
the new code.
> ...
>
> > @@ -3287,9 +3272,7 @@ bool MYSQL_BIN_LOG::write(Log_event *eve
> > {
> > THD *thd= event_info->thd;
> > bool error= 1;
> > - bool is_trans_cache= FALSE;
> > DBUG_ENTER("MYSQL_BIN_LOG::write(Log_event *)");
> > - binlog_cache_data *cache_data= 0;
> >
> > if (thd->binlog_evt_union.do_union)
> > {
> > @@ -3336,6 +3319,9 @@ bool MYSQL_BIN_LOG::write(Log_event *eve
> > #endif /* HAVE_REPLICATION */
> >
> > IO_CACHE *file= NULL;
> > + binlog_cache_mngr *cache_mngr= NULL;
> > + binlog_cache_data *cache_data= 0;
> > + bool is_trans_cache= FALSE;
> >
> > if (event_info->use_direct_logging())
> > {
> > @@ -3347,16 +3333,11 @@ bool MYSQL_BIN_LOG::write(Log_event *eve
> > if (thd->binlog_setup_trx_data())
> > goto err;
> >
> > - binlog_cache_mngr *const cache_mngr=
> > - (binlog_cache_mngr*) thd_get_ha_data(thd, binlog_hton);
> > -
> > + cache_mngr= (binlog_cache_mngr*) thd_get_ha_data(thd, binlog_hton);
> > is_trans_cache= use_trans_cache(thd, event_info->use_trans_cache());
> > file= cache_mngr->get_binlog_cache_log(is_trans_cache);
> > cache_data= cache_mngr->get_binlog_cache_data(is_trans_cache);
> >
> > - if (thd->lex->stmt_accessed_non_trans_temp_table())
> > - cache_data->set_changes_to_non_trans_temp_table();
> > -
> > thd->binlog_start_trans_and_stmt();
> > }
> > DBUG_PRINT("info",("event type: %d",event_info->get_type_code()));
> > @@ -3439,6 +3420,8 @@ bool MYSQL_BIN_LOG::write(Log_event *eve
> >
> > error= 0;
> >
> > + if (is_trans_cache && stmt_cannot_safely_rollback(thd))
> > + cache_mngr->set_trx_cache_cannot_rollback();
>
> The above if-statement makes an assumption that in cases when
> is_trans_cache == TRUE the cache_mngr is also non-NULL.
> This is a correct assumption at this point, but maybe it makes
> sense to move this if-statement to the block which sets
> is_trans_cache and cache_mngr instead?
> IMO this way code is going to be more future-proof/easier to support.
>
> Or do I miss something?
Yes, I intent to call set_trx_cache_cannot_rollback() just after
ev->write() be executed successfully. It means the log event has been
written into the cache.
>
> > err:
> > if (event_info->use_direct_logging())
> > {
>
> ...
>
> > === modified file 'sql/ha_ndbcluster.cc'
> > --- a/sql/ha_ndbcluster.cc 2010-11-18 16:34:56 +0000
> > +++ b/sql/ha_ndbcluster.cc 2011-01-05 05:21:07 +0000
> > @@ -4609,8 +4609,8 @@ void ha_ndbcluster::transaction_checks(T
> > {
> > m_transaction_on= FALSE;
> > /* Would be simpler if has_transactions() didn't always say "yes" */
> > - thd->transaction.all.modified_non_trans_table=
> > - thd->transaction.stmt.modified_non_trans_table= TRUE;
> > + thd->transaction.all.modified_non_trans_table();
> > + thd->transaction.stmt.modified_non_trans_table();
> > }
> > else if (!thd->transaction.on)
> > m_transaction_on= FALSE;
> >
>
> Taking into account that the above code works only in cases when
> trans_commit_stmt() is going to be called before transaction commit/
> rollback does it makes sense to do:
>
> thd->transaction.all.modified_non_trans_table();
>
> ?
>
To my shame I am not sure about it So I kept the code not removed.
I think we can remove the call if you sure trans_commit_stmt will be
called just after it.
> Maybe instead we should rely on trans_commit/rollback_stmt doing its
> job and propagating flags from statement to "normal" transaction?
>
> ...
>
[snip]
> > + /*
> > + Define the type of statemens which cannot be rolled back safely.
> > + Each type occupies one bit in unsafe_rollback_flags.
> > */
> > - bool modified_non_trans_table;
> > + enum
> > + {
> > + MODIFIED_NON_TRANS_TABLE= 0x01,
> > + CREATED_TEMP_TABLE= 0x02,
> > + DROPPED_TEMP_TABLE= 0x04
> > + };
> > +public:
> > + bool cannot_safely_rollback() const
> > + {
> > + return unsafe_rollback_flags > 0;
> > + }
> > + unsigned int get_unsafe_rollback_flags() const
> > + {
> > + return unsafe_rollback_flags;
> > + }
> > + void set_unsafe_rollback_flags(unsigned int flags)
> > + {
> > + unsafe_rollback_flags= flags;
> > + }
> > + void add_unsafe_rollback_flags(unsigned int flags)
> > + {
> > + unsafe_rollback_flags|= flags;
> > + }
> > + void reset_unsafe_rollback_flags()
> > + {
> > + unsafe_rollback_flags= 0;
> > + }
> > + void modified_non_trans_table()
> > + {
> > + unsafe_rollback_flags|= MODIFIED_NON_TRANS_TABLE;
> > + }
>
> In my opinion it makes sense to rename the above method
> in such way that it begins with a verb. For example, to
> mark_modified_non_trans_table().
Yes, I agree you. And I intent to use a verb. As they are called after
permanent change has happened, so I use the pass tense.
>
> > + bool has_modified_non_trans_table() const
> > + {
> > + return unsafe_rollback_flags & MODIFIED_NON_TRANS_TABLE;
> > + }
> > + void created_temp_table()
> > + {
> > + unsafe_rollback_flags|= CREATED_TEMP_TABLE;
> > + }
>
> Same comment here. One of possible names is - mark_created_temp_table().
>
> > + bool has_created_temp_table() const
> > + {
> > + return unsafe_rollback_flags & CREATED_TEMP_TABLE;
> > + }
> > + void dropped_temp_table()
> > + {
> > + unsafe_rollback_flags|= DROPPED_TEMP_TABLE;
> > + }
>
> Same comment here. One of possible names is - mark_dropped_temp_table().
>
> > + bool has_dropped_temp_table() const
> > + {
> > + return unsafe_rollback_flags & DROPPED_TEMP_TABLE;
> > + }
>
> OTOH, maybe it makes sense not to introduce
> mark/has_modified_non_trans_table(), mark/has_created_temp_table(),
> mark/has_dropped_temp_table() helper methods and try to get by
> making enum public and using add_unsafe_rollback_flags() and
> has_unsafe_rollback_flags() instead? What do you think?
I intent to keep the flags private. I think it is easier for
maintaining.
And Davi told me it is not a good practice to use enum as bit flags.
So I will use micros to replace them.
>
> >
> > - void reset() { no_2pc= FALSE; modified_non_trans_table= FALSE; }
> > + void reset() { no_2pc= FALSE; reset_unsafe_rollback_flags(); }
> > bool is_empty() const { return ha_list == NULL; }
> > };
>
> My biggest concern about this patch that it extends THD_TRANS class which
> is defined in handler.h header. This file is supposed to describe SE API
> and is included by handler implementation in various storage engines and
> the information/methods that you add to this class should not be part of
> SQL-layer/SE API.
>
> OTOH if we look a bit closer we will see that THD_TRANS is not really
> necessary in handler.h and can be moved to sql_class.h after moving
> definition of Ha_trx_info::register_ha() to handler.cc (it can still
> be inline).
>
> IMO we should do such a step as part of your refactoring (since it
> does significant change in this area anyway) in order to reduce
> dependencies in server and to make it more clear what is or is not
> part of SQL-layer/SE interface.
>
OK. I will move it to sql_class.h
> ...
>
> > === modified file 'sql/log_event.cc'
> > --- a/sql/log_event.cc 2010-12-29 05:35:31 +0000
> > +++ b/sql/log_event.cc 2011-01-05 05:21:07 +0000
> > @@ -8161,8 +8161,10 @@ int Rows_log_event::do_apply_event(Relay
> > m_curr_row= m_curr_row_end;
> >
> > if (error == 0 && !transactional_table)
> > - thd->transaction.all.modified_non_trans_table=
> > - thd->transaction.stmt.modified_non_trans_table= TRUE;
> > + {
> > + thd->transaction.all.modified_non_trans_table();
> > + thd->transaction.stmt.modified_non_trans_table();
> > + }
> >
> > if (m_curr_row == m_rows_end)
> > break;
>
> Hmm... Is there any reason why we can't rely on trans_commit_stmt()
> to propagate unsafe_rollback_flags from statement level to transaction
> level? If no, then probably it makes sense to remove
> thd->transaction.all.modified_non_trans_table(). If yes then it
> would be nice to have a comment explaining these reasons.
Right, It can be removed.
>
> >
[snip]
>
> > === modified file 'sql/sp_head.cc'
> > --- a/sql/sp_head.cc 2010-12-17 11:28:59 +0000
> > +++ b/sql/sp_head.cc 2011-01-05 05:21:07 +0000
> > @@ -376,8 +376,8 @@ sp_eval_expr(THD *thd, Field *result_fie
> > Item *expr_item;
> > enum_check_fields save_count_cuted_fields= thd->count_cuted_fields;
> > bool save_abort_on_warning= thd->abort_on_warning;
> > - bool save_stmt_modified_non_trans_table=
> > - thd->transaction.stmt.modified_non_trans_table;
> > + unsigned int stmt_unsafe_rollback_flags=
> > + thd->transaction.stmt.get_unsafe_rollback_flags();
> >
> > DBUG_ENTER("sp_eval_expr");
>
> How about adding short comment explaining what we are trying
> to achieve here? Particularly I am interested in why here we don't
> merge unsafe_rollback_flags accumulated during item evaluation in
> the statement's flags like we do in another place in this file.
>
Sorry, I failed to figure out the reason. I will double check it with
others. If I can figure out the reason, I will comment it.
> ...
>
> > === modified file 'sql/sql_class.h'
> > --- a/sql/sql_class.h 2010-12-29 00:38:59 +0000
> > +++ b/sql/sql_class.h 2011-01-05 05:21:07 +0000
> > @@ -1770,6 +1770,32 @@ public:
> > xid_state.xid.null();
> > init_sql_alloc(&mem_root, ALLOC_ROOT_MIN_BLOCK_SIZE, 0);
> > }
> > + void push_unsafe_rollback_warnings(THD *thd)
> > + {
> > + if (all.has_modified_non_trans_table())
> > + push_warning(thd, MYSQL_ERROR::WARN_LEVEL_WARN,
> > + ER_WARNING_NOT_COMPLETE_ROLLBACK,
> > + ER(ER_WARNING_NOT_COMPLETE_ROLLBACK));
> > +
> > + if (all.has_created_temp_table())
> > + push_warning(thd, MYSQL_ERROR::WARN_LEVEL_WARN,
> > + ER_WARNING_NOT_COMPLETE_ROLLBACK_WITH_CREATED_TEMP_TABLE,
> > +
> ER(ER_WARNING_NOT_COMPLETE_ROLLBACK_WITH_CREATED_TEMP_TABLE));
> > +
> > + if (all.has_dropped_temp_table())
> > + push_warning(thd, MYSQL_ERROR::WARN_LEVEL_WARN,
> > + ER_WARNING_NOT_COMPLETE_ROLLBACK_WITH_DROPPED_TEMP_TABLE,
> > +
> ER(ER_WARNING_NOT_COMPLETE_ROLLBACK_WITH_DROPPED_TEMP_TABLE));
> > + }
>
> Could you please elaborate why do you think this method should
> be part of THD::st_transactions class and not, for example,
> part of THD itself?
OK, I done what you said originally. But I moved it in st_transaction
class later. IMO, it is a function only related to transaction class.
Except thd, all other references are in the scope of transaction class.
>
[snip]
> > === modified file 'sql/sql_insert.cc'
> > --- a/sql/sql_insert.cc 2010-12-22 13:23:59 +0000
> > +++ b/sql/sql_insert.cc 2011-01-05 05:21:07 +0000
> >
> > @@ -4058,6 +4049,9 @@ void select_create::send_error(uint errc
> >
> > bool select_create::send_eof()
> > {
> > + if (create_info->options & HA_LEX_CREATE_TMP_TABLE)
> > + thd->transaction.stmt.created_temp_table();
> > +
> > bool tmp=select_insert::send_eof();
> > if (tmp)
> > abort_result_set();
>
> Should not we call thd->transaction.stmt.created_temp_table() only
> if select_insert::send_eof() has succeeded? Or do I miss something?
>
Binlogging the statement is in select_insert::send_eof(), and
the flag has to be set before the statement be binlogged.
If there is any error happens the flag will be reset in
abort_result_set();
> ...
>
> > === modified file 'sql/sql_load.cc'
> > --- a/sql/sql_load.cc 2010-12-10 16:55:50 +0000
> > +++ b/sql/sql_load.cc 2011-01-05 05:21:07 +0000
> > @@ -1263,7 +1261,8 @@ read_xml_field(THD *thd, COPY_INFO &info
> > We don't need to reset auto-increment field since we are restoring
> > its default value at the beginning of each loop iteration.
> > */
> > - thd->transaction.stmt.modified_non_trans_table= no_trans_update_stmt;
> > + if (no_trans_update_stmt)
> > + thd->transaction.stmt.modified_non_trans_table();
> > thd->warning_info->inc_current_row_for_warning();
> > continue_loop:;
> > }
>
> Is the above if-statement necessary? AFAIU code in write_record()
> should correctly update THD::transaction::stmt object, so it can
> be removed. Do I miss something?
Yes, I agree. I will remove it and add a test.
>
> ...
>
--
Your Sincerely,
Libing Song
==================================
MySQL Replication Team
Software Engineer
Email : Anders.Song@stripped
Skype : libing.song
MSN : slb_database@stripped
Phone : +86 010-6569-4500 ext. 319
Mobile: +86 138-1144-2038
==================================