Hi!
On Jan 11, Konstantin Osipov wrote:
> Hello Sergei,
>
> > > +
> > > +--echo # 5. Read-write statement: UPDATE, update 0 rows, 0 rows match
> WHERE
> > > +--echo #
> > > +update t1 set a=3 where a=1;
> > > +call p_verify_status_increment(2, 0, 1, 0);
> > > +commit;
> > > +call p_verify_status_increment(2, 0, 1, 0);
> >
> > why different results for mixed/row ?
>
> In statement mode, there is a read-only transaction in InnoDB
> and a write to the binary log. So we have two commits but no 2pc,
> since the first engine is read-only.
>
> In the row level replication mode, we only have the read-only
> transaction.
>
> I will add this as a comment, right?
Yes, please.
> > > .....
> > > +--echo # 8. Read-write statement: unqualified DELETE
> > > +--echo #
> > > +delete from t1;
> > > +call p_verify_status_increment(2, 2, 1, 0);
> > > +commit;
> > > +call p_verify_status_increment(2, 2, 1, 0);
> >
> > oh, row/mixed are even optimized differently. why ?
>
> There is a call to ha_delete_all_rows in both cases. This marks
> the innodb transaction read-write. Plus there is a read-write
> transaction in the binary log in statement mode.
> Thus we have 2 prepare + 2 commits.
>
> But no rows are deleted, so there is no write to the binary log in
> the row level replication mode. This is why in the second case we
> have no 2pc.
Add a comment about it too.
> > > - if (!trans->no_2pc && trans->nht > 1)
> > > + must_2pc= ha_check_and_coalesce_trx_read_only(thd, ha_info, all);
> >
> > do you need a call to ha_check_and_coalesce_trx_read_only() here ?
> > can you check and coalesce in the for() loop below
> > that does ht->prepare() ?
>
> If I inline the check into an existing loop, there will be a
> redundant 'prepare' for all cases when the first engine in the
> list has a read-write transaction.
> I think this optimization is unnecessary - in most cases the
> size of the list is 2, so an additional loop is very low cost.
> Keeping the code in a separate function also allows for clear
> comments and good isolation. Oh how many times did I change the
> contents of the function before I got to the final verison!
Ok. I agree that inlining it won't add any noticeable speed
improvements. But having a second loop when only one is sufficient needs
a reason, and I wanted to understand it. That's why I asked.
No need to change anything here, let's keep coalesce as a separate
function.
> > > +
> > > + if (!trans->no_2pc && must_2pc)
> > > {
> > > - for (; *ht && !error; ht++)
> > > + for (; ha_info && !error; ha_info= ha_info->next())
> > > {
> > > int err;
> > > - if ((err= (*(*ht)->prepare)(*ht, thd, all)))
> > > + handlerton *ht= ha_info->ht();
> > > + /*
> > > + Call two-phase commit even if this particular
> > > + transaction is read-only, just in case. It should not
> > > + produce any overhead, and normally should be a no-op,
> > > + but gives the engine a chance to abort the distributed
> > > + commit.
> >
> > why it could be useful (I mean "chance to abort ...") ?
>
> I have no strong opinion frankly. I just thought it's better to
> have it this way -- this makes it easy to count 'prepapre' in the
> test. Basically, if we need 2pc, we prepare all engines, so
> handler_prepare count is always equal to handler_commit count.
> Other than that it's just a gut feeling. What do you think?
Actually, I'd prefer not to call prepare() for read-only participants.
Prepare() is usually a sync(), although a storage engine can optimize it
away for read-only transactions. I don't see why a read-only participant
may want to abort a commit - it may need to do it in the middle of
transaction (in case of a deadlock), but hardly at commit.
Also, in that case handler_prepare will be more informative, not a copy
of handler_commit, but a number of storage engines where some changes
were made.
> > > +inline
> > > +void
> > > +handler::mark_trx_read_write()
> > > +{
> > > + Ha_trx_info *ha_info=
> &ha_thd()->ha_data[ht->slot].ha_info[0];
> > > + /*
> > > + When a storage engine method is called, the transaction must
> > > + have been started, unless it's a DDL call, for which the
> > > + storage engine starts the transaction internally, and commits
> > > + it internally, without registering in the ha_list.
> > > + Unfortunately here we can't know know for sure if the engine
> > > + has registered the transaction or not, so we must check.
> > > + */
> > > +
> > > + if (ha_info->is_started())
> > > + {
> > > + DBUG_ASSERT(has_transactions());
> > > + if (table_share == NULL || table_share->tmp_table == NO_TMP_TABLE)
> >
> > when can table_share be NULL ?
>
> I know of only one case - in handler::ha_delete_table.
>
> Remember the tweak with dummy_share in global ha_delete_table()?
Ah, indeed.
yuck.
> Should I add this in a comment?
yes, please
> > > + */
> > > virtual int delete_table(const char *name);
> > > private:
> > > + /* Private helpers */
> > > + inline void mark_trx_read_write();
> > > +private:
> >
> > why private: twice ?
>
> Just wanted to stress the border between two groups of
> declarations: private helpers and interface definition for
> SE writers. No strong opinion, can remove.
as you like.
(at least, not yet another magic C++ rule, that I'm not familiar with :)
> > > - if (transactional_table)
> > > - {
> > > - if (ha_autocommit_or_rollback(thd,error >= 0))
> > > - error=1;
> > > - }
> > > -
> > > - if (thd->lock)
> > > - {
> > > - mysql_unlock_tables(thd, thd->lock);
> > > - thd->lock=0;
> > > - }
> >
> > Hmm, if here (and below, and for insert/update/load) you don't call
> > ha_autocommit_or_rollback, then how can you ensure that
> > commit will succeed before sending an ok packet ?
>
> send_ok is no more *sending* ok. It's instead recording an OK message in the
> diagnostics area. I will rename it to my_ok soon.
Hmm, tricky.
Yes, my_ok would be better.
> I hope I missed no question. I will investigate the remaining
> difficult ones.
> Hope the above is of help.
Yes, thanks.
Regards / Mit vielen Grüssen,
Sergei
--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg@stripped>
/ /|_/ / // /\ \/ /_/ / /__ Principal Software Developer
/_/ /_/\_, /___/\___\_\___/ MySQL GmbH, Dachauer Str. 37, D-80335 München
<___/ Geschäftsführer: Kaj Arnö - HRB
München 162140