Hello,
Mattias Jonsson a écrit :
> Hi Guilhem,
>
> I just got bug#39407 assigned to me, and after some investigation I came
> to the same conclusions as you (but had a 'quick' fix just for testing
> my findings).
>
> The patch is good and you have done a great job explaining the changes
> and added informative comments.
Thanks :) They also helped me understand what this old patch was doing :)
> OK to push from me!
>
> Some questions though inline (started with 'MJ: ') which do not directly
> affect this patch.
Answers are below.
> Regards
> Mattias
>
>
> ChangeSet@stripped, 2007-10-22 20:52:05+02:00, guilhem@stripped +8 -0
> Fix for BUG#31612
> "Trigger fired multiple times leads to gaps in auto_increment sequence".
> Bug was never experienced by users because it occurs only with the fix
> for BUG31540 "incorrect auto_increment values used for multi-row insert
> trigger" which is not yet in the main 5.1.
> The bug was that if a trigger fired multiple times inside a top
> statement (for example top-statement is a multi-row INSERT,
> and trigger is ON INSERT), and that trigger inserted into an
> auto_increment
> column, then gaps could be observed in the auto_increment sequence,
> even if there were no other users of the database (no concurrency).
> It was wrong usage of THD::auto_inc_intervals_in_cur_stmt_for_binlog.
> sql/handler.cc@stripped, 2007-10-22 20:52:00+02:00, guilhem@stripped
> +16 -11
> See revision comment of handler.h.
> As THD::auto_inc_intervals_in_cur_stmt_for_binlog is cumulative
> over all trigger invokations by the top statement, the
> second invokation of the trigger arrived in
> handler::update_auto_increment()
> with already one interval in
> THD::auto_inc_intervals_in_cur_stmt_for_binlog. The method thus
> believed it had already reserved one interval for that invokation,
> thus reserved a twice larger interval (heuristic when we don't know
> how large the interval should be: we grow by powers of two). InnoDB
> thus increased its internal per-table auto_increment counter by 2
> while only one row was to be inserted. Hence a gap in the sequence.
> The fix is to use the new handler::auto_inc_intervals_count.
> Note that the trigger's statement knows how many rows it is going
> to insert, but provides estimation_rows_to_insert == 0 (see comments
> in sql_insert.cc why triggers don't call
> handler::ha_start_bulk_insert()).
> * removing white space at end of line
> * we don't need to maintain
> THD::auto_inc_intervals_in_cur_stmt_for_binlog
> if no binlogging or if row-based binlogging.
>
> MJ: Can THD::auto_inc_intervals_in_cur_stmt_for_binlog be used at all
> when triggers are used? (since it might be two different tables that
> have different auto_increments).
Yes, it can. In fact, the case of two different tables with
autoincrement is http://bugs.mysql.com/bug.php?id=19630
So, if two different tables with autoincrement, statement-based
binlogging will break, and one should use mixed or row-based binlogging.
> sql/handler.h@stripped, 2007-10-22 20:52:00+02:00, guilhem@stripped +9 -1
> THD::auto_inc_intervals_in_cur_stmt_for_binlog served
> - for binlogging
> - as a heuristic when we have no estimation of how many records the
> statement will insert.
> But the first goal needs to be cumulative over all statements which
> form a binlog event, while the second one needs to be attached to each
> statement. THD::auto_inc_intervals_in_cur_stmt_for_binlog is
> cumulative,
> leading to BUG#31612. So we introduce handler::auto_inc_intervals_count
> for the second goal. See the revision comment of handler.cc.
> A smaller issue was that, even when the binlog event was only one
> statement (no triggers, no stored functions),
> THD::auto_inc_intervals_in_cur_stmt.nb_elements() could be lower than
> the number of reserved intervals (fooling the heuristic), because its
> append() method collapses two contiguous intervals in one.
> Note that as auto_inc_intervals_count is in class 'handler' and not
> in class 'THD', it does not need to be handled in
> THD::reset|restore_sub_statement_state()
>
> MJ: Is really auto_inc_intervals_in_cur_stmt_for_binlog used for
> binlogging? (I only found that it is only used for binlogging the first
> auto_increment value, not all the other used auto_increment values.)
Yes it's used. In statement-based replication, slave expects:
- to be told the first used value (N)
- that all other used values were consecutive (N+1, N+2 etc)
So in binlog we store only the first value.
This means that in fact there has to always be one single interval
reserved, if using statement-based. This constraint put a burden on the
engine: it has to use a table-level write lock on the table, if using
statement-based binlogging, kept for the duration of the statement.
Kind of breaks the entire idea of reserving intervals (which was to
decrease locking on tables btw)? Yes :(
Ah, if only http://forge.mysql.com/worklog/task.php?id=3404
was implemented.
> Why are not first_successful_insert_id_in_cur_stmt used instead?
This one is reset at end of each sub-statement.
So if you have top-statement which inserts into auto-inc, and invokes
trigger which doesn't, at end of trigger
first_successful_insert_id_in_cur_stmt is set to 0, so cannot be used
for binlogging the value used by the top-statement.