Hello,
> #At file:///home/nirbhay/Project/mysql/repo/wl/mysql-5.1-bugteam-39828/ based on
> revid:georgi.kodinov@stripped
>
> 3520 Nirbhay Choubey 2010-10-03
> Bug#39828 : Autoinc wraps around when offset and increment > 1
>
> Auto increment value wraps when performing a bulk insert with
> auto_increment_increment and auto_increment_offset greater than
> one. Howerer, the initial investigation showed that same bug can
> be reproduced even if just auto_increment_increment is greater
> than one. The bug report also reports of an assertion failure on
> a debug server.
>
> The wrapping of autoinc column happend in update_auto_increment
> function as it failed to check for overflow of next autoinc value
> to be inserted into the column in case of bulk insert.
>
> Fixed by placing checks for overflow.
> @ mysql-test/suite/innodb/r/innodb-autoinc.result
> Bug #39828 : Autoinc wraps around when offset and increment > 1
> @ mysql-test/suite/innodb/t/innodb-autoinc.test
> Bug #39828 : Autoinc wraps around when offset and increment > 1
> @ mysql-test/suite/innodb_plugin/r/innodb-autoinc.result
> Bug #39828 : Autoinc wraps around when offset and increment > 1
> @ mysql-test/suite/innodb_plugin/t/innodb-autoinc.test
> Bug #39828 : Autoinc wraps around when offset and increment > 1
> @ sql/handler.cc
> Bug #39828 : Autoinc wraps around when offset and increment > 1
>
> Added a condition to check for overflow of 'nr' returned from
> compute_next_insert_id. Apart from that, added one more condition
> towards the beginning of handler::update_auto_increment to check
> if next_insert_id is lesser than insert_id_for_cur_row, which
> inturn will avoid overflowing of autoinc value in case of
> bulk (multi-valued) insert.
> === modified file 'sql/handler.cc'
> --- a/sql/handler.cc 2010-04-14 09:53:59 +0000
> +++ b/sql/handler.cc 2010-10-02 18:44:54 +0000
> @@ -2317,11 +2317,17 @@ prev_insert_id(ulonglong nr, struct syst
>
> int handler::update_auto_increment()
> {
> - ulonglong nr, nb_reserved_values;
> + ulonglong nr, nb_reserved_values, nr_old;
> bool append= FALSE;
> THD *thd= table->in_use;
> struct system_variables *variables= &thd->variables;
> DBUG_ENTER("handler::update_auto_increment");
> +
> + /*
> + Perform an overflow check on next_insert_id.
> + */
> + if ((insert_id_for_cur_row) && (insert_id_for_cur_row >
> next_insert_id))
> + DBUG_RETURN(HA_ERR_AUTOINC_READ_FAILED);
>
> /*
> next_insert_id is a "cursor" into the reserved interval, it may go greater
> @@ -2341,6 +2347,12 @@ int handler::update_auto_increment()
> */
> adjust_next_insert_id_after_explicit_value(nr);
> insert_id_for_cur_row= 0; // didn't generate anything
> +
> + /*
> + Check for the overflow of next_insert_id.
> + */
> + if (next_insert_id < nr)
> + auto_inc_interval_for_cur_row.replace(0, 0, 0);
> DBUG_RETURN(0);
> }
>
> @@ -2396,6 +2408,12 @@ int handler::update_auto_increment()
> DBUG_RETURN(HA_ERR_AUTOINC_READ_FAILED); // Mark failure
>
> /*
> + nr_old will be used to store nr's value, which inturn will be
> + compared with nr's value returned from compute_next_insert_id().
> + */
> + nr_old= nr;
> +
> + /*
> That rounding below should not be needed when all engines actually
> respect offset and increment in get_auto_increment(). But they don't
> so we still do it. Wonder if for the not-first-in-index we should do
> @@ -2404,6 +2422,9 @@ int handler::update_auto_increment()
> will not help as we inserted no row).
> */
> nr= compute_next_insert_id(nr-1, variables);
> +
> + if (nr < nr_old)
> + DBUG_RETURN(HA_ERR_AUTOINC_READ_FAILED);
> }
>
> if (table->s->next_number_keypart == 0)
The patch adds three if()s; but in test innodb.innodb-autoinc (and I
assume also in test innodb_plugin.innodb-autoinc, as they look similar),
only the first and second if() are triggered; the last one (nr<nr_old)
is never triggered. So we have a test coverage problem.
HA_ERR_AUTOINC_READ_FAILED is not the best error code: it means that we
couldn't read a value from the engine; in the bug's case, the problem is
not in the engine but in the server, it's rather an 'out of range'
problem; so I would return HA_ERR_AUTOINC_ERANGE (like we already do in
the same function in the case of "strict mode constraints").
I would like to analyze the problem from scratch here.
What is the root cause of this bug: it is that function
compute_next_insert_id(), which promises to generate a number strictly
greater than its input (as promised in the function's introductory
comment), wraps around and generates a smaller number. It fails to its
promise.
So we just have to catch when it fails, and issue an error. If we read
the update_auto_increment() function from first line to last line, here
are the three problematic places (marked with ^^^).
int handler::update_auto_increment()
{
ulonglong nr, nb_reserved_values;
bool append= FALSE;
THD *thd= table->in_use;
struct system_variables *variables= &thd->variables;
DBUG_ENTER("handler::update_auto_increment");
/*
next_insert_id is a "cursor" into the reserved interval, it may go
greater
than the interval, but not smaller.
*/
DBUG_ASSERT(next_insert_id >= auto_inc_interval_for_cur_row.minimum());
if ((nr= table->next_number_field->val_int()) != 0 ||
(table->auto_increment_field_not_null &&
thd->variables.sql_mode & MODE_NO_AUTO_VALUE_ON_ZERO))
{
/*
Update next_insert_id if we had already generated a value in this
statement (case of INSERT VALUES(null),(3763),(null):
the last NULL needs to insert 3764, not the value of the first
NULL plus
1).
*/
adjust_next_insert_id_after_explicit_value(nr);
^^^^^^^ here is the first problem, we need to fix function
adjust_next_insert_id_after_explicit_value() because it calls
compute_next_insert_id() internally, which may overflow
Let's continue with the rest of code:
insert_id_for_cur_row= 0; // didn't generate anything
DBUG_RETURN(0);
}
if ((nr= next_insert_id) >= auto_inc_interval_for_cur_row.maximum())
{
/* next_insert_id is beyond what is reserved, so we reserve more. */
const Discrete_interval *forced=
thd->auto_inc_intervals_forced.get_next();
if (forced != NULL)
{
nr= forced->minimum();
nb_reserved_values= forced->values();
}
else
{
/*
handler::estimation_rows_to_insert was set by
handler::ha_start_bulk_insert(); if 0 it means "unknown".
*/
ulonglong nb_desired_values;
/*
If an estimation was given to the engine:
- use it.
- if we already reserved numbers, it means the estimation was
not accurate, then we'll reserve 2*AUTO_INC_DEFAULT_NB_ROWS the 2nd
time, twice that the 3rd time etc.
If no estimation was given, use those increasing defaults from the
start, starting from AUTO_INC_DEFAULT_NB_ROWS.
Don't go beyond a max to not reserve "way too much" (because
reservation means potentially losing unused values).
Note that in prelocked mode no estimation is given.
*/
if ((auto_inc_intervals_count == 0) && (estimation_rows_to_insert
> 0))
nb_desired_values= estimation_rows_to_insert;
else /* go with the increasing defaults */
{
/* avoid overflow in formula, with this if() */
if (auto_inc_intervals_count <= AUTO_INC_DEFAULT_NB_MAX_BITS)
{
nb_desired_values= AUTO_INC_DEFAULT_NB_ROWS *
(1 << auto_inc_intervals_count);
set_if_smaller(nb_desired_values, AUTO_INC_DEFAULT_NB_MAX);
}
else
nb_desired_values= AUTO_INC_DEFAULT_NB_MAX;
}
/* This call ignores all its parameters but nr, currently */
get_auto_increment(variables->auto_increment_offset,
variables->auto_increment_increment,
nb_desired_values, &nr,
&nb_reserved_values);
if (nr == ~(ulonglong) 0)
DBUG_RETURN(HA_ERR_AUTOINC_READ_FAILED); // Mark failure
/*
That rounding below should not be needed when all engines actually
respect offset and increment in get_auto_increment(). But they
don't
so we still do it. Wonder if for the not-first-in-index we
should do
it. Hope that this rounding didn't push us out of the interval;
even
if it did we cannot do anything about it (calling the engine again
will not help as we inserted no row).
*/
nr= compute_next_insert_id(nr-1, variables);
^^^^^^^ this is a second problem
}
if (table->s->next_number_keypart == 0)
{
/* We must defer the appending until "nr" has been possibly
truncated */
append= TRUE;
}
else
{
/*
For such auto_increment there is no notion of interval, just a
singleton. The interval is not even stored in
thd->auto_inc_interval_for_cur_row, so we are sure to call the
engine
for next row.
*/
DBUG_PRINT("info",("auto_increment: special not-first-in-index"));
}
}
DBUG_PRINT("info",("auto_increment: %lu", (ulong) nr));
if (unlikely(table->next_number_field->store((longlong) nr, TRUE)))
{
/*
first test if the query was aborted due to strict mode constraints
*/
if (thd->killed == THD::KILL_BAD_DATA)
DBUG_RETURN(HA_ERR_AUTOINC_ERANGE);
/*
field refused this value (overflow) and truncated it, use the
result of
the truncation (which is going to be inserted); however we try to
decrease it to honour auto_increment_* variables.
That will shift the left bound of the reserved interval, we don't
bother shifting the right bound (anyway any other value from this
interval will cause a duplicate key).
*/
nr= prev_insert_id(table->next_number_field->val_int(), variables);
if (unlikely(table->next_number_field->store((longlong) nr, TRUE)))
nr= table->next_number_field->val_int();
}
if (append)
{
auto_inc_interval_for_cur_row.replace(nr, nb_reserved_values,
variables->auto_increment_increment);
auto_inc_intervals_count++;
/* Row-based replication does not need to store intervals in binlog */
if (mysql_bin_log.is_open() && !thd->current_stmt_binlog_row_based)
thd->auto_inc_intervals_in_cur_stmt_for_binlog.append(auto_inc_interval_for_cur_row.minimum(),
auto_inc_interval_for_cur_row.values(),
variables->auto_increment_increment);
}
/*
Record this autogenerated value. If the caller then
succeeds to insert this value, it will call
record_first_successful_insert_id_in_cur_stmt()
which will set first_successful_insert_id_in_cur_stmt if it's not
already set.
*/
insert_id_for_cur_row= nr;
/*
Set next insert id to point to next auto-increment value to be able to
handle multi-row statements.
*/
set_next_insert_id(compute_next_insert_id(nr, variables));
^^^^^^^^^ this is the third problem
DBUG_RETURN(0);
}
We need to catch each time compute_next_insert_id() is used, in all
functions which use it, and handle the possibility of an overflow in it.
This is wider than handler.cc: compute_next_insert_id() is used by
adjust_next_insert_id_after_explicit_value() which is used in
sql_insert.cc; so that's more places to fix.
And for all places we fix, we need to have tests which cover the changed
lines.
Regarding the call to adjust_next_insert_id_after_explicit_value() in
update_auto_increment(): it's in the context of an explicitely specified
value after a generated value, like this:
INSERT ...VALUES (NULL),(123).
When NULL was inserted, a generated value was inserted (not NULL) and
handler::next_insert_id was set to the next value to generate. For
example, if the table was empty, 1 was inserted, and next_insert_id was
set to 2. Then we insert 123: at this moment, we need to adjust
next_insert_id to 124, because for this query:
INSERT ... (NULL),(123),(NULL);
we don't want to insert 1,123,2: values must always increase.
Computing 124 is the job of
adjust_next_insert_id_after_explicit_value(). But this computation
happens when inserting 123, so if we notice that computation overflows,
what should we do? if we just return with error, then this statement
INSERT ... (NULL),(123).
will fail, which may surprise the user (123 is in the range, it's 124
which is not).
Should we rather remember, when we insert 123, than the generated value
next_insert_id is invalid and should not be used in a next insertion?
Maybe a simple thing to do is to have a new member in class "handler":
bool next_insert_id_invalid
and compute_next_insert_id() would set it in case of overflow (which is
when its output value is <= its input value). And this bool would be
tested each time we test next_insert_id. This way, when we insert 123,
we set next_insert_id_invalid to "true", and when we later insert NULL,
which is when we want to use next_insert_id, we see that
next_insert_id_invalid is true, so we give an error.
Just an idea, there may be much better ones. Maybe Alexey will have
ideas too.