On 2008-03-06 Thu 12:08 +0200,Andrei Elkin wrote:
> Zhenxing, hello.
>
> I am glad you are confirming the set of issues the big represents as
> well for a good finding on a documentation artifact.
>
> Getting straight to the point:
>
> > To sum up, things we may need to do:
> > 1) Let the 5.1 ignore the INSERT_ID value as 5.0 does when it is
> > replicating from 5.0 masters that have this problem
> > 2) Make 5.0 follow the document when using SP, do not write the
> > INSERT_ID value
> > 3) Make 5.0 behave the same as 5.1 on this issue
> >
> > item 1) is necessay for upgrade purpose, and is fixed by this patch.
>
> agreed. But remember to fix the minor version we talked about.
>
Do you mean warning on 5.1 slaves about the errornous INSERT_ID value
from 5.0 masters?
> >
> > Items 2) and 3) are exclusive, only one should be chosen, I would prefer
> > 2), because it's simple, can be fixed in an hour.
>
> agreed
>
>
> >
> > Item 3) would takes weeks to fix and can bring new problems if not
> > properly implemented, and I think we should not introduce new features
> > to or change behave of this version.
>
> hard to disagree.
>
> cheers,
>
> Andrei
>
>
> > Hi Andrei
> >
> > Thank you for your scrutiny and discussion over skype, it helped me to
> > clarify some vague points of the problem. And here is my latest
> > understanding of this problem after more investigation.
> >
> > For 5.0 masters, when a statement invokes a trigger/function that INSERT
> > into an AUTO_INCREMENT column, the generated AUTO_INCREMENT value is not
> > logged in the binary log, at the same time, when executing such a
> > statement, the INSERT_ID values associated with the statement is ignored
> > by the trigger/function invoked by the statement. This is documented in:
> >
> > http://dev.mysql.com/doc/refman/5.0/en/replication-features-autoincid.html
> >
> >
> > For 5.1, this was fixed, the master will write the AUTO_INCREMENT value
> > and the slave will use the value in this case.
> >
> > http://dev.mysql.com/doc/refman/5.1/en/replication-features-autoincid.html
> >
> > (NOTE: the document says that for triggers, the AUTO_INCREMENT value is
> > not written into binary log, but this is not correct, the AUTO_INCREMENT
> > value is written for both functions and triggers. I will report a bug on
> > this once convinced.)
> >
> > However, for 5.0 masters, when executing such a statement within a SP,
> > an Intvar event *IS* written into the binary log, with a wrong value.
> > According the document there should not happen. This does not cause any
> > problem when replicating to 5.0 slaves because the wrong AUTO_INCREMENT
> > value is not used for trigger/function when 5.0 slaves executing such a
> > statement. But this causes problem when replicating to a 5.1 slave,
> > which uses the AUTO_INCREMENT value for trigger/function invoked with
> > the statement.
> >
> > Let's use an example to make it clearer, first, the definitions of
> > tables t1, t2 and trigger tr1:
> >
> > CREATE TABLE t1 (id INT AUTO_INCREMENT PRIMARY KEY);
> > CREATE TABLE t2 (id INT AUTO_INCREMENT PRIMARY KEY);
> > DELIMITEDELIMITER //;R //;
> > CREATE TRIGGER tr1 BEFORE DELETE
> > ON t1 FOR EACH ROW
> > BEGIN
> > INSERT INTO t2 VALUES(NULL);
> > END//
> > DELIMITER ;//
> >
> > and now we run the following statements:
> >
> > INSERT INTO t1 VALUES(NULL);
> > INSERT INTO t1 VALUES(NULL);
> > DELETE FROM t1 WHERE id=1;
> >
> > If we run the above statements on 5.0 master, the last DELETE statement
> > will not have a SET INSERT_ID associated with it, which is correct
> > according to the document.
> >
> > INSERT_ID=1
> > use `test`; insert into t1 values (NULL)
> > INSERT_ID=2
> > use `test`; insert into t1 values (NULL)
> > use `test`; delete from t1 where id = 1
> >
> > However, if we wrap the above tree statements into a SP and them call
> > the SP, it would act the same because SP is logged by unrolling. But
> > this is not the case, if we call the SP, there will be a SET INSERT_ID
> > associated with the last DELETE statement. This is inconsistent with the
> > document and is the cause of this bug.
> >
> > INSERT_ID=1
> > use `test`; INSERT INTO t1 VALUES (NULL)
> > INSERT_ID=2
> > use `test`; INSERT INTO t1 VALUES (NULL)
> > INSERT_ID=2 <- this is incorrect
> > use `test`; DELETE FROM t1 WHERE id = 1
> >
> > when replicating to a 5.0 slave, this will not cause problem because the
> > incorrect INSERT_ID value is ignored by the trigger/function. But this
> > will cause problem when replicating to a 5.1 slave, which uses the
> > incorrect INSERT_ID value when calling trigger/function.
> >
>
> > To sum up, things we may need to do:
> > 1) Let the 5.1 ignore the INSERT_ID value as 5.0 does when it is
> > replicating from 5.0 masters that have this problem
> > 2) Make 5.0 follow the document when using SP, do not write the
> > INSERT_ID value
> > 3) Make 5.0 behave the same as 5.1 on this issue
> >
> > item 1) is necessay for upgrade purpose, and is fixed by this patch.
> >
> > Items 2) and 3) are exclusive, only one should be chosen, I would prefer
> > 2), because it's simple, can be fixed in an hour.
> >
> > Item 3) would takes weeks to fix and can bring new problems if not
> > properly implemented, and I think we should not introduce new features
> > to or change behave of this version.
>
> >
> >
> > On 2008-03-03 Mon 12:52 +0200,Andrei Elkin Wrote:
> >> Zhenxing, good morning.
> >>
> >> > Hi Andrei
> >> >
> >> > Thank you for your review! See below.
> >> >
> >> > On 2008-02-29 Fri 14:11 +0200,Andrei Elkin wrote:
> >> >> ZhenXing, hello.
> >> >>
> >> >> The bug you are dealing with is certainly very compicated.
> >> >> Please read on about my findings and a solution how we can solve
> the issues.
> >> >>
> >> >> > Below is the list of changes that have just been committed
> into a local
> >> >> > 5.1 repository of hezx. When hezx does a push these changes
> >> >> > will be propagated to the main repository and, within 24 hours
> after the
> >> >> > push, to the public repository.
> >> >> > For information on how to access the public repository
> >> >> > see
> http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
> >> >> >
> >> >> > ChangeSet@stripped, 2008-02-26 22:43:27+08:00,
> hezx@stripped +7 -0
> >> >> > BUG#33029 5.0 to 5.1 replication fails on dup key when
> inserting
> >> >> > using a trig in SP
> >> >> >
> >> >> > When replicating from 5.0 to 5.1, using trigger or function
> in
> >> >> > stored procedure will cause the replication break.
> >> >> >
> >> >>
> >> >>
> >> >> You needed to be more specific and explain what happended in that
> >> >> case.
> >> >> Examining a regression test i found two issues which are left as
> >> >> comments on the bug page, please read them too:
> >> >>
> >> >> 1. when 5.0 master executes
> >> >>
> >> >> delete from te
> >> >>
> >> >> where the explicit table te has a trigger inserting into an
> >> >> implicit table ti
> >> >>
> >> >> it does not generate set insert_id intvar event altogether.
> >> >>
> >> >> 2. 5.1 master does generate the intvar event in that case.
> >> >>
> >> >> 3. Given 5.0 master and the stored procedure of your test the
> binlog of its
> >> >> execution has the intvar with an incorrect auto-inc value
> >> >> (you can check that's 5 not 1)
> >> >>
> >> >> 4. I guess - you need to check it - 5.1 master makes the same
> >> >> mistake at p.3. Btw, use set
> @@session.binlog_format=statement.
> >> >
> >> > 5.1 master is correct on at p.3, it has the correct value 1 not 5.
> >> >
> >>
> >> Thanks for comforting there is no issue with 5.1!
> >>
> >> >>
> >> >> > The reason of the bug is because when run a statement that
> calls
> >> >> > function or trigger in a SP, For 5.1 masters, if the
> statement or
> >> >> > functions/triggers used generate auto_increment values, the
> first
> >> >> > such used value will be logged in binlog, if the statement
> do not
> >> >> > used such values, the auto_increment value used by
> >> >> > functions/triggers will be logged, and the value will later
> be
> >> >> > used by 5.1 slaves. However this is not the case for 5.0
> masters,
> >> >> > it's always the next auto_increment value of the statement
> that
> >> >> > will be logged, and at the same time, 5.0 slaves will not
> use the
> >> >> > value for functions/triggers. If we replicate from 5.0 to
> 5.1,
> >> >> > the auto_increment value of the statement will be use as
> the
> >> >> > value for functions/triggers, and then cause the break.
> >> >> >
> >> >>
> >> >> Sorry, I did not understand the writing. What i can guess from the
> >> >> patch you tried to protect the current statement from the most
> recent
> >> >> set insert_id which carries the wrong value.
> >> >> Should not we correct the value instead on master?
> >> >>
> >> >
> >> > I think the point of this bug is compatible with the behavour of 5.0,
> so
> >> > 5.0 servers can be upgraded to 5.1 or newer without breaking the
> >> > replication.
> >>
> >> I assume you have agreed with p.1 + p.3 are the reason for the failure of
> >> the reported hitting the prim key on the slave.
> >> You must notice that such artifact happens regardless of the master's
> version.
> >> In other words, 5.0 -> 5.0 replication is broken currently as well.
> >>
> >> Please consult with Lars, #support if you doubt we need to fix this.
> >> I personly don't doubt that it should be mended.
> >>
> >> What I would offer for you to do at first is to find out what are changes
> that
> >> made 5.1 master to binlog correctly and to merge them with 5.0.
> >>
> >> > it doesn't matter if the issue is fixed or not in the new
> >> > 5.0 relesse, there are already releases of 5.0 (and versions before
> >> > 5.1.12) suffer this problem, even if we fixed the problem, we still
> have
> >> > to cope with this bug so that these buggy old versions can be upgraded
> >> > to new versions.
> >>
> >> Wrt to protecting 5.0 -> 5.1 upgrade scenario the way you made it does
> >> seem secure:
> >>
> >> "hiding" set_insert_id for the subsequent query, and thus
> >> asking innodb engine for evaluating the auto_inc value, that your
> >> patch implements leaves number of possibilities to reach an
> >> inconsistent state on slave.
> >> Look at comments on the bug page I refer as the parent for yours
> >> how you can do that: start a transaction on master or slave
> >> that insert auto_inc values to the implicitly involved table
> >> to delete statement (tab_b) and roll it back.
> >> Since this moment, innodb engines on master and slave side views
> >> differently on how to calculate the auto_inc.
> >>
> >> Considering the upgrade problem I wonder if users are better to
> >> upgrade first on a fixed 5.0 first. Perhaps for that many could afford
> >> quick changing and restarting of their 5.0 master.
> >>
> >>
> >> >> That sounds to me the major thing to do.
> >> >>
> >> >> please mail me your opinion,
> >> >>
> >> >> cheers,
> >> >>
> >> >> Andrei
> >> >>
> >> >> > This patch fixes this bug by making 5.1 slaves checking for
> 5.0
> >> >> > masters, and if it is, don't use the saved auto_increment
> value
> >> >> > for functions/triggers to be compatible with the 5.0
> master.
> >> >> >
> >> >> >
> mysql-test/suite/rpl/r/rpl_auto_increment_bug33029.result@stripped, 2008-02-26 22:43:24+08:00,
> hezx@stripped +109 -0
> >> >> > Add test for BUG#33029
> >> >> >
> >> >> >
> mysql-test/suite/rpl/r/rpl_auto_increment_bug33029.result@stripped, 2008-02-26 22:43:24+08:00,
> hezx@stripped +0 -0
> >> >> >
> >> >> > mysql-test/suite/rpl/t/rpl_auto_increment_bug33029.test@stripped,
> 2008-02-26 22:43:24+08:00, hezx@stripped +80 -0
> >> >> > Add test for BUG#33029
> >> >> >
> >> >> > mysql-test/suite/rpl/t/rpl_auto_increment_bug33029.test@stripped,
> 2008-02-26 22:43:24+08:00, hezx@stripped +0 -0
> >> >> >
> >> >> > sql/slave.cc@stripped, 2008-02-26 22:43:23+08:00,
> hezx@stripped +22 -0
> >> >> > Add function to check for bug#33029
> >> >> >
> >> >> > sql/slave.h@stripped, 2008-02-26 22:43:24+08:00,
> hezx@stripped +1 -0
> >> >> > Add function to check for bug#33029
> >> >> >
> >> >> > sql/sql_class.cc@stripped, 2008-02-26 22:43:24+08:00,
> hezx@stripped +11 -0
> >> >> > if master has bug#33029, reset auto_inc_intervals_forced
> for sub statements
> >> >> >
> >> >> > sql/sql_class.h@stripped, 2008-02-26 22:43:24+08:00,
> hezx@stripped +1 -0
> >> >> > Add member to save and restore auto_inc_intervals_forced
> >> >> >
> >> >> > sql/structs.h@stripped, 2008-02-26 22:43:24+08:00,
> hezx@stripped +22 -0
> >> >> > Add functions to save and restore Discrete_intervals_list
> object
> ...
>