List:Commits« Previous MessageNext Message »
From:Andrei Elkin Date:March 3 2008 10:52am
Subject:Re: bk commit into 5.1 tree (hezx:1.2534) BUG#33029
View as plain text  
Zhenxing, good morning.

> Hi Andrei
>
> Thank you for your review! See below.
>
> On 2008-02-29 Fri 14:11 +0200,Andrei Elkin wrote:
>> ZhenXing, hello.
>> 
>> The bug you are dealing with is certainly very compicated.
>> Please read on about my findings and a solution how we can solve the issues.
>> 
>> > Below is the list of changes that have just been committed into a local
>> > 5.1 repository of hezx.  When hezx does a push these changes
>> > will be propagated to the main repository and, within 24 hours after the
>> > push, to the public repository.
>> > For information on how to access the public repository
>> > see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
>> >
>> > ChangeSet@stripped, 2008-02-26 22:43:27+08:00, hezx@stripped +7 -0
>> >   BUG#33029 5.0 to 5.1 replication fails on dup key when inserting
>> >   using a trig in SP
>> >   
>> >   When replicating from 5.0 to 5.1, using trigger or function in
>> >   stored procedure will cause the replication break.
>> >   
>> 
>> 
>> You needed to be more specific and explain what happended in that
>> case.
>> Examining a regression test i found two issues which are left as
>> comments on the bug page, please read them too:
>> 
>> 1.  when 5.0 master executes
>> 
>>     delete from te 
>>     
>>     where the explicit table te has a trigger inserting into an
>>     implicit table ti
>> 
>>     it does not generate set insert_id intvar event altogether.
>> 
>> 2.  5.1 master does generate the intvar event in that case.
>> 
>> 3.  Given 5.0 master and the stored procedure of your test  the binlog of its
>>     execution has the intvar with an incorrect auto-inc value
>>     (you can check that's 5 not 1)
>> 
>> 4.  I guess  - you need to check it - 5.1 master makes the same
>>     mistake at p.3. Btw, use set @@session.binlog_format=statement.
>
> 5.1 master is correct on at p.3, it has the correct value 1 not 5.
>

Thanks for comforting there is no issue with 5.1!

>> 
>> >   The reason of the bug is because when run a statement that calls
>> >   function or trigger in a SP, For 5.1 masters, if the statement or
>> >   functions/triggers used generate auto_increment values, the first
>> >   such used value will be logged in binlog, if the statement do not
>> >   used such values, the auto_increment value used by
>> >   functions/triggers will be logged, and the value will later be
>> >   used by 5.1 slaves. However this is not the case for 5.0 masters,
>> >   it's always the next auto_increment value of the statement that
>> >   will be logged, and at the same time, 5.0 slaves will not use the
>> >   value for functions/triggers. If we replicate from 5.0 to 5.1,
>> >   the auto_increment value of the statement will be use as the
>> >   value for functions/triggers, and then cause the break.
>> >   
>> 
>> Sorry, I did not understand the writing. What i can guess from the
>> patch you tried to protect the current statement from the most recent
>> set insert_id which carries the wrong value.
>> Should not we correct the value instead on master?
>> 
>
> I think the point of this bug is compatible with the behavour of 5.0, so
> 5.0 servers can be upgraded to 5.1 or newer without breaking the
> replication.

I assume you have agreed with p.1 + p.3 are the reason for the failure of
the reported hitting the prim key on the slave.
You must notice that such artifact happens regardless of the master's version.
In other words, 5.0 -> 5.0 replication is broken currently as well.

Please consult with Lars, #support if you doubt we need to fix this.
I personly don't doubt that it should be mended.

What I would offer for you to do at first is to find out what are changes that
made 5.1 master to binlog correctly and to merge them with 5.0.

> it doesn't matter if the issue is fixed or not in the new
> 5.0 relesse, there are already releases of 5.0 (and versions before
> 5.1.12) suffer this problem, even if we fixed the problem, we still have
> to cope with this bug so that these buggy old versions can be upgraded
> to new versions.

Wrt to protecting 5.0 -> 5.1 upgrade scenario the way you made it does
seem secure:

  "hiding" set_insert_id for the subsequent query, and thus
   asking innodb engine for evaluating the auto_inc value, that your
   patch implements leaves number of possibilities to reach an
   inconsistent state on slave.
   Look at comments on the bug page I refer as the parent for yours
   how you can do that: start a transaction on master or slave
   that insert auto_inc values to the implicitly involved table
   to delete statement (tab_b) and roll it back.
   Since this moment, innodb engines on master and slave side views
   differently on how to calculate the auto_inc.

Considering the upgrade problem I wonder if users are better to
upgrade first on a fixed 5.0 first. Perhaps for that many could afford
quick changing and restarting of their 5.0 master.


>> That sounds to me the major thing to do.
>> 
>> please mail me your opinion,
>> 
>> cheers,
>> 
>> Andrei
>> 
>> >   This patch fixes this bug by making 5.1 slaves checking for 5.0
>> >   masters, and if it is, don't use the saved auto_increment value
>> >   for functions/triggers to be compatible with the 5.0 master.
>> >
>> >   mysql-test/suite/rpl/r/rpl_auto_increment_bug33029.result@stripped, 2008-02-26
> 22:43:24+08:00, hezx@stripped +109 -0
>> >     Add test for BUG#33029
>> >
>> >   mysql-test/suite/rpl/r/rpl_auto_increment_bug33029.result@stripped, 2008-02-26
> 22:43:24+08:00, hezx@stripped +0 -0
>> >
>> >   mysql-test/suite/rpl/t/rpl_auto_increment_bug33029.test@stripped, 2008-02-26
> 22:43:24+08:00, hezx@stripped +80 -0
>> >     Add test for BUG#33029
>> >
>> >   mysql-test/suite/rpl/t/rpl_auto_increment_bug33029.test@stripped, 2008-02-26
> 22:43:24+08:00, hezx@stripped +0 -0
>> >
>> >   sql/slave.cc@stripped, 2008-02-26 22:43:23+08:00, hezx@stripped +22 -0
>> >     Add function to check for bug#33029
>> >
>> >   sql/slave.h@stripped, 2008-02-26 22:43:24+08:00, hezx@stripped +1 -0
>> >     Add function to check for bug#33029
>> >
>> >   sql/sql_class.cc@stripped, 2008-02-26 22:43:24+08:00, hezx@stripped +11
> -0
>> >     if master has bug#33029, reset auto_inc_intervals_forced for sub
> statements
>> >
>> >   sql/sql_class.h@stripped, 2008-02-26 22:43:24+08:00, hezx@stripped +1
> -0
>> >     Add member to save and restore auto_inc_intervals_forced
>> >
>> >   sql/structs.h@stripped, 2008-02-26 22:43:24+08:00, hezx@stripped +22 -0
>> >     Add functions to save and restore Discrete_intervals_list object
>> >
>> > diff -Nrup a/mysql-test/suite/rpl/r/rpl_auto_increment_bug33029.result
> b/mysql-test/suite/rpl/r/rpl_auto_increment_bug33029.result
>> > --- /dev/null	Wed Dec 31 16:00:00 196900
>> > +++ b/mysql-test/suite/rpl/r/rpl_auto_increment_bug33029.result	2008-02-26
> 22:43:24 +08:00
>> > @@ -0,0 +1,109 @@
>> > +stop slave;
>> > +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
>> > +reset master;
>> > +reset slave;
>> > +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
>> > +start slave;
>> > +drop table if exists t1, t2;
>> > +drop procedure if exists p1;
>> > +drop procedure if exists p2;
>> > +drop trigger if exists tr1;
>> > +drop function if exists f1;
>> > +create table t1 (id int auto_increment primary key);
>> > +create table t2 (id int auto_increment primary key);
>> > +CREATE PROCEDURE p1()
>> > +BEGIN
>> > +DECLARE ins_count INT DEFAULT 10; 
>> > +WHILE ins_count > 5 DO
>> > +INSERT INTO t1 VALUES (NULL);
>> > +SET ins_count = ins_count - 1;
>> > +END WHILE;
>> > +DELETE FROM t1 WHERE id = 1;
>> > +WHILE ins_count > 0 DO
>> > +INSERT INTO t1 VALUES (NULL);
>> > +SET ins_count = ins_count - 1;
>> > +END WHILE;
>> > +DELETE FROM t1 WHERE id = 2;
>> > +DELETE FROM t1 WHERE id = 3;
>> > +END//
>> > +CREATE PROCEDURE p2()
>> > +BEGIN
>> > +INSERT INTO t1 VALUES (NULL);
>> > +DELETE FROM t1 WHERE id = f1(4);
>> > +DELETE FROM t1 WHERE id = f1(5);
>> > +END//
>> > +CREATE TRIGGER tr1 BEFORE DELETE
>> > +ON t1 FOR EACH ROW 
>> > +BEGIN
>> > +INSERT INTO t2  VALUES (NULL);
>> > +END//
>> > +CREATE FUNCTION f1 (i int) RETURNS int
>> > +BEGIN
>> > +INSERT INTO t2 VALUES (NULL);
>> > +RETURN i;
>> > +END//
>> > +call p1();
>> > +DROP TRIGGER tr1;
>> > +call p2();
>> > +# Result on master
>> > +SELECT * FROM t1;
>> > +id
>> > +6
>> > +7
>> > +8
>> > +9
>> > +10
>> > +11
>> > +SELECT * FROM t2;
>> > +id
>> > +1
>> > +2
>> > +3
>> > +4
>> > +5
>> > +6
>> > +7
>> > +8
>> > +9
>> > +10
>> > +11
>> > +12
>> > +13
>> > +14
>> > +15
>> > +16
>> > +17
>> > +18
>> > +# Result on slave
>> > +SELECT * FROM t1;
>> > +id
>> > +6
>> > +7
>> > +8
>> > +9
>> > +10
>> > +11
>> > +SELECT * FROM t2;
>> > +id
>> > +1
>> > +2
>> > +3
>> > +4
>> > +5
>> > +6
>> > +7
>> > +8
>> > +9
>> > +10
>> > +11
>> > +12
>> > +13
>> > +14
>> > +15
>> > +16
>> > +17
>> > +18
>> > +drop table if exists t1, t2;
>> > +drop procedure if exists p1;
>> > +drop procedure if exists p2;
>> > +drop function if exists f1;
>> > diff -Nrup a/mysql-test/suite/rpl/t/rpl_auto_increment_bug33029.test
> b/mysql-test/suite/rpl/t/rpl_auto_increment_bug33029.test
>> > --- /dev/null	Wed Dec 31 16:00:00 196900
>> > +++ b/mysql-test/suite/rpl/t/rpl_auto_increment_bug33029.test	2008-02-26
> 22:43:24 +08:00
>> > @@ -0,0 +1,80 @@
>> > +# BUG#33029 5.0 to 5.1 replication fails on dup key when inserting using a
> trig in SP
>> > +
>> > +source include/master-slave.inc;
>> > +#source include/have_innodb.inc;
>> > +
>> > +--disable_warnings
>> > +drop table if exists t1, t2;
>> > +drop procedure if exists p1;
>> > +drop procedure if exists p2;
>> > +drop trigger if exists tr1;
>> > +drop function if exists f1;
>> > +--enable_warnings
>> > +
>> > +create table t1 (id int auto_increment primary key);
>> > +create table t2 (id int auto_increment primary key);
>> > +
>> > +delimiter //;
>> > +
>> > +CREATE PROCEDURE p1()
>> > +BEGIN
>> > +   DECLARE ins_count INT DEFAULT 10; 
>> > +
>> > +   WHILE ins_count > 5 DO
>> > +       INSERT INTO t1 VALUES (NULL);
>> > +       SET ins_count = ins_count - 1;
>> > +   END WHILE;
>> > +
>> > +   DELETE FROM t1 WHERE id = 1;
>> > +
>> > +   WHILE ins_count > 0 DO
>> > +       INSERT INTO t1 VALUES (NULL);
>> > +       SET ins_count = ins_count - 1;
>> > +   END WHILE;
>> > +
>> > +   DELETE FROM t1 WHERE id = 2;
>> > +   DELETE FROM t1 WHERE id = 3;
>> > +
>> > +END//
>> > +
>> > +CREATE PROCEDURE p2()
>> > +BEGIN
>> > +   INSERT INTO t1 VALUES (NULL);
>> > +   DELETE FROM t1 WHERE id = f1(4);
>> > +   DELETE FROM t1 WHERE id = f1(5);
>> > +END//
>> > +
>> > +CREATE TRIGGER tr1 BEFORE DELETE
>> > +    ON t1 FOR EACH ROW 
>> > +    BEGIN
>> > +        INSERT INTO t2  VALUES (NULL);
>> > +    END//
>> > +
>> > +CREATE FUNCTION f1 (i int) RETURNS int
>> > +    BEGIN
>> > +	INSERT INTO t2 VALUES (NULL);
>> > +	RETURN i;
>> > +    END//
>> > +
>> > +delimiter ;//
>> > +
>> > +call p1();
>> > +
>> > +DROP TRIGGER tr1;
>> > +
>> > +call p2();
>> > +
>> > +echo # Result on master;
>> > +SELECT * FROM t1;
>> > +SELECT * FROM t2;
>> > +sync_slave_with_master;
>> > +
>> > +echo # Result on slave;
>> > +SELECT * FROM t1;
>> > +SELECT * FROM t2;
>> > +
>> > +# clean up
>> > +drop table if exists t1, t2;
>> > +drop procedure if exists p1;
>> > +drop procedure if exists p2;
>> > +drop function if exists f1;
>> > diff -Nrup a/sql/slave.cc b/sql/slave.cc
>> > --- a/sql/slave.cc	2008-02-21 05:24:58 +08:00
>> > +++ b/sql/slave.cc	2008-02-26 22:43:23 +08:00
>> > @@ -4085,6 +4085,28 @@ bool rpl_master_has_bug(Relay_log_info *
>> >    return FALSE;
>> >  }
>> >  
>> > +/**
>> > +   Detect if the master affected by BUG#33029, which does not support
>> > +   using SET INSERT to set auto_increment field value for
>> > +   sub-statements.
>> > + */
>> > +bool rpl_master_bug33029(THD *thd)
>> > +{
>> > +  if (active_mi && active_mi->rli.sql_thd == thd)
>> > +  {
>> > +    Relay_log_info *rli= &active_mi->rli;
>> > +    const uchar *master_ver=
>> > +     
> rli->relay_log.description_event_for_exec->server_version_split;
>> > +    
>> > +    /* TODO: currently for all 5.0 versions, there may be other
>> > +       affected versions that should be checked here.
>> > +     */
>> > +    if (master_ver[0] == 5 && master_ver[1] == 0)
>> > +      return TRUE;
>> > +  }
>> > +  return FALSE;
>> > +}
>> > +
>> >  #ifdef HAVE_EXPLICIT_TEMPLATE_INSTANTIATION
>> >  template class I_List_iterator<i_string>;
>> >  template class I_List_iterator<i_string_pair>;
>> > diff -Nrup a/sql/slave.h b/sql/slave.h
>> > --- a/sql/slave.h	2008-02-07 04:07:43 +08:00
>> > +++ b/sql/slave.h	2008-02-26 22:43:24 +08:00
>> > @@ -166,6 +166,7 @@ int fetch_master_table(THD* thd, const c
>> >  bool show_master_info(THD* thd, Master_info* mi);
>> >  bool show_binlog_info(THD* thd);
>> >  bool rpl_master_has_bug(Relay_log_info *rli, uint bug_id);
>> > +bool rpl_master_bug33029(THD *thd);
>> >  
>> >  const char *print_slave_db_safe(const char *db);
>> >  int check_expected_error(THD* thd, Relay_log_info const *rli, int
> error_code);
>> > diff -Nrup a/sql/sql_class.cc b/sql/sql_class.cc
>> > --- a/sql/sql_class.cc	2008-02-13 01:21:07 +08:00
>> > +++ b/sql/sql_class.cc	2008-02-26 22:43:24 +08:00
>> > @@ -28,6 +28,7 @@
>> >  #include "mysql_priv.h"
>> >  #include "rpl_rli.h"
>> >  #include "rpl_record.h"
>> > +#include "slave.h"
>> >  #include <my_bitmap.h>
>> >  #include "log_event.h"
>> >  #include <m_ctype.h>
>> > @@ -2827,6 +2828,11 @@ extern "C" void thd_mark_transaction_to_
>> >  void THD::reset_sub_statement_state(Sub_statement_state *backup,
>> >                                      uint new_state)
>> >  {
>> > +#ifndef EMBEDDED_LIBRARY
>> > +  if (rpl_master_bug33029(this))
>> > +   
> auto_inc_intervals_forced.save(&backup->auto_inc_intervals_forced);
>> > +#endif
>> > +  
>> >    backup->options=         options;
>> >    backup->in_sub_stmt=     in_sub_stmt;
>> >    backup->enable_slow_log= enable_slow_log;
>> > @@ -2864,6 +2870,11 @@ void THD::reset_sub_statement_state(Sub_
>> >  
>> >  void THD::restore_sub_statement_state(Sub_statement_state *backup)
>> >  {
>> > +#ifndef EMBEDDED_LIBRARY
>> > +  if (rpl_master_bug33029(this))
>> > +   
> auto_inc_intervals_forced.restore(&backup->auto_inc_intervals_forced);
>> > +#endif
>> > +
>> >    /*
>> >      To save resources we want to release savepoints which were created
>> >      during execution of function or trigger before leaving their savepoint
>> > diff -Nrup a/sql/sql_class.h b/sql/sql_class.h
>> > --- a/sql/sql_class.h	2008-02-20 22:22:13 +08:00
>> > +++ b/sql/sql_class.h	2008-02-26 22:43:24 +08:00
>> > @@ -911,6 +911,7 @@ public:
>> >    ulonglong first_successful_insert_id_in_prev_stmt;
>> >    ulonglong first_successful_insert_id_in_cur_stmt, insert_id_for_cur_row;
>> >    Discrete_interval auto_inc_interval_for_cur_row;
>> > +  Discrete_intervals_list auto_inc_intervals_forced;
>> >    ulonglong limit_found_rows;
>> >    ha_rows    cuted_fields, sent_row_count, examined_row_count;
>> >    ulong client_capabilities;
>> > diff -Nrup a/sql/structs.h b/sql/structs.h
>> > --- a/sql/structs.h	2007-12-15 01:01:47 +08:00
>> > +++ b/sql/structs.h	2008-02-26 22:43:24 +08:00
>> > @@ -331,6 +331,28 @@ public:
>> >      }
>> >      empty_no_free();
>> >    }
>> > +
>> > +  /* Transfer elements from one list to an empty list */
>> > +  static void transfer(Discrete_intervals_list *from,
> Discrete_intervals_list *to)
>> > +  {
>> > +    DBUG_ASSERT(to->elements == 0);
>> > +    to->head = from->head;
>> > +    to->tail = from->tail;
>> > +    to->current = from->current;
>> > +    to->elements = from->elements;
>> > +    from->empty_no_free();
>> > +  }
>> > +
>> > +  void save(Discrete_intervals_list *to)
>> > +  {
>> > +    transfer(this, to);
>> > +  }
>> > +
>> > +  void restore(Discrete_intervals_list *from)
>> > +  {
>> > +    transfer(from, this);
>> > +  }
>> > +
>> >    const Discrete_interval* get_next()
>> >    {
>> >      Discrete_interval *tmp= current;
>> >
>> > -- 
>> > MySQL Code Commits Mailing List
>> > For list archives: http://lists.mysql.com/commits
>> > To unsubscribe:    http://lists.mysql.com/commits?unsub=1


regards,

Andrei
Thread
bk commit into 5.1 tree (hezx:1.2534) BUG#33029hezx26 Feb
  • Re: bk commit into 5.1 tree (hezx:1.2534) BUG#33029Andrei Elkin29 Feb
    • Re: bk commit into 5.1 tree (hezx:1.2534) BUG#33029He Zhenxing3 Mar
      • Re: bk commit into 5.1 tree (hezx:1.2534) BUG#33029Andrei Elkin3 Mar
        • Re: bk commit into 5.1 tree (hezx:1.2534) BUG#33029He Zhenxing6 Mar
          • Re: bk commit into 5.1 tree (hezx:1.2534) BUG#33029Andrei Elkin6 Mar
            • Re: bk commit into 5.1 tree (hezx:1.2534) BUG#33029He Zhenxing7 Mar
              • Re: bk commit into 5.1 tree (hezx:1.2534) BUG#33029Andrei Elkin7 Mar
                • Re: bk commit into 5.1 tree (hezx:1.2534) BUG#33029He Zhenxing10 Mar
Re: bk commit into 5.1 tree (hezx:1.2534) BUG#33029He Zhenxing29 Feb