List:General Discussion« Previous MessageNext Message »
From:Hank Date:June 15 2011 8:38pm
Subject:Found serious replication data-corruption bug in 5.5.8/5.5.11
(auto-increment in primary key)
View as plain text  
This is a follow-up to my previous post.  I have been narrowing down what is
causing this bug.  It is a timing issue of a replication ignored table with
an auto-increment primary key values leaking over into a non-ignored table
with inserts immediately after the ignore table has had rows inserted.

Basically, data from the ignored table is corrupting a non-ignored table on
the slave upon immediate inserts.

Here is how to repeat:

On a master issue:

use db;
drop table test;
CREATE TABLE test (id int NOT NULL,   cnt int unsigned  NOT NULL
 AUTO_INCREMENT,  PRIMARY KEY (id,cnt) ) ENGINE=MyISAM ;
drop table log;
CREATE TABLE log (id int unsigned NOT NULL AUTO_INCREMENT, log varchar(20),
 PRIMARY KEY (id) ) ENGINE=MyISAM AUTO_INCREMENT = 44444;

make sure those two tables are created on the slave through regular
replication.

on slave>

desc test;
desc log;

Once replicated, on the master, add the following line to the [mysqld]
section of my.cnf, and then restart mysql.

         replicate-ignore-table=db.log

The on the master, issue the following statements as a copy/paste of all of
them at once.
It's critical that the statements are executed in immediate succession (no
delays)

insert into log values (null,"info1");
insert into log values (null,"info2");
insert into log values (null,"info3");
insert into log values (null,"info4");
insert into test values (1,null);
insert into log values (null,"info5");
insert into test values (1,null);
insert into log values (null,"info6");
insert into test values (2,null);
insert into log values (null,"info7");
insert into test values (2,null);
insert into log values (null,"info8");

Here are the results from the master (all correct):

master>select * from log;
+-------+-------+
| id    | log   |
+-------+-------+
| 44444 | info1 |
| 44445 | info2 |
| 44446 | info3 |
| 44447 | info4 |
| 44448 | info5 |
| 44449 | info6 |
| 44450 | info7 |
| 44451 | info8 |
+-------+-------+
master>select * from test;
+----+-----+
| id | cnt |
+----+-----+
|  1 |   1 |
|  1 |   2 |
|  2 |   1 |
|  2 |   2 |
+----+-----+
Here are the results from the slave:

slave>select * from log;

Empty set (0.00 sec)  <--- as expected, since it is ignored

slave>select * from test;
+----+-------+
| id | cnt   |
+----+-------+
|  1 | 44447 |   <-- should be "1", but has values from "log" on the master
|  1 | 44448 |   <-- should be "2"
|  2 | 44449 |   <-- should be "1"
|  2 | 44450 |   <-- should be "2"
+----+-------+

If there is the slightest delay between the inserts into "log" and "test",
the replication happens correctly.

Thoughts?

-Hank Eskin

Thread
Found serious replication data-corruption bug in 5.5.8/5.5.11(auto-increment in primary key)Hank15 Jun
  • Re: Found serious replication data-corruption bug in 5.5.8/5.5.11(auto-increment in primary key)Hank15 Jun
    • Re: Found serious replication data-corruption bug in 5.5.8/5.5.11(auto-increment in primary key)Claudio Nanni15 Jun
      • Re: Found serious replication data-corruption bug in 5.5.8/5.5.11(auto-increment in primary key)Hank15 Jun
        • Re: Found serious replication data-corruption bug in 5.5.8/5.5.11(auto-increment in primary key)Claudio Nanni15 Jun
          • Re: Found serious replication data-corruption bug in 5.5.8/5.5.11(auto-increment in primary key)Hank17 Jun