On Sun, Dec 06, 2009 at 08:43:52AM -0800, MARK CALLAGHAN wrote:
> On Sun, Dec 6, 2009 at 8:34 AM, Simon J Mudd <sjmudd@stripped> wrote:
> > On Sun, Dec 06, 2009 at 08:27:28AM -0800, MARK CALLAGHAN wrote:
> >> On Sun, Dec 6, 2009 at 7:15 AM, Simon J Mudd <sjmudd@stripped>
> >> > song.voong@stripped (Song Voong) writes:
> >> >
> >> >> I been getting this error and my replication stoppped.
> >> >>
> >> >> ERROR 1062: Duplicate entry
> >> >> Last_Error: Error 'Duplicate entry '1010769-50-65% Cttn, 35%
> Rayon' for key
> >> >> 2' on query. Default database: 'm3'. Query: 'Update item_attribute
> >> >> lupdateby='TINAA',item_id='1010769',value='65% Cttn, 35%
> >> >> Rayon',source=3,attribute_definition_id='50' WHERE
> item_attribute_ID =
> >> >> 118838555'
> >> >>
> >> >> I know I can do a slave-skip-errors 1062 and by pass it, I
> been reading the
> >> >> forum and it is a common issues with replication, But what is
> causing this?
> >> >> I mean it is updating stuff on the master, and slave should get
> updated too
> >> >> . NO?
> >> Simon listed many good things to check. There are two other things to
> >> 1) confirm the slave did not crash/restart (either mysqld restart or
> >> server reboot). Slave replication state is not crashproof and on a
> >> crash transactions can get replayed. This is easy to check.
> > Note: I think even with the 5.0 version being used if you don't configure the
> > binlogs to be synced with the commits you can find them up to 30 seconds
> > behind, as mysql won't force them to disk. If the host crashes these changes
> > will be lost and on restart mysql will try to replay these statements.
> > Depending on what they are this can make the content of the different
> > from what it should be. So watch this. Default MySQL behaviour is NOT safe.
> With sync_binlog=1 on the master, crash recovery keeps InnoDB and the
> binlog in sync. A future release of MySQL will support an option to
> sync relay-log.info, but even that won't fix the problem on the slave.
I wasn't talking about the master but the slave.
While it won't completely fix the problem it will reduce the "damage" signficantly as
now if the server crashes the master.info and relay-log.info files don't get synced
and thus can be up to 30 seconds behind the actual synced position of the db files.
You can do a lot of work in 30 seconds all of which will generate duplicate
actions. If you're lucky and DO NOT do updates you'll probably get a replication error,
otherwise you'll get silent "corruption" which you'll never even know about.
> The problem on the slave is that the commit to InnoDB by the slave SQL
> thread and the update of relay-log.info that follows the commit are
> not atomic. If the slave crashes after the commit and before
> relay-log.info is updated (or flushed to disk), then the slave will
> replay transactions on restart. MySQL has work in progress to fix this
> in a future release.
Exactly and currently there's a 30-second window which can cause a lot
of corruption. In my case this basically means rebuilding the slave if
the server dies. We have a few so recloning the box is not a great
problem but it does take time and the larger the database sizes the
worse this gets. Downtime means moving the applications to another
box and that's also inconvenient.
> I assume that the slave and master use InnoDB. If you use MyISAM then
> you cannot expect to avoid problems like this.
Well since the problem you actually get are to do with the synchronisation
position of the db vs the position indicated in the master.info and relay-log.info
files it doesn't matter which engine you are using. Certainly Innodb will
recover from the crash cleanly and MyISAM won't if a change is half done,
but the damage would be significantly reduced if the sync position of the
relay-log file is done at the same time as the db sync/write.