List:Replication« Previous MessageNext Message »
From:Jesse Date:October 2 2007 3:16pm
Subject:Re: Slave just stops....
View as plain text  
This is a Windows 2000 Server machine, so we're not running cron jobs, 
however, there are no Scheduled Tasks at all in there, so that's not it. 
This has happened every day for the last three days, so I know it's not 
power outages or anything. Even if it were, we have battery backups to 
prevent problems such as this, and it would have rebooted, and re-started 
the replication process, I would think.

I'm at a loss.  We really need this to work.  The only backup we have right 
now is the nightly backup that MySQL Administrator makes.  I would hate to 
have a failure, and have to tell people, "sorry, but you're going to be 
missing a whole day's worth of data".  So, it's critical that we get this 
going.  It's just frustrating.

Jesse


----- Original Message ----- 
From: "Steve Musumeche" <steve@stripped>
To: "Jesse" <jlc@stripped>
Cc: <replication@stripped>
Sent: Tuesday, October 02, 2007 9:49 AM
Subject: Re: Slave just stops....


> Jesse,
>
> Your errors are strange and I really can't tell you what's going on.  Your 
> suspicion is correct - when replication is running successfully both 
> Slave_IO_Running and Slave_SQL_Running should be "Yes".  Your error log 
> suggests that there are network issues preventing the slave from 
> connecting to the master.  However, replication should not break in this 
> case, it will "catch itself up" when the connection is restored.  Are you 
> sure there is nothing else happening on the slave server which could be 
> stopping mysql unexpectedly (cron jobs, power outage, etc)?
>
> Steve Musumeche
> CIO, Internet Retail Connection
> steve@stripped
>
>
>
> Jesse wrote:
>>> The reason replication is stopped is because of your foreign key 
>>> restraint error.  Once you fix this, it should work (you may have to 
>>> re-sync).
>>
>> Yes, I caught that.  However, the reason that the foreign key error 
>> occurred was because replication simply stopped at some point for a 
>> (currently) unexplained reason, causing certain records not to be added, 
>> so when I re-started replication, it failed because it tried to add 
>> records who's foreign key's were missing.
>>
>> I almost decided not to post the results of show slave status after my 
>> re-start, because I figured it would cause confusion, and it has.
>>
>> However, here's the situation.  The same exact thing has happened again 
>> this morning.  I reset the slave last night.  I restored all of the data, 
>> and re-started the slave.  Everything appeared to be fine. I did a few 
>> things, which seemed to be working fine.  This morning, I checked, and I 
>> find the following error in the Windows Event Log:
>>
>> Slave SQL thread exiting, replication stopped in log 'mysql-bin.000001' 
>> at position 199
>>
>> This occurred at about 7:51 this morning.  I did a Show Slave Status, and 
>> here it is:
>> mysql> show slave status\G
>> *************************** 1. row ***************************
>> Slave_IO_State: Waiting for master to send event
>> Master_Host: webserver
>> Master_User: Replication
>> Master_Port: 3306
>> Connect_Retry: 60
>> Master_Log_File: mysql-bin.000001
>> Read_Master_Log_Pos: 1376264
>> Relay_Log_File: dlgsrv-relay-bin.000012
>> Relay_Log_Pos: 434
>> Relay_Master_Log_File: mysql-bin.000001
>> Slave_IO_Running: Yes
>> Slave_SQL_Running: No
>> Replicate_Do_DB:
>> Replicate_Ignore_DB:
>> Replicate_Do_Table:
>> Replicate_Ignore_Table:
>> Replicate_Wild_Do_Table:
>> Replicate_Wild_Ignore_Table:
>> Last_Errno: 0
>> Last_Error:
>> Skip_Counter: 0
>> Exec_Master_Log_Pos: 199
>> Relay_Log_Space: 69323
>> Until_Condition: None
>> Until_Log_File:
>> Until_Log_Pos: 0
>> Master_SSL_Allowed: No
>> Master_SSL_CA_File:
>> Master_SSL_CA_Path:
>> Master_SSL_Cert:
>> Master_SSL_Cipher:
>> Master_SSL_Key:
>> Seconds_Behind_Master: NULL
>> 1 row in set (0.00 sec)
>>
>> The only thing I question is "Slave_SQL_Running: No".  Should this be 
>> this way, or should the SQL be running?  If it should, then what stopped 
>> it?
>>
>> I checked the .err file, and the last error in there was the one I listed 
>> above.
>>
>> Another confusing thing about this is that it says that it exited at 
>> position 199.  When I started replication last night, I started it 
>> position 1272859.
>>
>> Here are a few more entries from the log that indicate errors, but it 
>> didn't seem to stop it until this morning:
>> 071001 21:18:53 [ERROR] Error reading packet from server: Lost connection 
>> to MySQL server during query ( server_errno=2013)
>> 071001 21:18:53 [Note] Slave I/O thread killed while reading event
>> 071001 21:18:53 [Note] Slave I/O thread exiting, read up to log 
>> 'mysql-bin.000001', position 1272859
>> 071001 21:21:47 [Note] Slave SQL thread initialized, starting replication 
>> in log 'mysql-bin.000001' at position 1272859, relay log 'C:\Program 
>> Files\MySQL\MySQL Server 5.0\Data\dlgsrv-relay-bin.000001' position: 4
>> 071001 21:21:47 [Note] Slave I/O thread: connected to master 
>> 'Replication@webserver:3306',  replication started in log 
>> 'mysql-bin.000001' at position 1272859
>> 071001 21:30:47 [ERROR] Error reading packet from server: Lost connection 
>> to MySQL server during query ( server_errno=2013)
>> 071001 21:30:47 [Note] Slave I/O thread killed while reading event
>> 071001 21:30:47 [Note] Slave I/O thread exiting, read up to log 
>> 'mysql-bin.000001', position 1273886
>> 071001 21:30:47 [Note] Error reading relay log event: slave SQL thread 
>> was killed
>> 071001 21:32:46 [Note] Slave SQL thread initialized, starting replication 
>> in log 'mysql-bin.000001' at position 1273886, relay log 'C:\Program 
>> Files\MySQL\MySQL Server 5.0\Data\dlgsrv-relay-bin.000002' position: 1262
>> 071001 21:32:46 [Note] Slave I/O thread: connected to master 
>> 'Replication@webserver:3306',  replication started in log 
>> 'mysql-bin.000001' at position 1273886
>>
>> These could have occurred at about the time that I re-started the slave, 
>> however, I checked this file immediately after re-starting the slave last 
>> night, and nothing was wrong, so this happened shortly after re-starting 
>> it. Could these be the problem?  And if so, what are the errors, and how 
>> to I fix them?
>>
>> Thanks,
>> Jesse
>>
> 

Thread
Slave just stops....Jesse29 Sep
  • Re: Slave just stops....Jesse1 Oct
    • Re: Slave just stops....Steve Musumeche1 Oct
  • Re: Slave just stops....Jesse1 Oct
    • Re: Slave just stops....Steve Musumeche1 Oct
  • RE: Slave just stops....Rick James2 Oct
  • Re: Slave just stops....Jesse2 Oct
    • Re: Slave just stops....Steve Musumeche2 Oct
  • Re: Slave just stops....Jesse2 Oct
  • Re: Slave just stops....Jesse2 Oct