Eric Bergen schrieb:
> TCP checksums aren't as strong as encryption. It's rare but corruption
> can happen.
>
But it happens every other day? that means at least one error in 4GB of
data (I have around 2GB of binlogs/day)?
Every DVD-ISO you download would be corrupt (statistically)?
> Where are you reading the positions from and how are you taking the
> snapshot to restore the slave?
>
From the log file:
080415 6:39:20 [ERROR] Error
running query, slave SQL thread aborted. Fix the problem, and restart
the slave SQL thread with "SLAVE START". We stopped at log
'mysql-bin.045709' position 172
I use rsync to set up the slave...
>
> On Mon, Apr 21, 2008 at 12:30 AM, Jan Kirchhoff <kirchy@stripped> wrote:
>
>> Eric Bergen schrieb:
>>
>>
>>> Hi Jan,
>>>
>> >
>> > You have two separate issues here. First the issue with the link
>> > between the external slave and the master. Running mysql through
>> > something like stunnel may help with the connection and data loss
>> > issues.
>> >
>> I wonder how any corruption could happen on a TCP connection as TCP has
>> its own checksums and a connection would break down in case of a missing
>> packet?
>>
>>
>>> The second problem is that your slave is corrupt. Duplicate key errors
>>>
>> > are sometimes caused by a corrupt table but more often by restarting
>> > replication from an incorrect binlog location. Try recloning the slave
>> > and starting replication again through stunnel.
>> >
>> The duplicate key errors happen after I start at the beginning of a
>> logfile (master_log_pos=0) when the positions that mysql reports as its
>> last positions is not working.
>>
>> I think I have 2 issues:
>> #1: how can this kind of binlog corruption happen on a TCP link although
>> TCP has its checksums and resends lost packets?
>>
>> #2: why does mysql report a master log position that is obviously wrong?
>> mysql reports log-posion 172 which is not working at all in a "change
>> master to" command, my only option is to start with master_log_pos=0 and
>> the number of duplicate key errors and such that I have to skip after
>> starting from master_log_pos=0 shows me that the real position that
>> mysql has stopped processing the binlog must be something in the
>> thousands or tenthousands and not 172?!
>>
>> Jan
>>
>>
>
>
>
>