After several long days trying to fix this I'm running out of ideas.
Master: RedHat 7.3 kernel 2.4, MySQL 4.0.20 32 bit (mysql.com rpm) ->
Slave: Fedora Core 2 64 bit kernel 2.6.5, MySQL-Max-4.0.20-0 64 bit
In a varying amount of time after a few hundred thousand queries
replication dies with
040625 16:19:12 Error in Log_event::read_log_event(): 'Event too
small', data_len: 0, event_type: 0
040625 16:19:12 Error reading relay log event: slave SQL thread aborted
because of I/O error
Using instructions from Sasha Pachev
40FreeBSD.csie.NCTU.edu.tw I've looked at the binlog on the slave and
can indeed verify a large chunk of empty space and that query is indeed
logged on the master.
Fun part is that it does work when I point our 32 bit master to
different 32 bit slave. So I know it's not a problem with our old
servers, just this fancy new one.
So far I've
- Tried a different master (we have a pool of 5 similar servers to use
as a master).
- Tried 32-bit server instead of 64-bit Max on the slave (couldn't get
64 bit non-Max to start at all, would just dump).
- Tried swapping nic to a different brand.
- Used tcpdump to attempt to spot any network level issues.
- Tried pointing the binlogs on the master to another local disk
separate from the data.
- Examined the changelogs for the nic drivers.
- Googled this to no end.
With no luck.
I'm open for suggestions.
I suppose the next step is to install core 2 32-bit and try again.
Matthew Kent \ SA \ bravenet.com \ 1-250-954-3203 ext 108