List:General Discussion« Previous MessageNext Message »
From:asm Date:June 19 2002 4:12pm
Subject:replication interrupted in read_event and fd leak
View as plain text  
>Description:
After upgrading MySQL from 3.23.46 to 3.23.51 I found following messages
in mysqld logs:
...
020617 15:19:50  Error reading packet from server:  (server_errno=1159)
020617 15:20:50  Slave: Failed reading log event, reconnecting to retry, log 'zzz'
position 582879
020617 15:20:50  Slave: reconnected to master 'xxx',replication resumed in log 'zzz' at
position 582879
020617 15:21:20  Error reading packet from server:  (server_errno=1159)
020617 15:21:20  Slave: Failed reading log event, reconnecting to retry, log 'zzz'
position 583636
020617 15:21:20  Slave: reconnected to master 'xxx',replication resumed in log 'zzz' at
position 583636
020617 15:21:50  Error reading packet from server:  (server_errno=1159)
020617 15:22:50  Slave: Failed reading log event, reconnecting to retry, log 'zzz'
position 583636
020617 15:22:50  Slave: reconnected to master 'xxx',replication resumed in log 'zzz' at
position 583636
020617 15:23:20  Error reading packet from server:  (server_errno=1159)
020617 15:23:20  Slave: Failed reading log event, reconnecting to retry, log 'zzz'
position 584964
020617 15:23:20  Slave: reconnected to master 'xxx',replication resumed in log 'zzz' at
position 584964
...

this messages repeats every few seconds.

I have 6 replication slaves and 1 master server and this errors exists on
all slaves.

After weekend our tech support reported to me that all slave servers rebooted
twice on Friday evening and Sunday noon.
Kernel error: kernel reports no resources.

I found that mysqld has abnormal number of open file descriptors (command:
ls -1 /proc/`cat /var/lib/mysql/mysqld.pid`/fd | wc -l ).
Normally with my configuration mysqld have ~270 open fds, this time > 5000.

Looking to the code I found that server_errno=1159 is ER_NET_READ_INTERRUPTED
and handling of interrupted read was removed (maybe with addition of
slave-skip-errors option in 3.23.47), so I tried to set slave-skip-errors=1159,
but this doesn't help.

>How-To-Repeat:
Try replication on same platform maybe?

>Fix:

Reintroduce interrupted read handling in read_event:
--- mysql-3.23.51.orig/sql/slave.cc     Mon Jun  3 14:39:05 2002
+++ mysql-3.23.51/sql/slave.cc  Mon Jun 17 14:49:42 2002
@@ -850,8 +850,10 @@
   if (disconnect_slave_event_count && !(events_till_disconnect--))
     return packet_error;
 #endif
-
-  len = mc_net_safe_read(mysql);
+
+  do {
+    len = mc_net_safe_read(mysql);
+  } while (len == packet_error && errno == EINTR);

   if (len == packet_error || (long) len < 1)
   {
fd leak on slave reconnect - unknown

>Submitter-Id:	<submitter ID>
>Originator:	
>Organization:
Alexey S. Mazurov
Yandex
http://www.yandex.ru/
>MySQL support: none
>Synopsis:	replication interrupted in read_event and fd leak
>Severity:	serious
>Priority:	medium
>Category:	mysql
>Class:		sw-bug
>Release:	mysql-3.23.51 (Yandex MySQL RPM)

>Environment:
2 x Intel P3
1 GB RAM

RedHat 6.2 with updates

System: Linux bs1.yandex.ru 2.2.19-6.2.16enterprise #1 SMP Wed Mar 13 13:46:30 EST 2002
i686 unknown
Architecture: i686

Some paths:  /usr/bin/perl /usr/bin/make /usr/bin/gmake /usr/bin/gcc /usr/bin/cc
GCC: Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66/specs
gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
Compilation info: CC='gcc'  CFLAGS=' -O3'  CXX='gcc'  CXXFLAGS=' -O3 	         
-felide-constructors -fno-exceptions -fno-rtti 		  '  LDFLAGS=''
LIBC: 
lrwxrwxrwx    1 root     root           13 Mar 29  2001 /lib/libc.so.6 -> libc-2.1.3.so
-rwxr-xr-x    1 root     root      4106828 Mar 29  2001 /lib/libc-2.1.3.so
-rw-r--r--    1 root     root     20320984 Mar 29  2001 /usr/lib/libc.a
-rw-r--r--    1 root     root          178 Mar 29  2001 /usr/lib/libc.so
Configure command: ./configure --disable-shared --with-mysqld-ldflags=-all-static
--with-client-ldflags=-all-static --enable-assembler --enable-local-infile
--with-mysqld-user=mysql --with-unix-socket-path=/var/lib/mysql/mysql.sock --prefix=/
--with-extra-charsets=all --exec-prefix=/usr --libexecdir=/usr/sbin --sysconfdir=/etc
--datadir=/usr/share --localstatedir=/var/lib/mysql --infodir=/usr/info
--includedir=/usr/include --mandir=/usr/man --with-raid --with-berkeley-db
--without-innodb '--with-comment=Yandex MySQL RPM' CC=gcc 'CFLAGS= -O3' 'CXXFLAGS= -O3 	  
       -felide-constructors -fno-exceptions -fno-rtti 		  ' CXX=gcc
Perl: This is perl, version 5.005_03 built for i386-linux
Thread
replication interrupted in read_event and fd leakasm19 Jun
  • Re: replication interrupted in read_event and fd leakJeff Kilbride19 Jun