List:Commits« Previous MessageNext Message »
From:hezx Date:January 29 2008 3:57am
Subject:bk commit into 5.0 tree (hezx:1.2565) BUG#26489
View as plain text  
Below is the list of changes that have just been committed into a local
5.0 repository of hezx. When hezx does a push these changes will
be propagated to the main repository and, within 24 hours after the
push, to the public repository.
For information on how to access the public repository
see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html

ChangeSet@stripped, 2008-01-29 11:56:48+08:00, hezx@stripped +1 -0
  BUG#26489 Corruption in relay logs
  
  Here is the scenario that causes the failure.(by Mats)
  
  1. The to-be corrupt log event (let's call it X), is split into two
     packets B and C on the network level (net_write_buff()). The parts
     are X = (x',x''). The part x' ends up in packet B and part x''
     ends up in packet C. Prior to the corrupt event X, the event Y has
     been written successfully, but has been split into two packets as
     well, which we call (y',y'').
  2. The master sends packet A = (y'',x') to the slave, increases the
     packet sequence number, the slave receives the packet, but fails
     to reply before the master gets a timeout.
  3. Since the master got a timeout, it reports failure, and aborts
     sending the binary log by exiting mysql_binlog_send(). However, it
     leaves the buffer intact, still holding y'' (but not x', since the
     write_pos is not increased).
  4. After exiting mysql_binlog_send(), the master does a
     disconnection of the client thread, which involves sending an
     error message e to the client (i.e., the slave).
  5. In this case, net_write_buff() is used again, but this time the
     old contents of the packet is used so that the new packet is
     D = (y'',e). Note that this will use a new packet sequence number,
     since the packet number was increased in step 2.
  6. The slave receives the tail y'' of the Y log event, concatenates
     this with x' (which it already received), and writes the event
     (x',y'') it to the relay log since it hasn't noticed anything is
     amiss.
  7. It then tries to read more bytes, which is either e (if the length
     given for X just happened to match the length given for Y, or just
     plain garbage because the slave is out of sync with what is
     actually sent.
  8. After a while, the SQL thread tries to execute the event (x',y''),
     which is very likely to be just nonsense.
  
  The problem can be fixed by not resetting net->error after the call of 
  mysql_binlog_send, so the error message will not be sent and the connection
  will be closed.

  sql/sql_parse.cc@stripped, 2008-01-29 11:56:46+08:00, hezx@stripped +0 -1
    Do not reset net->error, if net->error == 2, we should not try to use the connection again

diff -Nrup a/sql/sql_parse.cc b/sql/sql_parse.cc
--- a/sql/sql_parse.cc	2007-10-30 16:20:29 +08:00
+++ b/sql/sql_parse.cc	2008-01-29 11:56:46 +08:00
@@ -1999,7 +1999,6 @@ bool dispatch_command(enum enum_server_c
       unregister_slave(thd,1,1);
       /*  fake COM_QUIT -- if we get here, the thread needs to terminate */
       error = TRUE;
-      net->error = 0;
       break;
     }
 #endif
Thread
bk commit into 5.0 tree (hezx:1.2565) BUG#26489hezx29 Jan