List:Commits« Previous MessageNext Message »
From:Luis Soares Date:September 24 2010 9:44am
Subject:bzr commit into mysql-5.1-bugteam branch (luis.soares:3515) Bug#55263
View as plain text  
#At file:///home/lsoares/Workspace/bzr/work/bugfixing/55263/mysql-5.1-bugteam/ based on revid:ingo.struewing@stripped

 3515 Luis Soares	2010-09-24
      BUG#55263: assert in check_binlog_magic
      
      The procedure for setting up a fake binary log, by changing the
      relay log files manually, is run twice when we issue mtr with
      --repeat=2. However, when setting it up for the second time,
      neither the sql thread is reset nor the server is restarted. This
      means that previous stale relay log IO_CACHE metadata - from 
      first run - is left around. As a consequence, when the test is 
      run for the second time, the IO_CACHE for the relay log has 
      position in file different than zero, triggering the assertion.
                  
      We fix this by deploying a call to my_b_seek before calling
      check_binlog_magic in next_event. This prevents the server 
      from asserting, in the cases that the SQL thread was reads
      from a hot log (relay.NNNNN), then is stopped, then is restarted 
      from a previous cold log (relay.MMMMM), and then it reaches 
      the hot log relay.NNNNN again, in which case, the read 
      coordinates are not set to zero, but to the old values.
      
      Additionally, some comments to the source code were added.

    modified:
      sql/rpl_rli.h
      sql/slave.cc
=== modified file 'sql/rpl_rli.h'
--- a/sql/rpl_rli.h	2009-12-14 16:32:22 +0000
+++ b/sql/rpl_rli.h	2010-09-24 09:44:53 +0000
@@ -94,6 +94,16 @@ public:
   */
   MYSQL_BIN_LOG relay_log;
   LOG_INFO linfo;
+
+  /*
+   cur_log
+     Pointer that either points at relay_log.get_log_file() or
+     &rli->cache_buf, depending on whether the log is hot or there was
+     the need to open a cold relay_log.
+
+   cache_buf 
+     IO_CACHE used when opening cold relay logs.
+   */
   IO_CACHE cache_buf,*cur_log;
 
   /* The following variables are safe to read any time */

=== modified file 'sql/slave.cc'
--- a/sql/slave.cc	2010-07-20 18:07:36 +0000
+++ b/sql/slave.cc	2010-09-24 09:44:53 +0000
@@ -4349,12 +4349,66 @@ static Log_event* next_event(Relay_log_i
         DBUG_ASSERT(rli->cur_log_fd == -1);
 
         /*
-          Read pointer has to be at the start since we are the only
-          reader.
-          We must keep the LOCK_log to read the 4 first bytes, as this is a hot
-          log (same as when we call read_log_event() above: for a hot log we
-          take the mutex).
+           When the SQL thread is [stopped and] (re)started the
+           following may happen:
+
+           1. Log was hot at stop time and remains hot at restart
+
+              SQL thread reads again from hot_log (SQL thread was
+              reading from the active log when it was stopped and the
+              very same log is still active on SQL thread restart).
+
+              In this case, my_b_seek is performed on cur_log, while
+              cur_log points to relay_log.get_log_file();
+
+           2. Log was hot at stop time but got cold before restart
+
+              The log was hot when SQL thread stopped, but it is not
+              anymore when the SQL thread restarts.
+
+              In this case, the SQL thread reopens the log, using
+              cache_buf, ie, cur_log points to &cache_buf, and thence
+              its coordinates are reset.
+
+           3. Log was already cold at stop time
+
+              The log was not hot when the SQL thread stopped, and, of
+              course, it will not be hot when it restarts.
+
+              In this case, the SQL thread opens the cold log again,
+              using cache_buf, ie, cur_log points to &cache_buf, and
+              thence its coordinates are reset.
+
+           4. Log was hot at stop time, DBA changes to previous cold
+              log and restarts SQL thread
+
+              The log was hot when the SQL thread was stopped, but the
+              user changed the coordinates of the SQL thread to
+              restart from a previous cold log.
+
+              In this case, at start time, cur_log points to a cold
+              log, opened using &cache_buf as cache, and coordinates
+              are reset. However, as it moves on to the next logs, it
+              will eventually reach the hot log. If the hot log is the
+              same at the time the SQL thread was stopped, then
+              coordinates were not reset - the cur_log will point to
+              relay_log.get_log_file(), and not a freshly opened
+              IO_CACHE through cache_buf. For this reason we need to
+              deploy a my_b_seek before calling check_binlog_magic at
+              this point of the code (see: BUG#55263 for more
+              details).
+          
+          NOTES: 
+            - We must keep the LOCK_log to read the 4 first bytes, as
+              this is a hot log (same as when we call read_log_event()
+              above: for a hot log we take the mutex).
+
+            - Because of scenario #4 above, we need to have a
+              my_b_seek here. Otherwise, we might hit the assertion
+              inside check_binlog_magic.
         */
+
+        my_b_seek(cur_log, (my_off_t) 0);
         if (check_binlog_magic(cur_log,&errmsg))
         {
           if (!hot_log) pthread_mutex_unlock(log_lock);


Attachment: [text/bzr-bundle] bzr/luis.soares@oracle.com-20100924094453-qj3qfgzo1ifk8fid.bundle
Thread
bzr commit into mysql-5.1-bugteam branch (luis.soares:3515) Bug#55263Luis Soares24 Sep
  • Re: bzr commit into mysql-5.1-bugteam branch (luis.soares:3515) Bug#55263Sven Sandberg24 Sep