List:Commits« Previous MessageNext Message »
From:rsomla Date:February 3 2007 7:14pm
Subject:bk commit into 5.0 tree (rafal:1.2392) BUG#25306
View as plain text  
Below is the list of changes that have just been committed into a local
5.0 repository of rafal. When rafal does a push these changes will
be propagated to the main repository and, within 24 hours after the
push, to the public repository.
For information on how to access the public repository
see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html

ChangeSet@stripped, 2007-02-03 20:14:16+01:00, rafal@quant.(none) +2 -0
  BUG#25306 (Race conditions during replication slave shutdown (valgrind stacks)):
  The possibility of the race is removed by changing sequence of calls
  
    pthread_mutex_unlock(&mi->run_lock);
    pthread_cond_broadcast(&mi->stop_cond);  
  
  into
  
    pthread_cond_broadcast(&mi->stop_cond);
    pthread_mutex_unlock(&mi->run_lock);
  
  at the end of I/O thread (similar change at the end of SQL thread). This ensures 
  that no thread waiting on the condition executes between the broadcast and the 
  unlock and thus can't delete the mi structure which caused the bug.

  sql/slave.cc@stripped, 2007-02-03 20:14:12+01:00, rafal@quant.(none) +14 -5
    Change order of condition broadcast and mutex unlock at the end of slave's 
    I/O thread and SQL thread.

  sql/sql_repl.cc@stripped, 2007-02-03 20:14:13+01:00, rafal@quant.(none) +5 -3
    Add DBUG_ENTER() call

# This is a BitKeeper patch.  What follows are the unified diffs for the
# set of deltas contained in the patch.  The rest of the patch, the part
# that BitKeeper cares about, is below these diffs.
# User:	rafal
# Host:	quant.(none)
# Root:	/ext/mysql/bk/mysql-5.0-bug25306

--- 1.287/sql/slave.cc	2007-02-03 20:14:21 +01:00
+++ 1.288/sql/slave.cc	2007-02-03 20:14:21 +01:00
@@ -3758,8 +3758,13 @@
   mi->abort_slave= 0;
   mi->slave_running= 0;
   mi->io_thd= 0;
-  pthread_mutex_unlock(&mi->run_lock);
+  /*
+    Note: the order of the two following calls (first broadcast, then unlock)
+    is important. Otherwise a killer_thread can execute between the calls and
+    delete the mi structure leading to a crash! (see BUG#25306 for details)
+   */ 
   pthread_cond_broadcast(&mi->stop_cond);       // tell the world we are done
+  pthread_mutex_unlock(&mi->run_lock);
 #ifndef DBUG_OFF
   if (abort_slave_event_count && !events_till_abort)
     goto slave_begin;
@@ -3977,8 +3982,13 @@
   THD_CHECK_SENTRY(thd);
   delete thd;
   pthread_mutex_unlock(&LOCK_thread_count);
-  pthread_cond_broadcast(&rli->stop_cond);
 
+ /*
+  Note: the order of the broadcast and unlock calls below (first broadcast, then unlock)
+  is important. Otherwise a killer_thread can execute between the calls and
+  delete the mi structure leading to a crash! (see BUG#25306 for details)
+ */ 
+  pthread_cond_broadcast(&rli->stop_cond);
 #ifndef DBUG_OFF
   /*
     Bug #19938 Valgrind error (race) in handle_slave_sql()
@@ -3986,9 +3996,8 @@
   */
   const int eta= rli->events_till_abort;
 #endif
-
-  // tell the world we are done
-  pthread_mutex_unlock(&rli->run_lock);
+  pthread_mutex_unlock(&rli->run_lock);  // tell the world we are done
+  
 #ifndef DBUG_OFF // TODO: reconsider the code below
   if (abort_slave_event_count && !eta)
     goto slave_begin;

--- 1.155/sql/sql_repl.cc	2007-02-03 20:14:21 +01:00
+++ 1.156/sql/sql_repl.cc	2007-02-03 20:14:21 +01:00
@@ -882,12 +882,14 @@
 
 int stop_slave(THD* thd, MASTER_INFO* mi, bool net_report )
 {
+  DBUG_ENTER("stop_slave");
+  
   int slave_errno;
   if (!thd)
     thd = current_thd;
 
   if (check_access(thd, SUPER_ACL, any_db,0,0,0,0))
-    return 1;
+    DBUG_RETURN(1);
   thd->proc_info = "Killing slave";
   int thread_mask;
   lock_slave_threads(mi);
@@ -921,12 +923,12 @@
   {
     if (net_report)
       my_message(slave_errno, ER(slave_errno), MYF(0));
-    return 1;
+    DBUG_RETURN(1);
   }
   else if (net_report)
     send_ok(thd);
 
-  return 0;
+  DBUG_RETURN(0);
 }
 
 
Thread
bk commit into 5.0 tree (rafal:1.2392) BUG#25306rsomla8 Feb