List:Commits« Previous MessageNext Message »
From:Marc Alff Date:September 9 2010 9:33pm
Subject:bzr commit into mysql-5.5-bugfixing branch (marc.alff:3204) Bug#56324
View as plain text  
#At file:///home/malff/BZR_TREE/mysql-5.5-bugfixing-56324-B/ based on revid:mats.kindahl@stripped

 3204 Marc Alff	2010-09-09
      Bug#56324 Race Condition while shutting down MySQL: "PSI_server"
      
      Before this fix, the server could crash during shutdown,
      due to race conditions, that occured when killing the server.
      
      In particular, the performance schema instrumentation handle,
      PSI_server, and the performance schema itself would be cleaned up
      too soon, causing race conditions with a running kill server thread.
      
      The specifics of the race condition found are that:
      the main thread executing "PSI_server= NULL" can cause crashes in
      other threads still running, which are executing
      "if (PSI_server != NULL) PSI_server->xxx()"
      as part of the performance schema instrumentation.
      
      While the bug was reported for the kill server thread,
      in theory the same crash could happen with the signal thread,
      as found by code analysis.
      
      The correct fix would be to only shutdown the performance schema
      and set PSI_server to NULL after every other thread is guaranteed
      to be completed, including the kill_server_thread.
      
      However, due to the existing mysqld server design, this is not the case.
      See in particular bug number 56666.
      
      The work around used to fix this race condition is to simply not
      perform the call to shutdown_performance_schema() when the server exits,
      and to keep the PSI_server pointer unchanged.
      
      This will cause memory leaks to be reported by tools like valgrind,
      but no memory leak actually happen because the process is about to exit().
      
      As a result, the file mysql-test/valgrind.supp has been updated
      to filter out these false positive messages.
      
      This code has been tested with running in a loop the following
      tests in parallel, which have been known to fail with race conditions
      in the past:
      - rpl_change_master
      - binlog_max_extension
      - events_restart
      - rpl_heartbeat_basic
      and no crash of test failure has been seen with the changed code.

    modified:
      mysql-test/valgrind.supp
      sql/mysqld.cc
=== modified file 'mysql-test/valgrind.supp'
--- a/mysql-test/valgrind.supp	2010-06-07 08:47:04 +0000
+++ b/mysql-test/valgrind.supp	2010-09-09 21:33:35 +0000
@@ -745,3 +745,28 @@
    Memcheck:Addr1
    fun:buf_buddy_relocate
 }
+
+#
+# See related Bug#56666
+# Race condition between the server main thread and the kill server thread.
+#
+# Because of this race condition, the call to shutdown_performance_schema()
+# was commented in sql/mysqld.cc, causing the reported leaks.
+#
+
+{
+   missing shutdown_performance_schema 1
+   Memcheck:Leak
+   fun:malloc
+   fun:_Z10pfs_mallocmi
+}
+
+{
+   missing shutdown_performance_schema 2
+   Memcheck:Leak
+   fun:malloc
+   fun:my_malloc
+   fun:_lf_alloc_new
+   fun:lf_hash_insert
+}
+

=== modified file 'sql/mysqld.cc'
--- a/sql/mysqld.cc	2010-08-30 14:07:40 +0000
+++ b/sql/mysqld.cc	2010-09-09 21:33:35 +0000
@@ -1395,6 +1395,12 @@ static void mysqld_exit(int exit_code)
   mysql_audit_finalize();
   clean_up_mutexes();
   clean_up_error_log_mutex();
+#ifdef WITH_PERFSCHEMA_STORAGE_ENGINE
+  /*
+    Bug#56666 needs to be fixed before calling:
+    shutdown_performance_schema();
+  */
+#endif
   my_end(opt_endinfo ? MY_CHECK_ERROR | MY_GIVE_INFO : 0);
   exit(exit_code); /* purecov: inspected */
 }
@@ -2732,6 +2738,11 @@ pthread_handler_t signal_hand(void *arg
       if (!abort_loop)
       {
 	abort_loop=1;				// mark abort for threads
+#ifdef HAVE_PSI_INTERFACE
+        /* Delete the instrumentation for the signal thread */
+        if (likely(PSI_server != NULL))
+          PSI_server->delete_current_thread();
+#endif
 #ifdef USE_ONE_SIGNAL_HAND
 	pthread_t tmp;
         if (mysql_thread_create(0, /* Not instrumented */
@@ -4587,6 +4598,15 @@ int mysqld_main(int argc, char **argv)
 #endif
 #endif /* __WIN__ */
 
+#ifdef HAVE_PSI_INTERFACE
+  /*
+    Disable the main thread instrumentation,
+    to avoid recording events during the shutdown.
+  */
+  if (PSI_server)
+    PSI_server->delete_current_thread();
+#endif
+
   /* Wait until cleanup is done */
   mysql_mutex_lock(&LOCK_thread_count);
   while (!ready_to_exit)
@@ -4604,18 +4624,6 @@ int mysqld_main(int argc, char **argv)
   }
 #endif
   clean_up(1);
-#ifdef HAVE_PSI_INTERFACE
-  /*
-    Disable the instrumentation, to avoid recording events
-    during the shutdown.
-  */
-  if (PSI_server)
-  {
-    PSI_server->delete_current_thread();
-    PSI_server= NULL;
-  }
-  shutdown_performance_schema();
-#endif
   mysqld_exit(0);
 }
 


Attachment: [text/bzr-bundle] bzr/marc.alff@oracle.com-20100909213335-azexyau10kbdg1jx.bundle
Thread
bzr commit into mysql-5.5-bugfixing branch (marc.alff:3204) Bug#56324Marc Alff9 Sep