List:Maria Storage Engine« Previous MessageNext Message »
From:Guilhem Bichot Date:December 5 2008 3:41pm
Subject:bzr commit into MySQL/Maria:mysql-maria branch (guilhem:2709) Bug#41159
View as plain text  
#At bzr+ssh://bk-internal.mysql.com/bzrroot/server/mysql-maria/ based on revid:guilhem@stripped

 2709 Guilhem Bichot	2008-12-05
      Fix for BUG#41159 "Maria: deadlock between checkpoint and maria_write() when extending data file".
      No testcase (concurrency, tested by pushbuild2).
modified:
  storage/maria/ma_bitmap.c
  storage/maria/ma_checkpoint.c

per-file messages:
  storage/maria/ma_bitmap.c
    As checkpoint's _ma_bitmap_flush_all() is now without intern_lock, a maria_close() may be in progress, and be
    destroying the mutex/cond which _ma_bitmap_flush_all() needs: prevent that by letting checkpoint destroy
    cond/mutex.
  storage/maria/ma_checkpoint.c
    The first comment in this file is taking into account Sanja's upcoming patch for BUG 40981.
    In checkpoint, release intern_lock before waiting on bitmap's flushability (avoid deadlock).
    This opens the possibility for maria_close() to run while we're flushing, so maria_close() may
    leave us do the bitmap cond/mutex destruction.
=== modified file 'storage/maria/ma_bitmap.c'
--- a/storage/maria/ma_bitmap.c	2008-10-17 13:37:07 +0000
+++ b/storage/maria/ma_bitmap.c	2008-12-05 15:41:09 +0000
@@ -260,6 +260,14 @@ my_bool _ma_bitmap_init(MARIA_SHARE *sha
 my_bool _ma_bitmap_end(MARIA_SHARE *share)
 {
   my_bool res= _ma_bitmap_flush(share);
+  if (share->in_checkpoint & MARIA_CHECKPOINT_LOOKS_AT_ME)
+  {
+    /*
+      Keep mutex/cond usable, checkpoint may be soon using them in
+      _ma_bitmap_flush_all()
+     */
+    return res;
+  }
   pthread_mutex_destroy(&share->bitmap.bitmap_lock);
   pthread_cond_destroy(&share->bitmap.bitmap_cond);
   delete_dynamic(&share->bitmap.pinned_pages);

=== modified file 'storage/maria/ma_checkpoint.c'
--- a/storage/maria/ma_checkpoint.c	2008-08-28 18:52:23 +0000
+++ b/storage/maria/ma_checkpoint.c	2008-12-05 15:41:09 +0000
@@ -973,6 +973,9 @@ static int collect_tables(LEX_STRING *st
       sync; if we flush their state now we may be flushing an obsolete state
       onto a newer one (assuming the table has been reopened with a different
       share but of course same physical index file).
+      Note that we may come here when maria_close() is in progress (it unlocks
+      intern_lock in its middle but then share->id is 0 see
+      _ma_once_end_block_record()).
     */
     ignore_share= (share->id == 0) | (share->last_version == 0);
     DBUG_PRINT("info", ("ignore_share: %d", ignore_share));
@@ -1045,6 +1048,12 @@ static int collect_tables(LEX_STRING *st
           each checkpoint if the table was once written and then not anymore.
         */
       }
+      /*
+        _ma_bitmap_flush_all() may wait, so don't keep intern_lock as
+        otherwise this would deadlock with allocate_and_write_block_record()
+        calling _ma_set_share_data_file_length()
+      */
+      pthread_mutex_unlock(&share->intern_lock);
       if (_ma_bitmap_flush_all(share))
       {
         sync_error= 1;
@@ -1053,18 +1062,22 @@ static int collect_tables(LEX_STRING *st
       }
       DBUG_ASSERT(share->pagecache == maria_pagecache);
     }
+    else
+      pthread_mutex_unlock(&share->intern_lock);
     /*
       Clean up any unused states.
       TODO: Only do this call if there has been # (10?) ended transactions
       since last call.
+      We had to release intern_lock to respect lock order with LOCK_trn_list.
     */
-    pthread_mutex_unlock(&share->intern_lock);
     _ma_remove_not_visible_states_with_lock(share);
     pthread_mutex_lock(&share->intern_lock);
 
     if (share->in_checkpoint & MARIA_CHECKPOINT_SHOULD_FREE_ME)
     {
       /* maria_close() left us to free the share */
+      share->in_checkpoint= 0; /* this is for call below */
+      _ma_bitmap_end(share);
       pthread_mutex_unlock(&share->intern_lock);
       pthread_mutex_destroy(&share->intern_lock);
       my_free((uchar *)share, MYF(0));

Thread
bzr commit into MySQL/Maria:mysql-maria branch (guilhem:2709) Bug#41159Guilhem Bichot5 Dec