List:Commits« Previous MessageNext Message »
From:ingo Date:April 3 2007 2:35pm
Subject:bk commit into 4.1 tree (istruewing:1.2620) BUG#26379
View as plain text  
Below is the list of changes that have just been committed into a local
4.1 repository of istruewing. When istruewing does a push these changes will
be propagated to the main repository and, within 24 hours after the
push, to the public repository.
For information on how to access the public repository
see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html

ChangeSet@stripped, 2007-04-03 14:35:50+02:00, istruewing@stripped +10 -0
  Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
              corrupts a MERGE table
  
  This bug report revealed three problems:
  
  1. A thread trying to lock a MERGE table performs busy waiting while
     REPAIR TABLE or a similar table administration task is ongoing on
     one or more of its MyISAM tables.
  
  2. A thread trying to lock a MERGE table performs busy waiting until all
     threads that did REPAIR TABLE or similar table administration tasks
     on one or more of its MyISAM tables in LOCK TABLES segments do UNLOCK
     TABLES. The difference against problem #1 is that the busy waiting
     takes place *after* the administration task. It is terminated by
     UNLOCK TABLES only.
  
  3. Two FLUSH TABLES within a LOCK TABLES segment can invalidate the
     lock. This does *not* require a MERGE table. The first FLUSH TABLES
     can be replaced by any statement that requires other threads to
     reopen the table. In 5.0 and 5.1 a single FLUSH TABLES can provoke
     the problem.
  
  Problem #1 and #2 are fixed by moving abortion of locks behind
  the table administration operation and undoing the lock abort
  immediately after all threads have closed the table.
  
  Problem #3 is fixed by setting THD::some_tables_deleted for all
  threads with open tables before releasing LOCK_open when waiting
  for all threads to close their tables.
  
  No test case. The misbehaviour cannot be repeated reliably by a
  deterministic test without sleep()-injection in the code.
  
  Please see the bug report for more information and tests for
  manual testing.

  include/my_sys.h@stripped, 2007-04-03 14:35:48+02:00, istruewing@stripped +7 -0
    Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
                corrupts a MERGE table
    Backport of my_delete_allow_opened() from 5.0.

  include/thr_lock.h@stripped, 2007-04-03 14:35:48+02:00, istruewing@stripped +1 -0
    Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
                corrupts a MERGE table
    Added declaration of new function thr_unabort_locks().

  mysys/my_delete.c@stripped, 2007-04-03 14:35:48+02:00, istruewing@stripped +51 -0
    Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
                corrupts a MERGE table
    Backport of my_delete_allow_opened() from 5.0. Internally the
    function is called nt_share_delete().

  mysys/my_redel.c@stripped, 2007-04-03 14:35:48+02:00, istruewing@stripped +1 -1
    Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
                corrupts a MERGE table
    Backport of my_delete_allow_opened() from 5.0.

  mysys/thr_lock.c@stripped, 2007-04-03 14:35:48+02:00, istruewing@stripped +29 -0
    Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
                corrupts a MERGE table
    Added definition of new function thr_unabort_locks().

  sql/lock.cc@stripped, 2007-04-03 14:35:48+02:00, istruewing@stripped +30 -0
    Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
                corrupts a MERGE table
    Added definition of new function mysql_lock_unabort().

  sql/log.cc@stripped, 2007-04-03 14:35:48+02:00, istruewing@stripped +2 -2
    Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
                corrupts a MERGE table
    Backport of my_delete_allow_opened() from 5.0.

  sql/mysql_priv.h@stripped, 2007-04-03 14:35:48+02:00, istruewing@stripped +1 -0
    Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
                corrupts a MERGE table
    Added declaration of new function mysql_lock_unabort().

  sql/sql_base.cc@stripped, 2007-04-03 14:35:48+02:00, istruewing@stripped +36 -0
    Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
                corrupts a MERGE table
    Set table->in_use->some_tables_deleted= 1 for all open tables
    in close_cached_tables() when all tables are flushed.
    This prevents other threads to use their locks for work on
    the tables of the FLUSH TABLES thread while it is waiting
    for other threads to close their tables. This is an issue
    if we have LOCK TABLES in effect.

  sql/sql_table.cc@stripped, 2007-04-03 14:35:48+02:00, istruewing@stripped +16 -2
    Bug#26379 - Combination of FLUSH TABLE and REPAIR TABLE
                corrupts a MERGE table
    Move the branch with mysql_lock_abort() and
    remove_table_from_cache() behind the table administration
    operation. This prevents busy waiting for a MERGE table of
    other threads when the administration operation is running
    on one of the MERGE tables MyISAM tables.
    Added a call to the new function mysql_lock_unabort() behind
    remove_table_from_cache(). This prevents busy waiting after
    releasing LOCK_open. This could especially be a problem when
    the table is part of LOCK TABLES.

# This is a BitKeeper patch.  What follows are the unified diffs for the
# set of deltas contained in the patch.  The rest of the patch, the part
# that BitKeeper cares about, is below these diffs.
# User:	istruewing
# Host:	chilla.local
# Root:	/home/mydev/mysql-4.1-bug26379

--- 1.146/include/my_sys.h	2007-04-03 14:35:55 +02:00
+++ 1.147/include/my_sys.h	2007-04-03 14:35:55 +02:00
@@ -583,6 +583,13 @@ extern int my_access(const char *path, i
 #endif
 extern int check_if_legal_filename(const char *path);
 
+#if defined(__WIN__) && defined(__NT__)
+extern int nt_share_delete(const char *name,myf MyFlags);
+#define my_delete_allow_opened(fname,flags)  nt_share_delete((fname),(flags))
+#else
+#define my_delete_allow_opened(fname,flags)  my_delete((fname),(flags))
+#endif
+
 #ifndef TERMINATE
 extern void TERMINATE(FILE *file);
 #endif

--- 1.15/include/thr_lock.h	2007-04-03 14:35:55 +02:00
+++ 1.16/include/thr_lock.h	2007-04-03 14:35:55 +02:00
@@ -111,6 +111,7 @@ void thr_unlock(THR_LOCK_DATA *data);
 int thr_multi_lock(THR_LOCK_DATA **data,uint count);
 void thr_multi_unlock(THR_LOCK_DATA **data,uint count);
 void thr_abort_locks(THR_LOCK *lock);
+void thr_unabort_locks(THR_LOCK *lock);
 my_bool thr_abort_locks_for_thread(THR_LOCK *lock, pthread_t thread);
 void thr_print_locks(void);		/* For debugging */
 my_bool thr_upgrade_write_delay_lock(THR_LOCK_DATA *data);

--- 1.5/mysys/my_delete.c	2007-04-03 14:35:55 +02:00
+++ 1.6/mysys/my_delete.c	2007-04-03 14:35:55 +02:00
@@ -32,3 +32,54 @@ int my_delete(const char *name, myf MyFl
   }
   DBUG_RETURN(err);
 } /* my_delete */
+
+#if defined(__WIN__) && defined(__NT__)
+/*
+  Delete file which is possibly not closed.
+
+  This function is intended to be used exclusively as a temporal solution
+  for Win NT in case when it is needed to delete a not closed file (note
+  that the file must be opened everywhere with FILE_SHARE_DELETE mode).
+  Deleting not-closed files can not be supported on Win 98|ME (and because
+  of that is considered harmful).
+  
+  The function deletes the file with its preliminary renaming. This is
+  because when not-closed share-delete file is deleted it still lives on
+  a disk until it will not be closed everwhere. This may conflict with an
+  attempt to create a new file with the same name. The deleted file is
+  renamed to <name>.<num>.deleted where <name> - the initial name of
the
+  file, <num> - a hexadecimal number chosen to make the temporal name to
+  be unique.
+*/
+int nt_share_delete(const char *name, myf MyFlags)
+{
+  char buf[MAX_PATH + 20];
+  ulong cnt;
+  DBUG_ENTER("nt_share_delete");
+  DBUG_PRINT("my",("name %s MyFlags %d", name, MyFlags));
+  
+  for (cnt= GetTickCount(); cnt; cnt--)
+  {
+    sprintf(buf, "%s.%08X.deleted", name, cnt);
+    if (MoveFile(name, buf))
+      break;
+      
+    if ((errno= GetLastError()) == ERROR_ALREADY_EXISTS)
+      continue;
+      
+    DBUG_PRINT("warning", ("Failed to rename %s to %s, errno: %d",
+                           name, buf, errno));
+    break;
+  }
+
+  if (DeleteFile(buf))
+    DBUG_RETURN(0);
+
+  my_errno= GetLastError();
+  if (MyFlags & (MY_FAE+MY_WME))
+    my_error(EE_DELETE, MYF(ME_BELL + ME_WAITTANG + (MyFlags & ME_NOINPUT)),
+	       name, my_errno);
+
+  DBUG_RETURN(-1);
+}
+#endif

--- 1.10/mysys/my_redel.c	2007-04-03 14:35:55 +02:00
+++ 1.11/mysys/my_redel.c	2007-04-03 14:35:55 +02:00
@@ -61,7 +61,7 @@ int my_redel(const char *org_name, const
 		  MyFlags))
       goto end;
   }
-  else if (my_delete(org_name,MyFlags))
+  else if (my_delete_allow_opened(org_name,MyFlags))
       goto end;
   if (my_rename(tmp_name,org_name,MyFlags))
     goto end;

--- 1.42/mysys/thr_lock.c	2007-04-03 14:35:55 +02:00
+++ 1.43/mysys/thr_lock.c	2007-04-03 14:35:55 +02:00
@@ -960,6 +960,35 @@ void thr_abort_locks(THR_LOCK *lock)
 }
 
 
+/**
+  @brief Undo thr_abort_locks() by changing TL_WRITE_ONLY to TL_WRITE.
+
+  @details If thr_abort_locks() has been called on this lock, its lock
+  type has been set to TL_WRITE_ONLY. This caused all lock attempts for
+  this lock to be rejected with an error instead of waiting for the
+  lock. If the lock is part of a LOCK TABLE, its lock type must be reset
+  to TL_WRITE. Otherwise other threads will run amok trying to lock a
+  seemingly unused table. They won't wait for the table as nobody seems
+  to use it currently. They won't wait for the lock as the attempt fails
+  on TL_WRITE_ONLY.
+
+  @param[in,out]  lock  The lock to reset.
+*/
+
+void thr_unabort_locks(THR_LOCK *lock)
+{
+  THR_LOCK_DATA *data;
+  DBUG_ENTER("thr_unabort_locks");
+  pthread_mutex_lock(&lock->mutex);
+
+  if (lock->write.data && (lock->write.data->type == TL_WRITE_ONLY))
+    lock->write.data->type= TL_WRITE;
+
+  pthread_mutex_unlock(&lock->mutex);
+  DBUG_VOID_RETURN;
+}
+
+
 /*
   Abort all locks for specific table/thread combination
 

--- 1.68/sql/lock.cc	2007-04-03 14:35:55 +02:00
+++ 1.69/sql/lock.cc	2007-04-03 14:35:55 +02:00
@@ -393,6 +393,36 @@ void mysql_lock_abort(THD *thd, TABLE *t
 }
 
 
+/**
+  @brief Undo mysql_lock_abort().
+
+  @details If mysql_lock_abort() has been called for this table, its
+  locks are still set so that all new lock attempts for this table are
+  rejected with an error instead of waiting for the locks. If a LOCK
+  TABLE is in effect, the tables locks must be reset to normal.
+  Otherwise other threads will run amok trying to lock a seemingly
+  unused table. They won't wait for the table as nobody seems to use it
+  currently. They won't wait for the lock as the attempt fails with an
+  error.
+
+  @param[in]      thd       Thread handle.
+  @param[in,out]  table     The locked table.
+*/
+
+void mysql_lock_unabort(THD *thd, TABLE *table)
+{
+  MYSQL_LOCK *locked;
+  TABLE *write_lock_used;
+  if ((locked= get_lock_data(thd, &table, 1, GET_LOCK_UNLOCK,
+                             &write_lock_used)))
+  {
+    for (uint i=0; i < locked->lock_count; i++)
+      thr_unabort_locks(locked->locks[i]->lock);
+    my_free((gptr) locked,MYF(0));
+  }
+}
+
+
 /*
   Abort one thread / table combination
 

--- 1.167/sql/log.cc	2007-04-03 14:35:55 +02:00
+++ 1.168/sql/log.cc	2007-04-03 14:35:55 +02:00
@@ -651,14 +651,14 @@ bool MYSQL_LOG::reset_logs(THD* thd)
   
   for (;;)
   {
-    my_delete(linfo.log_file_name, MYF(MY_WME));
+    my_delete_allow_opened(linfo.log_file_name, MYF(MY_WME));
     if (find_next_log(&linfo, 0))
       break;
   }
 
   /* Start logging with a new file */
   close(LOG_CLOSE_INDEX);
-  my_delete(index_file_name, MYF(MY_WME));	// Reset (open will update)
+  my_delete_allow_opened(index_file_name, MYF(MY_WME));	// Reset (open will update)
   if (!thd->slave_thread)
     need_start_event=1;
   open(save_name, save_log_type, 0, index_file_name,

--- 1.388/sql/mysql_priv.h	2007-04-03 14:35:55 +02:00
+++ 1.389/sql/mysql_priv.h	2007-04-03 14:35:55 +02:00
@@ -1041,6 +1041,7 @@ void mysql_unlock_read_tables(THD *thd, 
 void mysql_unlock_some_tables(THD *thd, TABLE **table,uint count);
 void mysql_lock_remove(THD *thd, MYSQL_LOCK *locked,TABLE *table);
 void mysql_lock_abort(THD *thd, TABLE *table);
+void mysql_lock_unabort(THD *thd, TABLE *table);
 bool mysql_lock_abort_for_thread(THD *thd, TABLE *table);
 MYSQL_LOCK *mysql_lock_merge(MYSQL_LOCK *a,MYSQL_LOCK *b);
 int mysql_lock_have_duplicate(THD *thd, TABLE *table, TABLE_LIST *tables);

--- 1.276/sql/sql_base.cc	2007-04-03 14:35:55 +02:00
+++ 1.277/sql/sql_base.cc	2007-04-03 14:35:55 +02:00
@@ -276,6 +276,42 @@ bool close_cached_tables(THD *thd, bool 
 #endif
     }
     refresh_version++;				// Force close of open tables
+    DBUG_PRINT("note", ("incremented global refresh_version to: %lu",
+                        refresh_version));
+    if (if_wait_for_refresh)
+    {
+      /*
+        Other threads could wait in a loop in mysql_lock_table() trying
+        to lock one or more of our tables.
+
+        If they wait for the locks in thr_multi_lock(), their lock
+        request is aborted. If they wait for the table to become unused,
+        they see table->locked_by_flush=1. But if they passed the loop
+        in wait_for_tables(), released LOCK_open, we (the FLUSH TABLES
+        thread) are scheduled at this moment and enter
+        close_cached_tables(), they could awake while we sleep below,
+        waiting for others threads to close their open tables. If this
+        happens, the other threads would find the tables unlocked. They
+        would get the locks, one after the other, and could do their
+        destructive work. This is an issue if we have LOCK TABLES in
+        effect.
+
+        The fix for this problem is to set some_tables_deleted for all
+        threads with open tables. These threads can still get their locks,
+        but will immediately release them again after checking this
+        variable. They will then reenter wait_for_tables(). There they
+        will wait until we update all tables version below.
+
+        Setting some_tables_deleted is done by remove_table_from_cache()
+        in the other branch.
+      */
+      for (uint idx=0 ; idx < open_cache.records ; idx++)
+      {
+        TABLE *table=(TABLE*) hash_element(&open_cache,idx);
+        if (table->in_use)
+          table->in_use->some_tables_deleted= 1;
+      }
+    }
   }
   else
   {

--- 1.312/sql/sql_table.cc	2007-04-03 14:35:55 +02:00
+++ 1.313/sql/sql_table.cc	2007-04-03 14:35:55 +02:00
@@ -2075,7 +2075,9 @@ static int mysql_admin_table(THD* thd, T
       continue;
     }
 
-    /* Close all instances of the table to allow repair to rename files */
+    int result_code = (table->table->file->*operator_func)(thd, check_opt);
+
+    /* Close all instances of the table. They must be reopened. */
     if (lock_type == TL_WRITE && table->table->version)
     {
       pthread_mutex_lock(&LOCK_open);
@@ -2086,6 +2088,19 @@ static int mysql_admin_table(THD* thd, T
                               table->table->real_name,
                               RTFC_WAIT_OTHER_THREAD_FLAG |
                               RTFC_CHECK_KILLED_FLAG);
+      /*
+        We aborted all locks on this table. This is done so that all new
+        attempts to lock the table are rejected with an error. We wanted
+        everyone to close the table. We must reset the lock of this
+        table to normal, especially if LOCK TABLES is in effect. When we
+        release LOCK_open, other threads should wait for the lock
+        instead of failing with an error. If another thread tries to
+        lock a MERGE table which includes this table, it would start
+        busy waiting until the table is finally unlocked by this thread.
+        It would fail on the lock, but cannot detect that a base table
+        of the MERGE table is in use. (Bug #26379)
+      */
+      mysql_lock_unabort(thd, table->table);
       thd->exit_cond(old_message);
       if (thd->killed)
 	goto err;
@@ -2094,7 +2109,6 @@ static int mysql_admin_table(THD* thd, T
       open_for_modify= 0;
     }
 
-    int result_code = (table->table->file->*operator_func)(thd, check_opt);
 #ifdef EMBEDDED_LIBRARY
     thd->net.last_errno= 0;  // these errors shouldn't get client
 #endif
Thread
bk commit into 4.1 tree (istruewing:1.2620) BUG#26379ingo3 Apr