MySQL Lists are EOL. Please join:

List:Commits« Previous MessageNext Message »
From:konstantin Date:July 18 2007 11:13pm
Subject:bk commit into 5.0 tree (kostja:1.2533) BUG#29431
View as plain text  
Below is the list of changes that have just been committed into a local
5.0 repository of kostja. When kostja does a push these changes will
be propagated to the main repository and, within 24 hours after the
push, to the public repository.
For information on how to access the public repository
see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html

ChangeSet@stripped, 2007-07-19 03:13:10+04:00, kostja@bodhi.(none) +1 -0
  A fix for Bug#29431 killing an insert delayed thread causes crash
  No test case, since the bug requires a stress case with 30 INSERT DELAYED
  threads and 1 killer thread to repeat. The patch is verified
  manually.
  
  The server that is running DELAYED inserts would deadlock itself
  or crash under high load if some of the delayed threads were KILLed
  in the meanwhile.
  
  The fix is to change internal lock acquisition order of delayed inserts
  subsystem and to ensure that
  Delayed_insert::table_list::db does not point to volatile memory in some 
  cases.
  For details, please see a comment for sql_insert.cc.

  sql/sql_insert.cc@stripped, 2007-07-19 03:13:08+04:00, kostja@bodhi.(none) +5 -6
    A fix for Bug#29431 killing an insert delayed thread causes crash
    
    1) The deadlock was caused by different lock acquisition order
    between delayed_get_table and handle_delayed_insert.
    
    delayed_get_table would:
    - acquire LOCK_delayed_create
    - create a new Delayed_insert instance
    - acquire instance mutex (di->mutex)
    - "lock the instance in memory" by 
    increasing di->locked_in_memory variable
    - start the delayed thread
    - release di->mutex
    - let the delayed thread open the delayed table
    - discover that the delayed thread was killed
    - try to unlock() the delayed insert instance in memory and in 
    Delayed_insert::unlock()
     * lock LOCK_delayed_insert
     * decrease locks_in_memory and discover it's 0
     * attempt to lock di->mutex to broadcast di->cond condition <--- deadlock
     here
    
    Meanwhile, the delayed thread would
     * lock di->mutex
     * open the table
     * get killed
     * not notice that and attempt to lock LOCK_delayed_insert
    to register itself in the delayed insert list <-- deadlock here.
    
    Simply put, delayed_get_table used to lock LOCK_delayed_insert and then 
    di->mutex, and handle_delayed_insert would lock di->mutex and then 
    LOCK_delayed_insert.
    
    Fixed by moving registration in the list of delayed insert threads from 
    handle_delayed_insert to delayed_get_table - so that now
    handle_delayed_insert doesn't need to acquire LOCK_delayed_insert mutex.
    
    As a side note, the whole concept of two locks - LOCK_delayed_insert
    and LOCK_delayed_create is redundant, and we only need one of them
    to protect the global delayed threads list.
    As is redundant the concept of locks_in_memory, since we already
    have another counter with similar semantics - tables_in_use, both
    of them are devoted to counting the number of producers for a given
    consumer (delayed insert thread), only at different stages of
    producer-consumer relationship.
    'dead' and 'status' variables in Delayed_insert are redundant too,
    since there is already 'di->thd.killed' and di->stacked_inserts.
    This redundancy made it hard to locate the bug, however the fix itself
    luckily is simple.
    
    2) di->table_list.db was copied by-pointer from table_list.db of the first
    producer -- the one who initiated creation of the delayed insert thread.
    This producer might be long gone when the member is needed
    (handle_delayed_insert:open_ltable),
    e.g. by having been killed with KILL CONNECTION statement.
    Fixed by using a pointer to the consumer's deep copy of THD::db.

diff -Nrup a/sql/sql_insert.cc b/sql/sql_insert.cc
--- a/sql/sql_insert.cc	2007-07-08 23:02:59 +04:00
+++ b/sql/sql_insert.cc	2007-07-19 03:13:08 +04:00
@@ -1706,7 +1706,7 @@ Delayed_insert *find_handler(THD *thd, T
   while ((tmp=it++))
   {
     if (!strcmp(tmp->thd.db,table_list->db) &&
-	!strcmp(table_list->table_name,tmp->table->s->table_name))
+	!strcmp(table_list->table_name, tmp->table_list.table_name))
     {
       tmp->lock();
       break;
@@ -1789,6 +1789,7 @@ bool delayed_get_table(THD *thd, TABLE_L
       }
       tmp->table_list= *table_list;			// Needed to open table
       tmp->table_list.alias= tmp->table_list.table_name= tmp->thd.query;
+      tmp->table_list.db= tmp->thd.db;
       tmp->lock();
       pthread_mutex_lock(&tmp->mutex);
       if ((error=pthread_create(&tmp->thd.real_id,&connection_attrib,
@@ -1834,6 +1835,9 @@ bool delayed_get_table(THD *thd, TABLE_L
 	tmp->unlock();
         goto end_create;
       }
+      pthread_mutex_lock(&LOCK_delayed_insert);
+      delayed_threads.append(tmp);
+      pthread_mutex_unlock(&LOCK_delayed_insert);
     }
     pthread_mutex_unlock(&LOCK_delayed_create);
   }
@@ -2175,11 +2179,6 @@ pthread_handler_t handle_delayed_insert(
     goto end;
   }
   di->table->copy_blobs=1;
-
-  /* One can now use this */
-  pthread_mutex_lock(&LOCK_delayed_insert);
-  delayed_threads.append(di);
-  pthread_mutex_unlock(&LOCK_delayed_insert);
 
   /* Tell client that the thread is initialized */
   pthread_cond_signal(&di->cond_client);
Thread
bk commit into 5.0 tree (kostja:1.2533) BUG#29431konstantin19 Jul