List:Commits« Previous MessageNext Message »
From:Dmitry Lenev Date:July 22 2010 6:21am
Subject:Re: bzr commit into mysql-trunk-runtime branch (jon.hauglid:3081)
Bug#55223
View as plain text  
Hello Jon Olav!

* Jon Olav Hauglid <jon.hauglid@stripped> [10/07/21 19:02]:
> #At file:///export/home/x/mysql-trunk-runtime-bug55223/ based on
> revid:jon.hauglid@stripped
> 
>  3081 Jon Olav Hauglid	2010-07-21
>       Bug #55223 assert in Protocol::end_statement during CREATE DATABASE
>       
>       The problem was that a statement could cause an assert if it was aborted by
>       KILL QUERY while it waited on a metadata lock. This assert checks that a
>       statement either sends OK or an error to the client. If the bug was triggered
>       on release builds, it caused OK to be sent to the client instead of
>       ER_QUERY_INTERRUPTED.
>       
>       The root cause of the problem is that there are two separate ways to tell if a
>       statement is killed: thd->killed and mysys_var->abort. KILL QUERY causes
> both
>       to be set, thd->killed before mysys_var->abort. The latter was used to
> check
>       if a wait for a metadata lock should be aborted. If waiting is aborted, no OK
>       message is sent and thd->killed is checked to see if ER_QUERY_INTERRUPTED
>       should be sent to the client. The assert could be triggered if the compiler
>       reordered the KILL QUERY code so that mysys_var->abort was set before
>       thd->killed, and the aborted statement reached the thd->killed check
> before
>       thd->killed was set by KILL QUERY.

After a bit debugging I think what actually has happened was that
resetting of thd->killed/thd->mysys_var->abort values which happens at
the end of dispatch_command() sneaked in between the moment when 
THD::awake() set thd->killed to THD::KILL_QUERY and the moment when
thd->mysys_var->abort was set to 1. As result next statement started
its execution with thd->killed equal to NOT_KILLED and mysys_var->abort
set to 1.

IMO it makes sense to reflect this fact in changeset comment.

Also, given the nature of the problem, may be we should try to
prevent problem from re-appearing in future and consider implementing
a follow-up patch with a more generic solution to it, e.g., which will
involve taking THD::LOCK_thd_data when thd->killed and
thd->mysys_var->abort are being reset? What do you think?

>       This patch fixes the problem by changing the metadata lock waiting code to
>       check thd->killed.
>       
>       No test case added as reproducing the assert is dependent on very exact timing
>       of two (or more) threads and compiler reording of instructions. The patch has
>       been checked using RQG and the grammar posted on the bug report.
> 
>     modified:
>       sql/mdl.cc
>       sql/mdl.h
> === modified file 'sql/mdl.cc'
> --- a/sql/mdl.cc	2010-07-13 18:01:54 +0000
> +++ b/sql/mdl.cc	2010-07-21 15:00:01 +0000
> @@ -881,67 +881,6 @@ uint MDL_ticket::get_deadlock_weight() c
>  }
>  
>  
> -/**
> -  Helper functions and macros to be used for killable waiting in metadata
> -  locking subsystem.
> -
> -  @sa THD::enter_cond()/exit_cond()/killed.
> -
> -  @note We can't use THD::enter_cond()/exit_cond()/killed directly here
> -        since this will make metadata subsystem dependent on THD class
> -        and thus prevent us from writing unit tests for it. And usage of
> -        wrapper functions to access THD::killed/enter_cond()/exit_cond()
> -        will probably introduce too much overhead.
> -*/
> -
> -#define MDL_ENTER_COND(A, B, C, D) \
> -        mdl_enter_cond(A, B, C, D, __func__, __FILE__, __LINE__)
> -
> -static inline const char *mdl_enter_cond(THD *thd,
> -                                         st_my_thread_var *mysys_var,
> -                                         mysql_cond_t *cond,
> -                                         mysql_mutex_t *mutex,
> -                                         const char *calling_func,
> -                                         const char *calling_file,
> -                                         const unsigned int calling_line)
> -{
> -  mysql_mutex_assert_owner(mutex);
> -
> -  mysys_var->current_mutex= mutex;
> -  mysys_var->current_cond= cond;
> -
> -  DEBUG_SYNC(thd, "mdl_enter_cond");
> -
> -  return set_thd_proc_info(thd, "Waiting for table",
> -                           calling_func, calling_file, calling_line);
> -}
> -
> -#define MDL_EXIT_COND(A, B, C, D) \
> -        mdl_exit_cond(A, B, C, D, __func__, __FILE__, __LINE__)
> -
> -static inline void mdl_exit_cond(THD *thd,
> -                                 st_my_thread_var *mysys_var,
> -                                 mysql_mutex_t *mutex,
> -                                 const char* old_msg,
> -                                 const char *calling_func,
> -                                 const char *calling_file,
> -                                 const unsigned int calling_line)
> -{
> -  DBUG_ASSERT(mutex == mysys_var->current_mutex);
> -
> -  mysql_mutex_unlock(mutex);
> -  mysql_mutex_lock(&mysys_var->mutex);
> -  mysys_var->current_mutex= NULL;
> -  mysys_var->current_cond= NULL;
> -  mysql_mutex_unlock(&mysys_var->mutex);
> -
> -  DEBUG_SYNC(thd, "mdl_exit_cond");
> -
> -  (void) set_thd_proc_info(thd, old_msg, calling_func,
> -                           calling_file, calling_line);
> -}
> -
> -
>  /** Construct an empty wait slot. */
>  
>  MDL_wait::MDL_wait()
> @@ -1021,15 +960,14 @@ MDL_wait::timed_wait(THD *thd, struct ti
>  {
>    const char *old_msg;
>    enum_wait_status result;
> -  st_my_thread_var *mysys_var= my_thread_var;
>    int wait_result= 0;
>  
>    mysql_mutex_lock(&m_LOCK_wait_status);
>  
> -  old_msg= MDL_ENTER_COND(thd, mysys_var, &m_COND_wait_status,
> -                          &m_LOCK_wait_status);
> +  old_msg= thd_enter_cond(thd, &m_COND_wait_status, &m_LOCK_wait_status,
> +                          "Waiting for table");
>  
> -  while (!m_wait_status && !mysys_var->abort &&
> +  while (!m_wait_status && !thd_killed(thd) &&
>           wait_result != ETIMEDOUT && wait_result != ETIME)
>      wait_result= mysql_cond_timedwait(&m_COND_wait_status,
> &m_LOCK_wait_status,
>                                        abs_timeout);
> @@ -1048,14 +986,14 @@ MDL_wait::timed_wait(THD *thd, struct ti
>        false, which means that the caller intends to restart the
>        wait.
>      */
> -    if (mysys_var->abort)
> +    if (thd_killed(thd))
>        m_wait_status= KILLED;
>      else if (set_status_on_timeout)
>        m_wait_status= TIMEOUT;
>    }
>    result= m_wait_status;
>  
> -  MDL_EXIT_COND(thd, mysys_var, &m_LOCK_wait_status, old_msg);
> +  thd_exit_cond(thd, old_msg);
>  
>    return result;
>  }
> 
> === modified file 'sql/mdl.h'
> --- a/sql/mdl.h	2010-07-13 18:01:54 +0000
> +++ b/sql/mdl.h	2010-07-21 15:00:01 +0000
> @@ -725,6 +725,10 @@ extern "C" const char *set_thd_proc_info
>                                           const char *calling_function,
>                                           const char *calling_file,
>                                           const unsigned int calling_line);
> +extern "C" const char* thd_enter_cond(MYSQL_THD thd, mysql_cond_t *cond,
> +                                      mysql_mutex_t *mutex, const char *msg);
> +extern "C" void thd_exit_cond(MYSQL_THD thd, const char *old_msg);
> +
>  #ifndef DBUG_OFF
>  extern mysql_mutex_t LOCK_open;
>  #endif

Probably it makes sense to remove declaration of set_thd_proc_info
from this header since after your changes it became unused.

Otherwise I am OK with your patch and think that it can be pushed
after addressing issues with ChangeSet comment and set_thd_proc_info.

-- 
Dmitry Lenev, Software Developer
MySQL AB, www.mysql.com

Are you MySQL certified?  http://www.mysql.com/certification
Thread
bzr commit into mysql-trunk-runtime branch (jon.hauglid:3081) Bug#55223Jon Olav Hauglid21 Jul
  • Re: bzr commit into mysql-trunk-runtime branch (jon.hauglid:3081)Bug#55223Dmitry Lenev22 Jul