Hi Chuck,
I didn't do a full review, just some points.
On Fri, Oct 26, 2007 at 08:05:06PM +0200, cbell@stripped wrote:
> ChangeSet@stripped, 2007-10-26 14:04:34-04:00, cbell@mysql_cab_desk. +15 -0
> BUG#31383 : Consistent Snapshot driver not consistent
>
> This patch changes the default driver to use a separate thread to open and lock
> tables.
> The default driver opens tables on the prelock() call from the kernel. The
> snapshot
> driver initiates the CS read on the lock() call from the kernel and opens tables in
> the first
> call to get_data() after the lock is taken. This is due to the fact that the
> default driver's
> validity point is at open_and_lock_tables() while the snapshot driver's validity
> point
> is at the start of the transaction.
> diff -Nrup a/sql/backup/be_default.cc b/sql/backup/be_default.cc
> --- a/sql/backup/be_default.cc 2007-07-02 13:42:58 -04:00
> +++ b/sql/backup/be_default.cc 2007-10-26 14:04:25 -04:00
> @@ -126,6 +127,37 @@ Backup::Backup(const Table_list &tables,
> }
>
> /**
> + * @brief Prelock call to setup locking.
> + *
> + * Launches a separate thread ("locking thread") which will lock
> + * tables. Locking in a separate thread is needed to have a non-blocking
> + * prelock() (given that thr_lock() is blocking).
> + */
> +result_t Backup::prelock()
> +{
> + DBUG_ENTER("Default_backup::open_tables");
> +
> + pthread_t th;
> + if (pthread_create(&th, &connection_attrib,
> + backup_thread_for_locking, this))
> + SET_STATE_TO_ERROR_AND_DBUG_RETURN;
> +
> + /*
> + Block until lock thread is complete.
> + */
This lock may take one minute to be acquired. While it is waiting, the
backup kernel is entirely blocked. While it could use this minute to
do get_data() on other drivers.
http://forge.mysql.com/source/OnlineBackup/classbackup_1_1_backup__driver.html#a3
suggests:
"It can do the preparations inside the prelock() method if it doesn't
require too long time. In that case it should return READY. If the
preparations require longer time (waiting for ongoing operations to
finish) or sending additional data to the kernel then prelock() should
return OK. Later on, the kernel will call get_data() and driver can
signal that it has finished the preparations by returning READY
result."
> + pthread_mutex_lock(&THR_LOCK_driver_thread);
> + m_thd->enter_cond(&COND_driver_lock, &THR_LOCK_driver_thread,
> + "Default driver: waiting for table locks");
> + while (!backup_thread_locked())
> + pthread_cond_wait(&COND_driver_lock, &THR_LOCK_driver_thread);
> + m_thd->exit_cond("Default driver: got table locks");
> +
> + lock_called= TRUE;
> +
> + DBUG_RETURN(backup::OK);
> +}
> +
> +/**
> * @brief Start table read.
> *
> * This method saves the handler for the table and initializes the
> @@ -304,6 +336,21 @@ result_t Backup::get_data(Buffer &buf)
> buf.size= 0;
> buf.last= TRUE;
> mode= GET_NEXT_TABLE;
> +
> + /*
> + Optimization: If this is the last table to read, close the tables and
> + kill the lock thread. This only applies iff we are using the thread.
> + */
> + if (tables_in_backup->next_global == NULL)
> + {
> + if (lock_thd)
after that line, backup_thread_for_logging might go into "delete thd"
thus the awake() below will operate on freed memory and cause
problems.
> + {
> + pthread_cond_broadcast(&COND_driver_thread_wait);
> + lock_thd->awake(THD::KILL_CONNECTION);
> + }
> + cur_table= NULL;
> + lock_thd= 0;
> + }
> }
> else if (last_read_res != 0)
> DBUG_RETURN(ERROR);
> diff -Nrup a/sql/backup/be_thread.cc b/sql/backup/be_thread.cc
> --- /dev/null Wed Dec 31 16:00:00 196900
> +++ b/sql/backup/be_thread.cc 2007-10-26 14:04:29 -04:00
> @@ -0,0 +1,176 @@
> +my_bool backup_thread_is_locked= FALSE;
static?
> +
> +/**
> + @brief Lock tables in driver.
> +
> + This method creates a new THD for use in the new thread. It calls
> + the method to open and lock the tables.
> + */
> +pthread_handler_t backup_thread_for_locking(void *arg)
> +{
> + TABLE_LIST *tables_in_backup= 0;
> +
> + DBUG_ENTER("Default_backup - lock_tables_in_separate_thread");
> +
> + /*
> + Turn off condition variable check for lock.
> + */
> + pthread_mutex_lock(&THR_LOCK_driver_thread);
> + backup_thread_is_locked= FALSE;
> + pthread_mutex_unlock(&THR_LOCK_driver_thread);
> +
> + my_thread_init();
> + pthread_detach_this_thread();
> +
> + /*
> + First, create a new THD object.
> + */
> + DBUG_PRINT("info",("Online backup creating THD struct for thread"));
> + THD *thd= new THD;
> + if (unlikely(!thd))
> + {
> + delete thd;
> + DBUG_RETURN(0);
> + }
> + (static_cast<Backup_driver *>(arg))->lock_thd= thd;
> +
> + thd->thread_stack = (char*)&thd; // remember where our stack is
> + pthread_mutex_lock(&LOCK_thread_count);
> + thd->thread_id= thread_id++;
> + pthread_mutex_unlock(&LOCK_thread_count);
> + if (unlikely(thd->store_globals())) // for a proper MEM_ROOT
> + {
> + delete thd;
> + DBUG_RETURN(0);
> + }
> +
> + thd->init_for_queries(); // opening tables needs a proper LEX
> + thd->command= COM_DAEMON;
> + thd->system_thread= SYSTEM_THREAD_BACKUP;
There is an issue with the system_thread above (present in the MyISAM
driver too).
The thread is maybe going to wait for locks in
thr_lock.c:wait_for_lock(). Waiting is a loop:
while (!thread_var->abort || in_wait_list)
{
int rc= (can_deadlock ?
pthread_cond_timedwait(cond, &data->lock->mutex,
&wait_timeout) :
pthread_cond_wait(cond, &data->lock->mutex));
...
THD::awake() which we use to tell the thread to terminate, does:
if (!system_thread) // Don't abort locks
mysys_var->abort=1;
So, as our thread has system_thread!=0, mysys_var->abort won't be set
to 1, so the thread will restart waiting for locks.
That means that if the backup job fails, or is KILLed by the user, it
may not terminate.
If this is a "mysqladmin shutdown" the problem will not occur, because
this does a KILL on all threads, so the thread which was holding locks
will die and so our thread will get its locks and then die.
I think we should replace the if() in THD::awake() by
'if (system_thread != SYSTEM_THREAD_DELAYED_INSERT)' because:
- that if() dates from a time where the only possible system threads
were the delayed insert system thread and the slave threads
- for slave threads it's ok to abort locks because anyway we set
THD::killed in THD::awake(), and slave threads test THD::killed so
they already are aborted in the middle of their work, if one does a
KILL on them
- the comments in THD::awake() explaining the mysys_var->abort logic
talk only about the delayed insert system thread.
Before doing that though, we should ask the creators of system threads
(those who created:
SYSTEM_THREAD_NDBCLUSTER_BINLOG= 8, // Tomas Ulin
SYSTEM_THREAD_EVENT_SCHEDULER= 16, // Andrey Hristov
SYSTEM_THREAD_EVENT_WORKER= 32, // same
) if they see a problem with aborting locks of those threads.
Could you please do it? Do you prefer me to do it?
> + thd->version= refresh_version;
> + thd->set_time();
> + thd->main_security_ctx.host_or_ip= "";
> + thd->client_capabilities= 0;
> + my_net_init(&thd->net, 0);
> + thd->main_security_ctx.master_access= ~0;
> + thd->main_security_ctx.priv_user= 0;
> + thd->real_id= pthread_self();
> + /*
> + Making this thread visible to SHOW PROCESSLIST is useful for
> + troubleshooting a backup job (why does it stall etc).
> + */
> + pthread_mutex_lock(&LOCK_thread_count);
> + threads.append(thd);
> + pthread_mutex_unlock(&LOCK_thread_count);
> + lex_start(thd);
> + mysql_reset_thd_for_next_command(thd);
All the logic above is more or less duplicated between the MyISAM
backup driver, the NDB binlog thread (from which I copied it). We
should once for all make a generic function create_system_thread().
> +
> + /*
> + Now open and lock the tables.
> + */
> + DBUG_PRINT("info",("Online backup open tables in thread"));
> + tables_in_backup= (static_cast<Backup_driver *>(arg))->tables_in_backup;
> + /*
> + Make sure we have tables to open.
> + */
> + if (!tables_in_backup)
> + {
> + DBUG_PRINT("info",("Online backup locking error no tables to lock"));
> + DBUG_RETURN(0);
> + }
> +
> + /*
> + As locking tables can be a long operation, we need to support
> + cancellability during that time. So we publish our THD to the thread which
> + created us.
> + */
> + if (open_and_lock_tables(thd, tables_in_backup))
> + {
> + DBUG_PRINT("info",("Online backup locking thread dying"));
> + close_thread_tables(thd);
> + DBUG_RETURN(0);
> + }
> +
> + /*
> + Turn on condition variable check for lock.
> + */
> + pthread_mutex_lock(&THR_LOCK_driver_thread);
> + backup_thread_is_locked= TRUE;
> + pthread_mutex_unlock(&THR_LOCK_driver_thread);
> +
> + /*
> + Signal waiting driver that condition for lock is obtained.
> + */
> + pthread_cond_broadcast(&COND_driver_lock);
>
> + /*
> + Part of work is done. Rest until woken up.
> + */
> + pthread_mutex_lock(&THR_LOCK_driver_thread);
> + thd->enter_cond(&COND_driver_thread_wait, &THR_LOCK_driver_thread,
> + "Online backup driver thread: holding table locks");
> + while (!thd->killed)
> + pthread_cond_wait(&COND_driver_thread_wait, &THR_LOCK_driver_thread);
> + thd->exit_cond("Online backup driver thread: terminating");
> +
> + DBUG_PRINT("info",("Online backup driver thread locking thread terminating"));
There is duplication between this file and the MyISAM backup
driver. So when you have pushed, I should change the latter to use the
former.
> + /*
> + Cleanup and return.
> + */
> + close_thread_tables(thd);
> + net_end(&thd->net);
> + my_thread_end();
> + delete thd;
"delete thd" should be before my_thread_end(), as it uses
thread-related functions, see below.
Are you sure that at this moment, the main backup thread cannot be
looking at lock_thd?
> +
> + /*
> + Turn off condition variable check for lock.
> + */
> + pthread_mutex_lock(&THR_LOCK_driver_thread);
after my_thread_end() you cannot use any thread-related functions in
this thread anymore. You should move the my_thread_end() down.
> + backup_thread_is_locked= FALSE;
> + pthread_mutex_unlock(&THR_LOCK_driver_thread);
> +
> + pthread_exit(0);
> + DBUG_RETURN(0);
DBUG_RETURN(0) will never be reached. So the
"< function_name"
will be missing from the debug trace. Other problems may happen:
DBUG_RETURN calls
void _db_return_(uint _line_, const char **_sfunc_,
const char **_sfile_, uint *_slevel_)
(dbug/dbug.c) which takes locks, decrements counters etc.
I realize the MyISAM driver has the same bug, and threads in slave.cc
too.
So, we should either not use DBUG_ENTER/RETURN in the function
(DBUG_PRINT is ok on the other hand), or not use pthread_exit(0) in
the function (knowing that normally, returning from the function
causes the thread to exit, pthread_exit() is redundant).
In the MyISAM driver I am now removing DBUG_ENTER/RETURN.