Hi Chuck,
I have some questions and suggestions for improving this patch, see below.
Rafal
cbell@stripped wrote:
> Below is the list of changes that have just been committed into a local
> 6.0 repository of cbell. When cbell does a push these changes
> will be propagated to the main repository and, within 24 hours after the
> push, to the public repository.
> For information on how to access the public repository
> see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
>
> ChangeSet@stripped, 2008-05-07 11:30:42-04:00, cbell@mysql_cab_desk. +4 -0
> BUG#36323 : The build-in backup/restore drivers should support cancel() API call
>
> Patch changes default drivers to allow cancel operation in middle of
> operation. Also allows for better cleanup in event of an error in the
> driver.
>
> sql/backup/be_default.cc@stripped, 2008-05-07 11:30:36-04:00, cbell@mysql_cab_desk. +56
> -0
> BUG#36323 : The build-in backup/restore drivers should support cancel() API call
>
> Added ability to cancel backup and restore.
> Added code to cleanup if driver gets an error.
>
> sql/backup/be_default.h@stripped, 2008-05-07 11:30:37-04:00, cbell@mysql_cab_desk. +32
> -6
> BUG#36323 : The build-in backup/restore drivers should support cancel() API call
>
> Added cleanup method and cancel capability for backup and restore.
>
> sql/backup/be_snapshot.cc@stripped, 2008-05-07 11:30:38-04:00, cbell@mysql_cab_desk. +36
> -1
> BUG#36323 : The build-in backup/restore drivers should support cancel() API call
>
> Added ability to cancel backup.
> Added code to cleanup if driver gets an error.
>
> sql/backup/be_snapshot.h@stripped, 2008-05-07 11:30:38-04:00, cbell@mysql_cab_desk. +18
> -2
> BUG#36323 : The build-in backup/restore drivers should support cancel() API call
>
> Added cleanup method and cancel capability for backup.
>
=================================================================================
> diff -Nrup a/sql/backup/be_default.cc b/sql/backup/be_default.cc
> --- a/sql/backup/be_default.cc 2008-03-04 11:08:40 -05:00
> +++ b/sql/backup/be_default.cc 2008-05-07 11:30:36 -04:00
> @@ -127,6 +127,28 @@ Backup::Backup(const Table_list &tables,
> all_tables= locking_thd->tables_in_backup;
> init_phase_complete= FALSE;
> locks_acquired= FALSE;
> + hdl= NULL;
> + m_cleanup= TRUE;
> +}
> +
> +/**
> + Cleanup backup
> +
> + This method provides a means to stop a current backup by allowing
> + the driver to shutdown gracefully. The method call ends the current
> + table read then attempts to kill the locking thread if it is still
> + running.
> +*/
> +result_t Backup::cleanup()
> +{
> + DBUG_ENTER("Default_backup::cleanup()");
> + DBUG_PRINT("backup",("Default driver - stop backup"));
> + mode= CANCEL;
Why mode is set to CANCEL here? According to the documentation:
CANCEL, ///< Indicates time to cancel operation
Is it the case that after a call to cleanup() it is time to cancel operation? To
me it seems that cleanup() may be called as part of a cleanup operation (so mode
should be set to CANCEL already) but also during normal driver shutdown or after
error.
> + if (hdl)
> + end_tbl_read();
> + if (locking_thd)
> + locking_thd->kill_locking_thread();
> + DBUG_RETURN(OK);
Why close_thread_tables() is not called here as it is in the CS driver?
> }
>
There is m_cleanup flag indicating if we need to call celanup() or not. I
suggest to maintain this flag inside cleanup() without a need to remember about
it by the caller. That is:
1. if m_clenaup is FALSE then cleanup() will do nothing,
2. otherwise cleanup will do its job and set m_cleanup to FALSE.
This way it would be safe to call cleanup() at any moment and several times in a
row.
> /**
> @@ -160,7 +182,10 @@ result_t Backup::start_tbl_read(TABLE *t
> hdl= tbl->file;
> last_read_res= hdl->ha_rnd_init(1);
> if (last_read_res != 0)
> + {
> + hdl= NULL;
> DBUG_RETURN(ERROR);
> + }
> DBUG_RETURN(OK);
> }
>
> @@ -177,7 +202,12 @@ result_t Backup::end_tbl_read()
> int last_read_res;
>
> DBUG_ENTER("Default_backup::end_tbl_read)");
> +
> + if (!hdl)
> + DBUG_RETURN(OK);
> +
> last_read_res= hdl->ha_rnd_end();
> + hdl= NULL;
> if (last_read_res != 0)
> DBUG_RETURN(ERROR);
> DBUG_RETURN(OK);
> @@ -316,6 +346,11 @@ result_t Backup::get_data(Buffer &buf)
> buf.last= FALSE;
>
> /*
> + get_data() should not be called after cancel has been called.
> + */
> + DBUG_ASSERT(mode != CANCEL);
> +
> + /*
> Determine mode of operation and execute mode.
> */
> switch (mode) {
> @@ -581,6 +616,22 @@ Restore::Restore(const Table_list &table
> }
>
> /**
> + Cleanup restore
> +
> + This method provides a means to stop a current restore by allowing
> + the driver to shutdown gracefully. The method call closes the
> + table list by calling end() method.
> +*/
> +result_t Restore::cleanup()
> +{
> + DBUG_ENTER("Default_backup::cleanup()");
> + DBUG_PRINT("backup",("Default driver - stop restore"));
> + end();
> + mode= CANCEL;
Again, I think calls to cleanup() and cancel() are two independent things and
mode should not be changed inside cleanup().
> + DBUG_RETURN(OK);
> +}
> +
> +/**
> * @brief Truncate table.
> *
> * This method saves the handler for the table and deletes all rows in
> @@ -721,6 +772,11 @@ result_t Restore::send_data(Buffer &buf)
> DBUG_PRINT("default_restore",("Got packet with %lu bytes from stream %u",
> (unsigned long)buf.size, buf.table_num));
>
> + /*
> + get_data() should not be called after cancel has been called.
> + */
> + DBUG_ASSERT(mode != CANCEL);
> +
> /*
> Determine mode of operation and execute mode.
> */
=================================================================================
> diff -Nrup a/sql/backup/be_default.h b/sql/backup/be_default.h
> --- a/sql/backup/be_default.h 2008-03-04 11:06:22 -05:00
> +++ b/sql/backup/be_default.h 2008-05-07 11:30:37 -04:00
> @@ -84,7 +84,12 @@ class Backup: public Backup_thread_drive
> public:
> enum has_data_info { YES, WAIT, EOD };
> Backup(const Table_list &tables, THD *t_thd, thr_lock_type lock_type);
> - virtual ~Backup() { backup::free_table_list(all_tables); };
> + virtual ~Backup()
> + {
> + if (m_cleanup)
> + cleanup();
> + backup::free_table_list(all_tables);
> + };
> size_t size() { return UNKNOWN_SIZE; };
> size_t init_size() { return 0; };
> result_t begin(const size_t) { return backup::OK; };
> @@ -92,7 +97,12 @@ class Backup: public Backup_thread_drive
> result_t get_data(Buffer &buf);
> result_t lock() { return backup::OK; };
> result_t unlock() { return backup::OK; };
> - result_t cancel() { return backup::OK; };
> + result_t cancel()
> + {
> + m_cleanup= FALSE;
> + cleanup();
If m_cleanup is set to FALSE inside cleanup() you don't need to set it here.
I think mode should be set to CANCEL here indicating that the current operation
of the driver has been cancelled.
> + return backup::OK;
> + }
> TABLE_LIST *get_table_list() { return all_tables; }
> void free() { delete this; };
> result_t prelock();
> @@ -101,6 +111,9 @@ class Backup: public Backup_thread_drive
> TABLE *cur_table; ///< The table currently being read.
> my_bool init_phase_complete; ///< Used to identify end of init phase.
> my_bool locks_acquired; ///< Used to help kernel synchronize drivers.
> + handler *hdl; ///< Pointer to table handler.
> + my_bool m_cleanup; ///< Is call to cleanup() needed?
If managed inside cleanup() method, this flag can be made private perhaps.
> + result_t end_tbl_read();
>
> private:
> /*
> @@ -110,6 +123,7 @@ class Backup: public Backup_thread_drive
> */
> typedef enum {
> INITIALIZE, ///< Indicates time to initialize read
> + CANCEL, ///< Indicates time to cancel operation
> GET_NEXT_TABLE, ///< Open next table in the list
> READ_RCD, ///< Reading rows from table mode
> READ_RCD_BUFFER, ///< Buffer records mode
> @@ -119,11 +133,9 @@ class Backup: public Backup_thread_drive
> } BACKUP_MODE;
>
> result_t start_tbl_read(TABLE *tbl);
> - result_t end_tbl_read();
> int next_table();
> BACKUP_MODE mode; ///< Indicates which mode the code is in
> int tbl_num; ///< The index of the current table.
> - handler *hdl; ///< Pointer to table handler.
> uint *cur_blob; ///< The current blob field.
> uint *last_blob_ptr; ///< Position of last blob field.
> MY_BITMAP *read_set; ///< The file read set.
> @@ -132,6 +144,7 @@ class Backup: public Backup_thread_drive
> byte *ptr; ///< Pointer to blob data from record.
> TABLE_LIST *all_tables; ///< Reference to list of tables used.
>
> + result_t cleanup();
> uint pack(byte *rcd, byte *packed_row);
> };
>
> @@ -151,11 +164,21 @@ class Restore: public Restore_driver
> public:
> enum has_data_info { YES, WAIT, EOD };
> Restore(const Table_list &tables, THD *t_thd);
> - virtual ~Restore() { backup::free_table_list(all_tables); };
> + virtual ~Restore()
> + {
> + if (m_cleanup)
> + cleanup();
> + backup::free_table_list(all_tables);
> + };
> result_t begin(const size_t) { return backup::OK; };
> result_t end();
> result_t send_data(Buffer &buf);
> - result_t cancel() { return backup::OK; };
> + result_t cancel()
> + {
> + m_cleanup= FALSE;
> + cleanup();
As above, I think that:
- m_cleanup should be maitained by cleanup() method, not cancel()
- mode= CANCEL should be done here, not in cleanup().
> + return backup::OK;
> + }
> TABLE_LIST *get_table_list() { return all_tables; }
> void free() { delete this; };
>
> @@ -167,6 +190,7 @@ class Restore: public Restore_driver
> */
> typedef enum {
> INITIALIZE, ///< Indicates time to initialize read
> + CANCEL, ///< Indicates time to cancel operation
> GET_NEXT_TABLE, ///< Open next table in the list
> WRITE_RCD, ///< Writing rows from table mode
> CHECK_BLOBS, ///< See if record has blobs
> @@ -191,7 +215,9 @@ class Restore: public Restore_driver
> THD *m_thd; ///< Pointer to current thread struct.
> TABLE_LIST *all_tables; ///< Reference to list of tables used.
> ulonglong num_rows; ///< Number of rows in table
> + my_bool m_cleanup; ///< Is call to cleanup() needed?
>
> + result_t cleanup();
> uint unpack(byte *packed_row);
> };
> } // default_backup namespace
=================================================================================
> diff -Nrup a/sql/backup/be_snapshot.cc b/sql/backup/be_snapshot.cc
> --- a/sql/backup/be_snapshot.cc 2008-03-21 04:55:41 -04:00
> +++ b/sql/backup/be_snapshot.cc 2008-05-07 11:30:38 -04:00
> @@ -77,6 +77,41 @@ result_t Engine::get_backup(const uint32
> DBUG_RETURN(OK);
> }
>
> +/**
> + Cleanup backup
> +
> + This method provides a means to stop a current backup by allowing
> + the driver to shutdown gracefully. The method call ends the current
> + transaction and closes the tables.
> +*/
> +result_t Backup::cleanup()
> +{
> + DBUG_ENTER("Default_backup::cleanup()");
> + DBUG_PRINT("backup",("Snapshot driver - stop backup"));
> + m_cancel= TRUE;
Cancel state maintained inside cleanup() - this is confusing and may lead to
errors since cleanup() is called not only when operation is cancelled.
> + locking_thd->lock_state= LOCK_DONE; // set lock done so destructor won't wait
Why you don't use kill_locking_thread() method as it is done in the default
driver? This method is carefully written to avoid race conditions and other
synchronisation issues and I don't think it can be replaced by such simple
statement.
Is it correct to kill the table locking thread before the operations below are
performed, for example closing the consistent read transaction?
> + if (m_trans_start)
> + {
> + ha_autocommit_or_rollback(locking_thd->m_thd, 0);
> + end_active_trans(locking_thd->m_thd);
> + m_trans_start= FALSE;
> + }
> + if (tables_open)
> + {
> + if (hdl)
> + default_backup::Backup::end_tbl_read();
> + close_thread_tables(locking_thd->m_thd);
Why close_thrad_tables() is called here but not in the default driver?
> + tables_open= FALSE;
> + }
> + DBUG_RETURN(OK);
> +}
Class snapshot_backup::Backup inherits from default_backup::Backup. Thus it
would make sense if snapshot_backup::Backup::cleanup() calls
default_backup::Backup::cleanup(). This way, whenever we add something to
cleanup procedure of the default backup driver we won't forget to do it also in
the CS driver. Of course, the CS driver must do some additional cleanup actions
like ending the transactions and then call the inherited cleanup() method.
> +
> +/**
> + Lock the tables
> +
> + This method creates the consistent read transaction and acquires the read
> + lock.
> +*/
> result_t Backup::lock()
> {
> DBUG_ENTER("Snapshot_backup::lock()");
> @@ -119,7 +154,7 @@ result_t Backup::get_data(Buffer &buf)
> being set to LOCK_SIGNAL from parent::get_data(). This is set
> after the last table is finished reading.
> */
> - if (locking_thd->lock_state == LOCK_SIGNAL)
> + if ((locking_thd->lock_state == LOCK_SIGNAL) || m_cancel)
Can m_cancel be replaced by mode != CANCEL or something like this?
> {
> locking_thd->lock_state= LOCK_DONE; // set lock done so destructor won't
> wait
> ha_autocommit_or_rollback(locking_thd->m_thd, 0);
This code is executed when all tables have been processed in get_data(). Why not
use cleanup() here to avoid code repetition. Probably the same should be done in
default driver's get_data() method.
> diff -Nrup a/sql/backup/be_snapshot.h b/sql/backup/be_snapshot.h
> --- a/sql/backup/be_snapshot.h 2008-03-04 11:06:23 -05:00
> +++ b/sql/backup/be_snapshot.h 2008-05-07 11:30:38 -04:00
> @@ -59,9 +59,16 @@ class Backup: public default_backup::Bac
> {
> public:
> Backup(const Table_list &tables, THD *t_thd)
> - :default_backup::Backup(tables, t_thd, TL_READ) { tables_open= FALSE; };
> + :default_backup::Backup(tables, t_thd, TL_READ)
> + {
> + tables_open= FALSE;
> + m_cancel= FALSE;
> + m_trans_start= FALSE;
> + };
> virtual ~Backup()
> {
> + if (m_cleanup)
> + cleanup();
> if (locking_thd->lock_state == LOCK_ACQUIRED)
> {
> end_active_trans(locking_thd->m_thd);
Why you do anything more if you have called cleanup(). I understood that
cleanup() would do all the clean up actions required.
> @@ -74,9 +81,18 @@ class Backup: public default_backup::Bac
> result_t prelock() { return backup::READY; }
> result_t lock();
> result_t unlock() { return backup::OK; };
> - result_t cancel() { return backup::OK; };
> + result_t cancel()
> + {
> + m_cleanup= FALSE;
> + cleanup();
> + return backup::OK;
> + }
> private:
> my_bool tables_open; ///< Indicates if tables are open
> + my_bool m_cancel; ///< Cancel backup
Why m_cancel flag if we have CANCEL mode?
> + my_bool m_trans_start; ///< Is transaction stated?
> +
> + result_t cleanup();
> };
>
> /**
>
>