List:Commits« Previous MessageNext Message »
From:Chuck Bell Date:December 5 2008 6:42pm
Subject:Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2740)
Bug#35079 WL#4538
View as plain text  
STATUS
------
Patch not approved - changes requested.

REQUESTS
--------
  1. The test backup.backup_intr_errors fails on Windows.
  2. Spelling error in comments.
  3. Long line in mtr_report.pl.
  4. Spacing issues in several locations (See below).
  5. Fix patch warning.
  6. Answer question about race condition.
  7. Remove extra line change in logger.cc.
  8. Insert bug numbers for several FIX ME's.

SUGGESTIONS
-----------
  9. Create a bug report for si_* issue reported in comments.
10. Consider reconstructing test to eliminate huge result file.
11. I don't see the need to run the tests through all storage engines.
     I suggest removing this feature and run only once.

DETAILS
-------
>  2740 Rafal Somla	2008-12-05
>       BUG#35079/WL#4538 - Make BACKUP/RESTORE statements interruptible.
>       
>       The server detects interruptions in statement execution such as when client 
>       connection is closed or if user hits Ctrl+C. However, the code inside the
> server 
>       must actively check if an interruption has happened and abort execution in that
> 
>       case.
>       
>       This patch adds such checks to the backup code, as described in WL#4538. After
> 
>       this patch it should be possible to interrupt on-going BACKUP/RESTORE command.
> 
>       It also fixes few problems in the code so that the new tests can pass through.

...

>   mysql-test/lib/mtr_report.pl
>     Suppres detection by mtr expected errors during backup_intr_errors test.

[2] Spelling error 'Suppress'.

...

>   sql/backup/data_backup.cc
>     - Fix error in write_table_data() logic which caused the commit blocker to not be
> 
>       deactivated if error branch was executed.
>     - Add checks for interruptions in various stages of backup and restore protocol,
> 
>       including  Sheduler::step() which is the main worker method in
> write_table_data().
>     - Add explicit log_level::ERROR to report_error() where needed.
>     - Add extra debug sync points needed for testing.

[2] Spelling error 'Scheduler'.

...

>   sql/backup/stream.h
>     Move the temprorary cleanup code to the wrapper function read_meta_data()
>     so that we don't have to depend on the caller for executing it.

[2] Spelling error 'temporary'.

>   sql/share/errmsg.txt
>     Add messge used to notify about an interruption of BACKUP/RESTORE operations.

[2] Spelling error 'message'.

> === modified file 'mysql-test/lib/mtr_report.pl'
> --- a/mysql-test/lib/mtr_report.pl	2008-11-21 15:02:34 +0000
> +++ b/mysql-test/lib/mtr_report.pl	2008-12-05 07:38:01 +0000
> @@ -392,6 +392,16 @@ sub mtr_report_stats ($) {
>  		
>                  ($testname eq 'backup.backup_myisam1') and
>                  (/Backup: Can't initialize MyISAM backup driver/) or
> +
> +                ($testname eq 'backup.backup_intr_errors') and
> +		(
> +                  # ignore errors triggered on purpose during BACKUP/RESTORE
> operation shutdown

[3] This line is too long. Please break up into 2 lines.

...

> === added file 'mysql-test/suite/backup/t/backup_intr_errors.test'
> --- a/mysql-test/suite/backup/t/backup_intr_errors.test	1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/backup/t/backup_intr_errors.test	2008-12-05 07:38:01 +0000
> @@ -0,0 +1,122 @@

[1] This test fails on Windows:

backup.backup_intr_errors      [ fail ]

mysqltest: At line 86: query 'BACKUP DATABASE bup_intr TO 
'bup_intr.bkp'' failed
: 1638: Can't write to backup location 'bup_intr.bkp' (file already exists?)

The result from queries just before the failure was:
< snip >
#
# Examine backup logs.
#
# FIXME: Until BUG#39924 is fixed, change to BUP_CANCEL state will not be
# seen in backup_progress table and backup_history table will be empty.
# When the bug is fixed the output below will change and the result file
# should be modifed accordingly.
#
SELECT object, error_num, notes FROM mysql.backup_progress;
object  error_num       notes
backup kernel   0       starting
backup kernel   0       running
backup kernel   0       validity point
SELECT * FROM mysql.backup_history;
SET SESSION DEBUG="-d";
#
# Prepare bup_intr.bkp for RESTORE testing. Note that above BACKUP
# command should not create the file because it was interrupted.
#
BACKUP DATABASE bup_intr TO 'bup_intr.bkp';

I tried for about 2 hours to track this down but wasn't able to. One 
thing that complicated the diagnosis was the execution against all 
storage engines. While not related to this problem (I think), the test 
would be much easier to deal with it if it weren't run against all 
storage engines.

My gut feeling is this is a problem with how Linux and Windows deal with 
file open -- I think it may be that Windows creates the file earlier 
than you expected. But I don't know that to be true.

...

> +while (`SELECT count(*) > 0 FROM bup_intr.t1`)
> +{
> +
> +# read the next engine and remove it from t1
> +let $engine=`SELECT engine FROM bup_intr.t1 LIMIT 1`;
> +eval DELETE FROM bup_intr.t1 WHERE engine='$engine';

[4] While body is not indented.

> +
> +--echo
> +--echo ########################################
> +--echo ## Testing with $engine engine.
> +--echo ########################################
> +--echo

[10, 11] I don't think we do to this -- I don't think we need to run 
against all storage engines. By doing so, the result file is really 
large. Given that this test doesn't work on Windows diagnosing the 
problem is all that much more difficult. Unless there is a compelling 
reason to run against all storage engines, I would change this to a 
single pass and place in suite/backup.

...

> === added file
> 'mysql-test/suite/backup_engines/include/backup_restore_interrupt.inc'
> --- a/mysql-test/suite/backup_engines/include/backup_restore_interrupt.inc	1970-01-01
> 00:00:00 +0000
> +++ b/mysql-test/suite/backup_engines/include/backup_restore_interrupt.inc	2008-12-05
> 07:38:01 +0000

...

> +  --connection killer
> +
> +  --echo #
> +  --echo # Wait for the command to reach its synchronization point,
> +  --echo # then kill it.
> +  --echo #
> +  SET DEBUG_SYNC='now WAIT_FOR here';
> +  --replace_regex /id=[0-9]+/id=<query id>/
> +  eval SELECT state FROM INFORMATION_SCHEMA.PROCESSLIST WHERE id=$id;
> +  --replace_regex /QUERY [0-9]+/QUERY <query id>/
> +  eval KILL QUERY $id;
> +  SET DEBUG_SYNC='now SIGNAL go';
> +

[4] Why is this indented? It shouldn't be.

...

> +--connection default
> +
> +--echo #
> +--echo # Reap the command and show results.
> +--echo #
> +--error ER_QUERY_INTERRUPTED
> +reap;
> +--replace_column 2 <error-code>
> +# One error message contains file path - mask it out.
> +--replace_regex /Error on delete of '.*'/Error on delete of <backup image
> path>/
> +SHOW WARNINGS;
> +
> +--echo #
> +--echo # Examine backup logs.
> +--echo #
> +--echo # FIXME: Until BUG#39924 is fixed, change to BUP_CANCEL state will not be
> +--echo # seen in backup_progress table and backup_history table will be empty.
> +--echo # When the bug is fixed the output below will change and the result file
> +--echo # should be modifed accordingly.
> +--echo #
> +SELECT object, error_num, notes FROM mysql.backup_progress;
> +query_vertical SELECT * FROM mysql.backup_history;
> +
> +# check that backup image file was removed
> +if (!$do_restore)
> +{
> +  --error 1
> +  --remove_file $bdir/bup_intr.bkp
> +}
> \ No newline at end of file

[5] Add newline to remove patch warning.

...

> === added file 'mysql-test/suite/backup_engines/r/backup_interruption.result'
> --- a/mysql-test/suite/backup_engines/r/backup_interruption.result	1970-01-01
> 00:00:00 +0000
> +++ b/mysql-test/suite/backup_engines/r/backup_interruption.result	2008-12-05
> 07:38:01 +0000

I don't think this test needs to be in backup_engines. Please explain 
why you think it should be there.

...

> === added file 'mysql-test/suite/backup_engines/t/backup_interruption.test'
> --- a/mysql-test/suite/backup_engines/t/backup_interruption.test	1970-01-01 00:00:00
> +0000
> +++ b/mysql-test/suite/backup_engines/t/backup_interruption.test	2008-12-05 07:38:01
> +0000
> @@ -0,0 +1,261 @@

...

> +  
> +  # before common preparations
> +  let $sync_point= before_backup_common_prepare;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # during common preparations
> +    # before checking privileges
> +    let $sync_point= before_backup_privileges;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +  
> +    # before checking that no other BACKUP/RESTORE is running
> +    let $sync_point= before_backup_single_op;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +    # before blocking DDLs
> +    let $sync_point= before_backup_ddl_block;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # before openning the stream
> +  let $sync_point= before_backup_stream_open;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # before creating catalogue
> +  let $sync_point= before_backup_catalog;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc

[4] Why is this indented?

> +
> +# before populating backup catalogue
> +let $sync_point= after_backup_start_backup;
> +--source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +# before do_backup
> +let $sync_point= before_do_backup;
> +--source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +# inside do_backup
> +  # before preamble is written
> +  let $sync_point= before_backup_meta;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +  let $sync_point= backup_before_write_preamble;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # before write_table_data
> +  let $sync_point= before_backup_data;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # inside write_table_data
> +    # before initial phase
> +    let $sync_point=before_backup_data_init;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +    # before prepare phase
> +    let $sync_point=before_backup_data_prepare;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +    # before sync phase
> +    let $sync_point=before_backup_data_lock;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +    # inside sync phase
> +    let $sync_point=before_backup_data_unlock;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +    # after sync phase
> +    let $sync_point=after_backup_binlog;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +    # before final phase
> +    let $sync_point=before_backup_data_finish;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # before writing summary section
> +  let $sync_point= before_backup_summary;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc

[4] Why is this indented?

> +
> +# Note: after the call to do_backup() the operation is completed
> +# and terminates successfully, even if interruption has happened 
> +# after its completion.
> +
> +#
> +# Test RESTORE interruptions.
> +#
> +
> +--replace_column 1 #
> +BACKUP DATABASE bup_intr TO 'bup_intr.bkp';
> +DROP DATABASE bup_intr;
> +
> +let $do_restore=1;
> +
> +# before preparations
> +let $sync_point= before_restore_prepare;
> +--source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +# during preparations
> +  # before logger initialization
> +  let $sync_point= before_restore_logger_init;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +  
> +  # before common preparations
> +  let $sync_point= before_restore_common_prepare;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +  
> +  # before openning the stream
> +  let $sync_point= before_restore_stream_open;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # before creating catalogue
> +  let $sync_point= before_restore_catalog;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # before reading header
> +  let $sync_point= before_restore_read_header;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # before reading catalogue
> +  let $sync_point= before_restore_read_catalog;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # before writing incident event
> +  let $sync_point= before_restore_binlog;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +

[4] Why is this indented?

> +# before do_restore
> +let $sync_point= after_backup_start_restore;
> +--source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +# inside do_restore
> +
> +  # before disabling fkey constraints
> +  let $sync_point= before_restore_fkey_disable;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # before reading metadata and creating objects
> +  let $sync_point= before_restore_read_metadata;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +
> +  # Note: now we pass the point when database is created during RESTORE,
> +  # thus we need to DROP it before executing RESTORE another time.
> +  DROP DATABASE bup_intr;
> +
> +  # before locking tables
> +  let $sync_point= before_restore_lock_tables;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +  DROP DATABASE bup_intr;
> +
> +  # before restoring table data
> +  let $sync_point= before_restore_table_data;
> +  --source suite/backup_engines/include/backup_restore_interrupt.inc
> +  DROP DATABASE bup_intr;
> +
> +  # inside restore_table_data
> +    # before creating restore drivers
> +    let $sync_point= restore_before_drivers_create;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +    DROP DATABASE bup_intr;
> +
> +    # before initializing the drivers
> +    let $sync_point= restore_before_drivers_init;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +    DROP DATABASE bup_intr;
> +
> +    # before reading table data chunk
> +    let $sync_point= restore_before_read_data_chunk;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +    DROP DATABASE bup_intr;
> +
> +    # before sending the data to a driver
> +    let $sync_point= restore_before_sending_data;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +    DROP DATABASE bup_intr;
> +
> +    # before shutting down the drivers
> +    let $sync_point= restore_table_data_before_end;
> +    --source suite/backup_engines/include/backup_restore_interrupt.inc
> +    DROP DATABASE bup_intr;

[4] Why is this indented?

...

> === modified file 'sql/backup/be_thread.cc'
> --- a/sql/backup/be_thread.cc	2008-10-30 12:29:54 +0000
> +++ b/sql/backup/be_thread.cc	2008-12-05 07:38:01 +0000

...

>  void Locking_thread_st::wait_until_locking_thread_dies()
>  {
> +  // Nothing to do if the locking thread has not been started.
> +  if (!m_thread_started)
> +    return;
> +
>    pthread_mutex_lock(&THR_LOCK_caller);
>    if (lock_state != LOCK_DONE)
>    {
> @@ -350,4 +363,6 @@ void Locking_thread_st::wait_until_locki
>    }
>    else
>      pthread_mutex_unlock(&THR_LOCK_caller);
> +
> +  m_thread_started= FALSE;

[6] Isn't it possible to have a race condition here with writing to 
m_thread_started? I wonder...but if you are convinced there isn't a 
change then I'm ok with that. I think it is possible.

...

> === modified file 'sql/backup/kernel.cc'
> --- a/sql/backup/kernel.cc	2008-11-28 10:10:39 +0000
> +++ b/sql/backup/kernel.cc	2008-12-05 07:38:01 +0000

...

> -  DBUG_RETURN(0);
> +  /* 
> +    FIXME: this call is here because object services doesn't clean the
> +    statement execution context properly, which leads to assertion failure.
> +    It should be fixed inside object services implementation and then the
> +    following line should be removed.
> +   */

[8] Please annotate the code here making it (more) clear that this needs 
to be fixed with that bug. I wasn't sure which parts you were referring to.

[9] Please add a new bug for this problem if it doesn't already exist 
then add that bug# to this comment.

...

> @@ -1231,24 +1387,23 @@ int Backup_restore_ctx::do_restore(bool 
>  
>    DBUG_PRINT("restore",("Restoring table data"));
>  
> -  /* 
> -    FIXME: this call is here because object services doesn't clean the
> -    statement execution context properly, which leads to assertion failure.
> -    It should be fixed inside object services implementation and then the
> -    following line should be removed.
> -   */

[8] Please add a new bug for this problem and annotate the code here 
making it (more) clear that this needs to be fixed with that bug.

...

> @@ -1259,22 +1414,17 @@ int Backup_restore_ctx::do_restore(bool 
>     creation of these objects will fail.
>    */
>  
> -  err= restore_triggers_and_events();           // logs errors
> +  DEBUG_SYNC(m_thd, "before_restore_triggers");
> +  err= restore_triggers_and_events();   // logs errors and detects interruptions
>    if (err)
>       DBUG_RETURN(fatal_error(err));
>  
> -  /* 
> -    FIXME: this call is here because object services doesn't clean the
> -    statement execution context properly, which leads to assertion failure.
> -    It should be fixed inside object services implementation and then the
> -    following line should be removed.
> -   */

[8] Please add a new bug for this problem and annotate the code here 
making it (more) clear that this needs to be fixed with that bug.

...

> === modified file 'sql/backup/stream.h'
> --- a/sql/backup/stream.h	2008-11-13 13:02:36 +0000
> +++ b/sql/backup/stream.h	2008-12-05 07:38:01 +0000
> @@ -192,11 +192,29 @@ read_catalog(Image_info &info, Input_str
>    return ret == BSTREAM_ERROR ? ERROR : OK;
>  }
>  
> +/*
> +  FIXME: the thd parameter for read_meta_data() is here only because the
> +  temporary code within the function needs it. It should be removed once the
> +  issue is fixed.
> +*/ 

[8] Please add the corresponding bug here (or create one if not already) 
and make it clear that this needs to be fixed with that bug.

>  inline
>  result_t
> -read_meta_data(Image_info &info, Input_stream &s)
> +read_meta_data(THD *thd, Image_info &info, Input_stream &s)
>  {
>    int ret= bstream_rd_meta_data(&s,
> static_cast<st_bstream_image_header*>(&info));
> +
> +  /* 
> +    FIXME: the following code is here because object services doesn't clean the
> +    statement execution context properly, which leads to assertion failure.
> +    It should be fixed inside object services implementation and then the
> +    following line should be removed.
> +   */

[8] Please add a new bug for this problem and annotate the code here 
making it (more) clear that this needs to be fixed with that bug.

Chuck
Thread
bzr commit into mysql-6.0-backup branch (Rafal.Somla:2740) Bug#35079 WL#4538Rafal Somla5 Dec
  • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2740)Bug#35079 WL#4538Chuck Bell5 Dec
    • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2740)Bug#35079 WL#4538Rafal Somla8 Dec
      • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2740)Bug#35079 WL#4538Chuck Bell16 Dec
        • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2740)Bug#35079 WL#4538Rafal Somla19 Dec
          • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2740)Bug#35079 WL#4538Chuck Bell19 Dec
    • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2740)Bug#35079 WL#4538Rafal Somla12 Dec
      • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2740)Bug#35079 WL#4538Rafal Somla12 Dec