List:Commits« Previous MessageNext Message »
From:Chuck Bell Date:December 1 2008 9:22pm
Subject:RE: bzr commit into mysql-6.0-backup branch (jorgen.loland:2737)
Bug#40970
View as plain text  
Jorgen,

Sorry to be long winded (as it were), but I am documenting a problem I see
in our backup code. Your patch is just a victim I'm afraid. No harm
intended! :)

STATUS
------
Changes requested

REQUESTS
--------
* Please add a test for the new error message.

Note: If problem identified below cannot (or should not) be fixed in this
patch, open a new bug report for the problem and include commentary like
what is described below.

COMMENTARY
----------
The deficiencies this commentary illuminates isn't ranting about the quality
of this particular patch, rather it is describing a failure in the backup
code to sufficiently test all error conditions. I think this has been
brought on by collective bargaining of what can and cannot be tested
resulting in the introduction of and suppression of unintentional defects.

While it is unlikely that a proper test could be constructed that forces the
server into a state to generate the accused error, I think it is still
useful if we at least make the attempt to initiate the error via debug
insertion.

It is true and I will not debate the futility of thinking debug insertion is
a testing method that satisfies even the lowliest of quality assurance
standards, however it is useful in testing what the rest of the code does
should this error occur. That is, I am concerned when we create a new error
condition and don't test it -- either in coding the solution or via an
automated test -- that we create an error condition that is in itself a
source of bugs or worse system failures. Failing to handle errors properly
is a large part of what is wrong with most software.

That being said, I have a suggestion for how to test this (see below).

Once you implement my suggestion (or some variant thereof), you will see
exactly what I feared -- the code does not handle the error properly
resulting in this error instead:

backup.backup_errors           [ fail ]

mysqltest: At line 384: query 'DROP DATABASE db1' failed: 1223: Can't
execute the query because you have a conflicting read lock

What this suggests to me in thinking about it is that it is not sufficient
to report the error as this patch does if block_commits() fails. It is more
likely prudent that the code needs to know how to clean up depending on
which error occurred. Perhaps this is a commit blocker bug...regardless, a
simple test like this has revealed an unforeseen problem.

SUGGESTIONS
-----------
Use debug insertion to do a partial test of the new error condition.

=== modified file 'mysql-test/suite/backup/t/backup_errors.test'
--- mysql-test/suite/backup/t/backup_errors.test        2008-11-25 17:44:19
+000
0
+++ mysql-test/suite/backup/t/backup_errors.test        2008-12-01 21:06:51
+000
0
@@ -367,4 +367,18 @@ RESTORE FROM 'overwrite.bak';
 --echo Show that inserted value 2 is not there
 SELECT * FROM table1;

+--echo Test ER_BACKUP_BLOCK_COMMITS error handling by backup code.
+
+SET SESSION DEBUG='+d,global_read_lock_block_fail';
+--replace_column 1 #
+--error ER_BACKUP_BLOCK_COMMITS
+BACKUP DATABASE db1 TO 'overwrite1.bak';
+SET SESSION DEBUG='-d';
+
+SET SESSION DEBUG='+d,global_read_lock_fail';
+--replace_column 1 #
+--error ER_BACKUP_BLOCK_COMMITS
+BACKUP DATABASE db1 TO 'overwrite1.bak';
+SET SESSION DEBUG='-d';
+
 DROP DATABASE db1;

=== modified file 'sql/backup/data_backup.cc'
--- sql/backup/data_backup.cc   2008-11-26 10:05:19 +0000
+++ sql/backup/data_backup.cc   2008-12-01 21:01:24 +0000
@@ -361,6 +361,12 @@ int block_commits(THD *thd, TABLE_LIST *
   DBUG_ENTER("block_commits()");

   /*
+    Simulate an error condition for block commit failure
+    of the lock_global_read_lock call.
+  */
+  DBUG_EXECUTE_IF("global_read_lock_fail", DBUG_RETURN(1););
+
+  /*
     Step 1 - global read lock.
   */
   DEBUG_SYNC(thd, "before_commit_block");
@@ -383,6 +389,12 @@ int block_commits(THD *thd, TABLE_LIST *
   */

   /*
+    Simulate an error condition for block commit failure
+    of the lock_global_read_lock call.
+  */
+  DBUG_EXECUTE_IF("global_read_lock_block_fail", DBUG_RETURN(1););
+
+  /*
     Step 3 - make the global read lock to block commits.
   */
   if (make_global_read_lock_block_commit(thd))

Note: Simply returning an error here of '1' is not sufficient to know how to
properly cleanup. Whether this code handles it or the caller handles it --
it's still missing and its omission introduces unexpected results.

DETAILS
------- 
>  2737 Jorgen Loland	2008-12-01
>       Bug#40970 - Errors from commit blocker are not logged.
>       
>       If block_commits fails, an error is now logged. 
> unblock_commits cannot fail.
> modified:
>   sql/backup/data_backup.cc
>   sql/share/errmsg.txt
> 
> per-file messages:
>   sql/backup/data_backup.cc
>     Log error if block_commits fails.
>   sql/share/errmsg.txt
>     Added error message for when backup fails to block 

...

Chuck

Thread
bzr commit into mysql-6.0-backup branch (jorgen.loland:2737) Bug#40970Jorgen Loland1 Dec
  • Re: bzr commit into mysql-6.0-backup branch (jorgen.loland:2737)Bug#40970Rafal Somla1 Dec
    • Re: bzr commit into mysql-6.0-backup branch (jorgen.loland:2737)Bug#40970Jørgen Løland1 Dec
  • RE: bzr commit into mysql-6.0-backup branch (jorgen.loland:2737)Bug#40970Chuck Bell1 Dec
    • Re: bzr commit into mysql-6.0-backup branch (jorgen.loland:2737)Bug#40970Jørgen Løland3 Dec
  • Re: bzr commit into mysql-6.0-backup branch (jorgen.loland:2737)Bug#40970Øystein Grøvlen5 Dec