Chuck Bell wrote:
> Oystein,
>
> Great job on the patch! I have a few comments and some requests.
Chuck,
Thanks for the review. Some responses below.
>> +--echo First Backup
>> +connect (backup,localhost,root,,);
>> +USE backup_concurrent;
>> +send BACKUP DATABASE backup_concurrent TO 'backup1';
>> +
>> +--echo Starting second backup in another connection.
>> +--echo (Should fail because another backup is running.)
>> +connection default;
>> +--error ER_BACKUP_RUNNING
>> +BACKUP DATABASE backup_concurrent TO 'backup2';
>
> This test creates a potentially non-deterministic condition. What happens if
> the backup in the send completes before this one?
I discovered this yesterday, and have now rewritten the test to be based
on the sync mechanism.
>
> Request: We need to add a breakpoint here to stop the first backup in the
> middle then attempt the next one, get the error, then release the first
> backup. This will ensure that even our most horribly unpredictable PB
> machine (usually valgrind) will not run so slow that the test fails because
> the second backup succeeds.
>
> I would use the "before_backup_data" breakpoint -- see kernel.cc @916. This
> will ensure backup is in the middle (after metadata and before data). That
> should be safe. The following is a suggestion for how to set this up (see
> other tests that use breakpoints for ideas).
I used the after_backup_start_backup breakpoint. That provoked a
segfault withot the fix. Any reason I should change to before_backup_data?
>
> <at top of test>
> SET DEBUG_SYNC= 'RESET';
>
> ...
>
> SET DEBUG_SYNC= 'before_backup_data WAIT_FOR finish';
> --echo backup: Send the backup command.
> send BACKUP DATABASE ...
>
> <after second backup has failed>
>
> --echo breakpoints: Sending finish signal to wake BACKUP.
> SET DEBUG_SYNC= 'now SIGNAL finish';
>
> Request: For completeness we should test the following:
>
> 2 backups (done)
> backup in progress then restore
> restore in progress then backup
> 2 restores
I will do this. Should be pretty straight-forward.
>
> For the devil's sake, what happens if a third backup/restore attempts to
> run? That is, a backup is running, another tries and fails, then another...?
I could change the test to run the third backup without waiting for the
first one.
>
>
> ...
>
>> - /** Indicates if a backup/restore operation is in progress. */
>> - static bool is_running;
>> - static pthread_mutex_t run_lock; ///< To guard @c is_running flag.
>> + /** Indicates if and which backup/restore operation is in
>> progress. */
>> + static Backup_restore_ctx *current;
>
> I am not thrilled with this variable name change. What was wrong with
> is_running? That made more contextual sense. Please let me know your
> thoughts on this.
Note that it is not just a name change. I have changed the type, too.
I feel that is_running is a good name for a boolean, but for a pointer.
It should describe what it points to. I agree that current is a bit
generic. My first idea was current_backup, but then I realized it could
be a restore to. Maybe current_op or running_op would work? Any
suggestions?
>
> ...
>
>> - if (!is_running)
>> - is_running= TRUE;
>> + if (!current)
>> + current= this;
>> else
>> fatal_error(ER_BACKUP_RUNNING);
>
> Wouldn't the original if(!is_running) make more logical sense? If you
> disagree, please let me know why. Perhaps this is a documentation thing, but
> I like is_running better. It's just my opinion, but we should refrain from
> changing variable names unless we have a good (documented) reason.
>
> Chuck
>
--
Øystein