Hi Chuck,
This is the list of things which I think should be tested if you want to push
your code. For each item to be tested, I include in square brackets my
evaluation of how it is addressed in the test you proposed.
1. BACKUP/RESTORE on master
1a) BACKUP on master
- do not leave any trace in master's binlog, [ok]
- do not affect slave in any way - after BACKUP on master:
-- slaves state does not change, [missing SHOW SLAVE STATUS check]
-- slave's master position does not advance, [ok]
1b) RESTORE on master
- inserts incident event into master's binary log and nothing more, [ok]
- while active, new slaves can not connect, [not tested]
- causes slave to stop on the incident event, [missing SHOW SLAVE STATUS
(only error shown)]
- slave should contain data up to the point when RESTORE was performed on
master. [not tested]
2. BACKUP/RESTORE on slave
2a) BACKUP on slave
- master's position should be reported in backup_progress log, [the reported
value is not checked]
- slave should operate as usual after BACKUP (normal state), [missing SHOW
SLAVE STATUS check]
2b) RESTORE on slave
- should error if slave is replicating, [ok]
- should succeed if replication is disabled, [ok]
- should not be possible to enable replication while RESTORE is running. [not
tested]
See also more detailed remarks to the test inlined below.
Chuck Bell wrote:
> #At file:///C:/source/bzr/mysql-6.0-wl-4209/
>
> 2720 Chuck Bell 2008-10-29
> WL#4209 : Integrate Backup with Replication
>
> This patch makes changes to the backup system to allow backup to
> be used with replication.
> === added file 'mysql-test/suite/rpl/t/rpl_backup.test'
> --- a/mysql-test/suite/rpl/t/rpl_backup.test 1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/rpl/t/rpl_backup.test 2008-10-29 19:10:00 +0000
> @@ -0,0 +1,359 @@
> +#
> +# Test backup and replication integration.
> +#
> +
> +--source include/master-slave.inc
> +--source include/not_embedded.inc
> +--source include/have_debug_sync.inc
> +--source include/have_debug.inc
> +
> +connection master;
> +
> +--echo Create some data...
> +CREATE DATABASE rpl_backup;
> +CREATE TABLE rpl_backup.t1 (a int) ENGINE=MEMORY;
> +INSERT INTO rpl_backup.t1 VALUES (1), (2), (3), (4), (5);
> +
> +#
> +# Use Case 1 - Backup performed on a master.
> +# When a backup is performed on a master, the master shall not log
> +# the backup event nor shall the master replicate any data produced
> +# (logged) by the backup.
> +
> +--echo Remove all entries in the backup logs.
> +FLUSH BACKUP LOGS;
> +PURGE BACKUP LOGS;
> +
> +connection slave;
> +
> +--echo Remove all entries in the backup logs.
> +FLUSH BACKUP LOGS;
> +PURGE BACKUP LOGS;
> +
> +--echo Get master's binlog position from the slave before backup.
> +let $slave_before_pos =
> + query_get_value("SHOW SLAVE STATUS", Read_Master_Log_Pos, 1);
> +
> +connection master;
> +
> +--echo Get master's binlog position before backup.
> +let $master_before_pos = query_get_value("SHOW MASTER STATUS", Position, 1);
> +
> +#
> +# Now test read of backupid with known id using debug insertion
> +#
> +SET SESSION debug="+d,set_backup_id";
> +
> +# We are using debug insertion to set the first backup_id to
> +# 500 so we can expect the output of this operation to be 500.
> +--echo Backup_id = 500.
> +BACKUP DATABASE rpl_backup TO 'rpl_bup_m1.bak';
> +
> +SET SESSION debug="-d";
> +
> +--echo Show any events issued as a result of backup.
> +--echo Note: There should be none!
> +--disable_query_log
> +eval SHOW BINLOG EVENTS FROM $master_before_pos;
> +--enable_query_log
> +
> +--echo Verify backup run on master does not advance binlog pos.
> +--echo Get master's binlog position after backup.
> +let $master_after_pos = query_get_value("SHOW MASTER STATUS", Position, 1);
> +
Suggestion: remove this check - seeing that binlog is empty is enough evidence
for the correct behaviour.
> +--echo Compute the difference of the binlog positions.
> +--echo Result should be 0.
Misc: Above messages seem redundant.
> +--disable_query_log
> +--echo Compare the before position of the master's binlog to
> +--echo the after position of the master's binlog. The result
> +--echo should be 0.
> +eval SELECT $master_after_pos - $master_before_pos AS Delta;
> +--enable_query_log
> +
> +#
> +# Now check slave to see if backup logs are affected.
> +# Check slave's master position.
> +# Ensure replication is still working.
> +#
> +sync_slave_with_master;
> +connection slave;
> +
> +--echo Should have count(*) = 0.
> +SELECT count(*) FROM mysql.backup_history;
> +
I think it is an overkill: checking that slave's master position have not
changed is enoguh, because it implies that slave have not executed any
replication events.
> +--echo Verify backup run on master does not advance binlog pos.
> +--echo Get master's binlog position on the slave after backup.
> +let $slave_after_pos =
> + query_get_value("SHOW SLAVE STATUS", Read_Master_Log_Pos, 1);
> +
> +--echo Compute the difference of the binlog positions.
> +--echo Result should be 0.
Misc: Above messages redundant.
> +--disable_query_log
> +--echo Compare the before position of the master's binlog to
> +--echo the after position of the master's binlog as shown on
> +--echo on the slave. The result should be 0.
Typo: Should read "slave", not "master".
> +eval SELECT $slave_after_pos - $slave_before_pos AS Delta;
> +--enable_query_log
> +
Suggestion: Use SHOW SLAVE STATUS to check the state of replication. The output
contains information such as whether the slave threads are running etc.
> +connection master;
> +
> +--echo Ensure replication is still working...
> +--echo Cleanup from last error on master and slave.
> +DROP TABLE rpl_backup.t1;
> +
> +CREATE TABLE rpl_backup.t1 (a int) ENGINE=MEMORY;
> +
Suggestion: Do not drop the database and the table. Just DELETE ALL from the
table and fill it with new data. I think this would be more interesting to test.
> +INSERT INTO rpl_backup.t1 VALUES (11), (22), (33);
> +
> +SELECT count(*) FROM rpl_backup.t1;
> +
> +sync_slave_with_master;
> +connection slave;
> +
> +SELECT count(*) FROM rpl_backup.t1;
> +
> +--echo Cleanup backup logs.
> +FLUSH BACKUP LOGS;
> +PURGE BACKUP LOGS;
> +
> +#
> +# Use Case 3 - Backup performed on a slave (part 1 of 2)
> +# Test backup on slave where slave has no slaves.
> +# Also, verify master's binlog information is saved to
> +# the progress log.
> +#
> +
> +#
> +# Now test read of backupid with known id using debug insertion
> +#
> +
> +SET SESSION debug="+d,set_backup_id";
> +
> +# We are using debug insertion to set the first backup_id to
> +# 600 so we can expect the output of this operation to be 600.
> +--echo Backup_id = 600.
> +BACKUP DATABASE rpl_backup TO 'rpl_bup_s1.bak';
> +
> +SET SESSION debug="-d";
> +
> +--echo Check saving of master's binlog information.
> +--echo Should have count(*) = 1.
> +SELECT count(*) FROM mysql.backup_progress
> +WHERE backup_id = 600 AND notes LIKE 'Recording master binlog information%';
> +
Suggestion: make a better check, which also checks that the reported position is
corrct (i.e. as given by SHOW SLAVE STATUS).
> +--echo Should have count(*) = 1.
> +SELECT count(*) FROM mysql.backup_history;
> +
> +connection master;
> +
> +--echo Backup_id = 501.
> +BACKUP DATABASE rpl_backup TO 'rpl_bup_m2.bak';
Why this backup on master if you are testing backup on slave?
> +
> +--echo Ensure replication is still working...
> +
> +INSERT INTO rpl_backup.t1 VALUES (10), (20), (30);
> +
> +--echo Backup_id = 502.
> +BACKUP DATABASE rpl_backup TO 'rpl_bup_m3.bak';
> +
> +SELECT count(*) FROM rpl_backup.t1;
> +
> +sync_slave_with_master;
> +connection slave;
> +
> +SELECT count(*) FROM rpl_backup.t1;
> +
> +--echo Make a backup for later use.
> +--echo backup_id = 601.
> +BACKUP DATABASE rpl_backup TO 'rpl_bup_s2.bak';
> +
> +#
> +# Use Case 2 - Restore performed on a master.
> +#
> +
> +# To ensure the master does not log anything in the binary log
> +# during a restore, we first drop the database on the slave
> +# then run the restore and after slave is restarted check to
> +# see if database is still missing (it should be).
> +
> +connection slave;
> +
> +DROP DATABASE rpl_backup;
> +
> +connection master;
> +
> +--echo Get master's binlog position before restore.
> +let $master_before_pos = query_get_value("SHOW MASTER STATUS", Position, 1);
> +
> +--echo backup_id = 503.
> +RESTORE FROM 'rpl_bup_m3.bak';
> +
> +--echo Show the incident event issued as a result of restore.
> +--replace_column 2 # 5 #
> +--disable_query_log
> +eval SHOW BINLOG EVENTS FROM $master_before_pos;
> +--enable_query_log
> +
> +--echo Showing databases on master.
> +SHOW DATABASES LIKE 'rpl_backup%';
> +
> +SELECT count(*) FROM rpl_backup.t1;
> +
> +source include/wait_for_slave_sql_to_stop.inc;
Question: Is it waiting for slave to stop as the result of the incident event?
Please add a comment if I'm guessing right.
> +
> +connection slave;
> +
Suggestion: SHOW SLAVE STATUS would be nice to see the situation after the
incident event.
> +STOP SLAVE;
> +--source include/wait_for_slave_to_stop.inc
> +
Suggestion: move STOP SLAVE to just before START SLAVE below. This way, you are
sure that the error you chek below is generated by the incident event, not by
STOP SLAVE. Btw, I'm not sure if this STOP SLAVE is needed at all - perhaps
START SLAVE would work without it.
> +--echo Show the slave stopped with an error.
> +LET $last_error = query_get_value("SHOW SLAVE STATUS", Last_SQL_Error, 1);
> +disable_query_log;
> +eval SELECT "$last_error" AS Last_SQL_Error;
> +enable_query_log;
> +
> +SET global sql_slave_skip_counter=1;
> +
Request: Check what data slave contains after it has stopped.
> +START SLAVE;
> +--source include/wait_for_slave_to_start.inc
> +
Why are you doing this? It looks like testing PTR which is not the scope of this
test. Once we verified that slave has stopped because of the incident event, we
have verified that replication behaves as expected. Sure you can do plenty of
things after that but what is the point of trying it here?
> +# Sync with master to ensure nothing is replicated after incident event.
> +sync_with_master;
> +
> +--echo Showing databases on slave.
> +SHOW DATABASES LIKE 'rpl_backup%';
> +
> +# Now stop the slave, do the restore, then restart.
> +STOP SLAVE;
> +--source include/wait_for_slave_to_stop.inc
> +
> +--echo backup_id = 602.
> +RESTORE FROM 'rpl_bup_s2.bak';
Q: We restore on slave from a different image than on master. Is it intentional?
> +
> +--echo Showing databases on slave.
> +SHOW DATABASES LIKE 'rpl_backup%';
> +
> +SELECT count(*) FROM rpl_backup.t1;
> +
> +START SLAVE;
> +--source include/wait_for_slave_to_start.inc
> +
> +--echo Make a backup for later use.
> +--echo backup_id = 603.
> +BACKUP DATABASE rpl_backup TO 'rpl_bup_s3.bak';
> +
> +#
> +# Use Case 4 - Restore performed on a slave.
> +#
> +
> +connection slave;
> +
> +--echo Test restore on slave while replication turned on.
> +
> +--error ER_RESTORE_ON_SLAVE
> +RESTORE FROM 'rpl_bup_s1.bak';
> +
> +--echo Stop slave and restart after restore.
> +STOP SLAVE;
> +
> +--replace_column 1 #
> +RESTORE FROM 'rpl_bup_s3.bak';
> +
> +START SLAVE;
> +--source include/wait_for_slave_to_start.inc
> +
> +connection master;
> +
> +--echo Checking affect on replication.
> +INSERT INTO rpl_backup.t1 VALUES (44), (55), (66);
> +SELECT * FROM rpl_backup.t1 ORDER BY a;
> +
> +sync_slave_with_master;
> +connection slave;
> +SELECT * FROM rpl_backup.t1 ORDER BY a;
> +
Another scenario:
Start replication while RESTORE is in progress.
> +#
> +# Use Case 3 - Backup performed on a slave (part 2 of 2)
> +# Test backup on slave with another slave attached.
> +#
> +# Note: To be added as part of WL#4612
> +
> +#
> +# Use Case 5 - Backup run with no binary log.
> +#
> +
Suggestion: skip this case - it has nothing to do with BACKUP+replication
testing and with the code which you wrote.
> +--echo Stop replication and turn off binary log.
> +connection slave;
> +
> +STOP SLAVE;
> +--source include/wait_for_slave_to_stop.inc
> +
> +connection master;
> +
> +SET @orig_sql_log_bin= @@sql_log_bin;
> +
> +--echo Turn off binlog.
> +SET @@sql_log_bin= 0;
> +SHOW VARIABLES LIKE '%log_bin';
> +
> +--echo backup_id = 504.
> +BACKUP DATABASE rpl_backup TO 'rpl_bup_m4.bak';
> +
> +--echo Turn on binlog;
> +SET @@sql_log_bin= @orig_sql_log_bin;
> +SHOW VARIABLES LIKE '%log_bin';
> +
> +#
> +# Use Case 6 - Restore run with binary log turned on but no slaves attached.
> +#
> +
Suggestion: Skip this case - this behaviour has already been tested in Case 2.
> +RESET MASTER;
> +
> +--echo Get master's binlog position before restore.
> +let $master_before_pos = query_get_value("SHOW MASTER STATUS", Position, 1);
> +
> +--echo backup_id = 505.
> +RESTORE FROM 'rpl_bup_m4.bak';
> +
> +--echo Get master's binlog position after restore.
> +let $master_after_pos = query_get_value("SHOW MASTER STATUS", Position, 1);
> +
> +--echo Show the incident event issued as a result of restore.
> +--replace_column 2 # 5 #
> +--disable_query_log
> +eval SHOW BINLOG EVENTS FROM $master_before_pos;
> +--enable_query_log
> +
> +--echo Compute the difference of the binlog positions.
> +--echo Result should be 0.
> +--disable_query_log
> +--echo Compare the before position of the master's binlog to
> +--echo the after position of the master's binlog. The result
> +--echo should be 0.
> +eval SELECT $master_after_pos - $master_before_pos AS Delta;
> +--enable_query_log
> +
(cut)
Rafal