From: Date: September 12 2006 11:06pm Subject: bk commit into 5.0 tree (malff:1.2263) BUG#11733 List-Archive: http://lists.mysql.com/commits/11799 X-Bug: 11733 Message-Id: <20060912210634.0FBEA8E09E1@weblab.mysql.com> Below is the list of changes that have just been committed into a local 5.0 repository of marcsql. When marcsql does a push these changes will be propagated to the main repository and, within 24 hours after the push, to the public repository. For information on how to access the public repository see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html ChangeSet@stripped, 2006-09-12 14:06:28-07:00, malff@weblab.(none) +11 -0 BUG#11733 (COMMITs should not happen if read-only is set) Before this patch, 1) the COMMIT statement could write files even if the server is in --read-only mode, as reported. 2) the server could reply to the "set global read_only=1;" query and *still* perform writes to the file system for a short (?) time, as found during investigation. 3) the check to prevent statement execution in read-only mode was only a pre-check, and failed to detect cases when the server status changed *during* the statement execution. This is unsafe, and was found by code review. Depending of the timing and the duration of the statement (say, a full update on a very big table), this can be a real issue. With this patch, 1) is fixed by having the ha_commit_trans() code for the 2 phase commit pay attention to the --read-only server status. 2) is fixed by enforcing thread safety by protecting the *global* --read-only server status with a RW lock 3) is fixed by 1). The pre-check code still exists to prevent up front most of the use cases, which as value since it notifies the client quickly. The code has been better commented. Fixing 1, 2 and 3 is important to guarantee that once the server has advertised itself as --read-only (by replying to the set global read_only=1 query), files are actually safe to read when performing a file backup. Finally, a new replication test has been added to verify that: - the global variable read_only itself is not replicated, - setting read_only on the master replicates properly, - setting read_only on a slave cause the slave to fail, as expected. mysql-test/r/read_only.result@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +15 -0 Added test for Bug#11733 mysql-test/r/rpl_read_only.result@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +93 -0 New test case mysql-test/r/rpl_read_only.result@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +0 -0 mysql-test/t/read_only.test@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +34 -0 Added test for Bug#11733 mysql-test/t/rpl_read_only-slave.opt@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +1 -0 New test case mysql-test/t/rpl_read_only-slave.opt@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +0 -0 mysql-test/t/rpl_read_only.test@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +65 -0 New test case mysql-test/t/rpl_read_only.test@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +0 -0 sql/handler.cc@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +57 -1 ha_commit_trans should fail when the server is --read-only sql/mysql_priv.h@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +3 -1 LOCK_opt_readonly sql/mysqld.cc@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +3 -0 LOCK_opt_readonly sql/set_var.cc@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +12 -1 class sys_var_opt_readonly sql/set_var.h@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +10 -0 class sys_var_opt_readonly sql/sql_parse.cc@stripped, 2006-09-12 14:04:41-07:00, malff@weblab.(none) +8 -0 Comments only # This is a BitKeeper patch. What follows are the unified diffs for the # set of deltas contained in the patch. The rest of the patch, the part # that BitKeeper cares about, is below these diffs. # User: malff # Host: weblab.(none) # Root: /home/marcsql/TREE/mysql-5.0-11733 --- 1.217/sql/handler.cc 2006-09-12 14:06:33 -07:00 +++ 1.218/sql/handler.cc 2006-09-12 14:06:33 -07:00 @@ -688,11 +688,60 @@ int ha_commit_trans(THD *thd, bool all) #ifdef USING_TRANSACTIONS if (trans->nht) { + /* + Design choice: + The order of locking for : + - RW LOCK_opt_readonly + - Mutex LOCK_global_read_lock + impacts how the server behaves when a system admin + sets the server to read-only. + + 1) LOCK_opt_readonly followed by LOCK_global_read_lock: + - all the pending transactions in the rw_rdlock queue are + committed first, then the server change to a --read-only mode. + - the query that set the server to read-only may not return + immediately, as it's waiting for LOCK_opt_readonly to clear. + This will take a finite but unpredictable time under heavy load. + + 2) LOCK_global_read_lock followed by LOCK_opt_readonly: + - only 1 transaction is is the LOCK_opt_readonly queue, + all the others are serialized by LOCK_global_read_lock + - only the first transaction will commit, all the following ones + will fail, + - the query that sets the server to read-only returns after + the first commit is performed. + This will take a finite and more predictable time. + + The design choice is to implement 1), as this offers a better + quality of service (transactions succeed) compared to 2) (where + many transactions will fail). + + In both cases, by the time the "set global read_only=1;" returns, + the server is *garanteed* to have no pending commit still writing + to files, which is a good thing when these files are about to be + saved by an external backup tool. + See class sys_var_opt_readonly. + */ + + if (is_real_trans) + { + rw_rdlock(&LOCK_opt_readonly); + if (opt_readonly) + { + my_error(ER_OPTION_PREVENTS_STATEMENT, MYF(0), "--read-only"); + ha_rollback_trans(thd, all); + error= 1; + goto end_readonly; + } + } + if (is_real_trans && wait_if_global_read_lock(thd, 0, 0)) { ha_rollback_trans(thd, all); - DBUG_RETURN(1); + error= 1; + goto end; } + DBUG_EXECUTE_IF("crash_commit_before", abort();); /* Close all cursors that can not survive COMMIT */ @@ -728,7 +777,14 @@ int ha_commit_trans(THD *thd, bool all) DBUG_EXECUTE_IF("crash_commit_after", abort();); end: if (is_real_trans) + { start_waiting_global_read_lock(thd); + } +end_readonly: + if (is_real_trans) + { + rw_unlock(&LOCK_opt_readonly); + } } #endif /* USING_TRANSACTIONS */ DBUG_RETURN(error); --- 1.408/sql/mysql_priv.h 2006-09-12 14:06:33 -07:00 +++ 1.409/sql/mysql_priv.h 2006-09-12 14:06:33 -07:00 @@ -1231,7 +1231,8 @@ extern uint volatile thread_count, threa extern my_bool opt_sql_bin_update, opt_safe_user_create, opt_no_mix_types; extern my_bool opt_safe_show_db, opt_local_infile; extern my_bool opt_slave_compressed_protocol, use_temp_pool; -extern my_bool opt_readonly, lower_case_file_system; +extern my_bool opt_readonly; /* protected by LOCK_opt_readonly */ +extern my_bool lower_case_file_system; extern my_bool opt_enable_named_pipe, opt_sync_frm, opt_allow_suspicious_udfs; extern my_bool opt_secure_auth; extern my_bool opt_log_slow_admin_statements; @@ -1261,6 +1262,7 @@ extern pthread_mutex_t LOCK_mysql_create extern pthread_mutex_t LOCK_des_key_file; #endif extern rw_lock_t LOCK_grant, LOCK_sys_init_connect, LOCK_sys_init_slave; +extern rw_lock_t LOCK_opt_readonly; /* protects opt_readonly */ extern pthread_cond_t COND_refresh, COND_thread_count, COND_manager; extern pthread_cond_t COND_global_read_lock; extern pthread_attr_t connection_attrib; --- 1.570/sql/mysqld.cc 2006-09-12 14:06:33 -07:00 +++ 1.571/sql/mysqld.cc 2006-09-12 14:06:33 -07:00 @@ -508,6 +508,7 @@ pthread_mutex_t LOCK_prepared_stmt_count pthread_mutex_t LOCK_des_key_file; #endif rw_lock_t LOCK_grant, LOCK_sys_init_connect, LOCK_sys_init_slave; +rw_lock_t LOCK_opt_readonly; pthread_cond_t COND_refresh,COND_thread_count, COND_global_read_lock; pthread_t signal_thread; pthread_attr_t connection_attrib; @@ -1212,6 +1213,7 @@ static void clean_up_mutexes() (void) pthread_mutex_destroy(&LOCK_mysql_create_db); (void) pthread_mutex_destroy(&LOCK_Acl); (void) rwlock_destroy(&LOCK_grant); + (void) rwlock_destroy(&LOCK_opt_readonly); (void) pthread_mutex_destroy(&LOCK_open); (void) pthread_mutex_destroy(&LOCK_thread_count); (void) pthread_mutex_destroy(&LOCK_mapped_file); @@ -2821,6 +2823,7 @@ static int init_thread_environment() (void) my_rwlock_init(&LOCK_sys_init_connect, NULL); (void) my_rwlock_init(&LOCK_sys_init_slave, NULL); (void) my_rwlock_init(&LOCK_grant, NULL); + (void) my_rwlock_init(&LOCK_opt_readonly, NULL); (void) pthread_cond_init(&COND_thread_count,NULL); (void) pthread_cond_init(&COND_refresh,NULL); (void) pthread_cond_init(&COND_global_read_lock,NULL); --- 1.569/sql/sql_parse.cc 2006-09-12 14:06:33 -07:00 +++ 1.570/sql/sql_parse.cc 2006-09-12 14:06:33 -07:00 @@ -2490,6 +2490,9 @@ mysql_execute_command(THD *thd) /* When option readonly is set deny operations which change non-temporary tables. Except for the replication thread and the 'super' users. + This is a dirty read, not using LOCK_opt_readonly. + This check is unsecure and just used to filter most queries out, + it's ok since the real secure check is in handler.cc */ if (opt_readonly && !(thd->security_ctx->master_access & SUPER_ACL) && @@ -3319,6 +3322,11 @@ end_with_restore_list: #endif /* HAVE_REPLICATION */ if (res) break; + /* + This is a dirty read, not using LOCK_opt_readonly. + This check is unsecure and just used to filter most queries out, + it's ok since the real secure check is in handler.cc + */ if (opt_readonly && !(thd->security_ctx->master_access & SUPER_ACL) && some_non_temp_table_to_be_updated(thd, all_tables)) --- 1.3/mysql-test/r/read_only.result 2006-09-12 14:06:33 -07:00 +++ 1.4/mysql-test/r/read_only.result 2006-09-12 14:06:33 -07:00 @@ -39,6 +39,21 @@ delete t1 from t1,t3 where t1.a=t3.a; drop table t1; insert into t1 values(1); ERROR HY000: The MySQL server is running with the --read-only option so it cannot execute this statement +DROP TABLE IF EXISTS table_11733 ; +set global read_only=0; +create table table_11733 (a int) engine=InnoDb; +BEGIN; +insert into table_11733 values(11733); +set global read_only=1; +select @@global.read_only; +@@global.read_only +1 +select * from table_11733 ; +a +11733 +COMMIT; +ERROR HY000: The MySQL server is running with the --read-only option so it cannot execute this statement drop table t1,t2; +drop table table_11733 ; drop user test@localhost; set global read_only=0; --- 1.3/mysql-test/t/read_only.test 2006-09-12 14:06:33 -07:00 +++ 1.4/mysql-test/t/read_only.test 2006-09-12 14:06:33 -07:00 @@ -101,8 +101,42 @@ drop table t1; --error 1290 insert into t1 values(1); +# +# BUG#11733: COMMITs should not happen if read-only is set +# + +connection default; + +--disable_warnings +DROP TABLE IF EXISTS table_11733 ; +--enable_warnings + +set global read_only=0; + +## Any transactional engine will do +create table table_11733 (a int) engine=InnoDb; + +connection con1; + +BEGIN; + +insert into table_11733 values(11733); + +connection default; + +set global read_only=1; + +connection con1; + +select @@global.read_only; +select * from table_11733 ; + +-- error ER_OPTION_PREVENTS_STATEMENT +COMMIT; + connection default; drop table t1,t2; +drop table table_11733 ; drop user test@localhost; set global read_only=0; --- New file --- +++ mysql-test/r/rpl_read_only.result 06/09/12 14:04:41 stop slave; drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9; reset master; reset slave; drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9; start slave; create table t1(a int) engine=InnoDB; insert into t1 values(1001); set global read_only=1; select @@read_only; @@read_only 1 select * from t1; a 1001 select @@read_only; @@read_only 0 select * from t1; a 1001 set global read_only=0; BEGIN; insert into t1 values(1002); set global read_only=1; COMMIT; ERROR HY000: The MySQL server is running with the --read-only option so it cannot execute this statement set global read_only=0; insert into t1 values(1003); select * from t1; a 1001 1003 select * from t1; a 1001 1003 set global read_only=1; select @@read_only; @@read_only 1 show create table t1; Table Create Table t1 CREATE TABLE `t1` ( `a` int(11) default NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 insert into t1 values(1004); select * from t1; a 1001 1003 1004 select * from t1; a 1001 1003 SHOW SLAVE STATUS; Slave_IO_State Waiting for master to send event Master_Host 127.0.0.1 Master_User root Master_Port 9306 Connect_Retry 1 Master_Log_File master-bin.000001 Read_Master_Log_Pos 548 Relay_Log_File slave-relay-bin.000003 Relay_Log_Pos 569 Relay_Master_Log_File master-bin.000001 Slave_IO_Running Yes Slave_SQL_Running No Replicate_Do_DB Replicate_Ignore_DB Replicate_Do_Table Replicate_Ignore_Table Replicate_Wild_Do_Table Replicate_Wild_Ignore_Table Last_Errno 1290 Last_Error Error 'The MySQL server is running with the --read-only option so it cannot execute this statement' on query. Default database: 'test'. Query: 'insert into t1 values(1004)' Skip_Counter 0 Exec_Master_Log_Pos 431 Relay_Log_Space 686 Until_Condition None Until_Log_File Until_Log_Pos 0 Master_SSL_Allowed No Master_SSL_CA_File Master_SSL_CA_Path Master_SSL_Cert Master_SSL_Cipher Master_SSL_Key Seconds_Behind_Master NULL drop table t1; stop slave; drop table t1; --- New file --- +++ mysql-test/t/rpl_read_only-slave.opt 06/09/12 14:04:41 --innodb --- New file --- +++ mysql-test/t/rpl_read_only.test 06/09/12 14:04:41 # Test case for BUG #11733 -- source include/master-slave.inc -- source include/have_innodb.inc # Seting the master readonly : # - the variable @@readonly is not replicated on the slave connection master; create table t1(a int) engine=InnoDB; insert into t1 values(1001); set global read_only=1; select @@read_only; select * from t1; sync_slave_with_master; select @@read_only; select * from t1; # - only commited statements are replicated connection master; set global read_only=0; BEGIN; insert into t1 values(1002); set global read_only=1; -- error ER_OPTION_PREVENTS_STATEMENT COMMIT; set global read_only=0; insert into t1 values(1003); select * from t1; sync_slave_with_master; select * from t1; # Seting the slave readonly : replication will fail # connection slave; set global read_only=1; select @@read_only; # Make sure the replicated table is also transactional show create table t1; connection master; insert into t1 values(1004); select * from t1; connection slave; wait_for_slave_to_stop; select * from t1; -- query_vertical SHOW SLAVE STATUS ## Cleanup connection master; drop table t1; connection slave; wait_for_slave_to_stop; stop slave; drop table t1; --- 1.163/sql/set_var.cc 2006-09-12 14:06:33 -07:00 +++ 1.164/sql/set_var.cc 2006-09-12 14:06:33 -07:00 @@ -315,7 +315,7 @@ sys_var_thd_ulong sys_preload_buff &SV::preload_buff_size); sys_var_thd_ulong sys_read_buff_size("read_buffer_size", &SV::read_buff_size); -sys_var_bool_ptr sys_readonly("read_only", &opt_readonly); +sys_var_opt_readonly sys_readonly("read_only", &opt_readonly); sys_var_thd_ulong sys_read_rnd_buff_size("read_rnd_buffer_size", &SV::read_rnd_buff_size); sys_var_thd_ulong sys_div_precincrement("div_precision_increment", @@ -3669,6 +3669,17 @@ bool sys_var_trust_routine_creators::upd { warn_deprecated(thd); return sys_var_bool_ptr::update(thd, var); +} + +bool sys_var_opt_readonly::update(THD *thd, set_var *var) +{ + /* + Make sure all the current transactions in other threads + are commited before changing the global server status. + */ + rw_wrlock(&LOCK_opt_readonly); + sys_var_bool_ptr::update(thd, var); + rw_unlock(&LOCK_opt_readonly); } /**************************************************************************** --- 1.78/sql/set_var.h 2006-09-12 14:06:33 -07:00 +++ 1.79/sql/set_var.h 2006-09-12 14:06:33 -07:00 @@ -815,6 +815,16 @@ public: }; +class sys_var_opt_readonly :public sys_var_bool_ptr +{ +public: + sys_var_opt_readonly(const char *name_arg, my_bool *value_arg) : + sys_var_bool_ptr(name_arg, value_arg) {}; + ~sys_var_opt_readonly() {}; + bool update(THD *thd, set_var *var); +}; + + class sys_var_thd_lc_time_names :public sys_var_thd { public: