Hi Sven,
Patch looks OK, you just need to change a little
thing in the test case. You can push once you fix
that.
On 05/23/2011 05:28 PM, Sven Sandberg wrote:
> #At file:///home/sven/bzr/b12574820-reset_logs_lock_order/5.1/ based on
> revid:vasil.dimov@stripped
>
> 3625 Sven Sandberg 2011-05-23
> BUG#12574820: binlog.binlog_tmp_table timing out in daily and weekly trunk
> run
> Problem: MYSQL_BIN_LOG::reset_logs acquires mutexes in wrong order.
> The correct order is first LOCK_thread_count and then LOCK_log. This function
> does it the other way around. This leads to deadlock when run in parallel
> with a thread that takes the two locks in correct order. For example, a
> thread
> that disconnects will take the locks in the correct order.
> Fix: change order of the locks in MYSQL_BIN_LOG::reset_logs:
> first LOCK_thread_count and then LOCK_log.
> @ mysql-test/suite/binlog/r/binlog_reset_master.result
> added result file
> @ mysql-test/suite/binlog/t/binlog_reset_master.test
> Added test case that demonstrates deadlock because of wrong mutex order.
> The deadlock is between two threads:
> - RESET MASTER acquires mutexes in wrong order.
> - client thread shutdown code acquires mutexes in right order.
> Actually, this test case does not produce deadlock in 5.1, probably
> the client thread shutdown code does not hold both mutexes at the same
> time. However, the bug existed in 5.1 (mutexes are taken in the wrong
> order) so we push the test case to 5.1 too, to prevent future
> regressions.
> @ sql/log.cc
> Change mutex acquisition to the correct order:
> first LOCK_thread_count, then LOCK_log.
> @ sql/mysqld.cc
> Add debug code to synchronize test case.
>
> added:
> mysql-test/suite/binlog/r/binlog_reset_master.result
> mysql-test/suite/binlog/t/binlog_reset_master.test
> modified:
> sql/log.cc
> sql/mysqld.cc
> === added file 'mysql-test/suite/binlog/r/binlog_reset_master.result'
> --- a/mysql-test/suite/binlog/r/binlog_reset_master.result 1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/binlog/r/binlog_reset_master.result 2011-05-23 16:28:17 +0000
> @@ -0,0 +1 @@
> +RESET MASTER;
>
> === added file 'mysql-test/suite/binlog/t/binlog_reset_master.test'
> --- a/mysql-test/suite/binlog/t/binlog_reset_master.test 1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/binlog/t/binlog_reset_master.test 2011-05-23 16:28:17 +0000
> @@ -0,0 +1,26 @@
> +# ==== Purpose ====
> +#
> +# Test bugs in RESET MASTER.
> +
> +--source include/have_debug.inc
> +--source include/have_log_bin.inc
> +
> +#######################################################################
> +# BUG#12574820: binlog.binlog_tmp_table timing out in daily and weekly trunk run
> +# Problem: MYSQL_BIN_LOG::reset_logs acquired LOCK_thread_count and
> +# LOCK_log in the wrong order. This could cause a deadlock when
> +# RESET MASTER was run concurrently with a disconnecting thread.
> +#######################################################################
> +
> +# We use sleep, not debug_sync, because the sync point needs to be in
> +# the thread shut down code after the debug sync facility has been
> +# shut down.
> +--let $write_var= SET
> DEBUG="+d,sven,sleep_after_lock_thread_count_before_delete_thd"; CREATE TEMPORARY TABLE
> test.t1 (a INT);
Please, remove the ',sven,' part. ^
Regards,
Luís
> +--let $write_to_file= GENERATE
> +--disable_query_log
> +--source include/write_var_to_file.inc
> +--enable_query_log
> +
> +--exec $MYSQL< $write_to_file
> +RESET MASTER;
> +--remove_file $write_to_file
>
> === modified file 'sql/log.cc'
> --- a/sql/log.cc 2011-02-22 21:03:32 +0000
> +++ b/sql/log.cc 2011-05-23 16:28:17 +0000
> @@ -2989,12 +2989,6 @@ bool MYSQL_BIN_LOG::reset_logs(THD* thd)
> DBUG_ENTER("reset_logs");
>
> ha_reset_logs(thd);
> - /*
> - We need to get both locks to be sure that no one is trying to
> - write to the index log file.
> - */
> - pthread_mutex_lock(&LOCK_log);
> - pthread_mutex_lock(&LOCK_index);
>
> /*
> The following mutex is needed to ensure that no threads call
> @@ -3002,7 +2996,14 @@ bool MYSQL_BIN_LOG::reset_logs(THD* thd)
> thread. If the transaction involved MyISAM tables, it should go
> into binlog even on rollback.
> */
> - VOID(pthread_mutex_lock(&LOCK_thread_count));
> + pthread_mutex_lock(&LOCK_thread_count);
> +
> + /*
> + We need to get both locks to be sure that no one is trying to
> + write to the index log file.
> + */
> + pthread_mutex_lock(&LOCK_log);
> + pthread_mutex_lock(&LOCK_index);
>
> /* Save variables so that we can reopen the log */
> save_name=name;
>
> === modified file 'sql/mysqld.cc'
> --- a/sql/mysqld.cc 2011-03-25 10:57:27 +0000
> +++ b/sql/mysqld.cc 2011-05-23 16:28:17 +0000
> @@ -1903,6 +1903,12 @@ void unlink_thd(THD *thd)
> pthread_mutex_unlock(&LOCK_connection_count);
>
> (void) pthread_mutex_lock(&LOCK_thread_count);
> + /*
> + Used by binlog_reset_master. It would be cleaner to use
> + DEBUG_SYNC here, but that's not possible because the THD's debug
> + sync feature has been shut down at this point.
> + */
> + DBUG_EXECUTE_IF("sleep_after_lock_thread_count_before_delete_thd", sleep(5););
> thread_count--;
> delete thd;
> DBUG_VOID_RETURN;
>
>
>
>
>