List:Commits« Previous MessageNext Message »
From:Ingo Struewing Date:February 25 2009 3:47pm
Subject:bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780
View as plain text  
#At file:///home2/mydev/bzrroot/mysql-6.0-bug37780-4/ based on revid:ingo.struewing@stripped

 2779 Ingo Struewing	2009-02-25
      Bug#37780 - main.kill fails randomly
      
      The test case main.kill did not work reliably.
      
      The following problems have been identified:
      
      1. A kill signal could go lost if it came in, short before a thread went
      reading on the client connection.
      
      2. A kill signal could go lost if it came in, short before a thread went
      waiting on a condition variable.
      
      These problems have been solved as follows. Please see also added code
      comments for more details.
      
      1. There is no safe way to detect, when a thread enters the blocking
      state of a read(2) or recv(2) system call, where it can be interrupted
      by a signal. Hence it is not possible to wait for the right moment to
      send a kill signal. To be safe, we need to close the connection.
      
      2. Before waiting on a condition variable, we register it together with
      a synchronizating mutex in THD::mysys_var. After this, we need to test
      THD::killed again. At some places we did only test it in a loop
      condition before the registration. When THD::killed had been set between
      this test and the registration, we entered waiting without noticing the
      killed flag.
      
      In addition to the above, a re-write of the main.kill test case has been
      done. Most sleeps have been replaced by Debug Sync Facility
      synchronization. The test case run time decreased from over 30 to below
      three seconds. A couple of sync points have been added to the server
      code.
     @ mysql-test/r/kill.result
        Bug#37780 - main.kill fails randomly
        Updated test result.
     @ mysql-test/t/kill.test
        Bug#37780 - main.kill fails randomly
        Re-wrote test case to use Debug Sync points instead of sleeps.
     @ sql/event_queue.cc
        Bug#37780 - main.kill fails randomly
        Fixed kill detection in Event_queue::cond_wait().
     @ sql/lock.cc
        Bug#37780 - main.kill fails randomly
        Moved Debug Sync points behind enter_cond().
        Fixed comments.
     @ sql/mdl.cc
        Bug#37780 - main.kill fails randomly
        Fixed a compiler warning.
        Added Debug Sync points.
     @ sql/slave.cc
        Bug#37780 - main.kill fails randomly
        Fixed kill detection in start_slave_thread().
     @ sql/sql_base.cc
        Bug#37780 - main.kill fails randomly
        Fixed kill detection in close_cached_tables() and
        tdc_wait_for_old_versions().
     @ sql/sql_class.cc
        Bug#37780 - main.kill fails randomly
        Fixed and added comments.
     @ sql/sql_class.h
        Bug#37780 - main.kill fails randomly
        Unconditionally enabled SIGNAL_WITH_VIO_CLOSE with a comment.
     @ sql/sql_parse.cc
        Bug#37780 - main.kill fails randomly
        Added a sync point in do_command().
     @ sql/sql_select.cc
        Bug#37780 - main.kill fails randomly
        Added a sync point in JOIN::optimize().

    modified:
      mysql-test/r/kill.result
      mysql-test/t/kill.test
      sql/event_queue.cc
      sql/lock.cc
      sql/mdl.cc
      sql/slave.cc
      sql/sql_base.cc
      sql/sql_class.cc
      sql/sql_class.h
      sql/sql_parse.cc
      sql/sql_select.cc
=== modified file 'mysql-test/r/kill.result'
--- a/mysql-test/r/kill.result	2009-02-13 17:00:42 +0000
+++ b/mysql-test/r/kill.result	2009-02-25 15:46:59 +0000
@@ -1,141 +1,354 @@
-set @old_concurrent_insert= @@global.concurrent_insert;
-set @@global.concurrent_insert= 0;
-drop table if exists t1, t2, t3;
-create table t1 (kill_id int);
-insert into t1 values(connection_id());
-select ((@id := kill_id) - kill_id) from t1;
-((@id := kill_id) - kill_id)
-0
-kill @id;
-select ((@id := kill_id) - kill_id) from t1;
-((@id := kill_id) - kill_id)
-0
-select @id != connection_id();
-@id != connection_id()
+SET DEBUG_SYNC = 'RESET';
+USE test;
+DROP TABLE IF EXISTS t1, t2, t3;
+# Disable concurrent inserts to avoid test failures when reading the
+# connection id which was inserted into a table by another thread.
+SET @old_concurrent_insert= @@global.concurrent_insert;
+SET @@global.concurrent_insert= 0;
+# Create two connections for use throughout the test case.
+#
+# Test kill connection when blocked in read/recv().
+#
+# connection con1, store own connection id
+# connection con2, store con1 connection id
+# connection con1, request a signal from thread end
+SET DEBUG_SYNC= 'thread_end SIGNAL con1_end';
+# connection con1, disable reconnect
+# connection con2, wait for con1 to reach read/recv()
+# Note that this may not be reliable. There is no way to have
+# read/recv() emit a signal which another thread can wait for.
+# We could place a sync point short before vio_read(), but
+# this would still not be safe. So sorry guys, here we must
+# use a sleep! :-( This is still not safe, but best we can do.
+# connection con2, kill con1
+KILL @id;
+# connection con2, wait for end of con1
+SET DEBUG_SYNC= 'now WAIT_FOR con1_end';
+# connection con1, show connection end
+SELECT 1;
+Got one of the listed errors
+# connection con1, enable reconnect
+# connection con1, show successful reconnect
+SELECT 1;
+1
+1
+# connection con1, restore old connection id value
+# connection con1, show that we have a new connection id
+SELECT @id != CONNECTION_ID();
+@id != CONNECTION_ID()
 1
-select 4;
+# connection con2, show that server is still alive
+SELECT 4;
 4
 4
-drop table t1;
-kill (select count(*) from mysql.user);
+# connection default, show unsupported subselect
+KILL (SELECT COUNT(*) FROM mysql.user);
 ERROR 42000: This version of MySQL doesn't yet support 'Usage of subqueries or stored function calls as part of this statement'
-create table t1 (id int primary key);
-create table t2 (id int unsigned not null);
-insert into t2 select id from t1;
-create table t3 (kill_id int);
-insert into t3 values(connection_id());
-select id from t1 where id in (select distinct a.id from t2 a, t2 b, t2 c, t2 d group by a.id, b.id, c.id, d.id having a.id between 10 and 20);
-select ((@id := kill_id) - kill_id) from t3;
-((@id := kill_id) - kill_id)
-0
-kill @id;
+SET DEBUG_SYNC = 'RESET';
+#
+# Test kill connection when not blocked in read/recv().
+#
+# connection con1, store own connection id
+# connection con2, store con1 connection id
+# connection con1, request a signal from thread end
+SET DEBUG_SYNC= 'thread_end SIGNAL con1_end';
+# connection con1, disable reconnect
+# connection con1, let connection wait before reading
+# next statement, to prove that it is killed in this case too.
+SET DEBUG_SYNC= 'before_do_command_net_read SIGNAL in_sync WAIT_FOR kill';
+# connection con2, wait for con1 to reach sync point
+SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+# connection con2, kill con1
+KILL @id;
+# connection con2, wait for end of con1
+SET DEBUG_SYNC= 'now WAIT_FOR con1_end';
+# connection con1, show connection end
+SELECT 1;
 Got one of the listed errors
-drop table t1, t2, t3;
-select get_lock("a", 10);
-get_lock("a", 10)
-1
-select get_lock("a", 10);
-get_lock("a", 10)
-NULL
-select 1;
-1
-1
-select RELEASE_LOCK("a");
-RELEASE_LOCK("a")
-1
-create table t1(f1 int);
-create function bug27563() returns int(11)
-deterministic
-begin
-declare continue handler for sqlstate '70100' set @a:= 'killed';
-declare continue handler for sqlexception set @a:= 'exception';
-set @a= get_lock("lock27563", 10);
-return 1;
-end|
-select get_lock("lock27563",10);
-get_lock("lock27563",10)
-1
-insert into t1 values (bug27563());
-ERROR 70100: Query execution was interrupted
-select @a;
-@a
-NULL
-select * from t1;
-f1
-insert into t1 values(0);
-update t1 set f1= bug27563();
-ERROR 70100: Query execution was interrupted
-select @a;
-@a
-NULL
-select * from t1;
-f1
-0
-insert into t1 values(1);
-delete from t1 where bug27563() is null;
-ERROR 70100: Query execution was interrupted
-select @a;
-@a
-NULL
-select * from t1;
-f1
-0
-1
-select * from t1 where f1= bug27563();
-ERROR 70100: Query execution was interrupted
-select @a;
-@a
-NULL
-create procedure proc27563()
-begin
-declare continue handler for sqlstate '70100' set @a:= 'killed';
-declare continue handler for sqlexception set @a:= 'exception';
-select get_lock("lock27563",10);
-select "shouldn't be selected";
-end|
-call proc27563();
-get_lock("lock27563",10)
-NULL
-ERROR 70100: Query execution was interrupted
-select @a;
-@a
-NULL
-create table t2 (f2 int);
-create trigger trg27563 before insert on t1 for each row 
-begin 
-declare continue handler for sqlstate '70100' set @a:= 'killed';
-declare continue handler for sqlexception set @a:= 'exception';
-set @a:= get_lock("lock27563",10);
-insert into t2 values(1);
-end|
-insert into t1 values(2),(3);
-ERROR 70100: Query execution was interrupted
-select @a;
-@a
-NULL
-select * from t1;
+# connection con1, enable reconnect
+# connection con1, show successful reconnect
+SELECT 1;
+1
+1
+# connection con1, restore old connection id value
+# connection con1, show that we have a new connection id
+SELECT @id != CONNECTION_ID();
+@id != CONNECTION_ID()
+1
+# connection con2, show that server is still alive
+SELECT 4;
+4
+4
+# connection default
+SET DEBUG_SYNC = 'RESET';
+#
+# BUG#14851: killing long running subquery
+#            processed via a temporary table.
+#
+CREATE TABLE t1 (id INT PRIMARY KEY AUTO_INCREMENT);
+CREATE TABLE t2 (id INT UNSIGNED NOT NULL);
+INSERT INTO t1 VALUES
+(0),(0),(0),(0),(0),(0),(0),(0), (0),(0),(0),(0),(0),(0),(0),(0),
+(0),(0),(0),(0),(0),(0),(0),(0), (0),(0),(0),(0),(0),(0),(0),(0),
+(0),(0),(0),(0),(0),(0),(0),(0), (0),(0),(0),(0),(0),(0),(0),(0),
+(0),(0),(0),(0),(0),(0),(0),(0), (0),(0),(0),(0),(0),(0),(0),(0);
+INSERT t1 SELECT 0 FROM t1 AS a1, t1 AS a2 LIMIT 4032;
+INSERT INTO t2 SELECT id FROM t1;
+# connection con1, store own connection id
+# connection con2, store con1 connection id
+# connection con1, request a signal from thread end
+SET DEBUG_SYNC= 'thread_end SIGNAL con1_end';
+# connection con1, request a signal from ACOS() function
+SET DEBUG_SYNC= 'before_acos_function SIGNAL in_sync';
+# connection con1, run a slow query, using ACOS()
+# This is a very long running query. If this test start failing,
+# it may be necessary to change to an even longer query.
+SELECT id FROM t1 WHERE id IN
+(SELECT DISTINCT a.id FROM t2 a, t2 b, t2 c, t2 d
+GROUP BY ACOS(1/a.id), b.id, c.id, d.id
+HAVING a.id BETWEEN 10 AND 20);
+# connection con2, wait for con1 to reach sync point
+SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+# connection con2, kill con1
+KILL @id;
+# connection con2, wait for end of con1
+SET DEBUG_SYNC= 'now WAIT_FOR con1_end';
+# connection con1, show connection end
+Got one of the listed errors
+# connection con1, show successful reconnect
+SELECT 1;
+1
+1
+# connection default
+SET DEBUG_SYNC = 'RESET';
+DROP TABLE t1, t2;
+# End of 4.1 tests
+#
+# Test of blocking of sending ERROR after OK or EOF
+#
+# connection con1, store own connection id
+# connection con2, store con1 connection id
+# connection con1, request a signal from ACOS() function
+# and a wait for the statement to be killed.
+SET DEBUG_SYNC= 'before_acos_function SIGNAL in_sync WAIT_FOR kill';
+# connection con1, run a blocking query
+SELECT ACOS(0);
+# connection con2, wait for con1 to reach sync point
+SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+# connection con2, kill con1
+KILL QUERY @id;
+# connection con1, retrieve result
+ACOS(0)
+1.5707963267948966
+# connection con1, show that connection still works
+SELECT 1;
+1
+1
+# connection con1, show unchanged connection_id
+SELECT @id = CONNECTION_ID();
+@id = CONNECTION_ID()
+1
+# connection default
+SET DEBUG_SYNC = 'RESET';
+#
+# Bug#27563: Stored functions and triggers
+#            wasn't throwing an error when killed.
+#
+CREATE TABLE t1 (f1 INT);
+#
+# Create a function that signals 'in_function' and blocks until killed.
+#
+CREATE FUNCTION bug27563() RETURNS INT(11) DETERMINISTIC
+BEGIN
+DECLARE CONTINUE HANDLER FOR SQLSTATE '70100' SET @a:= 'killed';
+DECLARE CONTINUE HANDLER FOR SQLEXCEPTION SET @a:= 'exception';
+SET DEBUG_SYNC= 'now SIGNAL in_sync WAIT_FOR kill';
+RETURN 1;
+END|
+# connection con1, store own connection id
+# connection con2, store con1 connection id
+#
+# Test INSERT
+#
+# connection con1, insert from blocking function
+INSERT INTO t1 VALUES (bug27563());
+# connection con2, wait for con1 to reach sync point
+SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+# connection con2, kill con1
+KILL QUERY @id;
+# connection con1, retrieve result
+ERROR 70100: Query execution was interrupted
+# connection con1, show contents of table.
+SELECT * FROM t1;
+f1
+# connection default
+SET DEBUG_SYNC = 'RESET';
+#
+# Test UPDATE
+#
+INSERT INTO t1 VALUES(0);
+# connection con1, update from blocking function
+UPDATE t1 SET f1= bug27563();
+# connection con2, wait for con1 to reach sync point
+SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+# connection con2, kill con1
+KILL QUERY @id;
+# connection con1, retrieve result
+ERROR 70100: Query execution was interrupted
+# connection con1, show contents of table.
+SELECT * FROM t1;
+f1
+0
+# connection default
+SET DEBUG_SYNC = 'RESET';
+#
+# Test DELETE
+#
+INSERT INTO t1 VALUES(1);
+# connection con1, delete where blocking function
+DELETE FROM t1 WHERE bug27563() IS NULL;
+# connection con2, wait for con1 to reach sync point
+SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+# connection con2, kill con1
+KILL QUERY @id;
+# connection con1, retrieve result
+ERROR 70100: Query execution was interrupted
+# connection con1, show contents of table.
+SELECT * FROM t1;
 f1
 0
 1
-select * from t2;
-f2
-select release_lock("lock27563");
-release_lock("lock27563")
+# connection default
+SET DEBUG_SYNC = 'RESET';
+#
+# Test SELECT
+#
+# connection con1, select where blocking function
+SELECT * FROM t1 WHERE f1= bug27563();
+# connection con2, wait for con1 to reach sync point
+SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+# connection con2, kill con1
+KILL QUERY @id;
+# connection con1, retrieve result
+ERROR 70100: Query execution was interrupted
+# connection con1, show contents of table.
+SELECT * FROM t1;
+f1
+0
+1
+# connection default
+SET DEBUG_SYNC = 'RESET';
+DROP FUNCTION bug27563;
+#
+# Test PROCEDURE
+#
+CREATE PROCEDURE proc27563()
+BEGIN
+DECLARE CONTINUE HANDLER FOR SQLSTATE '70100' SET @a:= 'killed';
+DECLARE CONTINUE HANDLER FOR SQLEXCEPTION SET @a:= 'exception';
+SELECT "should be selected" AS procedure_activity_1;
+SET DEBUG_SYNC= 'now SIGNAL in_sync WAIT_FOR kill';
+SELECT "shouldn't be selected" AS procedure_activity_2;
+END|
+# connection con1, call blocking procedure
+CALL proc27563();
+# connection con2, wait for con1 to reach sync point
+SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+# connection con2, kill con1
+KILL QUERY @id;
+# connection con1, retrieve result
+procedure_activity_1
+should be selected
+ERROR 70100: Query execution was interrupted
+# connection con1, show contents of table.
+SELECT * FROM t1;
+f1
+0
+1
+# connection default
+SET DEBUG_SYNC = 'RESET';
+DROP PROCEDURE proc27563;
+#
+# Test TRIGGERS
+#
+CREATE TABLE t2 (f2 INT);
+CREATE TRIGGER trg27563 BEFORE INSERT ON t1 FOR EACH ROW 
+BEGIN 
+DECLARE CONTINUE HANDLER FOR SQLSTATE '70100' SET @a:= 'killed';
+DECLARE CONTINUE HANDLER FOR SQLEXCEPTION SET @a:= 'exception';
+INSERT INTO t2 VALUES(0);
+SET DEBUG_SYNC= 'now SIGNAL in_sync WAIT_FOR kill';
+INSERT INTO t2 VALUES(1);
+END|
+# connection con1, provoke blocking trigger
+INSERT INTO t1 VALUES(2),(3);
+# connection con2, wait for con1 to reach sync point
+SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+# connection con2, kill con1
+KILL QUERY @id;
+# connection con1, retrieve result
+ERROR 70100: Query execution was interrupted
+# connection con1, show contents of table.
+SELECT * FROM t1;
+f1
+0
 1
-drop table t1, t2;
-drop function bug27563;
-drop procedure proc27563;
+SELECT * FROM t2;
+f2
+0
+# connection default
+SET DEBUG_SYNC = 'RESET';
+DROP TABLE t1, t2;
+#
+# Bug#28598: mysqld crash when killing a long-running explain query.
+#
+# connection con1, store own connection id
+# connection con2, store con1 connection id
+# connection con1, store own connection id
+# connection con1, disable_query_log, create tables
+# connection con1, enable_query_log
+# connection con1, request a signal from JOIN::optimize
+SET DEBUG_SYNC= 'before_join_optimize SIGNAL in_sync';
+# connection con1, prepare and execute
 PREPARE stmt FROM 'EXPLAIN SELECT * FROM t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11,t12,t13,t14,t15,t16,t17,t18,t19,t20,t21,t22,t23,t24,t25,t26,t27,t28,t29,t30,t31,t32,t33,t34,t35,t36,t37,t38,t39,t40 WHERE a1=a2 AND a2=a3 AND a3=a4 AND a4=a5 AND a5=a6 AND a6=a7 AND a7=a8 AND a8=a9 AND a9=a10 AND a10=a11 AND a11=a12 AND a12=a13 AND a13=a14 AND a14=a15 AND a15=a16 AND a16=a17 AND a17=a18 AND a18=a19 AND a19=a20 AND a20=a21 AND a21=a22 AND a22=a23 AND a23=a24 AND a24=a25 AND a25=a26 AND a26=a27 AND a27=a28 AND a28=a29 AND a29=a30 AND a30=a31 AND a31=a32 AND a32=a33 AND a33=a34 AND a34=a35 AND a35=a36 AND a36=a37 AND a37=a38 AND a38=a39 AND a39=a40  ';
 EXECUTE stmt;
+# connection con2, wait for con1 to reach sync point
+SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+# connection con2, kill con1
+KILL QUERY @id;
+# connection con1, retrieve result
+ERROR 70100: Query execution was interrupted
+# connection con1, disable_query_log, drop tables
+# connection con1, enable_query_log
+# connection default
+SET DEBUG_SYNC = 'RESET';
 #
 # Bug#19723: kill of active connection yields different error code
 # depending on platform.
 #
-
-# Connection: con2.
-KILL CONNECTION_ID();
-# CR_SERVER_LOST, CR_SERVER_GONE_ERROR, depending on the timing 
-# of close of the connection socket
+# connection con1, store own connection id
+# connection con1, request a signal from thread end
+SET DEBUG_SYNC= 'thread_end SIGNAL con1_end';
+# connection con1, disable reconnect
+# connection con1, kill myself
+KILL @id;
+# connection con2, wait for end of con1
+SET DEBUG_SYNC= 'now WAIT_FOR con1_end';
+# connection con1, show connection end
+# ER_SERVER_SHUTDOWN, CR_SERVER_GONE_ERROR, CR_SERVER_LOST,
+# depending on the timing of close of the connection socket
 SELECT 1;
 Got one of the listed errors
+# connection con1, enable reconnect
+# connection con1, show successful reconnect
+SELECT 1;
+1
+1
+# connection con1, restore old connection id value
+# connection con1, show that we have a new connection id
+SELECT @id != CONNECTION_ID();
+@id != CONNECTION_ID()
+1
+# connection default
+SET DEBUG_SYNC = 'RESET';
 set @@global.concurrent_insert= @old_concurrent_insert;

=== modified file 'mysql-test/t/kill.test'
--- a/mysql-test/t/kill.test	2009-02-13 17:00:42 +0000
+++ b/mysql-test/t/kill.test	2009-02-25 15:46:59 +0000
@@ -4,329 +4,594 @@
 # This would work if mysqltest run would be threaded and handle each
 # connection in a separate thread.
 #
--- source include/not_embedded.inc
+--source include/not_embedded.inc
+--source include/have_debug_sync.inc
 
-# Disable concurrent inserts to avoid test failures when reading the
-# connection id which was inserted into a table by another thread.
-set @old_concurrent_insert= @@global.concurrent_insert;
-set @@global.concurrent_insert= 0;
-
-connect (con1, localhost, root,,);
-connect (con2, localhost, root,,);
-
-#remember id of con1
-connection con1;
 --disable_warnings
-drop table if exists t1, t2, t3;
+SET DEBUG_SYNC = 'RESET';
+USE test;
+DROP TABLE IF EXISTS t1, t2, t3;
 --enable_warnings
 
---disable_reconnect
-create table t1 (kill_id int);
-insert into t1 values(connection_id());
-
-#kill con1
-connection con2;
-select ((@id := kill_id) - kill_id) from t1; 
-kill @id;
-
-connection con1;
---sleep 2
-
---disable_query_log
---disable_result_log
-# One of the following statements should fail
---error 0,2006,2013
-select 1;
---error 0,2006,2013
-select 1;
---enable_query_log
---enable_result_log
-
---enable_reconnect
-# this should work, and we should have a new connection_id()
-select ((@id := kill_id) - kill_id) from t1;
-select @id != connection_id();
-
-#make sure the server is still alive
-connection con2;
-select 4;
-drop table t1;
-connection default;
+--echo # Disable concurrent inserts to avoid test failures when reading the
+--echo # connection id which was inserted into a table by another thread.
+SET @old_concurrent_insert= @@global.concurrent_insert;
+SET @@global.concurrent_insert= 0;
+
+--echo # Create two connections for use throughout the test case.
+--connect (con1, localhost, root,,)
+--connect (con2, localhost, root,,)
+
+--echo #
+--echo # Test kill connection when blocked in read/recv().
+--echo #
 
+    --echo # connection con1, store own connection id
+    --connection con1
+    --let $ID= `SELECT @id := CONNECTION_ID()`
+
+        --echo # connection con2, store con1 connection id
+        --connection con2
+        --let $ignore= `SELECT @id := $ID`
+
+    --echo # connection con1, request a signal from thread end
+    --connection con1
+    SET DEBUG_SYNC= 'thread_end SIGNAL con1_end';
+    #
+    --echo # connection con1, disable reconnect
+    --disable_reconnect
+
+        --echo # connection con2, wait for con1 to reach read/recv()
+        --echo # Note that this may not be reliable. There is no way to have
+        --echo # read/recv() emit a signal which another thread can wait for.
+        --echo # We could place a sync point short before vio_read(), but
+        --echo # this would still not be safe. So sorry guys, here we must
+        --echo # use a sleep! :-( This is still not safe, but best we can do.
+        --connection con2
+        --real_sleep 2
+        #
+        --echo # connection con2, kill con1
+        KILL @id;
+        #
+        --echo # connection con2, wait for end of con1
+        SET DEBUG_SYNC= 'now WAIT_FOR con1_end';
+
+    --echo # connection con1, show connection end
+    --connection con1
+    --error 1053,2006,2013
+    SELECT 1;
+    #
+    --echo # connection con1, enable reconnect
+    --enable_reconnect
+    #
+    --echo # connection con1, show successful reconnect
+    SELECT 1;
+    #
+    --echo # connection con1, restore old connection id value
+    --let $ignore= `SELECT @id := $ID`
+    #
+    --echo # connection con1, show that we have a new connection id
+    SELECT @id != CONNECTION_ID();
+
+        --echo # connection con2, show that server is still alive
+        --connection con2
+        SELECT 4;
+
+--echo # connection default, show unsupported subselect
+--connection default
 --error ER_NOT_SUPPORTED_YET
-kill (select count(*) from mysql.user);
+KILL (SELECT COUNT(*) FROM mysql.user);
+SET DEBUG_SYNC = 'RESET';
 
+--echo #
+--echo # Test kill connection when not blocked in read/recv().
+--echo #
+
+    --echo # connection con1, store own connection id
+    --connection con1
+    --let $ID= `SELECT @id := CONNECTION_ID()`
+
+        --echo # connection con2, store con1 connection id
+        --connection con2
+        --let $ignore= `SELECT @id := $ID`
+
+    --echo # connection con1, request a signal from thread end
+    --connection con1
+    SET DEBUG_SYNC= 'thread_end SIGNAL con1_end';
+    #
+    --echo # connection con1, disable reconnect
+    --disable_reconnect
+    #
+    --echo # connection con1, let connection wait before reading
+    --echo # next statement, to prove that it is killed in this case too.
+    SET DEBUG_SYNC= 'before_do_command_net_read SIGNAL in_sync WAIT_FOR kill';
+
+        --echo # connection con2, wait for con1 to reach sync point
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+        #
+        --echo # connection con2, kill con1
+        KILL @id;
+        #
+        --echo # connection con2, wait for end of con1
+        SET DEBUG_SYNC= 'now WAIT_FOR con1_end';
+
+    --echo # connection con1, show connection end
+    --connection con1
+    --error 1053,2006,2013
+    SELECT 1;
+    #
+    --echo # connection con1, enable reconnect
+    --enable_reconnect
+    #
+    --echo # connection con1, show successful reconnect
+    SELECT 1;
+    #
+    --echo # connection con1, restore old connection id value
+    --let $ignore= `SELECT @id := $ID`
+    #
+    --echo # connection con1, show that we have a new connection id
+    SELECT @id != CONNECTION_ID();
+
+        --echo # connection con2, show that server is still alive
+        --connection con2
+        SELECT 4;
+
+--echo # connection default
+--connection default
+SET DEBUG_SYNC = 'RESET';
+
+--echo #
+--echo # BUG#14851: killing long running subquery
+--echo #            processed via a temporary table.
+--echo #
+CREATE TABLE t1 (id INT PRIMARY KEY AUTO_INCREMENT);
+CREATE TABLE t2 (id INT UNSIGNED NOT NULL);
 #
-# BUG#14851: killing long running subquery processed via a temporary table.
-#
-create table t1 (id int primary key);
-create table t2 (id int unsigned not null);
+INSERT INTO t1 VALUES
+  (0),(0),(0),(0),(0),(0),(0),(0), (0),(0),(0),(0),(0),(0),(0),(0),
+  (0),(0),(0),(0),(0),(0),(0),(0), (0),(0),(0),(0),(0),(0),(0),(0),
+  (0),(0),(0),(0),(0),(0),(0),(0), (0),(0),(0),(0),(0),(0),(0),(0),
+  (0),(0),(0),(0),(0),(0),(0),(0), (0),(0),(0),(0),(0),(0),(0),(0);
+INSERT t1 SELECT 0 FROM t1 AS a1, t1 AS a2 LIMIT 4032;
+#
+INSERT INTO t2 SELECT id FROM t1;
+
+    --echo # connection con1, store own connection id
+    --connection con1
+    --let $ID= `SELECT @id := CONNECTION_ID()`
+
+        --echo # connection con2, store con1 connection id
+        --connection con2
+        --let $ignore= `SELECT @id := $ID`
+
+    --echo # connection con1, request a signal from thread end
+    --connection con1
+    SET DEBUG_SYNC= 'thread_end SIGNAL con1_end';
+    #
+    --echo # connection con1, request a signal from ACOS() function
+    SET DEBUG_SYNC= 'before_acos_function SIGNAL in_sync';
+    #
+    --echo # connection con1, run a slow query, using ACOS()
+    --echo # This is a very long running query. If this test start failing,
+    --echo # it may be necessary to change to an even longer query.
+    send SELECT id FROM t1 WHERE id IN
+           (SELECT DISTINCT a.id FROM t2 a, t2 b, t2 c, t2 d
+              GROUP BY ACOS(1/a.id), b.id, c.id, d.id
+              HAVING a.id BETWEEN 10 AND 20);
+
+        --echo # connection con2, wait for con1 to reach sync point
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+        #
+        --echo # connection con2, kill con1
+        KILL @id;
+        #
+        --echo # connection con2, wait for end of con1
+        SET DEBUG_SYNC= 'now WAIT_FOR con1_end';
+
+    --echo # connection con1, show connection end
+    --connection con1
+    --error 1053,2006,2013
+    reap;
+    #
+    --echo # connection con1, show successful reconnect
+    SELECT 1;
 
-connect (conn1, localhost, root,,);
-connection conn1;
+--echo # connection default
+--connection default
+SET DEBUG_SYNC = 'RESET';
+DROP TABLE t1, t2;
 
--- disable_result_log
--- disable_query_log
-let $1 = 4096;
-while ($1)
-{
-  eval insert into t1 values ($1);
-  dec $1;
-}
--- enable_query_log
--- enable_result_log
-
-insert into t2 select id from t1;
-
-create table t3 (kill_id int);
-insert into t3 values(connection_id());
-connect (conn2, localhost, root,,);
-connection conn2;
-
-connection conn1;
--- disable_result_log
-# This is a very long running query. If this test start failing, it may
-# be necessary to change to an even longer query.
-send select id from t1 where id in (select distinct a.id from t2 a, t2 b, t2 c, t2 d group by a.id, b.id, c.id, d.id having a.id between 10 and 20);
--- enable_result_log
-
-connection conn2;
-select ((@id := kill_id) - kill_id) from t3;
--- sleep 1
-kill @id;
-
-connection conn1;
--- error 1053,2013
-reap;
+--echo # End of 4.1 tests
 
-connection default;
+--echo #
+--echo # Test of blocking of sending ERROR after OK or EOF
+--echo #
 
-drop table t1, t2, t3;
+    --echo # connection con1, store own connection id
+    --connection con1
+    --let $ID= `SELECT @id := CONNECTION_ID()`
+
+        --echo # connection con2, store con1 connection id
+        --connection con2
+        --let $ignore= `SELECT @id := $ID`
+
+    --echo # connection con1, request a signal from ACOS() function
+    --echo # and a wait for the statement to be killed.
+    --connection con1
+    SET DEBUG_SYNC= 'before_acos_function SIGNAL in_sync WAIT_FOR kill';
+    #
+    --echo # connection con1, run a blocking query
+    send SELECT ACOS(0);
+
+        --echo # connection con2, wait for con1 to reach sync point
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+        #
+        --echo # connection con2, kill con1
+        KILL QUERY @id;
+
+    --echo # connection con1, retrieve result
+    --connection con1
+    reap;
+    #
+    --echo # connection con1, show that connection still works
+    SELECT 1;
+    #
+    --echo # connection con1, show unchanged connection_id
+    SELECT @id = CONNECTION_ID();
 
-# End of 4.1 tests
+--echo # connection default
+--connection default
+SET DEBUG_SYNC = 'RESET';
 
+--echo #
+--echo # Bug#27563: Stored functions and triggers
+--echo #            wasn't throwing an error when killed.
+--echo #
+CREATE TABLE t1 (f1 INT);
 #
-# test of blocking of sending ERROR after OK or EOF
-#
-connection con1;
-select get_lock("a", 10);
-connection con2;
-let $ID= `select connection_id()`;
-send select get_lock("a", 10);
-real_sleep 2;
-connection con1;
-disable_query_log;
-eval kill query $ID;
-enable_query_log;
-connection con2;
-reap;
-select 1;
-connection con1;
-select RELEASE_LOCK("a");
+--echo #
+--echo # Create a function that signals 'in_function' and blocks until killed.
+--echo #
+--delimiter |
+CREATE FUNCTION bug27563() RETURNS INT(11) DETERMINISTIC
+BEGIN
+  DECLARE CONTINUE HANDLER FOR SQLSTATE '70100' SET @a:= 'killed';
+  DECLARE CONTINUE HANDLER FOR SQLEXCEPTION SET @a:= 'exception';
+  SET DEBUG_SYNC= 'now SIGNAL in_sync WAIT_FOR kill';
+  RETURN 1;
+END|
+--delimiter ;
+
+    --echo # connection con1, store own connection id
+    --connection con1
+    --let $ID= `SELECT @id := CONNECTION_ID()`
+
+        --echo # connection con2, store con1 connection id
+        --connection con2
+        --let $ignore= `SELECT @id := $ID`
 
-#
-# Bug#27563: Stored functions and triggers wasn't throwing an error when killed.
-#
-create table t1(f1 int);
-delimiter |;
-create function bug27563() returns int(11)
-deterministic
-begin
-  declare continue handler for sqlstate '70100' set @a:= 'killed';
-  declare continue handler for sqlexception set @a:= 'exception';
-  set @a= get_lock("lock27563", 10);
-  return 1;
-end|
-delimiter ;|
-# Test stored functions
-# Test INSERT
-connection con1;
-select get_lock("lock27563",10);
-connection con2;
-let $ID= `select connection_id()`;
-send insert into t1 values (bug27563());
-real_sleep 2;
-connection con1;
-disable_query_log;
-eval kill query $ID;
-enable_query_log;
-connection con2;
---error 1317
-reap;
-select @a;
-connection con1;
-select * from t1;
-
-# Test UPDATE
-insert into t1 values(0);
-connection con2;
-send update t1 set f1= bug27563();
-real_sleep 2;
-connection con1;
-disable_query_log;
-eval kill query $ID;
-enable_query_log;
-connection con2;
---error 1317
-reap;
-select @a;
-connection con1;
-select * from t1;
-
-# Test DELETE
-insert into t1 values(1);
-connection con2;
-send delete from t1 where bug27563() is null;
-real_sleep 2;
-connection con1;
-disable_query_log;
-eval kill query $ID;
-enable_query_log;
-connection con2;
---error 1317
-reap;
-select @a;
-connection con1;
-select * from t1;
-
-# Test SELECT
-connection con2;
-send select * from t1 where f1= bug27563();
-real_sleep 2;
-connection con1;
-disable_query_log;
-eval kill query $ID;
-enable_query_log;
-connection con2;
---error 1317
-reap;
-select @a;
-
-# Test PROCEDURE
-connection con2;
-delimiter |;
-create procedure proc27563()
-begin
-  declare continue handler for sqlstate '70100' set @a:= 'killed';
-  declare continue handler for sqlexception set @a:= 'exception';
-  select get_lock("lock27563",10);
-  select "shouldn't be selected";
-end|
-delimiter ;|
-send call proc27563();
-real_sleep 2;
-connection con1;
-disable_query_log;
-eval kill query $ID;
-enable_query_log;
-connection con2;
---error 1317
-reap;
-select @a;
-
-# Test TRIGGERS
-connection con2;
-create table t2 (f2 int);
-delimiter |;
-create trigger trg27563 before insert on t1 for each row 
-begin 
-  declare continue handler for sqlstate '70100' set @a:= 'killed';
-  declare continue handler for sqlexception set @a:= 'exception';
-  set @a:= get_lock("lock27563",10);
-  insert into t2 values(1);
-end|
-delimiter ;|
-send insert into t1 values(2),(3);
-real_sleep 2;
-connection con1;
-disable_query_log;
-eval kill query $ID;
-enable_query_log;
-connection con2;
---error 1317
-reap;
-select @a;
-connection con1;
-select * from t1;
-select * from t2;
-
-# Cleanup
-select release_lock("lock27563");
-drop table t1, t2;
-drop function bug27563;
-drop procedure proc27563;
+--echo #
+--echo # Test INSERT
+--echo #
 
-#
-# Bug#28598: mysqld crash when killing a long-running explain query.
-#
---disable_query_log
-connection con1;
-let $ID= `select connection_id()`;
-let $tab_count= 40;
-
-let $i= $tab_count;
-while ($i)
-{
-  eval CREATE TABLE t$i (a$i int, KEY(a$i));
-  eval INSERT INTO t$i VALUES (1),(2),(3),(4),(5),(6),(7);
-  dec $i ;
-}
-set session optimizer_search_depth=0;
-
-let $i=$tab_count;
-while ($i)
-{
- let $a= a$i;
- let $t= t$i;
- dec $i;
- if ($i)
- {
-   let $comma=,;
-   let $from=$comma$t$from;
-   let $where=a$i=$a $and $where;
- }
- if (!$i) 
- {
-   let $from=FROM $t$from;
-   let $where=WHERE $where;
- }
- let $and=AND;
-}
-
---enable_query_log
-eval PREPARE stmt FROM 'EXPLAIN SELECT * $from $where';
-send EXECUTE stmt;
---disable_query_log
-
-connection con2;
-real_sleep 2;
-eval kill query $ID;
-let $i= $tab_count;
-while ($i)
-{
-  eval DROP TABLE t$i;
-  dec $i ;
-}
---enable_query_log
+    --echo # connection con1, insert from blocking function
+    --connection con1
+    send INSERT INTO t1 VALUES (bug27563());
+
+        --echo # connection con2, wait for con1 to reach sync point
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+        #
+        --echo # connection con2, kill con1
+        KILL QUERY @id;
+
+    --echo # connection con1, retrieve result
+    --connection con1
+    --error 1317
+    reap;
+    #
+    --echo # connection con1, show contents of table.
+    SELECT * FROM t1;
 
-###########################################################################
+--echo # connection default
+--connection default
+SET DEBUG_SYNC = 'RESET';
+
+--echo #
+--echo # Test UPDATE
+--echo #
+INSERT INTO t1 VALUES(0);
+
+    --echo # connection con1, update from blocking function
+    connection con1;
+    send UPDATE t1 SET f1= bug27563();
+
+        --echo # connection con2, wait for con1 to reach sync point
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+        #
+        --echo # connection con2, kill con1
+        KILL QUERY @id;
+
+    --echo # connection con1, retrieve result
+    --connection con1
+    --error 1317
+    reap;
+    #
+    --echo # connection con1, show contents of table.
+    SELECT * FROM t1;
+
+--echo # connection default
+--connection default
+SET DEBUG_SYNC = 'RESET';
+
+--echo #
+--echo # Test DELETE
+--echo #
+INSERT INTO t1 VALUES(1);
+
+    --echo # connection con1, delete where blocking function
+    --connection con1
+    send DELETE FROM t1 WHERE bug27563() IS NULL;
+
+        --echo # connection con2, wait for con1 to reach sync point
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+        #
+        --echo # connection con2, kill con1
+        KILL QUERY @id;
+
+    --echo # connection con1, retrieve result
+    --connection con1
+    --error 1317
+    reap;
+    #
+    --echo # connection con1, show contents of table.
+    SELECT * FROM t1;
+
+--echo # connection default
+--connection default
+SET DEBUG_SYNC = 'RESET';
+
+--echo #
+--echo # Test SELECT
+--echo #
+
+    --echo # connection con1, select where blocking function
+    --connection con1
+    send SELECT * FROM t1 WHERE f1= bug27563();
+
+        --echo # connection con2, wait for con1 to reach sync point
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+        #
+        --echo # connection con2, kill con1
+        KILL QUERY @id;
+
+    --echo # connection con1, retrieve result
+    --connection con1
+    --error 1317
+    reap;
+    #
+    --echo # connection con1, show contents of table.
+    SELECT * FROM t1;
+
+--echo # connection default
+--connection default
+SET DEBUG_SYNC = 'RESET';
+DROP FUNCTION bug27563;
+
+--echo #
+--echo # Test PROCEDURE
+--echo #
+
+--delimiter |
+CREATE PROCEDURE proc27563()
+BEGIN
+  DECLARE CONTINUE HANDLER FOR SQLSTATE '70100' SET @a:= 'killed';
+  DECLARE CONTINUE HANDLER FOR SQLEXCEPTION SET @a:= 'exception';
+  SELECT "should be selected" AS procedure_activity_1;
+  SET DEBUG_SYNC= 'now SIGNAL in_sync WAIT_FOR kill';
+  SELECT "shouldn't be selected" AS procedure_activity_2;
+END|
+--delimiter ;
+
+    --echo # connection con1, call blocking procedure
+    --connection con1
+    send CALL proc27563();
+
+        --echo # connection con2, wait for con1 to reach sync point
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+        #
+        --echo # connection con2, kill con1
+        KILL QUERY @id;
+
+    --echo # connection con1, retrieve result
+    --connection con1
+    --error 1317
+    reap;
+    #
+    --echo # connection con1, show contents of table.
+    SELECT * FROM t1;
+
+--echo # connection default
+--connection default
+SET DEBUG_SYNC = 'RESET';
+DROP PROCEDURE proc27563;
+
+--echo #
+--echo # Test TRIGGERS
+--echo #
+CREATE TABLE t2 (f2 INT);
+
+--delimiter |
+CREATE TRIGGER trg27563 BEFORE INSERT ON t1 FOR EACH ROW 
+BEGIN 
+  DECLARE CONTINUE HANDLER FOR SQLSTATE '70100' SET @a:= 'killed';
+  DECLARE CONTINUE HANDLER FOR SQLEXCEPTION SET @a:= 'exception';
+  INSERT INTO t2 VALUES(0);
+  SET DEBUG_SYNC= 'now SIGNAL in_sync WAIT_FOR kill';
+  INSERT INTO t2 VALUES(1);
+END|
+--delimiter ;
+
+    --echo # connection con1, provoke blocking trigger
+    --connection con1
+    send INSERT INTO t1 VALUES(2),(3);
+
+        --echo # connection con2, wait for con1 to reach sync point
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+        #
+        --echo # connection con2, kill con1
+        KILL QUERY @id;
+
+    --echo # connection con1, retrieve result
+    --connection con1
+    --error 1317
+    reap;
+    #
+    --echo # connection con1, show contents of table.
+    SELECT * FROM t1;
+    SELECT * FROM t2;
+
+--echo # connection default
+--connection default
+SET DEBUG_SYNC = 'RESET';
+DROP TABLE t1, t2;
+
+--echo #
+--echo # Bug#28598: mysqld crash when killing a long-running explain query.
+--echo #
+
+    --echo # connection con1, store own connection id
+    --connection con1
+    --let $ID= `SELECT @id := CONNECTION_ID()`
+
+        --echo # connection con2, store con1 connection id
+        --connection con2
+        --let $ignore= `SELECT @id := $ID`
+
+    --echo # connection con1, store own connection id
+    --connection con1
+    --echo # connection con1, disable_query_log, create tables
+    --disable_query_log
+    let $tab_count= 40;
+    #
+    let $i= $tab_count;
+    while ($i)
+    {
+      eval CREATE TABLE t$i (a$i INT, KEY(a$i));
+      eval INSERT INTO t$i VALUES (1),(2),(3),(4),(5),(6),(7);
+      dec $i ;
+    }
+    SET SESSION optimizer_search_depth=0;
+    #
+    let $i=$tab_count;
+    while ($i)
+    {
+     let $a= a$i;
+     let $t= t$i;
+     dec $i;
+     if ($i)
+     {
+       let $comma=,;
+       let $from=$comma$t$from;
+       let $where=a$i=$a $and $where;
+     }
+     if (!$i) 
+     {
+       let $from=FROM $t$from;
+       let $where=WHERE $where;
+     }
+     let $and=AND;
+    }
+    #
+    --echo # connection con1, enable_query_log
+    --enable_query_log
+    #
+    --echo # connection con1, request a signal from JOIN::optimize
+    SET DEBUG_SYNC= 'before_join_optimize SIGNAL in_sync';
+    #
+    --echo # connection con1, prepare and execute
+    eval PREPARE stmt FROM 'EXPLAIN SELECT * $from $where';
+    send EXECUTE stmt;
+
+        --echo # connection con2, wait for con1 to reach sync point
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR in_sync';
+        #
+        --echo # connection con2, kill con1
+        KILL QUERY @id;
+
+    --echo # connection con1, retrieve result
+    --connection con1
+    --error 1317
+    reap;
+    #
+    --echo # connection con1, disable_query_log, drop tables
+    --disable_query_log
+    let $i= $tab_count;
+    while ($i)
+    {
+      eval DROP TABLE t$i;
+      dec $i ;
+    }
+    --echo # connection con1, enable_query_log
+    --enable_query_log
+
+--echo # connection default
+--connection default
+SET DEBUG_SYNC = 'RESET';
 
 --echo #
 --echo # Bug#19723: kill of active connection yields different error code
 --echo # depending on platform.
 --echo #
 
---echo
---echo # Connection: con2.
---connection con2
-
-KILL CONNECTION_ID();
-
---echo # CR_SERVER_LOST, CR_SERVER_GONE_ERROR, depending on the timing 
---echo # of close of the connection socket
---error 2013, 2006 
-SELECT 1;
+    --echo # connection con1, store own connection id
+    --connection con1
+    --let $ID= `SELECT @id := CONNECTION_ID()`
+    #
+    --echo # connection con1, request a signal from thread end
+    SET DEBUG_SYNC= 'thread_end SIGNAL con1_end';
+    #
+    --echo # connection con1, disable reconnect
+    --disable_reconnect
+    #
+    --echo # connection con1, kill myself
+    KILL @id;
+
+        --echo # connection con2, wait for end of con1
+        --connection con2
+        SET DEBUG_SYNC= 'now WAIT_FOR con1_end';
+
+    --echo # connection con1, show connection end
+    --connection con1
+    --echo # ER_SERVER_SHUTDOWN, CR_SERVER_GONE_ERROR, CR_SERVER_LOST,
+    --echo # depending on the timing of close of the connection socket
+    --error 1053,2006,2013
+    SELECT 1;
+    #
+    --echo # connection con1, enable reconnect
+    --enable_reconnect
+    #
+    --echo # connection con1, show successful reconnect
+    SELECT 1;
+    #
+    --echo # connection con1, restore old connection id value
+    --let $ignore= `SELECT @id := $ID`
+    #
+    --echo # connection con1, show that we have a new connection id
+    SELECT @id != CONNECTION_ID();
+
+--echo # connection default
 --connection default
+SET DEBUG_SYNC = 'RESET';
 
 ###########################################################################
 #

=== modified file 'sql/event_queue.cc'
--- a/sql/event_queue.cc	2008-12-04 21:02:09 +0000
+++ b/sql/event_queue.cc	2009-02-25 15:46:59 +0000
@@ -728,6 +728,18 @@ Event_queue::unlock_data(const char *fun
       abstime If not null then call pthread_cond_timedwait()
       func    Which function is requesting cond_wait
       line    On which line cond_wait is requested
+
+  @note 'killed': It is not sufficient to test thd->killed at the top of
+  the loop (from which this function is called). If the thread is killed
+  after it tested thd->killed and before it registered the mutex (here:
+  LOCK_event_queue) in its mysys_var (in enter_cond()), the killing
+  thread won't do anything but set thd->killed. If we don't check it
+  again after registering the mutex, we would start waiting, though
+  thd->killed is set. The solution is to check thd->killed again, after
+  registering the mutex and before start waiting. An additional
+  advantage is the memory barrier, included in the mutex handling. This
+  prevents out-of-order execution in the processor. Thus we cannot read
+  an old value.
 */
 
 void
@@ -742,11 +754,15 @@ Event_queue::cond_wait(THD *thd, struct 
 
   thd->enter_cond(&COND_queue_state, &LOCK_event_queue, msg);
 
-  DBUG_PRINT("info", ("pthread_cond_%swait", abstime? "timed":""));
-  if (!abstime)
-    pthread_cond_wait(&COND_queue_state, &LOCK_event_queue);
-  else
-    pthread_cond_timedwait(&COND_queue_state, &LOCK_event_queue, abstime);
+  /* See function comment for 'killed'. */
+  if (!thd->killed)
+  {
+    DBUG_PRINT("info", ("pthread_cond_%swait", abstime? "timed":""));
+    if (!abstime)
+      pthread_cond_wait(&COND_queue_state, &LOCK_event_queue);
+    else
+      pthread_cond_timedwait(&COND_queue_state, &LOCK_event_queue, abstime);
+  }
 
   mutex_last_locked_in_func= func;
   mutex_last_locked_at_line= line;

=== modified file 'sql/lock.cc'
--- a/sql/lock.cc	2009-02-16 21:18:45 +0000
+++ b/sql/lock.cc	2009-02-25 15:46:59 +0000
@@ -1128,6 +1128,14 @@ bool lock_global_read_lock(THD *thd)
     const char *new_message= "Waiting to get readlock";
     (void) pthread_mutex_lock(&LOCK_global_read_lock);
 
+    old_message=thd->enter_cond(&COND_global_read_lock, &LOCK_global_read_lock,
+                                new_message);
+    DBUG_PRINT("info",
+               ("waiting_for: %d  protect_against: %d",
+                waiting_for_read_lock, protect_against_global_read_lock));
+
+    waiting_for_read_lock++;
+
 #if defined(ENABLED_DEBUG_SYNC)
     /*
       The below sync point fires if we have to wait for
@@ -1136,27 +1144,18 @@ bool lock_global_read_lock(THD *thd)
       WARNING: Beware to use WAIT_FOR with this sync point. We hold
       LOCK_global_read_lock here.
 
-      Call the sync point before calling enter_cond() as it does use
-      enter_cond() and exit_cond() itself if a WAIT_FOR action is
-      executed in spite of the above warning.
+      Call the sync point after enter_cond() so that the proc_info is
+      available immediately after the sync point sends a SIGNAL. This
+      makes tests more reliable.
 
-      Pre-set proc_info so that it is available immediately after the
-      sync point sends a SIGNAL. This makes tests more reliable.
+      Call the sync point before the loop so that it is executed once only.
     */
-    if (protect_against_global_read_lock)
+    if (protect_against_global_read_lock && !thd->killed)
     {
-      thd_proc_info(thd, new_message);
       DEBUG_SYNC(thd, "wait_lock_global_read_lock");
     }
 #endif /* defined(ENABLED_DEBUG_SYNC) */
 
-    old_message=thd->enter_cond(&COND_global_read_lock, &LOCK_global_read_lock,
-                                new_message);
-    DBUG_PRINT("info",
-	       ("waiting_for: %d  protect_against: %d",
-		waiting_for_read_lock, protect_against_global_read_lock));
-
-    waiting_for_read_lock++;
     while (protect_against_global_read_lock && !thd->killed)
       pthread_cond_wait(&COND_global_read_lock, &LOCK_global_read_lock);
     waiting_for_read_lock--;
@@ -1272,6 +1271,9 @@ bool wait_if_global_read_lock(THD *thd, 
       DBUG_RETURN(is_not_commit);
     }
 
+    old_message=thd->enter_cond(&COND_global_read_lock, &LOCK_global_read_lock,
+                                new_message);
+
 #if defined(ENABLED_DEBUG_SYNC)
     /*
       The below sync point fires if we have to wait for
@@ -1280,22 +1282,19 @@ bool wait_if_global_read_lock(THD *thd, 
       WARNING: Beware to use WAIT_FOR with this sync point. We hold
       LOCK_global_read_lock here.
 
-      Call the sync point before calling enter_cond() as it does use
-      enter_cond() and exit_cond() itself if a WAIT_FOR action is
-      executed in spite of the above warning.
+      Call the sync point after enter_cond() so that the proc_info is
+      available immediately after the sync point sends a SIGNAL. This
+      makes tests more reliable.
 
-      Pre-set proc_info so that it is available immediately after the
-      sync point sends a SIGNAL. This makes tests more reliable.
+      Call the sync point before the loop so that it is executed once only.
     */
-    if (must_wait)
+    if (must_wait && ! thd->killed &&
+        (!abort_on_refresh || thd->version == refresh_version))
     {
-      thd_proc_info(thd, new_message);
       DEBUG_SYNC(thd, "wait_if_global_read_lock");
     }
 #endif /* defined(ENABLED_DEBUG_SYNC) */
 
-    old_message=thd->enter_cond(&COND_global_read_lock, &LOCK_global_read_lock,
-                                new_message);
     while (must_wait && ! thd->killed &&
 	   (!abort_on_refresh || thd->version == refresh_version))
     {

=== modified file 'sql/mdl.cc'
--- a/sql/mdl.cc	2009-01-27 13:41:58 +0000
+++ b/sql/mdl.cc	2009-02-25 15:46:59 +0000
@@ -631,8 +631,8 @@ static bool can_grant_lock(MDL_LOCK *loc
   case MDL_SHARED_HIGH_PRIO:
     if ((lock->active_exclusive.is_empty() &&
          (lock_data->type == MDL_SHARED_HIGH_PRIO ||
-          lock->waiting_exclusive.is_empty() &&
-          lock->active_shared_waiting_upgrade.is_empty())) ||
+          (lock->waiting_exclusive.is_empty() &&
+           lock->active_shared_waiting_upgrade.is_empty()))) ||
         (!lock->active_exclusive.is_empty() &&
          lock->active_exclusive.head()->ctx == lock_data->ctx))
     {
@@ -894,6 +894,7 @@ bool mdl_acquire_exclusive_locks(MDL_CON
     }
     if (!lock_data)
       break;
+    DEBUG_SYNC(context->thd, "before_wait_excl_mdl_lock");
     if (signalled)
       pthread_cond_wait(&COND_mdl, &LOCK_mdl);
     else
@@ -1025,6 +1026,7 @@ bool mdl_upgrade_shared_lock_to_exclusiv
       }
     }
 
+    DEBUG_SYNC(context->thd, "before_wait_upgrade_mdl_lock");
     if (signalled)
       pthread_cond_wait(&COND_mdl, &LOCK_mdl);
     else
@@ -1158,6 +1160,25 @@ bool mdl_acquire_global_shared_lock(MDL_
   global_lock.waiting_shared++;
   old_msg= MDL_ENTER_COND(context, mysys_var);
 
+#if defined(ENABLED_DEBUG_SYNC)
+  /*
+    The below sync point fires if we have to wait for a table lock.
+
+    WARNING: Beware to use WAIT_FOR with this sync point. We hold
+    LOCK_mdl here.
+
+    Call the sync point after enter_cond() so that the proc_info is
+    available immediately after the sync point sends a SIGNAL. This
+    makes tests more reliable.
+
+    Call the sync point before the loop so that it is executed once only.
+  */
+  if (!mysys_var->abort && global_lock.active_intention_exclusive)
+  {
+    DEBUG_SYNC(context->thd, "wait_mdl_global_shared");
+  }
+#endif /* defined(ENABLED_DEBUG_SYNC) */
+
   while (!mysys_var->abort && global_lock.active_intention_exclusive)
     pthread_cond_wait(&COND_mdl, &LOCK_mdl);
 
@@ -1237,6 +1258,7 @@ bool mdl_wait_for_locks(MDL_CONTEXT *con
       pthread_mutex_unlock(&LOCK_mdl);
       break;
     }
+    DEBUG_SYNC(context->thd, "before_wait_mdl_locks");
     pthread_cond_wait(&COND_mdl, &LOCK_mdl);
     /* As a side-effect MDL_EXIT_COND() unlocks LOCK_mdl. */
     MDL_EXIT_COND(context, mysys_var, old_msg);

=== modified file 'sql/slave.cc'
--- a/sql/slave.cc	2009-02-13 16:30:54 +0000
+++ b/sql/slave.cc	2009-02-25 15:46:59 +0000
@@ -557,9 +557,17 @@ int start_slave_thread(pthread_handler h
     while (start_id == *slave_run_id)
     {
       DBUG_PRINT("sleep",("Waiting for slave thread to start"));
-      const char* old_msg = thd->enter_cond(start_cond,cond_lock,
-                                            "Waiting for slave thread to start");
-      pthread_cond_wait(start_cond,cond_lock);
+      const char* old_msg= thd->enter_cond(start_cond,cond_lock,
+                                           "Waiting for slave thread to start");
+      /*
+        It is not sufficient to test this at loop bottom. We must test
+        it after registering the mutex in enter_cond(). If the kill
+        happens after testing of thd->killed and before the mutex is
+        registered, we would otherwise go waiting though thd->killed is
+        set.
+      */
+      if (!thd->killed)
+        pthread_cond_wait(start_cond,cond_lock);
       thd->exit_cond(old_msg);
       pthread_mutex_lock(cond_lock); // re-acquire it as exit_cond() released
       if (thd->killed)

=== modified file 'sql/sql_base.cc'
--- a/sql/sql_base.cc	2009-02-16 21:18:45 +0000
+++ b/sql/sql_base.cc	2009-02-25 15:46:59 +0000
@@ -888,6 +888,18 @@ static void kill_delayed_threads_for_tab
         when some write-locked table is being reopened (by FLUSH TABLES or
         ALTER TABLE) we have to rely on additional global shared metadata
         lock taken by thread trying to obtain global read lock.
+
+  @note 'killed': It is not sufficient to test thd->killed at the top of
+  the loop. If the thread is killed after it tested thd->killed and
+  before it registered the mutex (here: LOCK_open) in its mysys_var (in
+  enter_cond()), the killing thread won't do anything but set
+  thd->killed. If we don't check it again after registering the mutex,
+  we would start waiting, though thd->killed is set. The solution is to
+  check thd->killed again, after registering the mutex and before start
+  waiting. Since we cannot 'break' between enter_cond() and exit_cond(),
+  we just omit the wait. An additional advantage is the memory barrier,
+  included in the mutex handling. This prevents out-of-order execution
+  in the processor. Thus we cannot read an old value.
 */
 
 bool close_cached_tables(THD *thd, TABLE_LIST *tables, bool have_lock,
@@ -1018,7 +1030,8 @@ bool close_cached_tables(THD *thd, TABLE
       }
     }
 
-    if (found)
+    /* See function comment for 'killed'. */
+    if (found && ! thd->killed)
     {
       DBUG_PRINT("signal", ("Waiting for COND_refresh"));
       pthread_cond_wait(&COND_refresh,&LOCK_open);
@@ -7686,11 +7699,25 @@ void tdc_remove_table(THD *thd, enum_tdc
 
 
 /**
-   Wait until there are no old versions of tables in the table
-   definition cache for the metadata locks that we try to acquire.
+  Wait until there are no old versions of tables in the table
+  definition cache for the metadata locks that we try to acquire.
+
+  @param[in]    thd             thread context
+  @param[in]    context         metadata locking context with locks
+
+  @return       killed          if the thread has been killed
 
-   @param thd      Thread context
-   @param context  Metadata locking context with locks.
+  @note 'killed': It is not sufficient to test thd->killed at the top of
+  the loop. If the thread is killed after it tested thd->killed and
+  before it registered the mutex (here: LOCK_open) in its mysys_var (in
+  enter_cond()), the killing thread won't do anything but set
+  thd->killed. If we don't check it again after registering the mutex,
+  we would start waiting, though thd->killed is set. The solution is to
+  check thd->killed again, after registering the mutex and before start
+  waiting. Since we cannot 'break' between enter_cond() and exit_cond(),
+  we just omit the wait. An additional advantage is the memory barrier,
+  included in the mutex handling. This prevents out-of-order execution
+  in the processor. Thus we cannot read an old value.
 */
 
 static bool tdc_wait_for_old_versions(THD *thd, MDL_CONTEXT *context)
@@ -7728,7 +7755,17 @@ static bool tdc_wait_for_old_versions(TH
       break;
     }
     old_msg= thd->enter_cond(&COND_refresh, &LOCK_open, "Waiting for table");
-    pthread_cond_wait(&COND_refresh, &LOCK_open);
+    /* See function comment for 'killed'. */
+    if (!thd->killed)
+    {
+      /*
+        Call the sync point after enter_cond() so that the proc_info is
+        available immediately after the sync point sends a SIGNAL. This
+        makes tests more reliable.
+      */
+      DEBUG_SYNC(thd, "tdc_wait_for_table");
+      pthread_cond_wait(&COND_refresh, &LOCK_open);
+    }
     /* LOCK_open mutex is unlocked by THD::exit_cond() as side-effect. */
     thd->exit_cond(old_msg);
   }

=== modified file 'sql/sql_class.cc'
--- a/sql/sql_class.cc	2009-02-20 16:40:19 +0000
+++ b/sql/sql_class.cc	2009-02-25 15:46:59 +0000
@@ -907,36 +907,64 @@ void add_diff_to_status(STATUS_VAR *to_v
 #endif
 
 
+/**
+  Awake a thread.
+
+  @param[in]    state_to_set            value for THD::killed
+
+  This is normally called from another thread's THD object.
+
+  @note Do not call this within LOCK_thread_count.
+
+  @note Do always call this within LOCK_delete.
+*/
+
 void THD::awake(THD::killed_state state_to_set)
 {
   DBUG_ENTER("THD::awake");
   DBUG_PRINT("enter", ("this: %p", this));
   THD_CHECK_SENTRY(this);
-  safe_mutex_assert_owner(&LOCK_delete); 
+  safe_mutex_assert_owner(&LOCK_delete);
 
+  /*
+    Set the 'killed' flag of 'this', which is the target THD object.
+  */
   killed= state_to_set;
+
   if (state_to_set != THD::KILL_QUERY)
   {
+    /* Mark the target thread's alarm request expired, and signal alarm. */
     thr_alarm_kill(thread_id);
+
+    /* Send an event to "libevent" that a thread should be killed. */
     if (!slave_thread)
       thread_scheduler.post_kill_notification(this);
+
 #ifdef SIGNAL_WITH_VIO_CLOSE
     if (this != current_thd)
     {
       /*
-        In addition to a signal, let's close the socket of the thread that
-        is being killed. This is to make sure it does not block if the
-        signal is lost. This needs to be done only on platforms where
-        signals are not a reliable interruption mechanism.
-
-        If we're killing ourselves, we know that we're not blocked, so this
-        hack is not used.
+        In addition to a signal, let's close the socket of the thread
+        that is being killed ("this", which is not the current thread).
+        This is to make sure it does not block if the signal is lost.
+        Another reason is that the target thread may not have been
+        blocked on the socket at the moment when we sent the signal. It
+        may come later to the blocking point. The closed connection
+        prevents it from blocking.
+
+        If we are killing ourselves, we know that we are not blocked.
+        We also know that we will check thd->killed before we go for
+        reading the next statement.
       */
 
       close_active_vio();
     }
-#endif    
+#endif
   }
+
+  /*
+    Broadcast a condition to kick the target if it is waiting on it.
+  */
   if (mysys_var)
   {
     pthread_mutex_lock(&mysys_var->mutex);
@@ -962,6 +990,9 @@ void THD::awake(THD::killed_state state_
       It's true that we have set its thd->killed but it may not
       see it immediately and so may have time to reach the cond_wait().
 
+      However, where possible, we test for killed once again after
+      enter_cond(). This should make the signalling safe.
+
       We have to do the loop with trylock, because if we would use
       pthread_mutex_lock(), we can cause a deadlock as we are here locking
       the mysys_var->mutex and mysys_var->current_mutex in a different order

=== modified file 'sql/sql_class.h'
--- a/sql/sql_class.h	2009-02-20 16:40:19 +0000
+++ b/sql/sql_class.h	2009-02-25 15:46:59 +0000
@@ -1187,6 +1187,35 @@ struct Ha_data
 };
 
 
+/*
+  The initial intent of SIGNAL_WITH_VIO_CLOSE was to avoid closing of
+  the connection if the operating system has reliable signals. The
+  assumption was that setting THD::killed, followed by sending a signal
+  and broadcasting a condition can make the other thread aware of the
+  kill request in any situation.
+
+  Unfortunately, it was forgotton that there is always some code between
+  a test of thd->killed and the blocked operation. With
+  pthread_cond_wait() this is defeated by taking a mutex, registering
+  the condition, and then checking thd->killed again before calling the
+  blocking function.
+
+  For blocking system services like read() or recv(), we cannot do this.
+  They won't release a mutex or similar facility when blocked. When the
+  signal comes in while the code executes between the test of
+  thd->killed and read/recv() being blocked, the signal handler will
+  run, but when it returns, the code will continue to run into
+  read/recv().
+
+  The only reliable workaround is to close the connection. This will
+  prevent read/recv() to block, and even awake them if they are blocked
+  already.
+*/
+#ifndef SIGNAL_WITH_VIO_CLOSE
+#define SIGNAL_WITH_VIO_CLOSE 1
+#endif
+
+
 /**
   @class THD
   For each client connection we create a separate thread with THD serving as

=== modified file 'sql/sql_parse.cc'
--- a/sql/sql_parse.cc	2009-02-20 16:40:19 +0000
+++ b/sql/sql_parse.cc	2009-02-25 15:46:59 +0000
@@ -682,6 +682,22 @@ bool do_command(THD *thd)
 
   net_new_transaction(net);
 
+  /*
+    Synchronization point for testing of KILL_CONNECTION.
+    This sync point can wait here, to simulate slow code execution
+    between the last test of thd->killed and blocking in read().
+
+    The goal of this test is to verify that a connection does not
+    hang, if it is killed at this point of execution.
+    (Bug#37780 - main.kill fails randomly)
+
+    Note that the sync point wait itself will be terminated by a
+    kill. In this case it consumes a condition broadcast, but does
+    not change anything else. The consumed broadcast should not
+    matter here, because the read/recv() below doesn't use it.
+  */
+  DEBUG_SYNC(thd, "before_do_command_net_read");
+
   if ((packet_length= my_net_read(net)) == packet_error)
   {
     DBUG_PRINT("info",("Got error %d reading command from socket %s",

=== modified file 'sql/sql_select.cc'
--- a/sql/sql_select.cc	2009-02-16 21:18:45 +0000
+++ b/sql/sql_select.cc	2009-02-25 15:46:59 +0000
@@ -29,6 +29,7 @@
 #endif
 
 #include "mysql_priv.h"
+#include "debug_sync.h"
 #include "sql_select.h"
 #include "sql_cursor.h"
 
@@ -1420,6 +1421,7 @@ JOIN::optimize()
   if (optimized)
     DBUG_RETURN(0);
   optimized= 1;
+  DEBUG_SYNC(thd, "before_join_optimize");
 
   thd_proc_info(thd, "optimizing");
   


Attachment: [text/bzr-bundle] bzr/ingo.struewing@sun.com-20090225154659-kot4es2of2qrndbq.bundle
Thread
bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Ingo Struewing25 Feb
  • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Davi Arnaut5 May
    • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Ingo Strüwing5 May
      • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Konstantin Osipov5 May
        • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Ingo Strüwing6 May
          • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Konstantin Osipov6 May
            • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Ingo Strüwing6 May
              • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Konstantin Osipov6 May
                • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Ingo Strüwing7 May
                  • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Ingo Strüwing12 May
              • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Davi Arnaut6 May
      • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Davi Arnaut6 May
        • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Ingo Strüwing6 May
          • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Konstantin Osipov6 May
            • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Ingo Strüwing7 May
          • Re: bzr commit into mysql-6.0 branch (ingo.struewing:2779) Bug#37780Davi Arnaut7 May