List:Commits« Previous MessageNext Message »
From:Jon Olav Hauglid Date:June 1 2011 9:14am
Subject:bzr push into mysql-trunk branch (jon.hauglid:3134 to 3135)
View as plain text  
 3135 Jon Olav Hauglid	2011-06-01 [merge]
      Merge from mysql-5.5 to mysql-trunk
      Text conflict in mysql-test/r/innodb_mysql_lock.result
      Text conflict in storage/innobase/handler/handler0alter.cc

    modified:
      mysql-test/r/innodb_mysql_lock.result
      mysql-test/r/innodb_mysql_sync.result
      mysql-test/t/innodb_mysql_lock.test
      mysql-test/t/innodb_mysql_sync.test
      sql/ha_partition.cc
      sql/ha_partition.h
      sql/handler.h
      sql/sql_table.cc
      storage/innobase/handler/ha_innodb.cc
      storage/innobase/handler/ha_innodb.h
      storage/innobase/handler/handler0alter.cc
 3134 Davi Arnaut	2011-05-31
      Bug#11762221 - 54790: Use of non-blocking mode for sockets limits performance
      Bug#11758972 - 51244: wait_timeout fails on OpenSolaris
      
      The problem was that a optimization for the case when the server
      uses alarms for timeouts could cause a slowdown when socket
      timeouts are used instead. In case alarms are used for timeouts,
      a non-blocking read is attempted first in order to avoid the
      cost of setting up a alarm and if this non-blocking read fails,
      the socket mode is changed to blocking and a alarm is armed.
      
      If socket timeout is used, there is no point in attempting a
      non-blocking read first as the timeout will be automatically
      armed by the OS. Yet the server would attempt a non-blocking
      read first and later switch the socket to blocking mode. This
      could inadvertently impact performance as switching the blocking
      mode of a socket requires at least two calls into the kernel
      on Linux, apart from problems inherited by the scalability
      of fcntl(2).
      
      The solution is to remove alarm based timeouts from the
      protocol layer and push timeout handling down to the virtual
      I/O layer. This approach allows the handling of socket timeouts
      on a platform-specific basis. The blocking mode of the socket
      is no longer exported and VIO read and write operations either
      complete or fail with a error or timeout.
      
      On Linux, the MSG_DONTWAIT flag is used to enable non-blocking
      send and receive operations. If the operation would block,
      poll() is used to wait for readiness or until a timeout occurs.
      This strategy avoids the need to set the socket timeout and
      blocking mode twice per query.
      
      On Windows, as before, the timeout is set on a per-socket
      fashion. In all remaining operating systems, the socket is set
      to non-blocking mode and poll() is used to wait for readiness
      or until a timeout occurs.
      
      In order to cleanup the code after the removal of alarm based
      timeouts, the low level packet reading loop is unrolled into
      two specific sequences: reading the packet header and the
      payload. This makes error handling easier down the road.
      
      In conclusion, benchmarks have shown that these changes do not
      introduce any performance hits and actually slightly improves
      the server throughput for higher numbers of threads.
      
      - Incompatible changes:
      
      A timeout is now always applied to a individual receive or
      send I/O operation. In contrast, a alarm based timeout was
      applied to an entire send or receive packet operation. That
      is, before this patch the timeout was really a time limit
      for sending or reading one packet.
      
      Building and running MySQL on POSIX systems now requires
      support for poll() and O_NONBLOCK. These should be available
      in any modern POSIX system. In other words, except for Windows,
      legacy (non-POSIX) systems which only support O_NDELAY and
      select() are no longer supported.
      
      On Windows, the default value for MYSQL_OPT_CONNECT_TIMEOUT
      is no longer 20 seconds. The default value now is no timeout
      (infinite), the same as in all other platforms.
      
      Packets bigger than the maximum allowed packet size are no
      longer skipped. Before this patch, if a application sent a
      packet bigger than the maximum allowed packet size, or if
      the server failed to allocate a buffer sufficiently large
      to hold the packet, the server would keep reading the packet
      until its end. Now the session is simply disconnected if the
      server cannot handle such large packets.
      
      The client socket buffer is no longer cleared (drained)
      before sending commands to the server. Before this patch,
      any data left in the socket buffer would be drained (removed)
      before a command was sent to the server, in order to work
      around bugs where the server would violate the protocol and
      send more data. The only check left is a debug-only assertion
      to ensure that the socket buffer is empty.
     @ cmake/os/Windows.cmake
        Move to global configure.cmake
     @ configure.cmake
        No need to detect socket timeouts anymore as they are only
        used and always present on Windows.
     @ extra/yassl/include/openssl/ssl.h
        Export a set of hooks for yaSSL to use in order to send and
        receive data.
     @ extra/yassl/include/openssl/transport_types.h
        Introduce header that defines the type of the transport
        functions used for sending and receiving data. Header is
        necessary to expose the types externally and internally,
        without the need to include ssl.h internally.
     @ extra/yassl/include/socket_wrapper.hpp
        Add variables to hold the callbacks (and argument to callback)
        used for sending and receiving data.
     @ extra/yassl/src/handshake.cpp
        Wait for input in the actual receive data operation. If the
        amount of data available in the socket buffer cannot be pre-
        determined, use a minimum value of 64 bytes.
     @ extra/yassl/src/socket_wrapper.cpp
        Use transport callbacks for sending and receive data. Use the
        system provided send() and recv() by default.
     @ extra/yassl/src/ssl.cpp
        Implement transport set functions.
     @ include/my_global.h
        A socket timeout is now signalled with a ETIMEDOUT error,
        which in some cases is emulated (that is, set explicitly).
        
        Add a wrapper for ECONNRESET so that it can be used to map
        SSL-specific errors.
     @ include/my_sys.h
        Remove as big packets are no longer skipped.
     @ include/mysql_com.h
        Remove as big packets are no longer skipped.
     @ include/violite.h
        Introduce a vio_io_wait method which can be used to wait
        for I/O events on a VIO. The supported I/O events are read
        and write. The notification of error conditions is implicit.
        
        Extend vio_reset, which was SSL-specific, to be able to
        rebind a new socket to an already initialized Vio object.
        
        Remove vio_is_blocking. The blocking mode is no longer
        exported. Also, remove fcntl_mode as the blocking mode is
        no longer buffered. It is now a product of the timeout.
        
        Add prototype for a wrapper vio_timeout() function, which
        invokes the underlying timeout method of the VIO. Add to
        VIO two member variables to hold the read and write timeout
        values.
        
        Rename vio_was_interrupted to vio_was_timeout.
        
        Remove vio_poll_read, which is now superseded vio_io_wait.
        
        Introduce vio_socket_connect to avoid code duplication.
     @ mysql-test/include/mysqlhotcopy.inc
        Perl's die() exists with errno ($!) if it is set, carrying
        over the errno value set by one the internal functions for
        whatever reason.
     @ mysql-test/t/myisam_debug.test
        Reap statement result before disconnecting, otherwise it might
        trigger a assertion or crash for embedded runs.
     @ mysql-test/t/ssl.test
        Add test case to ensure that timeouts work over SSL.
     @ mysql-test/t/wait_timeout.test
        Add test case for Bug#11762221 - 54790.
     @ mysql-test/t/xa.test
        Reap statement result before disconnecting, otherwise it might
        trigger a assertion or crash for embedded runs.
     @ sql-common/client.c
        Look into last_errno to check whether the socket was
        interrupted due to a timeout.
        
        Add a couple of wrappers to retrieve the connect_timeout
        option value.
        
        Use vio_io_wait instead of vio_poll_read.
        
        The result of a failed socket operation is in socket_errno,
        which is WSAGetLastError on Windows.
        
        Replace my_connect() and wait_for_data() with vio_socket_connect()
        to avoid code duplication. The polling inside wait_for_data()
        is somewhat equivalent to wait_for_io(). Also, this can be
        considered a bug fix as wait_for_data() would incorrectly
        use POLLIN (there is data to read) to wait (poll) for the
        connection to be established. This problem probably never
        surfaced because the OS network layer timeout is usually
        lower then our default.
     @ sql/net_serv.cc
        Remove net_data_is_ready, it is no longer necessary as its
        functionality is now implemented by vio_io_wait.
        
        Remove the use of signal based alarms for timeouts. Instead,
        rely on the underlying VIO layer to provide socket timeouts
        and control the blocking mode. The error handling is changed
        to use vio_was_timeout() to recognize a timeout occurrence
        and vio_should_retry() to detect interrupted I/O operations.
        
        The loop in the packet reading path (my_real_path) is unrolled
        into two steps: reading the packet header and payload. Each
        step is now documented and should be easier to understand.
        
        net_clear() no longer drains the socket buffer. Unread data
        on the socket constitutes a protocol violation and shouldn't
        be skipped. The previous behavior was race prone anyway
        as it only drained data on the socket buffer, missing any
        data in transit. Instead, net_clear() now just asserts
        that there is no data left in the read and socket buffers.
        
        Remove use of the inexistent MYSQL_INSTANCE_MANAGER macro.
     @ sql/sql_acl.cc
        Remove as big packets are no longer skipped.
     @ sql/sql_cache.cc
        Rename net_real_write to net_write_packet.
     @ sql/sql_class.cc
        Remove as big packets are no longer skipped.
     @ sql/sql_class.h
        Remove as big packets are no longer skipped.
     @ tests/mysql_client_test.c
        Add test case to ensure that the MYSQL_OPT_READ_TIMEOUT option
        works as expected.
     @ vio/CMakeLists.txt
        Add new files to build.
     @ vio/vio.c
        Rename no_poll_read to no_io_wait. The stub method is used
        for the transport types named pipe and shared memory.
        
        Remove hPipe parameter to vio_init, it is now set in the
        vio_new_win32pipe function.
        
        Initialize timeout values to -1 (infinite) in vio_init.
        
        Improve vio_reset to also reinitialize timeouts and assert
        that rebinding is only supported for socket-based transport
        types.
        
        Move event object creation to its own initialization function
        and perform error checking.
        
        Remove caching of the socket status flags, including blocking
        mode. It is no longer necessary.
        
        Add a vio_timeout wrapper which adjusts the timeout value
        from seconds to milliseconds and invokes the underlying
        VIO-specific timeout handler. It also helps the underlying
        handler detect the previous state of the timeouts (activated
        or deactivated).
     @ vio/vio_priv.h
        Remove prototypes for functions that are being removed.
        Introduce prototypes for functions that are added.
     @ vio/viopipe.c
        Pipe transport code moved from viosocket.c. Also, simplify
        it and adjust to handle the new timeout interface.
     @ vio/vioshm.c
        Move shared memory related code over from viosocket.c and
        adjust to handle the new timeout interface.
     @ vio/viosocket.c
        Use vio_ssl_errno to retrieve the error number if the VIO
        type is SSL.
        
        Add the vio_socket_io_wait() helper which wraps vio_io_wait()
        in order to avoid code duplication in vio_write()/vio_read().
        The function selects an appropriate timeout (based on the passed
        event) and translates the return value.
        
        Re-implement vio_read()/vio_write() as simple event loops. The
        loop consists of retrying the I/O operation until it succeeds
        or fails with a non-recoverable error. If it fails with error
        set to EAGAIN or EWOULDBLOCK, vio_io_wait() is used to wait
        for the socket to become ready. On Linux, the MSG_DONTWAIT
        flag is used to get the same effect as if the socket is in
        non-blocking mode.
        
        Add vio_set_blocking() which tweaks the blocking mode of
        a socket.
        
        Add vio_socket_timeout() which implements the logic around
        when a socket should be switched to non-blocking mode. Except
        on Windows, the socket is put into non-blocking mode if a
        timeout is set.
        
        vio_should_retry now indicates whether a I/O operation was
        interrupted by a temporary failure (interrupted by a signal)
        and should be retried later. vio_was_timeout indicates
        whether a I/O operation timed out.
        
        Remove socket_poll_read, now superseded by vio_io_wait.
        
        Implement vio_io_wait() using select() on Windows. Otherwise,
        use poll(). vio_io_wait() will loop until either a timeout
        occurs, or the socket becomes ready. In the presence of
        spurious wake-ups, the wait is resumed if possible.
        
        Move code related to the transport types named pipe and
        shared memory to their own files.
        
        Introduce vio_socket_connect to avoid code duplication. The
        function implements a timed connect by setting the socket
        to non-blocking and polling until the connection is either
        established or fails.
     @ vio/viossl.c
        Re-implement vio_ssl_read() and vio_ssl_write() as simple
        event loops. Use vio_socket_io_wait() to handle polling,
        the SSL transport can only be used with sockets.
        
        Obtain the error status from SSL. Map the obtained error
        code to a platform equivalent one.
        
        Emulate blocking I/O when yaSSL is being used.

    added:
      extra/yassl/include/openssl/transport_types.h
      vio/viopipe.c
      vio/vioshm.c
    modified:
      cmake/os/Windows.cmake
      configure.cmake
      extra/yassl/include/openssl/ssl.h
      extra/yassl/include/socket_wrapper.hpp
      extra/yassl/src/handshake.cpp
      extra/yassl/src/socket_wrapper.cpp
      extra/yassl/src/ssl.cpp
      include/my_global.h
      include/my_sys.h
      include/mysql.h.pp
      include/mysql_com.h
      include/violite.h
      mysql-test/include/mysqlhotcopy.inc
      mysql-test/r/myisam_debug.result
      mysql-test/r/ssl.result
      mysql-test/r/wait_timeout.result
      mysql-test/t/myisam_debug.test
      mysql-test/t/ssl.test
      mysql-test/t/wait_timeout.test
      mysql-test/t/xa.test
      sql-common/client.c
      sql/net_serv.cc
      sql/sql_acl.cc
      sql/sql_cache.cc
      sql/sql_class.cc
      sql/sql_class.h
      tests/mysql_client_test.c
      vio/CMakeLists.txt
      vio/vio.c
      vio/vio_priv.h
      vio/viosocket.c
      vio/viossl.c
=== modified file 'mysql-test/r/innodb_mysql_lock.result'
--- a/mysql-test/r/innodb_mysql_lock.result	2011-05-10 10:55:34 +0000
+++ b/mysql-test/r/innodb_mysql_lock.result	2011-06-01 09:11:28 +0000
@@ -153,6 +153,7 @@ DROP VIEW v1;
 #              KEY NO 0 FOR TABLE IN ERROR LOG 
 #
 DROP TABLE IF EXISTS t1;
+# Test 1: Secondary index
 # Connection default
 CREATE TABLE t1 (id INT PRIMARY KEY, value INT) ENGINE = InnoDB;
 INSERT INTO t1 VALUES (1, 12345);
@@ -164,12 +165,31 @@ id	value
 SET lock_wait_timeout=1;
 ALTER TABLE t1 ADD INDEX idx(value);
 ERROR HY000: Lock wait timeout exceeded; try restarting transaction
+ALTER TABLE t1 ADD INDEX idx(value);
+ERROR HY000: Lock wait timeout exceeded; try restarting transaction
 # Connection default
 SELECT * FROM t1;
 id	value
 1	12345
 COMMIT;
 DROP TABLE t1;
+# Test 2: Primary index
+CREATE TABLE t1(a INT NOT NULL, b INT NOT NULL) engine=innodb;
+INSERT INTO t1 VALUES (1, 12345), (2, 23456);
+# Connection con1
+SET SESSION debug= "+d,alter_table_rollback_new_index";
+ALTER TABLE t1 ADD PRIMARY KEY(a);
+ERROR HY000: Unknown error
+SELECT * FROM t1;
+a	b
+1	12345
+2	23456
+# Connection default
+SELECT * FROM t1;
+a	b
+1	12345
+2	23456
+DROP TABLE t1;
 #
 # Bug#11747690 33650: MYSQL_ALTER_TABLE() UNNECESSARILY DOES
 #              FULL TABLE COPY 

=== modified file 'mysql-test/r/innodb_mysql_sync.result'
--- a/mysql-test/r/innodb_mysql_sync.result	2011-04-13 18:43:08 +0000
+++ b/mysql-test/r/innodb_mysql_sync.result	2011-06-01 09:11:28 +0000
@@ -94,6 +94,74 @@ SET DEBUG_SYNC= 'RESET';
 # Bug#42230 during add index, cannot do queries on storage engines
 #           that implement add_index
 #
-#
-# DISABLED due to Bug#11815600
-#
+DROP DATABASE IF EXISTS db1;
+DROP TABLE IF EXISTS t1;
+# Test 1: Secondary index, should not block reads (original test case).
+# Connection default
+CREATE DATABASE db1;
+CREATE TABLE db1.t1(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, value INT) engine=innodb;
+INSERT INTO db1.t1(value) VALUES (1), (2);
+SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
+# Sending:
+ALTER TABLE db1.t1 ADD INDEX(value);
+# Connection con1
+SET DEBUG_SYNC= "now WAIT_FOR manage";
+USE db1;
+SELECT * FROM t1;
+id	value
+1	1
+2	2
+SET DEBUG_SYNC= "now SIGNAL query";
+# Connection default
+# Reaping: ALTER TABLE db1.t1 ADD INDEX(value)
+DROP DATABASE db1;
+# Test 2: Primary index (implicit), should block reads.
+CREATE TABLE t1(a INT NOT NULL, b INT NOT NULL) engine=innodb;
+SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
+# Sending:
+ALTER TABLE t1 ADD UNIQUE INDEX(a);
+# Connection con1
+SET DEBUG_SYNC= "now WAIT_FOR manage";
+USE test;
+# Sending:
+SELECT * FROM t1;
+# Connection con2
+# Waiting for SELECT to be blocked by the metadata lock on t1
+SET DEBUG_SYNC= "now SIGNAL query";
+# Connection default
+# Reaping: ALTER TABLE t1 ADD UNIQUE INDEX(a)
+# Connection con1
+# Reaping: SELECT * FROM t1
+a	b
+# Test 3: Primary index (explicit), should block reads.
+# Connection default
+ALTER TABLE t1 DROP INDEX a;
+SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
+# Sending:
+ALTER TABLE t1 ADD PRIMARY KEY (a);
+# Connection con1
+SET DEBUG_SYNC= "now WAIT_FOR manage";
+# Sending:
+SELECT * FROM t1;
+# Connection con2
+# Waiting for SELECT to be blocked by the metadata lock on t1
+SET DEBUG_SYNC= "now SIGNAL query";
+# Connection default
+# Reaping: ALTER TABLE t1 ADD PRIMARY KEY (a)
+# Connection con1
+# Reaping: SELECT * FROM t1
+a	b
+# Test 4: Secondary unique index, should not block reads.
+# Connection default
+SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
+# Sending:
+ALTER TABLE t1 ADD UNIQUE (b);
+# Connection con1
+SET DEBUG_SYNC= "now WAIT_FOR manage";
+SELECT * FROM t1;
+a	b
+SET DEBUG_SYNC= "now SIGNAL query";
+# Connection default
+# Reaping: ALTER TABLE t1 ADD UNIQUE (b)
+SET DEBUG_SYNC= "RESET";
+DROP TABLE t1;

=== modified file 'mysql-test/t/innodb_mysql_lock.test'
--- a/mysql-test/t/innodb_mysql_lock.test	2011-05-10 10:55:34 +0000
+++ b/mysql-test/t/innodb_mysql_lock.test	2011-06-01 09:11:28 +0000
@@ -290,6 +290,8 @@ DROP TABLE IF EXISTS t1;
 
 --connect (con1,localhost,root)
 
+--echo # Test 1: Secondary index
+
 --echo # Connection default
 connection default;
 CREATE TABLE t1 (id INT PRIMARY KEY, value INT) ENGINE = InnoDB;
@@ -300,6 +302,10 @@ SELECT * FROM t1;
 --echo # Connection con1
 --connection con1
 SET lock_wait_timeout=1;
+# Test with two timeouts, as the first version of this patch
+# only worked with one timeout.
+--error ER_LOCK_WAIT_TIMEOUT
+ALTER TABLE t1 ADD INDEX idx(value);
 --error ER_LOCK_WAIT_TIMEOUT
 ALTER TABLE t1 ADD INDEX idx(value);
 
@@ -308,6 +314,22 @@ ALTER TABLE t1 ADD INDEX idx(value);
 SELECT * FROM t1;
 COMMIT;
 DROP TABLE t1;
+
+--echo # Test 2: Primary index
+CREATE TABLE t1(a INT NOT NULL, b INT NOT NULL) engine=innodb;
+INSERT INTO t1 VALUES (1, 12345), (2, 23456);
+
+--echo # Connection con1
+--connection con1
+SET SESSION debug= "+d,alter_table_rollback_new_index";
+--error ER_UNKNOWN_ERROR
+ALTER TABLE t1 ADD PRIMARY KEY(a);
+SELECT * FROM t1;
+
+--echo # Connection default
+--connection default
+SELECT * FROM t1;
+DROP TABLE t1;
 disconnect con1;
 
 

=== modified file 'mysql-test/t/innodb_mysql_sync.test'
--- a/mysql-test/t/innodb_mysql_sync.test	2011-04-13 18:43:08 +0000
+++ b/mysql-test/t/innodb_mysql_sync.test	2011-06-01 09:11:28 +0000
@@ -152,133 +152,129 @@ disconnect con1;
 --echo #           that implement add_index
 --echo #
 
---echo #
---echo # DISABLED due to Bug#11815600
---echo #
+--disable_warnings
+DROP DATABASE IF EXISTS db1;
+DROP TABLE IF EXISTS t1;
+--enable_warnings
+
+connect(con1,localhost,root);
+connect(con2,localhost,root);
+
+--echo # Test 1: Secondary index, should not block reads (original test case).
+
+--echo # Connection default
+connection default;
+CREATE DATABASE db1;
+CREATE TABLE db1.t1(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, value INT) engine=innodb;
+INSERT INTO db1.t1(value) VALUES (1), (2);
+SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
+--echo # Sending:
+--send ALTER TABLE db1.t1 ADD INDEX(value)
+
+--echo # Connection con1
+connection con1;
+SET DEBUG_SYNC= "now WAIT_FOR manage";
+# Neither of these two statements should be blocked
+USE db1;
+SELECT * FROM t1;
+SET DEBUG_SYNC= "now SIGNAL query";
+
+--echo # Connection default
+connection default;
+--echo # Reaping: ALTER TABLE db1.t1 ADD INDEX(value)
+--reap
+DROP DATABASE db1;
+
+--echo # Test 2: Primary index (implicit), should block reads.
+
+CREATE TABLE t1(a INT NOT NULL, b INT NOT NULL) engine=innodb;
+SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
+--echo # Sending:
+--send ALTER TABLE t1 ADD UNIQUE INDEX(a)
+
+--echo # Connection con1
+connection con1;
+SET DEBUG_SYNC= "now WAIT_FOR manage";
+USE test;
+--echo # Sending:
+--send SELECT * FROM t1
+
+--echo # Connection con2
+connection con2;
+--echo # Waiting for SELECT to be blocked by the metadata lock on t1
+let $wait_condition= SELECT COUNT(*)= 1 FROM information_schema.processlist
+  WHERE state= 'Waiting for table metadata lock'
+  AND info='SELECT * FROM t1';
+--source include/wait_condition.inc
+SET DEBUG_SYNC= "now SIGNAL query";
+
+--echo # Connection default
+connection default;
+--echo # Reaping: ALTER TABLE t1 ADD UNIQUE INDEX(a)
+--reap
+
+--echo # Connection con1
+connection con1;
+--echo # Reaping: SELECT * FROM t1
+--reap
+
+--echo # Test 3: Primary index (explicit), should block reads.
+
+--echo # Connection default
+connection default;
+ALTER TABLE t1 DROP INDEX a;
+SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
+--echo # Sending:
+--send ALTER TABLE t1 ADD PRIMARY KEY (a)
+
+--echo # Connection con1
+connection con1;
+SET DEBUG_SYNC= "now WAIT_FOR manage";
+--echo # Sending:
+--send SELECT * FROM t1
+
+--echo # Connection con2
+connection con2;
+--echo # Waiting for SELECT to be blocked by the metadata lock on t1
+let $wait_condition= SELECT COUNT(*)= 1 FROM information_schema.processlist
+  WHERE state= 'Waiting for table metadata lock'
+  AND info='SELECT * FROM t1';
+--source include/wait_condition.inc
+SET DEBUG_SYNC= "now SIGNAL query";
+
+--echo # Connection default
+connection default;
+--echo # Reaping: ALTER TABLE t1 ADD PRIMARY KEY (a)
+--reap
+
+--echo # Connection con1
+connection con1;
+--echo # Reaping: SELECT * FROM t1
+--reap
+
+--echo # Test 4: Secondary unique index, should not block reads.
+
+--echo # Connection default
+connection default;
+SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
+--echo # Sending:
+--send ALTER TABLE t1 ADD UNIQUE (b)
+
+--echo # Connection con1
+connection con1;
+SET DEBUG_SYNC= "now WAIT_FOR manage";
+SELECT * FROM t1;
+SET DEBUG_SYNC= "now SIGNAL query";
+
+--echo # Connection default
+connection default;
+--echo # Reaping: ALTER TABLE t1 ADD UNIQUE (b)
+--reap
 
-#--disable_warnings
-#DROP DATABASE IF EXISTS db1;
-#DROP TABLE IF EXISTS t1;
-#--enable_warnings
-#
-#connect(con1,localhost,root);
-#connect(con2,localhost,root);
-#
-#--echo # Test 1: Secondary index, should not block reads (original test case).
-#
-#--echo # Connection default
-#connection default;
-#CREATE DATABASE db1;
-#CREATE TABLE db1.t1(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, value INT) engine=innodb;
-#INSERT INTO db1.t1(value) VALUES (1), (2);
-#SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
-#--echo # Sending:
-#--send ALTER TABLE db1.t1 ADD INDEX(value)
-#
-#--echo # Connection con1
-#connection con1;
-#SET DEBUG_SYNC= "now WAIT_FOR manage";
-# # Neither of these two statements should be blocked
-#USE db1;
-#SELECT * FROM t1;
-#SET DEBUG_SYNC= "now SIGNAL query";
-#
-#--echo # Connection default
-#connection default;
-#--echo # Reaping: ALTER TABLE db1.t1 ADD INDEX(value)
-#--reap
-#DROP DATABASE db1;
-#
-#--echo # Test 2: Primary index (implicit), should block reads.
-#
-#CREATE TABLE t1(a INT NOT NULL, b INT NOT NULL) engine=innodb;
-#SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
-#--echo # Sending:
-#--send ALTER TABLE t1 ADD UNIQUE INDEX(a)
-#
-#--echo # Connection con1
-#connection con1;
-#SET DEBUG_SYNC= "now WAIT_FOR manage";
-#USE test;
-#--echo # Sending:
-#--send SELECT * FROM t1
-#
-#--echo # Connection con2
-#connection con2;
-#--echo # Waiting for SELECT to be blocked by the metadata lock on t1
-#let $wait_condition= SELECT COUNT(*)= 1 FROM information_schema.processlist
-#  WHERE state= 'Waiting for table metadata lock'
-#  AND info='SELECT * FROM t1';
-#--source include/wait_condition.inc
-#SET DEBUG_SYNC= "now SIGNAL query";
-#
-#--echo # Connection default
-#connection default;
-#--echo # Reaping: ALTER TABLE t1 ADD UNIQUE INDEX(a)
-#--reap
-#
-#--echo # Connection con1
-#connection con1;
-#--echo # Reaping: SELECT * FROM t1
-#--reap
-#
-#--echo # Test 3: Primary index (explicit), should block reads.
-#
-#--echo # Connection default
-#connection default;
-#ALTER TABLE t1 DROP INDEX a;
-#SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
-#--echo # Sending:
-#--send ALTER TABLE t1 ADD PRIMARY KEY (a)
-#
-#--echo # Connection con1
-#connection con1;
-#SET DEBUG_SYNC= "now WAIT_FOR manage";
-#--echo # Sending:
-#--send SELECT * FROM t1
-#
-#--echo # Connection con2
-#connection con2;
-#--echo # Waiting for SELECT to be blocked by the metadata lock on t1
-#let $wait_condition= SELECT COUNT(*)= 1 FROM information_schema.processlist
-#  WHERE state= 'Waiting for table metadata lock'
-#  AND info='SELECT * FROM t1';
-#--source include/wait_condition.inc
-#SET DEBUG_SYNC= "now SIGNAL query";
-#
-#--echo # Connection default
-#connection default;
-#--echo # Reaping: ALTER TABLE t1 ADD PRIMARY KEY (a)
-#--reap
-#
-#--echo # Connection con1
-#connection con1;
-#--echo # Reaping: SELECT * FROM t1
-#--reap
-#
-#--echo # Test 4: Secondary unique index, should not block reads.
-#
-#--echo # Connection default
-#connection default;
-#SET DEBUG_SYNC= "alter_table_manage_keys SIGNAL manage WAIT_FOR query";
-#--echo # Sending:
-#--send ALTER TABLE t1 ADD UNIQUE (b)
-#
-#--echo # Connection con1
-#connection con1;
-#SET DEBUG_SYNC= "now WAIT_FOR manage";
-#SELECT * FROM t1;
-#SET DEBUG_SYNC= "now SIGNAL query";
-#
-#--echo # Connection default
-#connection default;
-#--echo # Reaping: ALTER TABLE t1 ADD UNIQUE (b)
-#--reap
-#
-#disconnect con1;
-#disconnect con2;
-#SET DEBUG_SYNC= "RESET";
-#DROP TABLE t1;
+disconnect con1;
+disconnect con2;
+SET DEBUG_SYNC= "RESET";
+DROP TABLE t1;
 
 
 # Check that all connections opened by test cases in this file are really

=== modified file 'sql/ha_partition.cc'
--- a/sql/ha_partition.cc	2011-05-26 15:20:09 +0000
+++ b/sql/ha_partition.cc	2011-06-01 09:11:28 +0000
@@ -6875,22 +6875,29 @@ bool ha_partition::check_if_incompatible
 
 
 /**
-  Support of fast or online add/drop index
+  Support of in-place add/drop index
 */
-int ha_partition::add_index(TABLE *table_arg, KEY *key_info, uint num_of_keys)
+int ha_partition::add_index(TABLE *table_arg, KEY *key_info, uint num_of_keys,
+                            handler_add_index **add)
 {
   handler **file;
   int ret= 0;
 
   DBUG_ENTER("ha_partition::add_index");
+  *add= new handler_add_index(table, key_info, num_of_keys);
   /*
     There has already been a check in fix_partition_func in mysql_alter_table
     before this call, which checks for unique/primary key violations of the
     partitioning function. So no need for extra check here.
   */
   for (file= m_file; *file; file++)
-    if ((ret=  (*file)->add_index(table_arg, key_info, num_of_keys)))
+  {
+    handler_add_index *add_index;
+    if ((ret= (*file)->add_index(table_arg, key_info, num_of_keys, &add_index)))
+      goto err;
+    if ((ret= (*file)->final_add_index(add_index, true)))
       goto err;
+  }
   DBUG_RETURN(ret);
 err:
   if (file > m_file)
@@ -6915,6 +6922,42 @@ err:
 }
 
 
+/**
+   Second phase of in-place add index.
+
+   @note If commit is false, index changes are rolled back by dropping the
+         added indexes. If commit is true, nothing is done as the indexes
+         were already made active in ::add_index()
+ */
+
+int ha_partition::final_add_index(handler_add_index *add, bool commit)
+{
+  DBUG_ENTER("ha_partition::final_add_index");
+  // Rollback by dropping indexes.
+  if (!commit)
+  {
+    TABLE *table_arg= add->table;
+    uint num_of_keys= add->num_of_keys;
+    handler **file;
+    uint *key_numbers= (uint*) ha_thd()->alloc(sizeof(uint) * num_of_keys);
+    uint old_num_of_keys= table_arg->s->keys;
+    uint i;
+    /* The newly created keys have the last id's */
+    for (i= 0; i < num_of_keys; i++)
+      key_numbers[i]= i + old_num_of_keys;
+    if (!table_arg->key_info)
+      table_arg->key_info= add->key_info;
+    for (file= m_file; *file; file++)
+    {
+      (void) (*file)->prepare_drop_index(table_arg, key_numbers, num_of_keys);
+      (void) (*file)->final_drop_index(table_arg);
+    }
+    if (table_arg->key_info == add->key_info)
+      table_arg->key_info= NULL;
+  }
+  DBUG_RETURN(0);
+}
+
 int ha_partition::prepare_drop_index(TABLE *table_arg, uint *key_num,
                                  uint num_of_keys)
 {

=== modified file 'sql/ha_partition.h'
--- a/sql/ha_partition.h	2011-05-16 11:32:07 +0000
+++ b/sql/ha_partition.h	2011-06-01 09:11:28 +0000
@@ -1052,7 +1052,9 @@ public:
     They are used for on-line/fast alter table add/drop index:
     -------------------------------------------------------------------------
   */
-  virtual int add_index(TABLE *table_arg, KEY *key_info, uint num_of_keys);
+  virtual int add_index(TABLE *table_arg, KEY *key_info, uint num_of_keys,
+                        handler_add_index **add);
+  virtual int final_add_index(handler_add_index *add, bool commit);
   virtual int prepare_drop_index(TABLE *table_arg, uint *key_num,
                                  uint num_of_keys);
   virtual int final_drop_index(TABLE *table_arg);

=== modified file 'sql/handler.h'
--- a/sql/handler.h	2011-05-12 17:29:19 +0000
+++ b/sql/handler.h	2011-06-01 09:11:28 +0000
@@ -1214,6 +1214,27 @@ uint calculate_key_len(TABLE *, uint, co
 */
 #define make_prev_keypart_map(N) (((key_part_map)1 << (N)) - 1)
 
+
+/**
+  Index creation context.
+  Created by handler::add_index() and freed by handler::final_add_index().
+*/
+
+class handler_add_index
+{
+public:
+  /* Table where the indexes are added */
+  TABLE* const table;
+  /* Indexes being created */
+  KEY* const key_info;
+  /* Size of key_info[] */
+  const uint num_of_keys;
+  handler_add_index(TABLE *table_arg, KEY *key_info_arg, uint num_of_keys_arg)
+    : table (table_arg), key_info (key_info_arg), num_of_keys (num_of_keys_arg)
+  {}
+  virtual ~handler_add_index() {}
+};
+
 /**
   The handler class is the interface for dynamically loadable
   storage engines. Do not add ifdefs and take care when adding or
@@ -1909,8 +1930,36 @@ public:
 
   virtual ulong index_flags(uint idx, uint part, bool all_parts) const =0;
 
-  virtual int add_index(TABLE *table_arg, KEY *key_info, uint num_of_keys)
+/**
+   First phase of in-place add index.
+   Handlers are supposed to create new indexes here but not make them
+   visible.
+
+   @param table_arg   Table to add index to
+   @param key_info    Information about new indexes
+   @param num_of_key  Number of new indexes
+   @param add[out]    Context of handler specific information needed
+                      for final_add_index().
+
+   @note This function can be called with less than exclusive metadata
+   lock depending on which flags are listed in alter_table_flags.
+*/
+  virtual int add_index(TABLE *table_arg, KEY *key_info, uint num_of_keys,
+                        handler_add_index **add)
+  { return (HA_ERR_WRONG_COMMAND); }
+
+/**
+   Second and last phase of in-place add index.
+   Commit or rollback pending new indexes.
+
+   @param add     Context of handler specific information from add_index().
+   @param commit  If true, commit. If false, rollback index changes.
+
+   @note This function is called with exclusive metadata lock.
+*/
+  virtual int final_add_index(handler_add_index *add, bool commit)
   { return (HA_ERR_WRONG_COMMAND); }
+
   virtual int prepare_drop_index(TABLE *table_arg, uint *key_num,
                                  uint num_of_keys)
   { return (HA_ERR_WRONG_COMMAND); }

=== modified file 'sql/sql_table.cc'
--- a/sql/sql_table.cc	2011-05-26 15:20:09 +0000
+++ b/sql/sql_table.cc	2011-06-01 09:11:28 +0000
@@ -5931,6 +5931,8 @@ bool mysql_alter_table(THD *thd,char *ne
   uint index_drop_count= 0;
   uint *index_drop_buffer= NULL;
   uint index_add_count= 0;
+  handler_add_index *add= NULL;
+  bool pending_inplace_add_index= false;
   uint *index_add_buffer= NULL;
   uint candidate_key_count= 0;
   bool no_pk;
@@ -6687,6 +6689,9 @@ bool mysql_alter_table(THD *thd,char *ne
   }
   thd->count_cuted_fields= CHECK_FIELD_IGNORE;
 
+  if (error)
+    goto err_new_table_cleanup;
+
   /* If we did not need to copy, we might still need to add/drop indexes. */
   if (! new_table)
   {
@@ -6718,7 +6723,8 @@ bool mysql_alter_table(THD *thd,char *ne
           key_part->field= table->field[key_part->fieldnr];
       }
       /* Add the indexes. */
-      if ((error= table->file->add_index(table, key_info, index_add_count)))
+      if ((error= table->file->add_index(table, key_info, index_add_count,
+                                         &add)))
       {
         /*
           Exchange the key_info for the error message. If we exchange
@@ -6730,11 +6736,27 @@ bool mysql_alter_table(THD *thd,char *ne
         table->key_info= save_key_info;
         goto err_new_table_cleanup;
       }
+      pending_inplace_add_index= true;
     }
     /*end of if (index_add_count)*/
 
     if (index_drop_count)
     {
+      /* Currently we must finalize add index if we also drop indexes */
+      if (pending_inplace_add_index)
+      {
+        /* Committing index changes needs exclusive metadata lock. */
+        DBUG_ASSERT(thd->mdl_context.is_lock_owner(MDL_key::TABLE,
+                                                   table_list->db,
+                                                   table_list->table_name,
+                                                   MDL_EXCLUSIVE));
+        if ((error= table->file->final_add_index(add, true)))
+        {
+          table->file->print_error(error, MYF(0));
+          goto err_new_table_cleanup;
+        }
+        pending_inplace_add_index= false;
+      }
       /* The prepare_drop_index() method takes an array of key numbers. */
       key_numbers= (uint*) thd->alloc(sizeof(uint) * index_drop_count);
       keyno_p= key_numbers;
@@ -6775,11 +6797,14 @@ bool mysql_alter_table(THD *thd,char *ne
   }
   /*end of if (! new_table) for add/drop index*/
 
-  if (error)
-    goto err_new_table_cleanup;
-
   if (table->s->tmp_table != NO_TMP_TABLE)
   {
+    /*
+      In-place operations are not supported for temporary tables, so
+      we don't have to call final_add_index() in this case. The assert
+      verifies that in-place add index has not been done.
+    */
+    DBUG_ASSERT(!pending_inplace_add_index);
     /* Close lock if this is a transactional table */
     if (thd->lock)
     {
@@ -6846,8 +6871,30 @@ bool mysql_alter_table(THD *thd,char *ne
     my_casedn_str(files_charset_info, old_name);
 
   if (wait_while_table_is_used(thd, table, HA_EXTRA_PREPARE_FOR_RENAME))
+  {
+    if (pending_inplace_add_index)
+    {
+      pending_inplace_add_index= false;
+      table->file->final_add_index(add, false);
+    }
+    // Mark this TABLE instance as stale to avoid out-of-sync index information.
+    table->m_needs_reopen= true;
     goto err_new_table_cleanup;
-
+  }
+  if (pending_inplace_add_index)
+  {
+    pending_inplace_add_index= false;
+    DBUG_EXECUTE_IF("alter_table_rollback_new_index", {
+      table->file->final_add_index(add, false);
+      my_error(ER_UNKNOWN_ERROR, MYF(0));
+      goto err_new_table_cleanup;
+    });
+    if ((error= table->file->final_add_index(add, true)))
+    {
+      table->file->print_error(error, MYF(0));
+      goto err_new_table_cleanup;
+    }
+  }
 
   close_all_tables_for_name(thd, table->s,
                             new_name != table_name || new_db != db);

=== modified file 'storage/innobase/handler/ha_innodb.cc'
--- a/storage/innobase/handler/ha_innodb.cc	2011-05-31 09:30:59 +0000
+++ b/storage/innobase/handler/ha_innodb.cc	2011-06-01 09:11:28 +0000
@@ -2848,8 +2848,10 @@ innobase_alter_table_flags(
 	uint	flags)
 {
 	return(HA_INPLACE_ADD_INDEX_NO_READ_WRITE
+		| HA_INPLACE_ADD_INDEX_NO_WRITE
 		| HA_INPLACE_DROP_INDEX_NO_READ_WRITE
 		| HA_INPLACE_ADD_UNIQUE_INDEX_NO_READ_WRITE
+		| HA_INPLACE_ADD_UNIQUE_INDEX_NO_WRITE
 		| HA_INPLACE_DROP_UNIQUE_INDEX_NO_READ_WRITE
 		| HA_INPLACE_ADD_PK_INDEX_NO_READ_WRITE);
 }

=== modified file 'storage/innobase/handler/ha_innodb.h'
--- a/storage/innobase/handler/ha_innodb.h	2011-04-11 14:57:47 +0000
+++ b/storage/innobase/handler/ha_innodb.h	2011-06-01 09:11:28 +0000
@@ -213,7 +213,9 @@ class ha_innobase: public handler
 	bool primary_key_is_clustered();
 	int cmp_ref(const uchar *ref1, const uchar *ref2);
 	/** Fast index creation (smart ALTER TABLE) @see handler0alter.cc @{ */
-	int add_index(TABLE *table_arg, KEY *key_info, uint num_of_keys);
+	int add_index(TABLE *table_arg, KEY *key_info, uint num_of_keys,
+		      handler_add_index **add);
+	int final_add_index(handler_add_index *add, bool commit);
 	int prepare_drop_index(TABLE *table_arg, uint *key_num,
 			       uint num_of_keys);
 	int final_drop_index(TABLE *table_arg);

=== modified file 'storage/innobase/handler/handler0alter.cc'
--- a/storage/innobase/handler/handler0alter.cc	2011-05-31 09:30:59 +0000
+++ b/storage/innobase/handler/handler0alter.cc	2011-06-01 09:11:28 +0000
@@ -642,6 +642,18 @@ innobase_create_temporary_tablename(
 	return(name);
 }
 
+class ha_innobase_add_index : public handler_add_index
+{
+public:
+	/** table where the indexes are being created */
+	dict_table_t* indexed_table;
+	ha_innobase_add_index(TABLE* table, KEY* key_info, uint num_of_keys,
+			      dict_table_t* indexed_table_arg) :
+		handler_add_index(table, key_info, num_of_keys),
+		indexed_table (indexed_table_arg) {}
+	~ha_innobase_add_index() {}
+};
+
 /*******************************************************************//**
 Clean up on ha_innobase::add_index error. */
 static
@@ -694,12 +706,15 @@ UNIV_INTERN
 int
 ha_innobase::add_index(
 /*===================*/
-	TABLE*	table,		/*!< in: Table where indexes are created */
-	KEY*	key_info,	/*!< in: Indexes to be created */
-	uint	num_of_keys)	/*!< in: Number of indexes to be created */
+	TABLE*			table,		/*!< in: Table where indexes
+						are created */
+	KEY*			key_info,	/*!< in: Indexes
+						to be created */
+	uint			num_of_keys,	/*!< in: Number of indexes
+						to be created */
+	handler_add_index**	add)		/*!< out: context */
 {
 	dict_index_t**	index;		/*!< Index to be created */
-	dict_table_t*	innodb_table;	/*!< InnoDB table in dictionary */
 	dict_table_t*	indexed_table;	/*!< Table where indexes are created */
 	merge_index_def_t* index_defs;	/*!< Index definitions */
 	mem_heap_t*     heap;		/*!< Heap for index definitions */
@@ -715,6 +730,8 @@ ha_innobase::add_index(
 	ut_a(key_info);
 	ut_a(num_of_keys);
 
+	*add = NULL;
+
 	if (srv_created_new_raw || srv_force_recovery) {
 		DBUG_RETURN(HA_ERR_WRONG_COMMAND);
 	}
@@ -732,28 +749,28 @@ ha_innobase::add_index(
 
 	indexed_table = dict_table_open_on_name(prebuilt->table->name, FALSE);
 
-	innodb_table = indexed_table;
-
-	if (UNIV_UNLIKELY(!innodb_table)) {
+	if (UNIV_UNLIKELY(!indexed_table)) {
 		DBUG_RETURN(HA_ERR_NO_SUCH_TABLE);
 	}
 
+	ut_a(indexed_table == prebuilt->table);
+
 	/* Check that index keys are sensible */
-	error = innobase_check_index_keys(key_info, num_of_keys, innodb_table);
+	error = innobase_check_index_keys(key_info, num_of_keys, prebuilt->table);
 
 	if (UNIV_UNLIKELY(error)) {
-		dict_table_close(innodb_table, FALSE);
+		dict_table_close(prebuilt->table, FALSE);
 		DBUG_RETURN(error);
 	}
 
 	/* Check each index's column length to make sure they do not
 	exceed limit */
 	for (ulint i = 0; i < num_of_keys; i++) {
-		error = innobase_check_column_length(innodb_table,
+		error = innobase_check_column_length(prebuilt->table,
 						     &key_info[i]);
 
 		if (error) {
-			dict_table_close(innodb_table, FALSE);
+			dict_table_close(prebuilt->table, FALSE);
 			DBUG_RETURN(error);
 		}
 	}
@@ -774,8 +791,8 @@ ha_innobase::add_index(
 
 	row_mysql_lock_data_dictionary(trx);
 
-	if (innodb_table->can_be_evicted) {
-		dict_table_move_from_lru_to_non_lru(innodb_table);
+	if (prebuilt->table->can_be_evicted) {
+		dict_table_move_from_lru_to_non_lru(prebuilt->table);
 	}
 
 	row_mysql_unlock_data_dictionary(trx);
@@ -787,7 +804,7 @@ ha_innobase::add_index(
 	num_of_idx = num_of_keys;
 
 	index_defs = innobase_create_key_def(
-		trx, innodb_table, heap, key_info, num_of_idx);
+		trx, prebuilt->table, heap, key_info, num_of_idx);
 
 	new_primary = DICT_CLUSTERED & index_defs[0].ind_type;
 
@@ -801,7 +818,7 @@ ha_innobase::add_index(
 	trx_set_dict_operation(trx, TRX_DICT_OP_INDEX);
 
 	/* Acquire a lock on the table before creating any indexes. */
-	error = row_merge_lock_table(prebuilt->trx, innodb_table,
+	error = row_merge_lock_table(prebuilt->trx, prebuilt->table,
 				     new_primary ? LOCK_X : LOCK_S);
 
 	if (UNIV_UNLIKELY(error != DB_SUCCESS)) {
@@ -815,7 +832,7 @@ ha_innobase::add_index(
 	row_mysql_lock_data_dictionary(trx);
 	dict_locked = TRUE;
 
-	ut_d(dict_table_check_for_dup_indexes(innodb_table, FALSE));
+	ut_d(dict_table_check_for_dup_indexes(prebuilt->table, FALSE));
 
 	/* If a new primary key is defined for the table we need
 	to drop the original table and rebuild all indexes. */
@@ -824,15 +841,15 @@ ha_innobase::add_index(
 		/* This transaction should be the only one
 		operating on the table. The table get above
 		would have incremented the ref count to 2. */
-		ut_a(innodb_table->n_ref_count == 2);
+		ut_a(prebuilt->table->n_ref_count == 2);
 
 		char*	new_table_name = innobase_create_temporary_tablename(
-			heap, '1', innodb_table->name);
+			heap, '1', prebuilt->table->name);
 
 		/* Clone the table. */
 		trx_set_dict_operation(trx, TRX_DICT_OP_TABLE);
 		indexed_table = row_merge_create_temporary_table(
-			new_table_name, index_defs, innodb_table, trx);
+			new_table_name, index_defs, prebuilt->table, trx);
 
 		if (!indexed_table) {
 
@@ -846,17 +863,18 @@ ha_innobase::add_index(
 				break;
 			default:
 				error = convert_error_code_to_mysql(
-					trx->error_state, innodb_table->flags,
+					trx->error_state,
+					prebuilt->table->flags,
 					user_thd);
 			}
 
-			ut_d(dict_table_check_for_dup_indexes(innodb_table,
+			ut_d(dict_table_check_for_dup_indexes(prebuilt->table,
 							      FALSE));
 			row_mysql_unlock_data_dictionary(trx);
 			mem_heap_free(heap);
 
 			innobase_add_index_cleanup(
-				prebuilt, trx, innodb_table);
+				prebuilt, trx, prebuilt->table);
 
 			DBUG_RETURN(error);
 		}
@@ -866,17 +884,15 @@ ha_innobase::add_index(
 
 	/* Create the indexes in SYS_INDEXES and load into dictionary. */
 
-	for (ulint i = 0; i < num_of_idx; i++) {
+	for (num_created = 0; num_created < num_of_idx; num_created++) {
 
-		index[i] = row_merge_create_index(trx, indexed_table,
-						  &index_defs[i]);
+		index[num_created] = row_merge_create_index(
+			trx, indexed_table, &index_defs[num_created]);
 
-		if (!index[i]) {
+		if (!index[num_created]) {
 			error = trx->error_state;
 			goto error_handling;
 		}
-
-		num_created++;
 	}
 
 	ut_ad(error == DB_SUCCESS);
@@ -897,7 +913,7 @@ ha_innobase::add_index(
 	if (UNIV_UNLIKELY(new_primary)) {
 		/* A primary key is to be built.  Acquire an exclusive
 		table lock also on the table that is being created. */
-		ut_ad(indexed_table != innodb_table);
+		ut_ad(indexed_table != prebuilt->table);
 
 		error = row_merge_lock_table(prebuilt->trx, indexed_table,
 					     LOCK_X);
@@ -911,7 +927,7 @@ ha_innobase::add_index(
 	/* Read the clustered index of the table and build indexes
 	based on this information using temporary files and merge sort. */
 	error = row_merge_build_indexes(prebuilt->trx,
-					innodb_table, indexed_table,
+					prebuilt->table, indexed_table,
 					index, num_of_idx, table);
 
 error_handling:
@@ -920,74 +936,17 @@ error_handling:
 	dictionary which were defined. */
 
 	switch (error) {
-		const char*	old_name;
-		char*		tmp_name;
 	case DB_SUCCESS:
 		ut_a(!dict_locked);
-		row_mysql_lock_data_dictionary(trx);
-		dict_locked = TRUE;
 
+		ut_d(mutex_enter(&dict_sys->mutex));
 		ut_d(dict_table_check_for_dup_indexes(prebuilt->table, TRUE));
+		ut_d(mutex_exit(&dict_sys->mutex));
+		*add = new ha_innobase_add_index(table, key_info, num_of_keys,
+                                                 indexed_table);
 
-		if (!new_primary) {
-			error = row_merge_rename_indexes(trx, indexed_table);
-
-			if (error != DB_SUCCESS) {
-				row_merge_drop_indexes(trx, indexed_table,
-						       index, num_created);
-			}
-
-			dict_table_close(innodb_table, dict_locked);
-
-			goto convert_error;
-		}
-
-		/* If a new primary key was defined for the table and
-		there was no error at this point, we can now rename
-		the old table as a temporary table, rename the new
-		temporary table as the old table and drop the old table. */
-		old_name = innodb_table->name;
-		tmp_name = innobase_create_temporary_tablename(heap, '2',
-							       old_name);
-
-		error = row_merge_rename_tables(innodb_table, indexed_table,
-						tmp_name, trx);
-
-		dict_table_close(innodb_table, dict_locked);
-		ut_a(innodb_table->n_ref_count == 1);
-
-		if (error != DB_SUCCESS) {
-
-			ut_a(indexed_table->n_ref_count == 0);
-
-			row_merge_drop_table(trx, indexed_table);
-
-			switch (error) {
-			case DB_TABLESPACE_ALREADY_EXISTS:
-			case DB_DUPLICATE_KEY:
-				innobase_convert_tablename(tmp_name);
-				my_error(HA_ERR_TABLE_EXIST, MYF(0), tmp_name);
-				error = HA_ERR_TABLE_EXIST;
-				break;
-			default:
-				goto convert_error;
-			}
-			break;
-		}
-
-		trx_commit_for_mysql(prebuilt->trx);
-
-		row_prebuilt_free(prebuilt, TRUE);
-
-		ut_a(innodb_table->n_ref_count == 0);
-
-		error = row_merge_drop_table(trx, innodb_table);
-
-		prebuilt = row_create_prebuilt(indexed_table);
-
-		innodb_table = indexed_table;
-
-		goto convert_error;
+		dict_table_close(prebuilt->table, dict_locked);
+		break;
 
 	case DB_TOO_BIG_RECORD:
 		my_error(HA_ERR_TO_BIG_ROW, MYF(0));
@@ -1000,12 +959,12 @@ error:
 		prebuilt->trx->error_info = NULL;
 		/* fall through */
 	default:
-		dict_table_close(innodb_table, dict_locked);
+		dict_table_close(prebuilt->table, dict_locked);
 
 		trx->error_state = DB_SUCCESS;
 
 		if (new_primary) {
-			if (indexed_table != innodb_table) {
+			if (indexed_table != prebuilt->table) {
 				dict_table_close(indexed_table, dict_locked);
 				row_merge_drop_table(trx, indexed_table);
 			}
@@ -1018,40 +977,166 @@ error:
 			row_merge_drop_indexes(trx, indexed_table,
 					       index, num_created);
 		}
-
-convert_error:
-		if (error == DB_SUCCESS) {
-			/* Build index is successful. We will need to
-			rebuild index translation table.  Reset the
-			index entry count in the translation table
-			to zero, so that translation table will be rebuilt */
-			share->idx_trans_tbl.index_count = 0;
-		}
-
-		error = convert_error_code_to_mysql(error,
-						    innodb_table->flags,
-						    user_thd);
 	}
 
-	ut_a(!new_primary || innodb_table->n_ref_count == 1);
+	ut_a(!new_primary || prebuilt->table->n_ref_count == 1);
 
-	mem_heap_free(heap);
 	trx_commit_for_mysql(trx);
 	if (prebuilt->trx) {
 		trx_commit_for_mysql(prebuilt->trx);
 	}
 
 	if (dict_locked) {
-		ut_d(dict_table_check_for_dup_indexes(innodb_table, FALSE));
 		row_mysql_unlock_data_dictionary(trx);
 	}
 
 	trx_free_for_mysql(trx);
+	mem_heap_free(heap);
+
+	/* There might be work for utility threads.*/
+	srv_active_wake_master_thread();
+
+	DBUG_RETURN(convert_error_code_to_mysql(error, prebuilt->table->flags,
+						user_thd));
+}
+
+/*******************************************************************//**
+Finalize or undo add_index().
+@return	0 or error number */
+UNIV_INTERN
+int
+ha_innobase::final_add_index(
+/*=========================*/
+	handler_add_index*	add_arg,/*!< in: context from add_index() */
+	bool			commit)	/*!< in: true=commit, false=rollback */
+{
+	ha_innobase_add_index*	add;
+	trx_t*			trx;
+	int			err	= 0;
+
+	DBUG_ENTER("ha_innobase::final_add_index");
+
+	ut_ad(add_arg);
+	add = static_cast<class ha_innobase_add_index*>(add_arg);
+
+	/* Create a background transaction for the operations on
+	the data dictionary tables. */
+	trx = innobase_trx_allocate(user_thd);
+	trx_start_if_not_started(trx);
+
+	/* Flag this transaction as a dictionary operation, so that
+	the data dictionary will be locked in crash recovery. */
+	trx_set_dict_operation(trx, TRX_DICT_OP_INDEX);
+
+	/* Latch the InnoDB data dictionary exclusively so that no deadlocks
+	or lock waits can happen in it during an index create operation. */
+	row_mysql_lock_data_dictionary(trx);
+
+	if (add->indexed_table != prebuilt->table) {
+		ulint	error;
+
+		/* We copied the table (new_primary). */
+		if (commit) {
+			mem_heap_t*	heap;
+			char*		tmp_name;
+
+			heap = mem_heap_create(1024);
+
+			/* A new primary key was defined for the table
+			and there was no error at this point. We can
+			now rename the old table as a temporary table,
+			rename the new temporary table as the old
+			table and drop the old table. */
+			tmp_name = innobase_create_temporary_tablename(
+				heap, '2', prebuilt->table->name);
+
+			error = row_merge_rename_tables(
+				prebuilt->table, add->indexed_table,
+				tmp_name, trx);
+
+			ut_a(prebuilt->table->n_ref_count == 1);
+
+			switch (error) {
+			case DB_TABLESPACE_ALREADY_EXISTS:
+			case DB_DUPLICATE_KEY:
+				ut_a(add->indexed_table->n_ref_count == 0);
+				innobase_convert_tablename(tmp_name);
+				my_error(HA_ERR_TABLE_EXIST, MYF(0), tmp_name);
+				err = HA_ERR_TABLE_EXIST;
+				break;
+			default:
+				err = convert_error_code_to_mysql(
+					error, prebuilt->table->flags,
+					user_thd);
+				break;
+			}
+
+			mem_heap_free(heap);
+		}
+
+		if (!commit || err) {
+			dict_table_close(add->indexed_table, TRUE);
+			error = row_merge_drop_table(trx, add->indexed_table);
+			trx_commit_for_mysql(prebuilt->trx);
+		} else {
+			dict_table_t*	old_table = prebuilt->table;
+			trx_commit_for_mysql(prebuilt->trx);
+			row_prebuilt_free(prebuilt, TRUE);
+			error = row_merge_drop_table(trx, old_table);
+			prebuilt = row_create_prebuilt(add->indexed_table);
+		}
+
+		err = convert_error_code_to_mysql(
+			error, prebuilt->table->flags, user_thd);
+	} else {
+		/* We created secondary indexes (!new_primary). */
+
+		if (commit) {
+			err = convert_error_code_to_mysql(
+				row_merge_rename_indexes(trx, prebuilt->table),
+				prebuilt->table->flags, user_thd);
+		}
+
+		if (!commit || err) {
+			dict_index_t*	index;
+			dict_index_t*	next_index;
+
+			for (index = dict_table_get_first_index(
+				     prebuilt->table);
+			     index; index = next_index) {
+
+				next_index = dict_table_get_next_index(index);
+
+				if (*index->name == TEMP_INDEX_PREFIX) {
+					row_merge_drop_index(
+						index, prebuilt->table, trx);
+				}
+			}
+		}
+	}
+
+	/* If index is successfully built, we will need to rebuild index
+	translation table. Set valid index entry count in the translation
+	table to zero. */
+	if (err == 0 && commit) {
+		share->idx_trans_tbl.index_count = 0;
+	}
+
+	trx_commit_for_mysql(trx);
+	if (prebuilt->trx) {
+		trx_commit_for_mysql(prebuilt->trx);
+	}
+
+	ut_d(dict_table_check_for_dup_indexes(prebuilt->table, FALSE));
+	row_mysql_unlock_data_dictionary(trx);
+
+	trx_free_for_mysql(trx);
 
 	/* There might be work for utility threads.*/
 	srv_active_wake_master_thread();
 
-	DBUG_RETURN(error);
+	delete add;
+	DBUG_RETURN(err);
 }
 
 /*******************************************************************//**

No bundle (reason: useless for push emails).
Thread
bzr push into mysql-trunk branch (jon.hauglid:3134 to 3135) Jon Olav Hauglid1 Jun