List:Commits« Previous MessageNext Message »
From:Davi Arnaut Date:May 31 2011 1:52pm
Subject:bzr commit into mysql-trunk branch (davi:3134) Bug#11758972 Bug#11762221
View as plain text  
# At a local mysql-trunk repository of davi

 3134 Davi Arnaut	2011-05-31
      Bug#11762221 - 54790: Use of non-blocking mode for sockets limits performance
      Bug#11758972 - 51244: wait_timeout fails on OpenSolaris
      
      The problem was that a optimization for the case when the server
      uses alarms for timeouts could cause a slowdown when socket
      timeouts are used instead. In case alarms are used for timeouts,
      a non-blocking read is attempted first in order to avoid the
      cost of setting up a alarm and if this non-blocking read fails,
      the socket mode is changed to blocking and a alarm is armed.
      
      If socket timeout is used, there is no point in attempting a
      non-blocking read first as the timeout will be automatically
      armed by the OS. Yet the server would attempt a non-blocking
      read first and later switch the socket to blocking mode. This
      could inadvertently impact performance as switching the blocking
      mode of a socket requires at least two calls into the kernel
      on Linux, apart from problems inherited by the scalability
      of fcntl(2).
      
      The solution is to remove alarm based timeouts from the
      protocol layer and push timeout handling down to the virtual
      I/O layer. This approach allows the handling of socket timeouts
      on a platform-specific basis. The blocking mode of the socket
      is no longer exported and VIO read and write operations either
      complete or fail with a error or timeout.
      
      On Linux, the MSG_DONTWAIT flag is used to enable non-blocking
      send and receive operations. If the operation would block,
      poll() is used to wait for readiness or until a timeout occurs.
      This strategy avoids the need to set the socket timeout and
      blocking mode twice per query.
      
      On Windows, as before, the timeout is set on a per-socket
      fashion. In all remaining operating systems, the socket is set
      to non-blocking mode and poll() is used to wait for readiness
      or until a timeout occurs.
      
      In order to cleanup the code after the removal of alarm based
      timeouts, the low level packet reading loop is unrolled into
      two specific sequences: reading the packet header and the
      payload. This makes error handling easier down the road.
      
      In conclusion, benchmarks have shown that these changes do not
      introduce any performance hits and actually slightly improves
      the server throughput for higher numbers of threads.
      
      - Incompatible changes:
      
      A timeout is now always applied to a individual receive or
      send I/O operation. In contrast, a alarm based timeout was
      applied to an entire send or receive packet operation. That
      is, before this patch the timeout was really a time limit
      for sending or reading one packet.
      
      Building and running MySQL on POSIX systems now requires
      support for poll() and O_NONBLOCK. These should be available
      in any modern POSIX system. In other words, except for Windows,
      legacy (non-POSIX) systems which only support O_NDELAY and
      select() are no longer supported.
      
      On Windows, the default value for MYSQL_OPT_CONNECT_TIMEOUT
      is no longer 20 seconds. The default value now is no timeout
      (infinite), the same as in all other platforms.
      
      Packets bigger than the maximum allowed packet size are no
      longer skipped. Before this patch, if a application sent a
      packet bigger than the maximum allowed packet size, or if
      the server failed to allocate a buffer sufficiently large
      to hold the packet, the server would keep reading the packet
      until its end. Now the session is simply disconnected if the
      server cannot handle such large packets.
      
      The client socket buffer is no longer cleared (drained)
      before sending commands to the server. Before this patch,
      any data left in the socket buffer would be drained (removed)
      before a command was sent to the server, in order to work
      around bugs where the server would violate the protocol and
      send more data. The only check left is a debug-only assertion
      to ensure that the socket buffer is empty.
     @ cmake/os/Windows.cmake
        Move to global configure.cmake
     @ configure.cmake
        No need to detect socket timeouts anymore as they are only
        used and always present on Windows.
     @ extra/yassl/include/openssl/ssl.h
        Export a set of hooks for yaSSL to use in order to send and
        receive data.
     @ extra/yassl/include/openssl/transport_types.h
        Introduce header that defines the type of the transport
        functions used for sending and receiving data. Header is
        necessary to expose the types externally and internally,
        without the need to include ssl.h internally.
     @ extra/yassl/include/socket_wrapper.hpp
        Add variables to hold the callbacks (and argument to callback)
        used for sending and receiving data.
     @ extra/yassl/src/handshake.cpp
        Wait for input in the actual receive data operation. If the
        amount of data available in the socket buffer cannot be pre-
        determined, use a minimum value of 64 bytes.
     @ extra/yassl/src/socket_wrapper.cpp
        Use transport callbacks for sending and receive data. Use the
        system provided send() and recv() by default.
     @ extra/yassl/src/ssl.cpp
        Implement transport set functions.
     @ include/my_global.h
        A socket timeout is now signalled with a ETIMEDOUT error,
        which in some cases is emulated (that is, set explicitly).
        
        Add a wrapper for ECONNRESET so that it can be used to map
        SSL-specific errors.
     @ include/my_sys.h
        Remove as big packets are no longer skipped.
     @ include/mysql_com.h
        Remove as big packets are no longer skipped.
     @ include/violite.h
        Introduce a vio_io_wait method which can be used to wait
        for I/O events on a VIO. The supported I/O events are read
        and write. The notification of error conditions is implicit.
        
        Extend vio_reset, which was SSL-specific, to be able to
        rebind a new socket to an already initialized Vio object.
        
        Remove vio_is_blocking. The blocking mode is no longer
        exported. Also, remove fcntl_mode as the blocking mode is
        no longer buffered. It is now a product of the timeout.
        
        Add prototype for a wrapper vio_timeout() function, which
        invokes the underlying timeout method of the VIO. Add to
        VIO two member variables to hold the read and write timeout
        values.
        
        Rename vio_was_interrupted to vio_was_timeout.
        
        Remove vio_poll_read, which is now superseded vio_io_wait.
        
        Introduce vio_socket_connect to avoid code duplication.
     @ mysql-test/include/mysqlhotcopy.inc
        Perl's die() exists with errno ($!) if it is set, carrying
        over the errno value set by one the internal functions for
        whatever reason.
     @ mysql-test/t/myisam_debug.test
        Reap statement result before disconnecting, otherwise it might
        trigger a assertion or crash for embedded runs.
     @ mysql-test/t/ssl.test
        Add test case to ensure that timeouts work over SSL.
     @ mysql-test/t/wait_timeout.test
        Add test case for Bug#11762221 - 54790.
     @ mysql-test/t/xa.test
        Reap statement result before disconnecting, otherwise it might
        trigger a assertion or crash for embedded runs.
     @ sql-common/client.c
        Look into last_errno to check whether the socket was
        interrupted due to a timeout.
        
        Add a couple of wrappers to retrieve the connect_timeout
        option value.
        
        Use vio_io_wait instead of vio_poll_read.
        
        The result of a failed socket operation is in socket_errno,
        which is WSAGetLastError on Windows.
        
        Replace my_connect() and wait_for_data() with vio_socket_connect()
        to avoid code duplication. The polling inside wait_for_data()
        is somewhat equivalent to wait_for_io(). Also, this can be
        considered a bug fix as wait_for_data() would incorrectly
        use POLLIN (there is data to read) to wait (poll) for the
        connection to be established. This problem probably never
        surfaced because the OS network layer timeout is usually
        lower then our default.
     @ sql/net_serv.cc
        Remove net_data_is_ready, it is no longer necessary as its
        functionality is now implemented by vio_io_wait.
        
        Remove the use of signal based alarms for timeouts. Instead,
        rely on the underlying VIO layer to provide socket timeouts
        and control the blocking mode. The error handling is changed
        to use vio_was_timeout() to recognize a timeout occurrence
        and vio_should_retry() to detect interrupted I/O operations.
        
        The loop in the packet reading path (my_real_path) is unrolled
        into two steps: reading the packet header and payload. Each
        step is now documented and should be easier to understand.
        
        net_clear() no longer drains the socket buffer. Unread data
        on the socket constitutes a protocol violation and shouldn't
        be skipped. The previous behavior was race prone anyway
        as it only drained data on the socket buffer, missing any
        data in transit. Instead, net_clear() now just asserts
        that there is no data left in the read and socket buffers.
        
        Remove use of the inexistent MYSQL_INSTANCE_MANAGER macro.
     @ sql/sql_acl.cc
        Remove as big packets are no longer skipped.
     @ sql/sql_cache.cc
        Rename net_real_write to net_write_packet.
     @ sql/sql_class.cc
        Remove as big packets are no longer skipped.
     @ sql/sql_class.h
        Remove as big packets are no longer skipped.
     @ tests/mysql_client_test.c
        Add test case to ensure that the MYSQL_OPT_READ_TIMEOUT option
        works as expected.
     @ vio/CMakeLists.txt
        Add new files to build.
     @ vio/vio.c
        Rename no_poll_read to no_io_wait. The stub method is used
        for the transport types named pipe and shared memory.
        
        Remove hPipe parameter to vio_init, it is now set in the
        vio_new_win32pipe function.
        
        Initialize timeout values to -1 (infinite) in vio_init.
        
        Improve vio_reset to also reinitialize timeouts and assert
        that rebinding is only supported for socket-based transport
        types.
        
        Move event object creation to its own initialization function
        and perform error checking.
        
        Remove caching of the socket status flags, including blocking
        mode. It is no longer necessary.
        
        Add a vio_timeout wrapper which adjusts the timeout value
        from seconds to milliseconds and invokes the underlying
        VIO-specific timeout handler. It also helps the underlying
        handler detect the previous state of the timeouts (activated
        or deactivated).
     @ vio/vio_priv.h
        Remove prototypes for functions that are being removed.
        Introduce prototypes for functions that are added.
     @ vio/viopipe.c
        Pipe transport code moved from viosocket.c. Also, simplify
        it and adjust to handle the new timeout interface.
     @ vio/vioshm.c
        Move shared memory related code over from viosocket.c and
        adjust to handle the new timeout interface.
     @ vio/viosocket.c
        Use vio_ssl_errno to retrieve the error number if the VIO
        type is SSL.
        
        Add the vio_socket_io_wait() helper which wraps vio_io_wait()
        in order to avoid code duplication in vio_write()/vio_read().
        The function selects an appropriate timeout (based on the passed
        event) and translates the return value.
        
        Re-implement vio_read()/vio_write() as simple event loops. The
        loop consists of retrying the I/O operation until it succeeds
        or fails with a non-recoverable error. If it fails with error
        set to EAGAIN or EWOULDBLOCK, vio_io_wait() is used to wait
        for the socket to become ready. On Linux, the MSG_DONTWAIT
        flag is used to get the same effect as if the socket is in
        non-blocking mode.
        
        Add vio_set_blocking() which tweaks the blocking mode of
        a socket.
        
        Add vio_socket_timeout() which implements the logic around
        when a socket should be switched to non-blocking mode. Except
        on Windows, the socket is put into non-blocking mode if a
        timeout is set.
        
        vio_should_retry now indicates whether a I/O operation was
        interrupted by a temporary failure (interrupted by a signal)
        and should be retried later. vio_was_timeout indicates
        whether a I/O operation timed out.
        
        Remove socket_poll_read, now superseded by vio_io_wait.
        
        Implement vio_io_wait() using select() on Windows. Otherwise,
        use poll(). vio_io_wait() will loop until either a timeout
        occurs, or the socket becomes ready. In the presence of
        spurious wake-ups, the wait is resumed if possible.
        
        Move code related to the transport types named pipe and
        shared memory to their own files.
        
        Introduce vio_socket_connect to avoid code duplication. The
        function implements a timed connect by setting the socket
        to non-blocking and polling until the connection is either
        established or fails.
     @ vio/viossl.c
        Re-implement vio_ssl_read() and vio_ssl_write() as simple
        event loops. Use vio_socket_io_wait() to handle polling,
        the SSL transport can only be used with sockets.
        
        Obtain the error status from SSL. Map the obtained error
        code to a platform equivalent one.
        
        Emulate blocking I/O when yaSSL is being used.

    added:
      extra/yassl/include/openssl/transport_types.h
      vio/viopipe.c
      vio/vioshm.c
    modified:
      cmake/os/Windows.cmake
      configure.cmake
      extra/yassl/include/openssl/ssl.h
      extra/yassl/include/socket_wrapper.hpp
      extra/yassl/src/handshake.cpp
      extra/yassl/src/socket_wrapper.cpp
      extra/yassl/src/ssl.cpp
      include/my_global.h
      include/my_sys.h
      include/mysql.h.pp
      include/mysql_com.h
      include/violite.h
      mysql-test/include/mysqlhotcopy.inc
      mysql-test/r/myisam_debug.result
      mysql-test/r/ssl.result
      mysql-test/r/wait_timeout.result
      mysql-test/t/myisam_debug.test
      mysql-test/t/ssl.test
      mysql-test/t/wait_timeout.test
      mysql-test/t/xa.test
      sql-common/client.c
      sql/net_serv.cc
      sql/sql_acl.cc
      sql/sql_cache.cc
      sql/sql_class.cc
      sql/sql_class.h
      tests/mysql_client_test.c
      vio/CMakeLists.txt
      vio/vio.c
      vio/vio_priv.h
      vio/viosocket.c
      vio/viossl.c
=== modified file 'cmake/os/Windows.cmake'
--- a/cmake/os/Windows.cmake	2011-04-14 08:09:49 +0000
+++ b/cmake/os/Windows.cmake	2011-05-31 13:52:09 +0000
@@ -121,9 +121,6 @@ LINK_LIBRARIES(ws2_32)
 # ..also for tests
 SET(CMAKE_REQUIRED_LIBRARIES ws2_32)
 
-# System checks
-SET(SIGNAL_WITH_VIO_CLOSE 1) # Something that runtime team needs
-
 # IPv6 constants appeared in Vista SDK first. We need to define them in any case if they are 
 # not in headers, to handle dual mode sockets correctly.
 CHECK_SYMBOL_EXISTS(IPPROTO_IPV6 "winsock2.h" HAVE_IPPROTO_IPV6)

=== modified file 'configure.cmake'
--- a/configure.cmake	2011-05-26 15:20:09 +0000
+++ b/configure.cmake	2011-05-31 13:52:09 +0000
@@ -52,6 +52,10 @@ IF(NOT SYSTEM_TYPE)
   ENDIF()
 ENDIF()
 
+# As a consequence of ALARMs no longer being used, thread
+# notification for KILL must close the socket to wake up
+# other threads.
+SET(SIGNAL_WITH_VIO_CLOSE 1)
 
 # Always enable -Wall for gnu C/C++
 IF(CMAKE_COMPILER_IS_GNUCXX)
@@ -914,52 +918,6 @@ CHECK_CXX_SOURCE_COMPILES("
   "
   HAVE_SOLARIS_STYLE_GETHOST)
 
-# Use of ALARMs to wakeup on timeout on sockets
-#
-# This feature makes use of a mutex and is a scalability hog we
-# try to avoid using. However we need support for SO_SNDTIMEO and
-# SO_RCVTIMEO socket options for this to work. So we will check
-# if this feature is supported by a simple TRY_RUN macro. However
-# on some OS's there is support for setting those variables but
-# they are silently ignored. For those OS's we will not attempt
-# to use SO_SNDTIMEO and SO_RCVTIMEO even if it is said to work.
-# See Bug#29093 for the problem with SO_SND/RCVTIMEO on HP/UX.
-# To use alarm is simple, simply avoid setting anything.
-
-IF(WIN32)
-  SET(HAVE_SOCKET_TIMEOUT 1)
-ELSEIF(CMAKE_SYSTEM MATCHES "HP-UX")
-  SET(HAVE_SOCKET_TIMEOUT 0)
-ELSEIF(CMAKE_CROSSCOMPILING)
-  SET(HAVE_SOCKET_TIMEOUT 0)
-ELSE()
-SET(CMAKE_REQUIRED_LIBRARIES ${LIBNSL} ${LIBSOCKET}) 
-CHECK_C_SOURCE_RUNS(
-"
- #include <sys/types.h>
- #include <sys/socket.h>
- #include <sys/time.h>
- 
- int main()
- {    
-   int fd = socket(AF_INET, SOCK_STREAM, 0);
-   struct timeval tv;
-   int ret= 0;
-   tv.tv_sec= 2;
-   tv.tv_usec= 0;
-   ret|= setsockopt(fd, SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof(tv));
-   ret|= setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));
-   return !!ret;
- }
-" HAVE_SOCKET_TIMEOUT)
-ENDIF()
-
-SET(NO_ALARM "${HAVE_SOCKET_TIMEOUT}" CACHE BOOL 
-   "No need to use alarm to implement socket timeout")
-SET(SIGNAL_WITH_VIO_CLOSE "${HAVE_SOCKET_TIMEOUT}")
-MARK_AS_ADVANCED(NO_ALARM)
-
-
 IF(CMAKE_COMPILER_IS_GNUCXX)
 IF(WITH_ATOMIC_OPS STREQUAL "up")
   SET(MY_ATOMIC_MODE_DUMMY 1 CACHE BOOL "Assume single-CPU mode, no concurrency")

=== modified file 'extra/yassl/include/openssl/ssl.h'
--- a/extra/yassl/include/openssl/ssl.h	2011-03-29 08:01:07 +0000
+++ b/extra/yassl/include/openssl/ssl.h	2011-05-31 13:52:09 +0000
@@ -539,13 +539,23 @@ void MD5_Final(unsigned char*, MD5_CTX*)
 #define SSL_DEFAULT_CIPHER_LIST ""   /* default all */
 
 
-/* yaSSL adds */
+/* yaSSL extensions */
 int SSL_set_compression(SSL*);   /* turn on yaSSL zlib compression */
 char *yaSSL_ASN1_TIME_to_string(ASN1_TIME *time, char *buf, size_t len);
 
+#include "transport_types.h"
 
+/*
+  Set functions for yaSSL to use in order to send and receive data.
 
+  These hooks are offered in order to enable non-blocking I/O. If
+  not set, yaSSL defaults to using send() and recv().
 
+  @todo Remove hooks and accompanying code when yaSSL is fixed.
+*/
+void yaSSL_transport_set_ptr(SSL *, void *);
+void yaSSL_transport_set_recv_function(SSL *, yaSSL_recv_func_t);
+void yaSSL_transport_set_send_function(SSL *, yaSSL_send_func_t);
 
 #if defined(__cplusplus) && !defined(YASSL_MYSQL_COMPATIBLE)
 }      /* namespace  */

=== added file 'extra/yassl/include/openssl/transport_types.h'
--- a/extra/yassl/include/openssl/transport_types.h	1970-01-01 00:00:00 +0000
+++ b/extra/yassl/include/openssl/transport_types.h	2011-05-31 13:52:09 +0000
@@ -0,0 +1,8 @@
+#ifndef yaSSL_transport_types_h__
+#define yaSSL_transport_types_h__
+
+/* Type of transport functions used for sending and receiving data. */
+typedef long (*yaSSL_recv_func_t) (void *, void *, size_t);
+typedef long (*yaSSL_send_func_t) (void *, const void *, size_t);
+
+#endif

=== modified file 'extra/yassl/include/socket_wrapper.hpp'
--- a/extra/yassl/include/socket_wrapper.hpp	2007-03-28 16:47:51 +0000
+++ b/extra/yassl/include/socket_wrapper.hpp	2011-05-31 13:52:09 +0000
@@ -55,7 +55,9 @@ typedef unsigned int uint;
     const int SOCKET_ERROR = -1;
 #endif
 
-
+  extern "C" {
+    #include "openssl/transport_types.h"
+  }
 
 typedef unsigned char byte;
 
@@ -65,6 +67,9 @@ class Socket {
     socket_t socket_;                    // underlying socket descriptor
     bool     wouldBlock_;                // if non-blocking data, for last read 
     bool     nonBlocking_;               // is option set
+    void     *ptr_;                      // Argument to transport function
+    yaSSL_send_func_t send_func_;        // Function to send data
+    yaSSL_recv_func_t recv_func_;        // Function to receive data
 public:
     explicit Socket(socket_t s = INVALID_SOCKET);
     ~Socket();
@@ -73,10 +78,13 @@ public:
     uint     get_ready() const;
     socket_t get_fd()    const;
 
-    uint send(const byte* buf, unsigned int len, int flags = 0) const;
-    uint receive(byte* buf, unsigned int len, int flags = 0);
+    void set_transport_ptr(void *ptr);
+    void set_transport_recv_function(yaSSL_recv_func_t recv_func);
+    void set_transport_send_function(yaSSL_send_func_t send_func);
+
+    uint send(const byte* buf, unsigned int len) const;
+    uint receive(byte* buf, unsigned int len);
 
-    bool wait();
     bool WouldBlock() const;
     bool IsNonBlocking() const;
 

=== modified file 'extra/yassl/src/handshake.cpp'
--- a/extra/yassl/src/handshake.cpp	2009-06-29 14:00:47 +0000
+++ b/extra/yassl/src/handshake.cpp	2011-05-31 13:52:09 +0000
@@ -704,13 +704,9 @@ void build_certHashes(SSL& ssl, Hashes&
 // do process input requests, return 0 is done, 1 is call again to complete
 int DoProcessReply(SSL& ssl)
 {
-    // wait for input if blocking
-    if (!ssl.useSocket().wait()) {
-      ssl.SetError(receive_error);
-        return 0;
-    }
     uint ready = ssl.getSocket().get_ready();
-    if (!ready) return 1; 
+    if (!ready)
+      ready= 64;
 
     // add buffered data if its there
     input_buffer* buffered = ssl.useBuffers().TakeRawInput();

=== modified file 'extra/yassl/src/socket_wrapper.cpp'
--- a/extra/yassl/src/socket_wrapper.cpp	2010-07-15 11:13:30 +0000
+++ b/extra/yassl/src/socket_wrapper.cpp	2011-05-31 13:52:09 +0000
@@ -52,11 +52,32 @@
 #endif // _WIN32
 
 
+namespace {
+
+
+extern "C" long system_recv(void *ptr, void *buf, size_t count)
+{
+  yaSSL::socket_t *socket = (yaSSL::socket_t *) ptr;
+  return ::recv(*socket, reinterpret_cast<char *>(buf), count, 0);
+}
+
+
+extern "C" long system_send(void *ptr, const void *buf, size_t count)
+{
+  yaSSL::socket_t *socket = (yaSSL::socket_t *) ptr;
+  return ::send(*socket, reinterpret_cast<const char *>(buf), count, 0);
+}
+
+
+}
+
+
 namespace yaSSL {
 
 
 Socket::Socket(socket_t s) 
-    : socket_(s), wouldBlock_(false), nonBlocking_(false)
+    : socket_(s), wouldBlock_(false), nonBlocking_(false),
+      ptr_(&socket_), send_func_(system_send), recv_func_(system_recv)
 {}
 
 
@@ -108,15 +129,32 @@ uint Socket::get_ready() const
     return ready;
 }
 
+void Socket::set_transport_ptr(void *ptr)
+{
+  ptr_ = ptr;
+}
+
+
+void Socket::set_transport_recv_function(yaSSL_recv_func_t recv_func)
+{
+  recv_func_ = recv_func;
+}
+
+
+void Socket::set_transport_send_function(yaSSL_send_func_t send_func)
+{
+  send_func_ = send_func;
+}
+
 
-uint Socket::send(const byte* buf, unsigned int sz, int flags) const
+uint Socket::send(const byte* buf, unsigned int sz) const
 {
     const byte* pos = buf;
     const byte* end = pos + sz;
 
+    /* Remove send()/recv() hooks once non-blocking send is implemented. */
     while (pos != end) {
-        int sent = ::send(socket_, reinterpret_cast<const char *>(pos),
-                          static_cast<int>(end - pos), flags);
+        int sent = send_func_(ptr_, pos, static_cast<int>(end - pos));
 
     if (sent == -1)
         return 0;
@@ -128,11 +166,11 @@ uint Socket::send(const byte* buf, unsig
 }
 
 
-uint Socket::receive(byte* buf, unsigned int sz, int flags)
+uint Socket::receive(byte* buf, unsigned int sz)
 {
     wouldBlock_ = false;
 
-    int recvd = ::recv(socket_, reinterpret_cast<char *>(buf), sz, flags);
+    int recvd = recv_func_(ptr_, buf, sz);
 
     // idea to seperate error from would block by arnetheduck@stripped
     if (recvd == -1) {
@@ -150,14 +188,6 @@ uint Socket::receive(byte* buf, unsigned
 }
 
 
-// wait if blocking for input, return false for error
-bool Socket::wait()
-{
-    byte b;
-    return receive(&b, 1, MSG_PEEK) != static_cast<uint>(-1);
-}
-
-
 void Socket::shutDown(int how)
 {
     shutdown(socket_, how);

=== modified file 'extra/yassl/src/ssl.cpp'
--- a/extra/yassl/src/ssl.cpp	2011-03-29 12:52:02 +0000
+++ b/extra/yassl/src/ssl.cpp	2011-05-31 13:52:09 +0000
@@ -1683,8 +1683,22 @@ unsigned long ERR_get_error()
     }
 
 
+    void yaSSL_transport_set_ptr(SSL *ssl, void *ptr)
+    {
+      ssl->useSocket().set_transport_ptr(ptr);
+    }
+
 
+    void yaSSL_transport_set_recv_function(SSL *ssl, yaSSL_recv_func_t func)
+    {
+      ssl->useSocket().set_transport_recv_function(func);
+    }
 
 
+    void yaSSL_transport_set_send_function(SSL *ssl, yaSSL_send_func_t func)
+    {
+      ssl->useSocket().set_transport_send_function(func);
+    }
+
 } // extern "C"
 } // namespace

=== modified file 'include/my_global.h'
--- a/include/my_global.h	2011-04-13 19:16:45 +0000
+++ b/include/my_global.h	2011-05-31 13:52:09 +0000
@@ -965,9 +965,10 @@ typedef ulong nesting_map;  /* Used for
 #define socket_errno	WSAGetLastError()
 #define SOCKET_EINTR	WSAEINTR
 #define SOCKET_EAGAIN	WSAEINPROGRESS
-#define SOCKET_ETIMEDOUT WSAETIMEDOUT
 #define SOCKET_EWOULDBLOCK WSAEWOULDBLOCK
 #define SOCKET_EADDRINUSE WSAEADDRINUSE
+#define SOCKET_ETIMEDOUT WSAETIMEDOUT
+#define SOCKET_ECONNRESET WSAECONNRESET
 #define SOCKET_ENFILE	ENFILE
 #define SOCKET_EMFILE	EMFILE
 #else /* Unix */
@@ -975,9 +976,10 @@ typedef ulong nesting_map;  /* Used for
 #define closesocket(A)	close(A)
 #define SOCKET_EINTR	EINTR
 #define SOCKET_EAGAIN	EAGAIN
-#define SOCKET_ETIMEDOUT SOCKET_EINTR
 #define SOCKET_EWOULDBLOCK EWOULDBLOCK
 #define SOCKET_EADDRINUSE EADDRINUSE
+#define SOCKET_ETIMEDOUT ETIMEDOUT
+#define SOCKET_ECONNRESET ECONNRESET
 #define SOCKET_ENFILE	ENFILE
 #define SOCKET_EMFILE	EMFILE
 #endif

=== modified file 'include/my_sys.h'
--- a/include/my_sys.h	2011-05-21 09:30:35 +0000
+++ b/include/my_sys.h	2011-05-31 13:52:09 +0000
@@ -936,7 +936,6 @@ extern size_t escape_quotes_for_mysql(CH
 
 extern void thd_increment_bytes_sent(ulong length);
 extern void thd_increment_bytes_received(ulong length);
-extern void thd_increment_net_big_packet_count(ulong length);
 
 #ifdef __WIN__
 extern my_bool have_tcpip;		/* Is set if tcpip is used */

=== modified file 'include/mysql.h.pp'
--- a/include/mysql.h.pp	2011-05-06 13:46:57 +0000
+++ b/include/mysql.h.pp	2011-05-31 13:52:09 +0000
@@ -85,18 +85,15 @@ enum enum_mysql_set_option
 my_bool my_net_init(NET *net, Vio* vio);
 void my_net_local_init(NET *net);
 void net_end(NET *net);
-  void net_clear(NET *net, my_bool clear_buffer);
+void net_clear(NET *net, my_bool check_buffer);
 my_bool net_realloc(NET *net, size_t length);
 my_bool net_flush(NET *net);
 my_bool my_net_write(NET *net,const unsigned char *packet, size_t len);
 my_bool net_write_command(NET *net,unsigned char command,
      const unsigned char *header, size_t head_len,
      const unsigned char *packet, size_t len);
-int net_real_write(NET *net,const unsigned char *packet, size_t len);
+my_bool net_write_packet(NET *net, const unsigned char *packet, size_t length);
 unsigned long my_net_read(NET *net);
-struct sockaddr;
-int my_connect(my_socket s, const struct sockaddr *name, unsigned int namelen,
-        unsigned int timeout);
 struct rand_struct {
   unsigned long seed1,seed2,max_value;
   double max_value_dbl;

=== modified file 'include/mysql_com.h'
--- a/include/mysql_com.h	2011-03-25 13:28:19 +0000
+++ b/include/mysql_com.h	2011-05-31 13:52:09 +0000
@@ -324,16 +324,6 @@ typedef struct st_net {
   /** Client library sqlstate buffer. Set along with the error message. */
   char sqlstate[SQLSTATE_LENGTH+1];
   void *extension;
-#if defined(MYSQL_SERVER) && !defined(EMBEDDED_LIBRARY)
-  /*
-    Controls whether a big packet should be skipped.
-
-    Initially set to FALSE by default. Unauthenticated sessions must have
-    this set to FALSE so that the server can't be tricked to read packets
-    indefinitely.
-  */
-  my_bool skip_big_packet;
-#endif
 } NET;
 
 
@@ -449,16 +439,16 @@ extern "C" {
 #endif
 
 my_bool	my_net_init(NET *net, Vio* vio);
-void	my_net_local_init(NET *net);
-void	net_end(NET *net);
-  void	net_clear(NET *net, my_bool clear_buffer);
+void my_net_local_init(NET *net);
+void net_end(NET *net);
+void net_clear(NET *net, my_bool check_buffer);
 my_bool net_realloc(NET *net, size_t length);
 my_bool	net_flush(NET *net);
 my_bool	my_net_write(NET *net,const unsigned char *packet, size_t len);
 my_bool	net_write_command(NET *net,unsigned char command,
 			  const unsigned char *header, size_t head_len,
 			  const unsigned char *packet, size_t len);
-int	net_real_write(NET *net,const unsigned char *packet, size_t len);
+my_bool net_write_packet(NET *net, const unsigned char *packet, size_t length);
 unsigned long my_net_read(NET *net);
 
 #ifdef _global_h
@@ -466,10 +456,6 @@ void my_net_set_write_timeout(NET *net,
 void my_net_set_read_timeout(NET *net, uint timeout);
 #endif
 
-struct sockaddr;
-int my_connect(my_socket s, const struct sockaddr *name, unsigned int namelen,
-	       unsigned int timeout);
-
 struct rand_struct {
   unsigned long seed1,seed2,max_value;
   double max_value_dbl;

=== modified file 'include/violite.h'
--- a/include/violite.h	2011-05-19 09:47:43 +0000
+++ b/include/violite.h	2011-05-31 13:52:09 +0000
@@ -11,7 +11,7 @@
 
    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
-   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
+   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301  USA */
 
 /*
  * Vio Lite.
@@ -40,6 +40,15 @@ enum enum_vio_type
   VIO_TYPE_SSL, VIO_TYPE_SHARED_MEMORY
 };
 
+/**
+  VIO I/O events.
+*/
+enum enum_vio_io_event
+{
+  VIO_IO_EVENT_READ,
+  VIO_IO_EVENT_WRITE,
+  VIO_IO_EVENT_CONNECT
+};
 
 #define VIO_LOCALHOST 1                         /* a localhost connection */
 #define VIO_BUFFERED_READ 2                     /* use buffered read */
@@ -61,13 +70,11 @@ Vio* vio_new_win32shared_memory(HANDLE h
 
 void	vio_delete(Vio* vio);
 int	vio_close(Vio* vio);
-void    vio_reset(Vio* vio, enum enum_vio_type type,
-                  my_socket sd, HANDLE hPipe, uint flags);
+my_bool vio_reset(Vio* vio, enum enum_vio_type type,
+                  my_socket sd, void *ssl, uint flags);
 size_t	vio_read(Vio *vio, uchar *	buf, size_t size);
 size_t  vio_read_buff(Vio *vio, uchar * buf, size_t size);
 size_t	vio_write(Vio *vio, const uchar * buf, size_t size);
-int	vio_blocking(Vio *vio, my_bool onoff, my_bool *old_mode);
-my_bool	vio_is_blocking(Vio *vio);
 /* setsockopt TCP_NODELAY at IPPROTO_TCP level, when possible */
 int	vio_fastsend(Vio *vio);
 /* setsockopt SO_KEEPALIVE at SOL_SOCKET level, when possible */
@@ -75,7 +82,7 @@ int	vio_keepalive(Vio *vio, my_bool	onof
 /* Whenever we should retry the last read/write operation. */
 my_bool	vio_should_retry(Vio *vio);
 /* Check that operation was timed out */
-my_bool	vio_was_interrupted(Vio *vio);
+my_bool vio_was_timeout(Vio *vio);
 /* Short text description of the socket for those, who are curious.. */
 const char* vio_description(Vio *vio);
 /* Return the type of the connection */
@@ -86,9 +93,17 @@ int	vio_errno(Vio*vio);
 my_socket vio_fd(Vio*vio);
 /* Remote peer's address and name in text form */
 my_bool vio_peer_addr(Vio *vio, char *buf, uint16 *port, size_t buflen);
-my_bool vio_poll_read(Vio *vio, uint timeout);
+/* Wait for an I/O event notification. */
+int vio_io_wait(Vio *vio, enum enum_vio_io_event event, int timeout);
 my_bool vio_is_connected(Vio *vio);
+#ifndef DBUG_OFF
 ssize_t vio_pending(Vio *vio);
+#endif
+/* Set timeout for a network operation. */
+int vio_timeout(Vio *vio, uint which, int timeout_sec);
+/* Connect to a peer. */
+my_bool vio_socket_connect(Vio *vio, struct sockaddr *addr, socklen_t len,
+                           int timeout);
 
 my_bool vio_get_normalized_ip_string(const struct sockaddr *addr, int addr_length,
                                      char *ip_string, size_t ip_string_size);
@@ -160,17 +175,13 @@ void vio_end(void);
 #define vio_errno(vio)	 			(vio)->vioerrno(vio)
 #define vio_read(vio, buf, size)                ((vio)->read)(vio,buf,size)
 #define vio_write(vio, buf, size)               ((vio)->write)(vio, buf, size)
-#define vio_blocking(vio, set_blocking_mode, old_mode)\
- 	(vio)->vioblocking(vio, set_blocking_mode, old_mode)
-#define vio_is_blocking(vio) 			(vio)->is_blocking(vio)
 #define vio_fastsend(vio)			(vio)->fastsend(vio)
 #define vio_keepalive(vio, set_keep_alive)	(vio)->viokeepalive(vio, set_keep_alive)
 #define vio_should_retry(vio) 			(vio)->should_retry(vio)
-#define vio_was_interrupted(vio) 		(vio)->was_interrupted(vio)
+#define vio_was_timeout(vio)                    (vio)->was_timeout(vio)
 #define vio_close(vio)				((vio)->vioclose)(vio)
 #define vio_peer_addr(vio, buf, prt, buflen)	(vio)->peer_addr(vio, buf, prt, buflen)
-#define vio_timeout(vio, which, seconds)	(vio)->timeout(vio, which, seconds)
-#define vio_poll_read(vio, timeout)             (vio)->poll_read(vio, timeout)
+#define vio_io_wait(vio, event, timeout)        (vio)->io_wait(vio, event, timeout)
 #define vio_is_connected(vio)                   (vio)->is_connected(vio)
 #endif /* !defined(DONT_MAP_VIO) */
 
@@ -190,9 +201,7 @@ enum SSL_type
 struct st_vio
 {
   my_socket		sd;		/* my_socket - real or imaginary */
-  HANDLE hPipe;
   my_bool		localhost;	/* Are we from localhost? */
-  int			fcntl_mode;	/* Buffered fcntl(sd,F_GETFL) */
   struct sockaddr_storage	local;		/* Local internet address */
   struct sockaddr_storage	remote;		/* Remote internet address */
   int addrLen;                          /* Length of remote address */
@@ -202,24 +211,29 @@ struct st_vio
   char                  *read_pos;      /* start of unfetched data in the
                                            read buffer */
   char                  *read_end;      /* end of unfetched data */
+  int                   read_timeout;   /* Timeout value (ms) for read ops. */
+  int                   write_timeout;  /* Timeout value (ms) for write ops. */
   /* function pointers. They are similar for socket/SSL/whatever */
   void    (*viodelete)(Vio*);
   int     (*vioerrno)(Vio*);
   size_t  (*read)(Vio*, uchar *, size_t);
   size_t  (*write)(Vio*, const uchar *, size_t);
-  int     (*vioblocking)(Vio*, my_bool, my_bool *);
-  my_bool (*is_blocking)(Vio*);
+  int     (*timeout)(Vio*, uint, my_bool);
   int     (*viokeepalive)(Vio*, my_bool);
   int     (*fastsend)(Vio*);
   my_bool (*peer_addr)(Vio*, char *, uint16*, size_t);
   void    (*in_addr)(Vio*, struct sockaddr_storage*);
   my_bool (*should_retry)(Vio*);
-  my_bool (*was_interrupted)(Vio*);
+  my_bool (*was_timeout)(Vio*);
   int     (*vioclose)(Vio*);
-  void	  (*timeout)(Vio*, unsigned int which, unsigned int timeout);
-  my_bool (*poll_read)(Vio *vio, uint timeout);
   my_bool (*is_connected)(Vio*);
   my_bool (*has_data) (Vio*);
+  int (*io_wait)(Vio*, enum enum_vio_io_event, int);
+  my_bool (*connect)(Vio*, struct sockaddr *, socklen_t, int);
+#ifdef _WIN32
+  OVERLAPPED overlapped;
+  HANDLE hPipe;
+#endif
 #ifdef HAVE_OPENSSL
   void	  *ssl_arg;
 #endif
@@ -234,10 +248,5 @@ struct st_vio
   size_t  shared_memory_remain;
   char    *shared_memory_pos;
 #endif /* HAVE_SMEM */
-#ifdef _WIN32
-  OVERLAPPED pipe_overlapped;
-  DWORD read_timeout_ms;
-  DWORD write_timeout_ms;
-#endif
 };
 #endif /* vio_violite_h_ */

=== modified file 'mysql-test/include/mysqlhotcopy.inc'
--- a/mysql-test/include/mysqlhotcopy.inc	2011-02-08 09:56:04 +0000
+++ b/mysql-test/include/mysqlhotcopy.inc	2011-05-31 13:52:09 +0000
@@ -107,7 +107,7 @@ DROP DATABASE hotcopy_save;
 --replace_result $MYSQLD_DATADIR MYSQLD_DATADIR
 --list_files $MYSQLD_DATADIR/hotcopy_save
 --replace_result $MASTER_MYSOCK MASTER_MYSOCK
---error 9,11,2304
+--error 9,11,110,2304
 --exec $MYSQLHOTCOPY --quiet -S $MASTER_MYSOCK -u root hotcopy_test hotcopy_save
 --replace_result $MASTER_MYSOCK MASTER_MYSOCK
 --exec $MYSQLHOTCOPY --quiet --allowold -S $MASTER_MYSOCK -u root hotcopy_test hotcopy_save

=== modified file 'mysql-test/r/myisam_debug.result'
--- a/mysql-test/r/myisam_debug.result	2009-04-30 11:03:44 +0000
+++ b/mysql-test/r/myisam_debug.result	2011-05-31 13:52:09 +0000
@@ -37,3 +37,4 @@ CHECK TABLE t1;
 Table	Op	Msg_type	Msg_text
 test.t1	check	status	OK
 DROP TABLE t1,t2;
+ERROR HY000: 137 when fixing table

=== modified file 'mysql-test/r/ssl.result'
--- a/mysql-test/r/ssl.result	2011-03-29 08:01:07 +0000
+++ b/mysql-test/r/ssl.result	2011-05-31 13:52:09 +0000
@@ -2163,3 +2163,13 @@ drop table t1;
 SHOW STATUS LIKE 'Ssl_cipher';
 Variable_name	Value
 Ssl_cipher	DHE-RSA-AES256-SHA
+#
+# Bug#54790: Use of non-blocking mode for sockets limits performance
+#
+# Open ssl_con and set a timeout.
+SET @@SESSION.wait_timeout = 2;
+# Wait for ssl_con to be disconnected.
+# Check that ssl_con has been disconnected.
+# CR_SERVER_LOST, CR_SERVER_GONE_ERROR
+SELECT 1;
+Got one of the listed errors

=== modified file 'mysql-test/r/wait_timeout.result'
--- a/mysql-test/r/wait_timeout.result	2009-01-23 17:19:09 +0000
+++ b/mysql-test/r/wait_timeout.result	2011-05-31 13:52:09 +0000
@@ -31,3 +31,26 @@ SELECT 3;
 3
 SET @@global.wait_timeout= <start_value>;
 disconnection con1;
+#
+# Bug#54790: Use of non-blocking mode for sockets limits performance
+#
+#
+# Test UNIX domain sockets timeout.
+#
+# Open con1 and set a timeout.
+SET @@SESSION.wait_timeout = 2;
+# Wait for con1 to be disconnected.
+# Check that con1 has been disconnected.
+# CR_SERVER_LOST, CR_SERVER_GONE_ERROR
+SELECT 1;
+Got one of the listed errors
+#
+# Test TCP/IP sockets timeout.
+#
+# Open con1 and set a timeout.
+SET @@SESSION.wait_timeout = 2;
+# Wait for con1 to be disconnected.
+# Check that con1 has been disconnected.
+# CR_SERVER_LOST, CR_SERVER_GONE_ERROR
+SELECT 1;
+Got one of the listed errors

=== modified file 'mysql-test/t/myisam_debug.test'
--- a/mysql-test/t/myisam_debug.test	2009-05-04 09:05:16 +0000
+++ b/mysql-test/t/myisam_debug.test	2011-05-31 13:52:09 +0000
@@ -54,4 +54,7 @@ INTO @thread_id;
 KILL QUERY @thread_id;
 CHECK TABLE t1; 
 DROP TABLE t1,t2;
+CONNECTION insertConn;
+--error ER_NOT_KEYFILE
+REAP;
 DISCONNECT insertConn;

=== modified file 'mysql-test/t/ssl.test'
--- a/mysql-test/t/ssl.test	2011-03-29 08:01:07 +0000
+++ b/mysql-test/t/ssl.test	2011-05-31 13:52:09 +0000
@@ -24,6 +24,32 @@ SHOW STATUS LIKE 'Ssl_cipher';
 connection default;
 disconnect ssl_con;
 
+--echo #
+--echo # Bug#54790: Use of non-blocking mode for sockets limits performance
+--echo #
+
+--echo # Open ssl_con and set a timeout.
+connect (ssl_con,localhost,root,,,,,SSL);
+
+LET $ID= `SELECT connection_id()`;
+SET @@SESSION.wait_timeout = 2;
+
+--echo # Wait for ssl_con to be disconnected.
+connection default;
+let $wait_condition=
+  SELECT COUNT(*) = 0 FROM INFORMATION_SCHEMA.PROCESSLIST
+  WHERE ID = $ID;
+--source include/wait_condition.inc
+
+--echo # Check that ssl_con has been disconnected.
+connection ssl_con;
+--echo # CR_SERVER_LOST, CR_SERVER_GONE_ERROR
+--error 2006,2013
+SELECT 1;
+
+connection default;
+disconnect ssl_con;
+
 # Wait till all disconnects are completed
 --source include/wait_until_count_sessions.inc
 

=== modified file 'mysql-test/t/wait_timeout.test'
--- a/mysql-test/t/wait_timeout.test	2010-10-19 11:54:28 +0000
+++ b/mysql-test/t/wait_timeout.test	2011-05-31 13:52:09 +0000
@@ -137,6 +137,62 @@ disconnect con1;
 # The last connect is to keep tools checking the current test happy.
 connect (default,localhost,root,,test,,);
 
+--echo #
+--echo # Bug#54790: Use of non-blocking mode for sockets limits performance
+--echo #
+
+--echo #
+--echo # Test UNIX domain sockets timeout.
+--echo #
+
+--echo # Open con1 and set a timeout.
+connect(con1,localhost,root,,);
+
+LET $ID= `SELECT connection_id()`;
+SET @@SESSION.wait_timeout = 2;
+
+--echo # Wait for con1 to be disconnected.
+connection default;
+let $wait_condition=
+  SELECT COUNT(*) = 0 FROM INFORMATION_SCHEMA.PROCESSLIST
+  WHERE ID = $ID;
+--source include/wait_condition.inc
+
+--echo # Check that con1 has been disconnected.
+connection con1;
+--echo # CR_SERVER_LOST, CR_SERVER_GONE_ERROR
+--error 2006,2013
+SELECT 1;
+
+disconnect con1;
+connection default;
+
+--echo #
+--echo # Test TCP/IP sockets timeout.
+--echo #
+
+--echo # Open con1 and set a timeout.
+connect(con1,127.0.0.1,root,,);
+
+LET $ID= `SELECT connection_id()`;
+SET @@SESSION.wait_timeout = 2;
+
+--echo # Wait for con1 to be disconnected.
+connection default;
+let $wait_condition=
+  SELECT COUNT(*) = 0 FROM INFORMATION_SCHEMA.PROCESSLIST
+  WHERE ID = $ID;
+--source include/wait_condition.inc
+
+--echo # Check that con1 has been disconnected.
+connection con1;
+--echo # CR_SERVER_LOST, CR_SERVER_GONE_ERROR
+--error 2006,2013
+SELECT 1;
+
+disconnect con1;
+connection default;
+
 # Wait till all disconnects are completed
 --source include/wait_until_count_sessions.inc
 

=== modified file 'mysql-test/t/xa.test'
--- a/mysql-test/t/xa.test	2011-04-14 08:47:14 +0000
+++ b/mysql-test/t/xa.test	2011-05-31 13:52:09 +0000
@@ -119,7 +119,8 @@ xa rollback 'a','c';
 connect (con3,localhost,root,,);
 --connection con3
 xa start 'a','c';
-
+--connection con1
+--reap
 --disconnect con1
 --disconnect con3
 --connection default
@@ -227,7 +228,11 @@ XA START 'xid1';
 XA END 'xid1';
 XA ROLLBACK 'xid1';
 
+connection con1;
+REAP;
 disconnect con1;
+
+connection default;
 DROP TABLE t1;
 
 

=== modified file 'sql-common/client.c'
--- a/sql-common/client.c	2011-05-26 15:20:09 +0000
+++ b/sql-common/client.c	2011-05-31 13:52:09 +0000
@@ -11,7 +11,7 @@
 
    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
-   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
+   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301  USA */
 
 /*
   This file is included by both libmysql.c (the MySQL client C API)
@@ -25,10 +25,8 @@
   - Support for reading local file with LOAD DATA LOCAL
   - SHARED memory handling
   - Prepared statements
-  
   - Things that only works for the server
-  - Alarm handling on connect
-  
+
   In all other cases, the code should be idential for the client and
   server.
 */ 
@@ -92,19 +90,11 @@ my_bool	net_flush(NET *net);
 #  include <sys/un.h>
 #endif
 
-#if defined(__WIN__)
-#define perror(A)
-#else
+#ifndef _WIN32
 #include <errno.h>
 #define SOCKET_ERROR -1
 #endif
 
-#ifdef __WIN__
-#define CONNECT_TIMEOUT 20
-#else
-#define CONNECT_TIMEOUT 0
-#endif
-
 #include "client_settings.h"
 #include <sql_common.h>
 #include <mysql/client_plugin.h>
@@ -127,197 +117,80 @@ static void mysql_close_free_options(MYS
 static void mysql_close_free(MYSQL *mysql);
 static void mysql_prune_stmt_list(MYSQL *mysql);
 
-#if !defined(__WIN__)
-static int wait_for_data(my_socket fd, uint timeout);
-#endif
-
 CHARSET_INFO *default_client_charset_info = &my_charset_latin1;
 
 /* Server error code and message */
 unsigned int mysql_server_last_errno;
 char mysql_server_last_error[MYSQL_ERRMSG_SIZE];
 
-/****************************************************************************
-  A modified version of connect().  my_connect() allows you to specify
-  a timeout value, in seconds, that we should wait until we
-  derermine we can't connect to a particular host.  If timeout is 0,
-  my_connect() will behave exactly like connect().
-
-  Base version coded by Steve Bernacki, Jr. <steve@stripped>
-*****************************************************************************/
-
-int my_connect(my_socket fd, const struct sockaddr *name, uint namelen,
-	       uint timeout)
-{
-#if defined(__WIN__)
-  DBUG_ENTER("my_connect");
-  DBUG_RETURN(connect(fd, (struct sockaddr*) name, namelen));
-#else
-  int flags, res, s_err;
-  DBUG_ENTER("my_connect");
-  DBUG_PRINT("enter", ("fd: %d  timeout: %u", fd, timeout));
+/**
+  Convert the connect timeout option to a timeout value for VIO
+  functions (vio_socket_connect() and vio_io_wait()).
+
+  @param mysql  Connection handle (client side).
+
+  @return The timeout value in milliseconds, or -1 if no timeout.
+*/
+
+static int get_vio_connect_timeout(MYSQL *mysql)
+{
+  int timeout_ms;
+  uint timeout_sec;
 
   /*
-    If they passed us a timeout of zero, we should behave
-    exactly like the normal connect() call does.
+    A timeout of 0 means no timeout. Also, the connect_timeout
+    option value is in seconds, while VIO timeouts are measured
+    in milliseconds. Hence, check for a possible overflow. In
+    case of overflow, set to no timeout.
   */
+  timeout_sec= mysql->options.connect_timeout;
 
-  if (timeout == 0)
-    DBUG_RETURN(connect(fd, (struct sockaddr*) name, namelen));
-
-  flags = fcntl(fd, F_GETFL, 0);	  /* Set socket to not block */
-#ifdef O_NONBLOCK
-  fcntl(fd, F_SETFL, flags | O_NONBLOCK);  /* and save the flags..  */
-#endif
+  if (!timeout_sec || (timeout_sec > INT_MAX/1000))
+    timeout_ms= -1;
+  else
+    timeout_ms= (int) (timeout_sec * 1000);
 
-  DBUG_PRINT("info", ("connecting non-blocking"));
-  res= connect(fd, (struct sockaddr*) name, namelen);
-  DBUG_PRINT("info", ("connect result: %d  errno: %d", res, errno));
-  s_err= errno;			/* Save the error... */
-  fcntl(fd, F_SETFL, flags);
-  if ((res != 0) && (s_err != EINPROGRESS))
-  {
-    errno= s_err;			/* Restore it */
-    DBUG_RETURN(-1);
-  }
-  if (res == 0)				/* Connected quickly! */
-    DBUG_RETURN(0);
-  DBUG_RETURN(wait_for_data(fd, timeout));
-#endif
+  return timeout_ms;
 }
 
 
-/*
-  Wait up to timeout seconds for a connection to be established.
+#ifdef _WIN32
 
-  We prefer to do this with poll() as there is no limitations with this.
-  If not, we will use select()
-*/
+/**
+  Convert the connect timeout option to a timeout value for WIN32
+  synchronization functions.
 
-#if !defined(__WIN__)
+  @remark Specific for WIN32 connection methods shared memory and
+          named pipe.
 
-static int wait_for_data(my_socket fd, uint timeout)
+  @param mysql  Connection handle (client side).
+
+  @return The timeout value in milliseconds, or INFINITE if no timeout.
+*/
+
+static DWORD get_win32_connect_timeout(MYSQL *mysql)
 {
-#ifdef HAVE_POLL
-  struct pollfd ufds;
-  int res;
-  DBUG_ENTER("wait_for_data");
+  DWORD timeout_ms;
+  uint timeout_sec;
 
-  DBUG_PRINT("info", ("polling"));
-  ufds.fd= fd;
-  ufds.events= POLLIN | POLLPRI;
-  if (!(res= poll(&ufds, 1, (int) timeout*1000)))
-  {
-    DBUG_PRINT("info", ("poll timed out"));
-    errno= EINTR;
-    DBUG_RETURN(-1);
-  }
-  DBUG_PRINT("info",
-             ("poll result: %d  errno: %d  revents: 0x%02d  events: 0x%02d",
-              res, errno, ufds.revents, ufds.events));
-  if (res < 0 || !(ufds.revents & (POLLIN | POLLPRI)))
-    DBUG_RETURN(-1);
   /*
-    At this point, we know that something happened on the socket.
-    But this does not means that everything is alright.
-    The connect might have failed. We need to retrieve the error code
-    from the socket layer. We must return success only if we are sure
-    that it was really a success. Otherwise we might prevent the caller
-    from trying another address to connect to.
+    A timeout of 0 means no timeout. Also, the connect_timeout
+    option value is in seconds, while WIN32 timeouts are in
+    milliseconds. Hence, check for a possible overflow. In case
+    of overflow, set to no timeout.
   */
-  {
-    int         s_err;
-    socklen_t   s_len= sizeof(s_err);
-
-    DBUG_PRINT("info", ("Get SO_ERROR from non-blocked connected socket."));
-    res= getsockopt(fd, SOL_SOCKET, SO_ERROR, &s_err, &s_len);
-    DBUG_PRINT("info", ("getsockopt res: %d  s_err: %d", res, s_err));
-    if (res)
-      DBUG_RETURN(res);
-    /* getsockopt() was successful, check the retrieved status value. */
-    if (s_err)
-    {
-      errno= s_err;
-      DBUG_RETURN(-1);
-    }
-    /* Status from connect() is zero. Socket is successfully connected. */
-  }
-  DBUG_RETURN(0);
-#else
-  SOCKOPT_OPTLEN_TYPE s_err_size = sizeof(uint);
-  fd_set sfds;
-  struct timeval tv;
-  time_t start_time, now_time;
-  int res, s_err;
-  DBUG_ENTER("wait_for_data");
+  timeout_sec= mysql->options.connect_timeout;
 
-  if (fd >= FD_SETSIZE)				/* Check if wrong error */
-    DBUG_RETURN(0);					/* Can't use timeout */
+  if (!timeout_sec || (timeout_sec > INT_MAX/1000))
+    timeout_ms= INFINITE;
+  else
+    timeout_ms= (DWORD) (timeout_sec * 1000);
 
-  /*
-    Our connection is "in progress."  We can use the select() call to wait
-    up to a specified period of time for the connection to suceed.
-    If select() returns 0 (after waiting howevermany seconds), our socket
-    never became writable (host is probably unreachable.)  Otherwise, if
-    select() returns 1, then one of two conditions exist:
-   
-    1. An error occured.  We use getsockopt() to check for this.
-    2. The connection was set up sucessfully: getsockopt() will
-    return 0 as an error.
-   
-    Thanks goes to Andrew Gierth <andrew@stripped>
-    who posted this method of timing out a connect() in
-    comp.unix.programmer on August 15th, 1997.
-  */
+  return timeout_ms;
+}
 
-  FD_ZERO(&sfds);
-  FD_SET(fd, &sfds);
-  /*
-    select could be interrupted by a signal, and if it is, 
-    the timeout should be adjusted and the select restarted
-    to work around OSes that don't restart select and 
-    implementations of select that don't adjust tv upon
-    failure to reflect the time remaining
-   */
-  start_time= my_time(0);
-  for (;;)
-  {
-    tv.tv_sec = (long) timeout;
-    tv.tv_usec = 0;
-#if defined(HPUX10)
-    if ((res = select(fd+1, NULL, (int*) &sfds, NULL, &tv)) > 0)
-      break;
-#else
-    if ((res = select(fd+1, NULL, &sfds, NULL, &tv)) > 0)
-      break;
 #endif
-    if (res == 0)					/* timeout */
-      DBUG_RETURN(-1);
-    now_time= my_time(0);
-    timeout-= (uint) (now_time - start_time);
-    if (errno != EINTR || (int) timeout <= 0)
-      DBUG_RETURN(-1);
-  }
-
-  /*
-    select() returned something more interesting than zero, let's
-    see if we have any errors.  If the next two statements pass,
-    we've got an open socket!
-  */
 
-  s_err=0;
-  if (getsockopt(fd, SOL_SOCKET, SO_ERROR, (char*) &s_err, &s_err_size) != 0)
-    DBUG_RETURN(-1);
-
-  if (s_err)
-  {						/* getsockopt could succeed */
-    errno = s_err;
-    DBUG_RETURN(-1);					/* but return an error... */
-  }
-  DBUG_RETURN(0);					/* ok */
-#endif /* HAVE_POLL */
-}
-#endif /* !defined(__WIN__) */
 
 /**
   Set the internal error message to mysql handler
@@ -401,17 +274,18 @@ void set_mysql_extended_error(MYSQL *mys
   Create a named pipe connection
 */
 
-#ifdef __WIN__
+#ifdef _WIN32
 
-HANDLE create_named_pipe(MYSQL *mysql, uint connect_timeout, char **arg_host,
-			 char **arg_unix_socket)
+static HANDLE create_named_pipe(MYSQL *mysql, DWORD connect_timeout,
+                                const char **arg_host,
+                                const char **arg_unix_socket)
 {
   HANDLE hPipe=INVALID_HANDLE_VALUE;
   char pipe_name[1024];
   DWORD dwMode;
   int i;
   my_bool testing_named_pipes=0;
-  char *host= *arg_host, *unix_socket= *arg_unix_socket;
+  const char *host= *arg_host, *unix_socket= *arg_unix_socket;
 
   if ( ! unix_socket || (unix_socket)[0] == 0x00)
     unix_socket = mysql_unix_port;
@@ -442,7 +316,7 @@ HANDLE create_named_pipe(MYSQL *mysql, u
       return INVALID_HANDLE_VALUE;
     }
     /* wait for for an other instance */
-    if (! WaitNamedPipe(pipe_name, connect_timeout*1000) )
+    if (!WaitNamedPipe(pipe_name, connect_timeout))
     {
       set_mysql_extended_error(mysql, CR_NAMEDPIPEWAIT_ERROR, unknown_sqlstate,
                                ER(CR_NAMEDPIPEWAIT_ERROR),
@@ -475,15 +349,16 @@ HANDLE create_named_pipe(MYSQL *mysql, u
 /*
   Create new shared memory connection, return handler of connection
 
-  SYNOPSIS
-    create_shared_memory()
-    mysql		Pointer of mysql structure
-    net			Pointer of net structure
-    connect_timeout	Timeout of connection
+  @param mysql  Pointer of mysql structure
+  @param net    Pointer of net structure
+  @param connect_timeout  Timeout of connection (in milliseconds)
+
+  @return HANDLE to the shared memory area.
 */
 
 #ifdef HAVE_SMEM
-HANDLE create_shared_memory(MYSQL *mysql,NET *net, uint connect_timeout)
+static HANDLE create_shared_memory(MYSQL *mysql, NET *net,
+                                   DWORD connect_timeout)
 {
   ulong smem_buffer_length = shared_memory_buffer_length + 4;
   /*
@@ -588,7 +463,7 @@ HANDLE create_shared_memory(MYSQL *mysql
   }
 
   /* Wait of answer from server */
-  if (WaitForSingleObject(event_connect_answer,connect_timeout*1000) !=
+  if (WaitForSingleObject(event_connect_answer, connect_timeout) !=
       WAIT_OBJECT_0)
   {
     error_allow = CR_SHARED_MEMORY_CONNECT_ABANDONED_ERROR;
@@ -737,7 +612,7 @@ cli_safe_read(MYSQL *mysql)
     DBUG_PRINT("error",("Wrong connection or packet. fd: %s  len: %lu",
 			vio_description(net->vio),len));
 #ifdef MYSQL_SERVER
-    if (net->vio && vio_was_interrupted(net->vio))
+    if (net->vio && (net->last_errno == ER_NET_READ_INTERRUPTED))
       return (packet_error);
 #endif /*MYSQL_SERVER*/
     end_server(mysql);
@@ -830,9 +705,10 @@ cli_advanced_command(MYSQL *mysql, enum
   mysql->info=0;
   mysql->affected_rows= ~(my_ulonglong) 0;
   /*
-    We don't want to clear the protocol buffer on COM_QUIT, because if
-    the previous command was a shutdown command, we may have the
-    response for the COM_QUIT already in the communication buffer
+    Do not check the socket/protocol buffer on COM_QUIT as the
+    result of a previous command might not have been read. This
+    can happen if a client sends a query but does not reap the
+    result before attempting to close the connection.
   */
   net_clear(&mysql->net, (command != COM_QUIT));
 
@@ -1713,7 +1589,6 @@ mysql_init(MYSQL *mysql)
   }
   else
     memset(mysql, 0, sizeof(*(mysql)));
-  mysql->options.connect_timeout= CONNECT_TIMEOUT;
   mysql->charset=default_client_charset_info;
   strmov(mysql->net.sqlstate, not_error_sqlstate);
 
@@ -2961,10 +2836,6 @@ CLI_MYSQL_REAL_CONNECT(MYSQL *mysql,cons
   const char    *scramble_plugin;
   ulong		pkt_length;
   NET		*net= &mysql->net;
-#ifdef MYSQL_SERVER
-  thr_alarm_t   alarmed;
-  ALARM		alarm_buff;
-#endif
 #ifdef __WIN__
   HANDLE	hPipe=INVALID_HANDLE_VALUE;
 #endif
@@ -3038,9 +2909,13 @@ CLI_MYSQL_REAL_CONNECT(MYSQL *mysql,cons
        mysql->options.protocol == MYSQL_PROTOCOL_MEMORY) &&
       (!host || !strcmp(host,LOCAL_HOST)))
   {
+    HANDLE handle_map;
     DBUG_PRINT("info", ("Using shared memory"));
-    if ((create_shared_memory(mysql,net, mysql->options.connect_timeout)) ==
-	INVALID_HANDLE_VALUE)
+
+    handle_map= create_shared_memory(mysql, net,
+                                     get_win32_connect_timeout(mysql));
+
+    if (handle_map == INVALID_HANDLE_VALUE)
     {
       DBUG_PRINT("error",
 		 ("host: '%s'  socket: '%s'  shared memory: %s  have_tcpip: %d",
@@ -3106,8 +2981,8 @@ CLI_MYSQL_REAL_CONNECT(MYSQL *mysql,cons
     UNIXaddr.sun_family= AF_UNIX;
     strmake(UNIXaddr.sun_path, unix_socket, sizeof(UNIXaddr.sun_path)-1);
 
-    if (my_connect(sock, (struct sockaddr *) &UNIXaddr, sizeof(UNIXaddr),
-		   mysql->options.connect_timeout))
+    if (vio_socket_connect(net->vio, (struct sockaddr *) &UNIXaddr,
+                           sizeof(UNIXaddr), get_vio_connect_timeout(mysql)))
     {
       DBUG_PRINT("error",("Got error %d on connect to local server",
 			  socket_errno));
@@ -3121,15 +2996,16 @@ CLI_MYSQL_REAL_CONNECT(MYSQL *mysql,cons
     }
     mysql->options.protocol=MYSQL_PROTOCOL_SOCKET;
   }
-#elif defined(__WIN__)
+#elif defined(_WIN32)
   if (!net->vio &&
       (mysql->options.protocol == MYSQL_PROTOCOL_PIPE ||
        (host && !strcmp(host,LOCAL_HOST_NAMEDPIPE)) ||
        (! have_tcpip && (unix_socket || !host && is_NT()))))
   {
-    if ((hPipe= create_named_pipe(mysql, mysql->options.connect_timeout,
-                                  (char**) &host, (char**) &unix_socket)) ==
-	INVALID_HANDLE_VALUE)
+    hPipe= create_named_pipe(mysql, get_win32_connect_timeout(mysql),
+                             &host, &unix_socket);
+
+    if (hPipe == INVALID_HANDLE_VALUE)
     {
       DBUG_PRINT("error",
 		 ("host: '%s'  socket: '%s'  have_tcpip: %d",
@@ -3157,10 +3033,10 @@ CLI_MYSQL_REAL_CONNECT(MYSQL *mysql,cons
        mysql->options.protocol == MYSQL_PROTOCOL_TCP))
   {
     struct addrinfo *res_lst, *client_bind_ai_lst= NULL, hints, *t_res;
-    int gai_errno;
     char port_buf[NI_MAXSERV];
     my_socket sock= SOCKET_ERROR;
-    int saved_error= 0, status= -1, bind_result= 0;
+    int gai_errno, saved_error= 0, status= -1, bind_result= 0;
+    uint flags= VIO_BUFFERED_READ;
 
     unix_socket=0;				/* This is not used */
 
@@ -3172,16 +3048,6 @@ CLI_MYSQL_REAL_CONNECT(MYSQL *mysql,cons
 
     my_snprintf(host_info=buff, sizeof(buff)-1, ER(CR_TCP_CONNECTION), host);
     DBUG_PRINT("info",("Server name: '%s'.  TCP sock: %d", host, port));
-#ifdef MYSQL_SERVER
-    thr_alarm_init(&alarmed);
-    thr_alarm(&alarmed, mysql->options.connect_timeout, &alarm_buff);
-#endif
-
-    DBUG_PRINT("info",("IP '%s'", "client"));
-
-#ifdef MYSQL_SERVER
-    thr_end_alarm(&alarmed);
-#endif
 
     memset(&hints, 0, sizeof(hints));
     hints.ai_socktype= SOCK_STREAM;
@@ -3292,20 +3158,39 @@ CLI_MYSQL_REAL_CONNECT(MYSQL *mysql,cons
         DBUG_PRINT("info", ("Successfully bound client side of socket"));
       }
 
+      /* Create a new Vio object to abstract the socket. */
+      if (!net->vio)
+      {
+        if (!(net->vio= vio_new(sock, VIO_TYPE_TCPIP, flags)))
+        {
+          set_mysql_error(mysql, CR_OUT_OF_MEMORY, unknown_sqlstate);
+          closesocket(sock);
+          goto error;
+        }
+      }
+      /* Just reinitialize if one is already allocated. */
+      else if (vio_reset(net->vio, VIO_TYPE_TCPIP, sock, NULL, flags))
+      {
+        set_mysql_error(mysql, CR_UNKNOWN_ERROR, unknown_sqlstate);
+        closesocket(sock);
+        goto error;
+      }
+
       DBUG_PRINT("info", ("Connect socket"));
-      status= my_connect(sock, t_res->ai_addr, t_res->ai_addrlen,
-                         mysql->options.connect_timeout);
+      status= vio_socket_connect(net->vio, t_res->ai_addr, t_res->ai_addrlen,
+                                 get_vio_connect_timeout(mysql));
       /*
-        Here we rely on my_connect() to return success only if the
-        connect attempt was really successful. Otherwise we would stop
-        trying another address, believing we were successful.
+        Here we rely on vio_socket_connect() to return success only if
+        the connect attempt was really successful. Otherwise we would
+        stop trying another address, believing we were successful.
       */
       if (!status)
         break;
 
       /*
-        Save value as socket errno might be overwritten due to
-        calling a socket function below.
+        Save either the socket error status or the error code of
+        the failed vio_connection operation. It is necessary to
+        avoid having it overwritten by later operations.
       */
       saved_error= socket_errno;
 
@@ -3334,15 +3219,6 @@ CLI_MYSQL_REAL_CONNECT(MYSQL *mysql,cons
                                 ER(CR_CONN_HOST_ERROR), host, saved_error);
       goto error;
     }
-
-    net->vio= vio_new(sock, VIO_TYPE_TCPIP, VIO_BUFFERED_READ);
-    if (! net->vio )
-    {
-      DBUG_PRINT("error",("Unknow protocol %d ", mysql->options.protocol));
-      set_mysql_error(mysql, CR_CONN_UNKNOW_PROTOCOL, unknown_sqlstate);
-      closesocket(sock);
-      goto error;
-    }
   }
 
   DBUG_PRINT("info", ("net->vio: %p", net->vio));
@@ -3376,12 +3252,13 @@ CLI_MYSQL_REAL_CONNECT(MYSQL *mysql,cons
   /* Get version info */
   mysql->protocol_version= PROTOCOL_VERSION;	/* Assume this */
   if (mysql->options.connect_timeout &&
-      vio_poll_read(net->vio, mysql->options.connect_timeout))
+      (vio_io_wait(net->vio, VIO_IO_EVENT_READ,
+                   get_vio_connect_timeout(mysql)) < 1))
   {
     set_mysql_extended_error(mysql, CR_SERVER_LOST, unknown_sqlstate,
                              ER(CR_SERVER_LOST_EXTENDED),
                              "waiting for initial communication packet",
-                             errno);
+                             socket_errno);
     goto error;
   }
 
@@ -3396,7 +3273,7 @@ CLI_MYSQL_REAL_CONNECT(MYSQL *mysql,cons
       set_mysql_extended_error(mysql, CR_SERVER_LOST, unknown_sqlstate,
                                ER(CR_SERVER_LOST_EXTENDED),
                                "reading initial communication packet",
-                               errno);
+                               socket_errno);
     goto error;
   }
   pkt_end= (char*)net->read_pos + pkt_length;

=== modified file 'sql/net_serv.cc'
--- a/sql/net_serv.cc	2011-02-08 15:54:12 +0000
+++ b/sql/net_serv.cc	2011-05-31 13:52:09 +0000
@@ -10,8 +10,8 @@
    GNU General Public License for more details.
 
    You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software Foundation,
-   51 Franklin Street, Suite 500, Boston, MA 02110-1335 USA */
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
 
 /**
   @file
@@ -61,50 +61,27 @@
   the client should have a bigger max_allowed_packet.
 */
 
-#if defined(__WIN__) || !defined(MYSQL_SERVER)
-  /* The following is because alarms doesn't work on windows. */
-#ifndef NO_ALARM
-#define NO_ALARM
-#endif
-#endif
-
-#ifndef NO_ALARM
-#include "my_pthread.h"
-void sql_print_error(const char *format,...);
-#else
-#define DONT_USE_THR_ALARM
-#endif /* NO_ALARM */
-
-#include "thr_alarm.h"
-
 #ifdef MYSQL_SERVER
 /*
   The following variables/functions should really not be declared
   extern, but as it's hard to include sql_priv.h here, we have to
   live with this for a while.
 */
-extern uint test_flags;
-extern ulong bytes_sent, bytes_received, net_big_packet_count;
-#ifndef MYSQL_INSTANCE_MANAGER
 #ifdef HAVE_QUERY_CACHE
 #define USE_QUERY_CACHE
 extern void query_cache_insert(const char *packet, ulong length,
                                unsigned pkt_nr);
-#endif // HAVE_QUERY_CACHE
+#endif /* HAVE_QUERY_CACHE */
 #define update_statistics(A) A
-#endif /* MYSQL_INSTANCE_MANGER */
-#endif /* defined(MYSQL_SERVER) && !defined(MYSQL_INSTANCE_MANAGER) */
-
-#if !defined(MYSQL_SERVER) || defined(MYSQL_INSTANCE_MANAGER)
+#else /* MYSQL_SERVER */
 #define update_statistics(A)
 #define thd_increment_bytes_sent(N)
 #endif
 
-#define TEST_BLOCKING		8
+#define VIO_SOCKET_ERROR  ((size_t) -1)
 #define MAX_PACKET_LENGTH (256L*256L*256L-1)
 
-static my_bool net_write_buff(NET *net,const uchar *packet,ulong len);
-
+static my_bool net_write_buff(NET *, const uchar *, ulong);
 
 /** Init with packet info. */
 
@@ -126,20 +103,11 @@ my_bool my_net_init(NET *net, Vio* vio)
   net->where_b = net->remain_in_buf=0;
   net->last_errno=0;
   net->unused= 0;
-#if defined(MYSQL_SERVER) && !defined(EMBEDDED_LIBRARY)
-  net->skip_big_packet= FALSE;
-#endif
 
-  if (vio != 0)					/* If real connection */
+  if (vio)
   {
-    net->fd  = vio_fd(vio);			/* For perl DBI/DBD */
-#if defined(MYSQL_SERVER) && !defined(__WIN__)
-    if (!(test_flags & TEST_BLOCKING))
-    {
-      my_bool old_mode;
-      vio_blocking(vio, FALSE, &old_mode);
-    }
-#endif
+    /* For perl DBI/DBD. */
+    net->fd= vio_fd(vio);
     vio_fastsend(vio);
   }
   DBUG_RETURN(0);
@@ -180,7 +148,7 @@ my_bool net_realloc(NET *net, size_t len
   /*
     We must allocate some extra bytes for the end 0 and to be able to
     read big compressed blocks + 1 safety byte since uint3korr() in
-    my_real_read() may actually read 4 bytes depending on build flags and
+    net_read_packet() may actually read 4 bytes depending on build flags and
     platform.
   */
   if (!(buff= (uchar*) my_realloc((char*) net->buff, pkt_length +
@@ -200,128 +168,29 @@ my_bool net_realloc(NET *net, size_t len
 
 
 /**
-  Check if there is any data to be read from the socket.
+  Clear (reinitialize) the NET structure for a new command.
 
-  @param sd   socket descriptor
+  @remark Performs debug checking of the socket buffer to
+          ensure that the protocol sequence is correct.
 
-  @retval
-    0  No data to read
-  @retval
-    1  Data or EOF to read
-  @retval
-    -1   Don't know if data is ready or not
+  @param net          NET handler
+  @param check_buffer  Whether to check the socket buffer.
 */
 
-#if !defined(EMBEDDED_LIBRARY)
-
-static int net_data_is_ready(my_socket sd)
+void net_clear(NET *net,
+               my_bool check_buffer __attribute__((unused)))
 {
-#ifdef HAVE_POLL
-  struct pollfd ufds;
-  int res;
-
-  ufds.fd= sd;
-  ufds.events= POLLIN | POLLPRI;
-  if (!(res= poll(&ufds, 1, 0)))
-    return 0;
-  if (res < 0 || !(ufds.revents & (POLLIN | POLLPRI)))
-    return 0;
-  return 1;
-#else
-  fd_set sfds;
-  struct timeval tv;
-  int res;
-
-#ifndef __WIN__
-  /* Windows uses an _array_ of 64 fd's as default, so it's safe */
-  if (sd >= FD_SETSIZE)
-    return -1;
-#define NET_DATA_IS_READY_CAN_RETURN_MINUS_ONE
-#endif
-
-  FD_ZERO(&sfds);
-  FD_SET(sd, &sfds);
-
-  tv.tv_sec= tv.tv_usec= 0;
-
-  if ((res= select((int) (sd + 1), &sfds, NULL, NULL, &tv)) < 0)
-    return 0;
-  else
-    return test(res ? FD_ISSET(sd, &sfds) : 0);
-#endif /* HAVE_POLL */
-}
-
-#endif /* EMBEDDED_LIBRARY */
-
-/**
-  Remove unwanted characters from connection
-  and check if disconnected.
-
-    Read from socket until there is nothing more to read. Discard
-    what is read.
-
-    If there is anything when to read 'net_clear' is called this
-    normally indicates an error in the protocol.
-
-    When connection is properly closed (for TCP it means with
-    a FIN packet), then select() considers a socket "ready to read",
-    in the sense that there's EOF to read, but read() returns 0.
-
-  @param net			NET handler
-  @param clear_buffer           if <> 0, then clear all data from comm buff
-*/
-
-void net_clear(NET *net, my_bool clear_buffer)
-{
-#if !defined(EMBEDDED_LIBRARY)
-  size_t count;
-  int ready;
-#endif
   DBUG_ENTER("net_clear");
 
 #if !defined(EMBEDDED_LIBRARY)
-  if (clear_buffer)
-  {
-    while ((ready= net_data_is_ready(net->vio->sd)) > 0)
-    {
-      /* The socket is ready */
-      if ((long) (count= vio_read(net->vio, net->buff,
-                                  (size_t) net->max_packet)) > 0)
-      {
-        DBUG_PRINT("info",("skipped %ld bytes from file: %s",
-                           (long) count, vio_description(net->vio)));
-#if defined(EXTRA_DEBUG)
-        fprintf(stderr,"Note: net_clear() skipped %ld bytes from file: %s\n",
-                (long) count, vio_description(net->vio));
+  /* Ensure the socket buffer is empty, except for an EOF (at least 1). */
+  DBUG_ASSERT(!check_buffer || (vio_pending(net->vio) <= 1));
 #endif
-      }
-      else
-      {
-        DBUG_PRINT("info",("socket ready but only EOF to read - disconnected"));
-        net->error= 2;
-        break;
-      }
-    }
-#ifdef NET_DATA_IS_READY_CAN_RETURN_MINUS_ONE
-    /* 'net_data_is_ready' returned "don't know" */
-    if (ready == -1)
-    {
-      /* Read unblocking to clear net */
-      my_bool old_mode;
-      if (!vio_blocking(net->vio, FALSE, &old_mode))
-      {
-        while ((long) (count= vio_read(net->vio, net->buff,
-                                       (size_t) net->max_packet)) > 0)
-          DBUG_PRINT("info",("skipped %ld bytes from file: %s",
-                             (long) count, vio_description(net->vio)));
-        vio_blocking(net->vio, TRUE, &old_mode);
-      }
-    }
-#endif /* NET_DATA_IS_READY_CAN_RETURN_MINUS_ONE */
-  }
-#endif /* EMBEDDED_LIBRARY */
-  net->pkt_nr=net->compress_pkt_nr=0;		/* Ready for new command */
-  net->write_pos=net->buff;
+
+  /* Ready for new command */
+  net->pkt_nr= net->compress_pkt_nr= 0;
+  net->write_pos= net->buff;
+
   DBUG_VOID_RETURN;
 }
 
@@ -334,9 +203,9 @@ my_bool net_flush(NET *net)
   DBUG_ENTER("net_flush");
   if (net->buff != net->write_pos)
   {
-    error=test(net_real_write(net, net->buff,
-			      (size_t) (net->write_pos - net->buff)));
-    net->write_pos=net->buff;
+    error= net_write_packet(net, net->buff,
+                            (size_t) (net->write_pos - net->buff));
+    net->write_pos= net->buff;
   }
   /* Sync packet number if using compression */
   if (net->compress)
@@ -345,6 +214,42 @@ my_bool net_flush(NET *net)
 }
 
 
+/**
+  Whether a I/O operation should be retried later.
+
+  @param net          NET handler.
+  @param retry_count  Maximum number of interrupted operations.
+
+  @retval TRUE    Operation should be retried.
+  @retval FALSE   Operation should not be retried. Fatal error.
+*/
+
+static my_bool
+net_should_retry(NET *net, uint *retry_count __attribute__((unused)))
+{
+  my_bool retry;
+
+#if !defined(MYSQL_SERVER) && defined(THREAD_SAFE_CLIENT)
+  /*
+    In the thread safe client library, interrupted I/O operations
+    are always retried.  Otherwise, its either a timeout or a
+    unrecoverable error.
+  */
+  retry= vio_should_retry(net->vio);
+#else
+  /*
+    In the non-thread safe client library, or in the server,
+    interrupted I/O operations are retried up to a limit.
+    In this scenario, pthread_kill can be used to wake up
+    (interrupt) threads waiting for I/O.
+  */
+  retry= vio_should_retry(net->vio) && ((*retry_count)++ < net->retry_count);
+#endif
+
+  return retry;
+}
+
+
 /*****************************************************************************
 ** Write something to server/client buffer
 *****************************************************************************/
@@ -352,15 +257,13 @@ my_bool net_flush(NET *net)
 /**
   Write a logical packet with packet header.
 
-  Format: Packet length (3 bytes), packet number(1 byte)
-  When compression is used a 3 byte compression length is added
+  Format: Packet length (3 bytes), packet number (1 byte)
+  When compression is used, a 3 byte compression length is added.
 
-  @note
-    If compression is used the original package is modified!
+  @note If compression is used, the original packet is modified!
 */
 
-my_bool
-my_net_write(NET *net,const uchar *packet,size_t len)
+my_bool my_net_write(NET *net, const uchar *packet, size_t len)
 {
   uchar buff[NET_HEADER_SIZE];
   int rc;
@@ -405,6 +308,7 @@ my_net_write(NET *net,const uchar *packe
   return rc;
 }
 
+
 /**
   Send a command to the server.
 
@@ -480,6 +384,7 @@ net_write_command(NET *net,uchar command
   DBUG_RETURN(rc);
 }
 
+
 /**
   Caching the data in a local buffer before sending it.
 
@@ -497,7 +402,7 @@ net_write_command(NET *net,uchar command
   @note
     The cached buffer can be sent as it is with 'net_flush()'.
     In this code we have to be careful to not send a packet longer than
-    MAX_PACKET_LENGTH to net_real_write() if we are using the compressed
+    MAX_PACKET_LENGTH to net_write_packet() if we are using the compressed
     protocol as we store the length of the compressed packet in 3 bytes.
 
   @retval
@@ -523,9 +428,9 @@ net_write_buff(NET *net, const uchar *pa
     if (net->write_pos != net->buff)
     {
       /* Fill up already used packet and write it */
-      memcpy((char*) net->write_pos,packet,left_length);
-      if (net_real_write(net, net->buff, 
-			 (size_t) (net->write_pos - net->buff) + left_length))
+      memcpy(net->write_pos, packet, left_length);
+      if (net_write_packet(net, net->buff,
+                           (size_t) (net->write_pos - net->buff) + left_length))
 	return 1;
       net->write_pos= net->buff;
       packet+= left_length;
@@ -540,461 +445,380 @@ net_write_buff(NET *net, const uchar *pa
       left_length= MAX_PACKET_LENGTH;
       while (len > left_length)
       {
-	if (net_real_write(net, packet, left_length))
+	if (net_write_packet(net, packet, left_length))
 	  return 1;
 	packet+= left_length;
 	len-= left_length;
       }
     }
     if (len > net->max_packet)
-      return net_real_write(net, packet, len) ? 1 : 0;
+      return net_write_packet(net, packet, len);
     /* Send out rest of the blocks as full sized blocks */
   }
-  memcpy((char*) net->write_pos,packet,len);
+  memcpy(net->write_pos, packet, len);
   net->write_pos+= len;
   return 0;
 }
 
 
 /**
-  Read and write one packet using timeouts.
-  If needed, the packet is compressed before sending.
+  Write a determined number of bytes to a network handler.
 
-  @todo
-    - TODO is it needed to set this variable if we have no socket
+  @param  net     NET handler.
+  @param  buf     Buffer containing the data to be written.
+  @param  count   The length, in bytes, of the buffer.
+
+  @return TRUE on error, FALSE on success.
 */
 
-int
-net_real_write(NET *net,const uchar *packet, size_t len)
+static my_bool
+net_write_raw_loop(NET *net, const uchar *buf, size_t count)
 {
-  size_t length;
-  const uchar *pos,*end;
-  thr_alarm_t alarmed;
-#ifndef NO_ALARM
-  ALARM alarm_buff;
+  unsigned int retry_count= 0;
+
+  while (count)
+  {
+    size_t sentcnt= vio_write(net->vio, buf, count);
+
+    /* VIO_SOCKET_ERROR (-1) indicates an error. */
+    if (sentcnt == VIO_SOCKET_ERROR)
+    {
+      /* A recoverable I/O error occurred? */
+      if (net_should_retry(net, &retry_count))
+        continue;
+      else
+        break;
+    }
+
+    count-= sentcnt;
+    buf+= sentcnt;
+    update_statistics(thd_increment_bytes_sent(sentcnt));
+  }
+
+  /* On failure, propagate the error code. */
+  if (count)
+  {
+    /* Socket should be closed. */
+    net->error= 2;
+
+    /* Interrupted by a timeout? */
+    if (vio_was_timeout(net->vio))
+      net->last_errno= ER_NET_WRITE_INTERRUPTED;
+    else
+      net->last_errno= ER_NET_ERROR_ON_WRITE;
+
+#ifdef MYSQL_SERVER
+    my_error(net->last_errno, MYF(0));
 #endif
-  uint retry_count=0;
-  my_bool net_blocking = vio_is_blocking(net->vio);
-  DBUG_ENTER("net_real_write");
+  }
+
+  return test(count);
+}
+
+
+/**
+  Compress and encapsulate a packet into a compressed packet.
+
+  @param          net      NET handler.
+  @param          packet   The packet to compress.
+  @param[in,out]  length   Length of the packet.
+
+  A compressed packet header is compromised of the packet
+  length (3 bytes), packet number (1 byte) and the length
+  of the original (uncompressed) packet.
+
+  @return Pointer to the (new) compressed packet.
+*/
+
+static uchar *
+compress_packet(NET *net, const uchar *packet, size_t *length)
+{
+  uchar *compr_packet;
+  size_t compr_length;
+  const uint header_length= NET_HEADER_SIZE + COMP_HEADER_SIZE;
+
+  compr_packet= (uchar *) my_malloc(*length + header_length, MYF(MY_WME));
+
+  if (compr_packet == NULL)
+    return NULL;
+
+  memcpy(compr_packet + header_length, packet, *length);
+
+  /* Compress the encapsulated packet. */
+  if (my_compress(compr_packet + header_length, length, &compr_length))
+  {
+    /*
+      If the length of the compressed packet is larger than the
+      original packet, the original packet is sent uncompressed.
+    */
+    compr_length= 0;
+  }
+
+  /* Length of the compressed (original) packet. */
+  int3store(&compr_packet[NET_HEADER_SIZE], compr_length);
+  /* Length of this packet. */
+  int3store(compr_packet, *length);
+  /* Packet number. */
+  compr_packet[3]= (uchar) (net->compress_pkt_nr++);
+
+  *length+= header_length;
+
+  return compr_packet;
+}
+
+
+/**
+  Write a MySQL protocol packet to the network handler.
+
+  @param  net     NET handler.
+  @param  packet  The packet to write.
+  @param  length  Length of the packet.
+
+  @remark The packet might be encapsulated into a compressed packet.
+
+  @return TRUE on error, FALSE on success.
+*/
+
+my_bool
+net_write_packet(NET *net, const uchar *packet, size_t length)
+{
+  my_bool res;
+  DBUG_ENTER("net_write_packet");
 
 #if defined(MYSQL_SERVER) && defined(USE_QUERY_CACHE)
-  query_cache_insert((char*) packet, len, net->pkt_nr);
+  query_cache_insert((char*) packet, length, net->pkt_nr);
 #endif
 
+  /* Socket can't be used */
   if (net->error == 2)
-    DBUG_RETURN(-1);				/* socket can't be used */
+    DBUG_RETURN(TRUE);
+
+  net->reading_or_writing= 2;
 
-  net->reading_or_writing=2;
 #ifdef HAVE_COMPRESS
   if (net->compress)
   {
-    size_t complen;
-    uchar *b;
-    uint header_length=NET_HEADER_SIZE+COMP_HEADER_SIZE;
-    if (!(b= (uchar*) my_malloc(len + NET_HEADER_SIZE +
-                                COMP_HEADER_SIZE, MYF(MY_WME))))
+    if ((packet= compress_packet(net, packet, &length)) == NULL)
     {
       net->error= 2;
       net->last_errno= ER_OUT_OF_RESOURCES;
-      /* In the server, the error is reported by MY_WME flag. */
+      /* In the server, allocation failure raises a error. */
       net->reading_or_writing= 0;
-      DBUG_RETURN(1);
+      DBUG_RETURN(TRUE);
     }
-    memcpy(b+header_length,packet,len);
-
-    if (my_compress(b+header_length, &len, &complen))
-      complen=0;
-    int3store(&b[NET_HEADER_SIZE],complen);
-    int3store(b,len);
-    b[3]=(uchar) (net->compress_pkt_nr++);
-    len+= header_length;
-    packet= b;
   }
 #endif /* HAVE_COMPRESS */
 
 #ifdef DEBUG_DATA_PACKETS
-  DBUG_DUMP("data", packet, len);
+  DBUG_DUMP("data", packet, length);
 #endif
 
-#ifndef NO_ALARM
-  thr_alarm_init(&alarmed);
-  if (net_blocking)
-    thr_alarm(&alarmed, net->write_timeout, &alarm_buff);
-#else
-  alarmed=0;
-  /* Write timeout is set in my_net_set_write_timeout */
-#endif /* NO_ALARM */
-
-  pos= packet;
-  end=pos+len;
-  while (pos != end)
-  {
-    if ((long) (length= vio_write(net->vio,pos,(size_t) (end-pos))) <= 0)
-    {
-      my_bool interrupted = vio_should_retry(net->vio);
-#if !defined(__WIN__)
-      if ((interrupted || length == 0) && !thr_alarm_in_use(&alarmed))
-      {
-        if (!thr_alarm(&alarmed, net->write_timeout, &alarm_buff))
-        {                                       /* Always true for client */
-	  my_bool old_mode;
-	  while (vio_blocking(net->vio, TRUE, &old_mode) < 0)
-	  {
-	    if (vio_should_retry(net->vio) && retry_count++ < net->retry_count)
-	      continue;
-#ifdef EXTRA_DEBUG
-	    fprintf(stderr,
-		    "%s: my_net_write: fcntl returned error %d, aborting thread\n",
-		    my_progname,vio_errno(net->vio));
-#endif /* EXTRA_DEBUG */
-	    net->error= 2;                     /* Close socket */
-            net->last_errno= ER_NET_PACKET_TOO_LARGE;
-#ifdef MYSQL_SERVER
-            my_error(ER_NET_PACKET_TOO_LARGE, MYF(0));
-#endif
-	    goto end;
-	  }
-	  retry_count=0;
-	  continue;
-	}
-      }
-      else
-#endif /* !defined(__WIN__) */
-	if (thr_alarm_in_use(&alarmed) && !thr_got_alarm(&alarmed) &&
-	    interrupted)
-      {
-	if (retry_count++ < net->retry_count)
-	    continue;
-#ifdef EXTRA_DEBUG
-	  fprintf(stderr, "%s: write looped, aborting thread\n",
-		  my_progname);
-#endif /* EXTRA_DEBUG */
-      }
-#if defined(THREAD_SAFE_CLIENT) && !defined(MYSQL_SERVER)
-      if (vio_errno(net->vio) == SOCKET_EINTR)
-      {
-	DBUG_PRINT("warning",("Interrupted write. Retrying..."));
-	continue;
-      }
-#endif /* defined(THREAD_SAFE_CLIENT) && !defined(MYSQL_SERVER) */
-      net->error= 2;				/* Close socket */
-      net->last_errno= (interrupted ? ER_NET_WRITE_INTERRUPTED :
-                               ER_NET_ERROR_ON_WRITE);
-#ifdef MYSQL_SERVER
-      my_error(net->last_errno, MYF(0));
-#endif /* MYSQL_SERVER */
-      break;
-    }
-    pos+=length;
-    update_statistics(thd_increment_bytes_sent(length));
-  }
-#ifndef __WIN__
- end:
-#endif
+  res= net_write_raw_loop(net, packet, length);
+
 #ifdef HAVE_COMPRESS
   if (net->compress)
-    my_free((void*) packet);
+    my_free((void *) packet);
 #endif
-  if (thr_alarm_in_use(&alarmed))
-  {
-    my_bool old_mode;
-    thr_end_alarm(&alarmed);
-    vio_blocking(net->vio, net_blocking, &old_mode);
-  }
-  net->reading_or_writing=0;
-  DBUG_RETURN(((int) (pos != end)));
-}
 
+  net->reading_or_writing= 0;
+
+  DBUG_RETURN(res);
+}
 
 /*****************************************************************************
 ** Read something from server/clinet
 *****************************************************************************/
 
-#ifndef NO_ALARM
+/**
+  Read a determined number of bytes from a network handler.
+
+  @param  net     NET handler.
+  @param  count   The number of bytes to read.
 
-static my_bool net_safe_read(NET *net, uchar *buff, size_t length,
-			     thr_alarm_t *alarmed)
+  @return TRUE on error, FALSE on success.
+*/
+
+static my_bool net_read_raw_loop(NET *net, size_t count)
 {
-  uint retry_count=0;
-  while (length > 0)
+  bool eof= false;
+  unsigned int retry_count= 0;
+  uchar *buf= net->buff + net->where_b;
+
+  while (count)
   {
-    size_t tmp;
-    if ((long) (tmp= vio_read(net->vio, buff, length)) <= 0)
+    size_t recvcnt= vio_read(net->vio, buf, count);
+
+    /* VIO_SOCKET_ERROR (-1) indicates an error. */
+    if (recvcnt == VIO_SOCKET_ERROR)
     {
-      my_bool interrupted = vio_should_retry(net->vio);
-      if (!thr_got_alarm(alarmed) && interrupted)
-      {					/* Probably in MIT threads */
-	if (retry_count++ < net->retry_count)
-	  continue;
-      }
-      return 1;
+      /* A recoverable I/O error occurred? */
+      if (net_should_retry(net, &retry_count))
+        continue;
+      else
+        break;
+    }
+    /* Zero indicates end of file. */
+    else if (!recvcnt)
+    {
+      eof= true;
+      break;
     }
-    length-= tmp;
-    buff+= tmp;
+
+    count-= recvcnt;
+    buf+= recvcnt;
+    update_statistics(thd_increment_bytes_received(recvcnt));
   }
-  return 0;
+
+  /* On failure, propagate the error code. */
+  if (count)
+  {
+    /* Socket should be closed. */
+    net->error= 2;
+
+    /* Interrupted by a timeout? */
+    if (!eof && vio_was_timeout(net->vio))
+      net->last_errno= ER_NET_READ_INTERRUPTED;
+    else
+      net->last_errno= ER_NET_READ_ERROR;
+
+#ifdef MYSQL_SERVER
+    my_error(net->last_errno, MYF(0));
+#endif
+  }
+
+  return test(count);
 }
 
+
 /**
-  Help function to clear the commuication buffer when we get a too big packet.
+  Read the header of a packet. The MySQL protocol packet header
+  consists of the length, in bytes, of the payload (packet data)
+  and a serial number.
 
-  @param net		Communication handle
-  @param remain	Bytes to read
-  @param alarmed	Parameter for thr_alarm()
-  @param alarm_buff	Parameter for thr_alarm()
+  @remark The encoded length is the length of the packet payload,
+          which does not include the packet header.
 
-  @retval
-   0	Was able to read the whole packet
-  @retval
-   1	Got mailformed packet from client
+  @remark The serial number is used to ensure that the packets are
+          received in order. If the packet serial number does not
+          match the expected value, a error is returned.
+
+  @param  net  NET handler.
+
+  @return TRUE on error, FALSE on success.
 */
 
-static my_bool my_net_skip_rest(NET *net, uint32 remain, thr_alarm_t *alarmed,
-				ALARM *alarm_buff)
+static my_bool net_read_packet_header(NET *net)
 {
-  uint32 old=remain;
-  DBUG_ENTER("my_net_skip_rest");
-  DBUG_PRINT("enter",("bytes_to_skip: %u", (uint) remain));
+  uchar pkt_nr;
+  size_t count= NET_HEADER_SIZE;
 
-  /* The following is good for debugging */
-  update_statistics(thd_increment_net_big_packet_count(1));
+  if (net->compress)
+    count+= COMP_HEADER_SIZE;
 
-  if (!thr_alarm_in_use(alarmed))
-  {
-    my_bool old_mode;
-    if (thr_alarm(alarmed,net->read_timeout, alarm_buff) ||
-	vio_blocking(net->vio, TRUE, &old_mode) < 0)
-      DBUG_RETURN(1);				/* Can't setup, abort */
-  }
-  for (;;)
+  if (net_read_raw_loop(net, count))
+    return TRUE;
+
+  DBUG_DUMP("packet_header", net->buff + net->where_b, NET_HEADER_SIZE);
+
+  pkt_nr= net->buff[net->where_b + 3];
+
+  /*
+    Verify packet serial number against the truncated packet counter.
+    The local packet counter must be truncated since its not reset.
+  */
+  if (pkt_nr != (uchar) net->pkt_nr)
   {
-    while (remain > 0)
-    {
-      size_t length= min(remain, net->max_packet);
-      if (net_safe_read(net, net->buff, length, alarmed))
-	DBUG_RETURN(1);
-      update_statistics(thd_increment_bytes_received(length));
-      remain -= (uint32) length;
-    }
-    if (old != MAX_PACKET_LENGTH)
-      break;
-    if (net_safe_read(net, net->buff, NET_HEADER_SIZE, alarmed))
-      DBUG_RETURN(1);
-    old=remain= uint3korr(net->buff);
-    net->pkt_nr++;
+    /* Not a NET error on the client. XXX: why? */
+#if defined(MYSQL_SERVER)
+    my_error(ER_NET_PACKETS_OUT_OF_ORDER, MYF(0));
+#elif defined(EXTRA_DEBUG)
+    /*
+      We don't make noise server side, since the client is expected
+      to break the protocol for e.g. --send LOAD DATA .. LOCAL where
+      the server expects the client to send a file, but the client
+      may reply with a new command instead.
+    */
+    fprintf(stderr, "Error: packets out of order (found %u, expected %u)\n",
+            (uint) pkt_nr, net->pkt_nr);
+    DBUG_ASSERT(pkt_nr == net->pkt_nr);
+#endif
+    return TRUE;
   }
-  DBUG_RETURN(0);
+
+  net->pkt_nr++;
+
+  return FALSE;
 }
-#endif /* NO_ALARM */
 
 
 /**
-  Reads one packet to net->buff + net->where_b.
-  Long packets are handled by my_net_read().
-  This function reallocates the net->buff buffer if necessary.
+  Read one (variable-length) MySQL protocol packet.
+  A MySQL packet consists of a header and a payload.
 
-  @return
-    Returns length of packet.
+  @remark Reads one packet to net->buff + net->where_b.
+  @remark Long packets are handled by my_net_read().
+  @remark The network buffer is expanded if necessary.
+
+  @return The length of the packet, or @packet_error on error.
 */
 
-static ulong
-my_real_read(NET *net, size_t *complen)
+static ulong net_read_packet(NET *net, size_t *complen)
 {
-  uchar *pos;
-  size_t length;
-  uint i,retry_count=0;
-  ulong len=packet_error;
-  thr_alarm_t alarmed;
-#ifndef NO_ALARM
-  ALARM alarm_buff;
-#endif
-  my_bool net_blocking=vio_is_blocking(net->vio);
-  uint32 remain= (net->compress ? NET_HEADER_SIZE+COMP_HEADER_SIZE :
-		  NET_HEADER_SIZE);
-  *complen = 0;
-
-  net->reading_or_writing=1;
-  thr_alarm_init(&alarmed);
-#ifndef NO_ALARM
-  if (net_blocking)
-    thr_alarm(&alarmed,net->read_timeout,&alarm_buff);
-#else
-  /* Read timeout is set in my_net_set_read_timeout */
-#endif /* NO_ALARM */
+  size_t pkt_len, pkt_data_len;
 
-    pos = net->buff + net->where_b;		/* net->packet -4 */
-    for (i=0 ; i < 2 ; i++)
-    {
-      while (remain > 0)
-      {
-	/* First read is done with non blocking mode */
-        if ((long) (length= vio_read(net->vio, pos, remain)) <= 0L)
-        {
-          my_bool interrupted = vio_should_retry(net->vio);
-
-	  DBUG_PRINT("info",("vio_read returned %ld  errno: %d",
-			     (long) length, vio_errno(net->vio)));
-#if !defined(__WIN__) || defined(MYSQL_SERVER)
-	  /*
-	    We got an error that there was no data on the socket. We now set up
-	    an alarm to not 'read forever', change the socket to non blocking
-	    mode and try again
-	  */
-	  if ((interrupted || length == 0) && !thr_alarm_in_use(&alarmed))
-	  {
-	    if (!thr_alarm(&alarmed,net->read_timeout,&alarm_buff)) /* Don't wait too long */
-	    {
-	      my_bool old_mode;
-	      while (vio_blocking(net->vio, TRUE, &old_mode) < 0)
-	      {
-		if (vio_should_retry(net->vio) &&
-		    retry_count++ < net->retry_count)
-		  continue;
-		DBUG_PRINT("error",
-			   ("fcntl returned error %d, aborting thread",
-			    vio_errno(net->vio)));
-#ifdef EXTRA_DEBUG
-		fprintf(stderr,
-			"%s: read: fcntl returned error %d, aborting thread\n",
-			my_progname,vio_errno(net->vio));
-#endif /* EXTRA_DEBUG */
-		len= packet_error;
-		net->error= 2;                 /* Close socket */
-	        net->last_errno= ER_NET_FCNTL_ERROR;
-#ifdef MYSQL_SERVER
-		my_error(ER_NET_FCNTL_ERROR, MYF(0));
-#endif
-		goto end;
-	      }
-	      retry_count=0;
-	      continue;
-	    }
-	  }
-#endif /* (!defined(__WIN__) || defined(MYSQL_SERVER) */
-	  if (thr_alarm_in_use(&alarmed) && !thr_got_alarm(&alarmed) &&
-	      interrupted)
-	  {					/* Probably in MIT threads */
-	    if (retry_count++ < net->retry_count)
-	      continue;
-#ifdef EXTRA_DEBUG
-	    fprintf(stderr, "%s: read looped with error %d, aborting thread\n",
-		    my_progname,vio_errno(net->vio));
-#endif /* EXTRA_DEBUG */
-	  }
-#if defined(THREAD_SAFE_CLIENT) && !defined(MYSQL_SERVER)
-	  if (vio_errno(net->vio) == SOCKET_EINTR)
-	  {
-	    DBUG_PRINT("warning",("Interrupted read. Retrying..."));
-	    continue;
-	  }
-#endif
-	  DBUG_PRINT("error",("Couldn't read packet: remain: %u  errno: %d  length: %ld",
-			      remain, vio_errno(net->vio), (long) length));
-	  len= packet_error;
-	  net->error= 2;				/* Close socket */
-          net->last_errno= (vio_was_interrupted(net->vio) ?
-                                   ER_NET_READ_INTERRUPTED :
-                                   ER_NET_READ_ERROR);
-#ifdef MYSQL_SERVER
-          my_error(net->last_errno, MYF(0));
-#endif
-	  goto end;
-	}
-	remain -= (uint32) length;
-	pos+= length;
-	update_statistics(thd_increment_bytes_received(length));
-      }
-      if (i == 0)
-      {					/* First parts is packet length */
-	ulong helping;
-        DBUG_DUMP("packet_header", net->buff+net->where_b,
-                  NET_HEADER_SIZE);
-	if (net->buff[net->where_b + 3] != (uchar) net->pkt_nr)
-	{
-	  if (net->buff[net->where_b] != (uchar) 255)
-	  {
-	    DBUG_PRINT("error",
-		       ("Packets out of order (Found: %d, expected %u)",
-			(int) net->buff[net->where_b + 3],
-			net->pkt_nr));
-            /* 
-              We don't make noise server side, since the client is expected
-              to break the protocol for e.g. --send LOAD DATA .. LOCAL where
-              the server expects the client to send a file, but the client
-              may reply with a new command instead.
-            */
-#if defined (EXTRA_DEBUG) && !defined (MYSQL_SERVER)
-            fflush(stdout);
-	    fprintf(stderr,"Error: Packets out of order (Found: %d, expected %d)\n",
-		    (int) net->buff[net->where_b + 3],
-		    (uint) (uchar) net->pkt_nr);
-            fflush(stderr);
-            DBUG_ASSERT(0);
-#endif
-	  }
-	  len= packet_error;
-          /* Not a NET error on the client. XXX: why? */
-#ifdef MYSQL_SERVER
-	  my_error(ER_NET_PACKETS_OUT_OF_ORDER, MYF(0));
-#endif
-	  goto end;
-	}
-	net->compress_pkt_nr= ++net->pkt_nr;
-#ifdef HAVE_COMPRESS
-	if (net->compress)
-	{
-          /*
-            The following uint3korr() may read 4 bytes, so make sure we don't
-            read unallocated or uninitialized memory. The right-hand expression
-            must match the size of the buffer allocated in net_realloc().
-          */
-          DBUG_ASSERT(net->where_b + NET_HEADER_SIZE + sizeof(uint32) <=
-                      net->max_packet + NET_HEADER_SIZE + COMP_HEADER_SIZE + 1);
-	  /*
-	    If the packet is compressed then complen > 0 and contains the
-	    number of bytes in the uncompressed packet
-	  */
-	  *complen=uint3korr(&(net->buff[net->where_b + NET_HEADER_SIZE]));
-	}
-#endif
+  *complen= 0;
 
-	len=uint3korr(net->buff+net->where_b);
-	if (!len)				/* End of big multi-packet */
-	  goto end;
-	helping = max(len,*complen) + net->where_b;
-	/* The necessary size of net->buff */
-	if (helping >= net->max_packet)
-	{
-	  if (net_realloc(net,helping))
-	  {
-#if defined(MYSQL_SERVER) && !defined(NO_ALARM)
-	    if (!net->compress &&
-                net->skip_big_packet &&
-		!my_net_skip_rest(net, (uint32) len, &alarmed, &alarm_buff))
-	      net->error= 3;		/* Successfully skiped packet */
-#endif
-	    len= packet_error;          /* Return error and close connection */
-	    goto end;
-	  }
-	}
-	pos=net->buff + net->where_b;
-	remain = (uint32) len;
-      }
-    }
+  net->reading_or_writing= 1;
 
-end:
-  if (thr_alarm_in_use(&alarmed))
+  /* Retrieve packet length and number. */
+  if (net_read_packet_header(net))
+    goto error;
+
+  net->compress_pkt_nr= net->pkt_nr;
+
+#ifdef HAVE_COMPRESS
+  if (net->compress)
   {
-    my_bool old_mode;
-    thr_end_alarm(&alarmed);
-    vio_blocking(net->vio, net_blocking, &old_mode);
+    /*
+      The following uint3korr() may read 4 bytes, so make sure we don't
+      read unallocated or uninitialized memory. The right-hand expression
+      must match the size of the buffer allocated in net_realloc().
+    */
+    DBUG_ASSERT(net->where_b + NET_HEADER_SIZE + sizeof(uint32) <=
+                net->max_packet + NET_HEADER_SIZE + COMP_HEADER_SIZE + 1);
+
+    /*
+      If the packet is compressed then complen > 0 and contains the
+      number of bytes in the uncompressed packet.
+    */
+    *complen= uint3korr(&(net->buff[net->where_b + NET_HEADER_SIZE]));
   }
-  net->reading_or_writing=0;
-#ifdef DEBUG_DATA_PACKETS
-  if (len != packet_error)
-    DBUG_DUMP("data", net->buff+net->where_b, len);
 #endif
-  return(len);
+
+  /* The length of the packet that follows. */
+  pkt_len= uint3korr(net->buff+net->where_b);
+
+  /* End of big multi-packet. */
+  if (!pkt_len)
+    goto end;
+
+  pkt_data_len = max(pkt_len, *complen) + net->where_b;
+
+  /* Expand packet buffer if necessary. */
+  if ((pkt_data_len >= net->max_packet) && net_realloc(net, pkt_data_len))
+    goto error;
+
+  /* Read the packet data (payload). */
+  if (net_read_raw_loop(net, pkt_len))
+    goto error;
+
+end:
+  net->reading_or_writing= 0;
+  return pkt_len;
+
+error:
+  net->reading_or_writing= 0;
+  return packet_error;
 }
 
 
@@ -1025,7 +849,7 @@ my_net_read(NET *net)
   if (!net->compress)
   {
 #endif
-    len = my_real_read(net,&complen);
+    len= net_read_packet(net, &complen);
     if (len == MAX_PACKET_LENGTH)
     {
       /* First packet of a multi-packet.  Concatenate the packets */
@@ -1035,7 +859,7 @@ my_net_read(NET *net)
       {
 	net->where_b += len;
 	total_length += len;
-	len = my_real_read(net,&complen);
+	len= net_read_packet(net, &complen);
       } while (len == MAX_PACKET_LENGTH);
       if (len != packet_error)
 	len+= total_length;
@@ -1127,7 +951,7 @@ my_net_read(NET *net)
       }
 
       net->where_b=buf_length;
-      if ((packet_len = my_real_read(net,&complen)) == packet_error)
+      if ((packet_len= net_read_packet(net, &complen)) == packet_error)
       {
         MYSQL_NET_READ_DONE(1, 0);
 	return packet_error;
@@ -1165,10 +989,8 @@ void my_net_set_read_timeout(NET *net, u
   DBUG_ENTER("my_net_set_read_timeout");
   DBUG_PRINT("enter", ("timeout: %d", timeout));
   net->read_timeout= timeout;
-#ifdef NO_ALARM
   if (net->vio)
     vio_timeout(net->vio, 0, timeout);
-#endif
   DBUG_VOID_RETURN;
 }
 
@@ -1178,9 +1000,8 @@ void my_net_set_write_timeout(NET *net,
   DBUG_ENTER("my_net_set_write_timeout");
   DBUG_PRINT("enter", ("timeout: %d", timeout));
   net->write_timeout= timeout;
-#ifdef NO_ALARM
   if (net->vio)
     vio_timeout(net->vio, 1, timeout);
-#endif
   DBUG_VOID_RETURN;
 }
+

=== modified file 'sql/sql_acl.cc'
--- a/sql/sql_acl.cc	2011-05-30 11:42:03 +0000
+++ b/sql/sql_acl.cc	2011-05-31 13:52:09 +0000
@@ -9542,16 +9542,6 @@ acl_authenticate(THD *thd, uint connect_
   else
     my_ok(thd);
 
-#if defined(MYSQL_SERVER) && !defined(EMBEDDED_LIBRARY)
-  /*
-    Allow the network layer to skip big packets. Although a malicious
-    authenticated session might use this to trick the server to read
-    big packets indefinitely, this is a previously established behavior
-    that needs to be preserved as to not break backwards compatibility.
-  */
-  thd->net.skip_big_packet= TRUE;
-#endif
-
 #ifdef HAVE_PSI_THREAD_INTERFACE
   if (likely(PSI_server != NULL))
   {

=== modified file 'sql/sql_cache.cc'
--- a/sql/sql_cache.cc	2011-05-26 15:20:09 +0000
+++ b/sql/sql_cache.cc	2011-05-31 13:52:09 +0000
@@ -275,14 +275,14 @@ functions:
        - Called before parsing and used to match a statement with the stored
          queries hash.
          If a match is found the cached result set is sent through repeated
-         calls to net_real_write. (note: calling thread doesn't have a regis-
+         calls to net_write_packet. (note: calling thread doesn't have a regis-
          tered result set writer: thd->net.query_cache_query=0)
  2. Query_cache::store_query
        - Called just before handle_select() and is used to register a result
          set writer to the statement currently being processed
          (thd->net.query_cache_query).
  3. query_cache_insert
-       - Called from net_real_write to append a result set to a cached query
+       - Called from net_write_packet to append a result set to a cached query
          if (and only if) this query has a registered result set writer
          (thd->net.query_cache_query).
  4. Query_cache::invalidate
@@ -1407,12 +1407,12 @@ send_data_in_chunks(NET *net, const ucha
 
   while (len > MAX_CHUNK_LENGTH)
   {
-    if (net_real_write(net, packet, MAX_CHUNK_LENGTH))
+    if (net_write_packet(net, packet, MAX_CHUNK_LENGTH))
       return TRUE;
     packet+= MAX_CHUNK_LENGTH;
     len-= MAX_CHUNK_LENGTH;
   }
-  if (len && net_real_write(net, packet, len))
+  if (len && net_write_packet(net, packet, len))
     return TRUE;
 
   return FALSE;

=== modified file 'sql/sql_class.cc'
--- a/sql/sql_class.cc	2011-05-26 15:20:09 +0000
+++ b/sql/sql_class.cc	2011-05-31 13:52:09 +0000
@@ -3287,12 +3287,6 @@ void thd_increment_bytes_received(ulong
 }
 
 
-void thd_increment_net_big_packet_count(ulong length)
-{
-  current_thd->status_var.net_big_packet_count+= length;
-}
-
-
 void THD::set_status_var_init()
 {
   memset(&status_var, 0, sizeof(status_var));

=== modified file 'sql/sql_class.h'
--- a/sql/sql_class.h	2011-05-30 06:25:47 +0000
+++ b/sql/sql_class.h	2011-05-31 13:52:09 +0000
@@ -561,7 +561,6 @@ typedef struct system_status_var
   ulong key_cache_write;
   /* END OF KEY_CACHE parts */
 
-  ulong net_big_packet_count;
   ulong opened_tables;
   ulong opened_shares;
   ulong select_full_join_count;

=== modified file 'tests/mysql_client_test.c'
--- a/tests/mysql_client_test.c	2011-05-26 15:20:09 +0000
+++ b/tests/mysql_client_test.c	2011-05-31 13:52:09 +0000
@@ -19594,6 +19594,46 @@ static void test_bug11766854()
 }
 
 
+/**
+  Bug#54790: Use of non-blocking mode for sockets limits performance
+*/
+
+static void test_bug54790()
+{
+  int rc;
+  MYSQL *lmysql;
+  uint timeout= 2;
+
+  DBUG_ENTER("test_bug54790");
+  myheader("test_bug54790");
+
+  lmysql= mysql_client_init(NULL);
+  DIE_UNLESS(lmysql);
+
+  rc= mysql_options(lmysql, MYSQL_OPT_READ_TIMEOUT, &timeout);
+  DIE_UNLESS(!rc);
+
+  if (!mysql_real_connect(lmysql, opt_host, opt_user, opt_password,
+                          opt_db ? opt_db : "test", opt_port,
+                          opt_unix_socket, 0))
+  {
+    mysql= lmysql;
+    myerror("mysql_real_connect failed");
+    mysql_close(lmysql);
+    exit(1);
+  }
+
+  rc= mysql_query(lmysql, "SELECT SLEEP(100);");
+  myquery_r(rc);
+
+  /* A timeout error (ER_NET_READ_INTERRUPTED) would be more appropriate. */
+  DIE_UNLESS(mysql_errno(lmysql) == CR_SERVER_LOST);
+
+  mysql_close(lmysql);
+
+  DBUG_VOID_RETURN;
+}
+
 /*
   Read and parse arguments and MySQL options from my.cnf
 */
@@ -19938,6 +19978,7 @@ static struct my_tests_st my_tests[]= {
   { "test_bug57058", test_bug57058 },
   { "test_bug56976", test_bug56976 },
   { "test_bug11766854", test_bug11766854 },
+  { "test_bug54790", test_bug54790 },
   { 0, 0 }
 };
 

=== modified file 'vio/CMakeLists.txt'
--- a/vio/CMakeLists.txt	2010-08-12 15:19:57 +0000
+++ b/vio/CMakeLists.txt	2011-05-31 13:52:09 +0000
@@ -17,6 +17,6 @@ INCLUDE_DIRECTORIES(${CMAKE_SOURCE_DIR}/
 ${SSL_INCLUDE_DIRS})
 ADD_DEFINITIONS(${SSL_DEFINES})
 
-SET(VIO_SOURCES  vio.c viosocket.c viossl.c viosslfactories.c)
+SET(VIO_SOURCES vio.c viosocket.c viossl.c viopipe.c vioshm.c viosslfactories.c)
 ADD_CONVENIENCE_LIBRARY(vio ${VIO_SOURCES})
 TARGET_LINK_LIBRARIES(vio ${LIBSOCKET})

=== modified file 'vio/vio.c'
--- a/vio/vio.c	2011-05-26 15:20:09 +0000
+++ b/vio/vio.c	2011-05-31 13:52:09 +0000
@@ -1,4 +1,4 @@
-/* Copyright (C) 2000 MySQL AB
+/* Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.
 
    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
@@ -11,7 +11,7 @@
 
    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
-   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
+   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301  USA */
 
 /*
   Note that we can't have assertion on file descriptors;  The reason for
@@ -22,24 +22,26 @@
 
 #include "vio_priv.h"
 
-#if defined(__WIN__) || defined(HAVE_SMEM)
+#ifdef _WIN32
 
 /**
-  Stub poll_read method that defaults to indicate that there
-  is data to read.
+  Stub io_wait method that defaults to indicate that
+  requested I/O event is ready.
 
   Used for named pipe and shared memory VIO types.
 
   @param vio      Unused.
+  @param event    Unused.
   @param timeout  Unused.
 
-  @retval FALSE   There is data to read.
+  @retval 1       The requested I/O event has occurred.
 */
 
-static my_bool no_poll_read(Vio *vio __attribute__((unused)),
-                            uint timeout __attribute__((unused)))
+static int no_io_wait(Vio *vio __attribute__((unused)),
+                      enum enum_vio_io_event event __attribute__((unused)),
+                      int timeout __attribute__((unused)))
 {
-  return FALSE;
+  return 1;
 }
 
 #endif
@@ -53,8 +55,8 @@ static my_bool has_no_data(Vio *vio __at
  * Helper to fill most of the Vio* with defaults.
  */
 
-static void vio_init(Vio* vio, enum enum_vio_type type,
-                     my_socket sd, HANDLE hPipe, uint flags)
+static void vio_init(Vio *vio, enum enum_vio_type type,
+                     my_socket sd, uint flags)
 {
   DBUG_ENTER("vio_init");
   DBUG_PRINT("enter", ("type: %d  sd: %d  flags: %d", type, sd, flags));
@@ -65,8 +67,8 @@ static void vio_init(Vio* vio, enum enum
   memset(vio, 0, sizeof(*vio));
   vio->type	= type;
   vio->sd	= sd;
-  vio->hPipe	= hPipe;
   vio->localhost= flags & VIO_LOCALHOST;
+  vio->read_timeout= vio->write_timeout= -1;
   if ((flags & VIO_BUFFERED_READ) &&
       !(vio->read_buffer= (char*)my_malloc(VIO_READ_BUFFER_SIZE, MYF(MY_WME))))
     flags&= ~VIO_BUFFERED_READ;
@@ -80,25 +82,16 @@ static void vio_init(Vio* vio, enum enum
     vio->fastsend	=vio_fastsend;
     vio->viokeepalive	=vio_keepalive;
     vio->should_retry	=vio_should_retry;
-    vio->was_interrupted=vio_was_interrupted;
+    vio->was_timeout    =vio_was_timeout;
     vio->vioclose	=vio_close_pipe;
     vio->peer_addr	=vio_peer_addr;
-    vio->vioblocking	=vio_blocking;
-    vio->is_blocking	=vio_is_blocking;
-
-    vio->poll_read      =no_poll_read;
+    vio->io_wait        =no_io_wait;
     vio->is_connected   =vio_is_connected_pipe;
     vio->has_data       =has_no_data;
-
-    vio->timeout=vio_win32_timeout;
-    /* Set default timeout */
-    vio->read_timeout_ms= INFINITE;
-    vio->write_timeout_ms= INFINITE;
-    vio->pipe_overlapped.hEvent= CreateEvent(NULL, TRUE, FALSE, NULL);
     DBUG_VOID_RETURN;
   }
 #endif
-#ifdef HAVE_SMEM 
+#ifdef HAVE_SMEM
   if (type == VIO_TYPE_SHARED_MEMORY)
   {
     vio->viodelete	=vio_delete;
@@ -108,25 +101,16 @@ static void vio_init(Vio* vio, enum enum
     vio->fastsend	=vio_fastsend;
     vio->viokeepalive	=vio_keepalive;
     vio->should_retry	=vio_should_retry;
-    vio->was_interrupted=vio_was_interrupted;
+    vio->was_timeout    =vio_was_timeout;
     vio->vioclose	=vio_close_shared_memory;
     vio->peer_addr	=vio_peer_addr;
-    vio->vioblocking	=vio_blocking;
-    vio->is_blocking	=vio_is_blocking;
-
-    vio->poll_read      =no_poll_read;
+    vio->io_wait        =no_io_wait;
     vio->is_connected   =vio_is_connected_shared_memory;
     vio->has_data       =has_no_data;
-
-    /* Currently, shared memory is on Windows only, hence the below is ok*/
-    vio->timeout= vio_win32_timeout; 
-    /* Set default timeout */
-    vio->read_timeout_ms= INFINITE;
-    vio->write_timeout_ms= INFINITE;
     DBUG_VOID_RETURN;
   }
-#endif   
-#ifdef HAVE_OPENSSL 
+#endif
+#ifdef HAVE_OPENSSL
   if (type == VIO_TYPE_SSL)
   {
     vio->viodelete	=vio_ssl_delete;
@@ -136,15 +120,13 @@ static void vio_init(Vio* vio, enum enum
     vio->fastsend	=vio_fastsend;
     vio->viokeepalive	=vio_keepalive;
     vio->should_retry	=vio_should_retry;
-    vio->was_interrupted=vio_was_interrupted;
+    vio->was_timeout    =vio_was_timeout;
     vio->vioclose	=vio_ssl_close;
     vio->peer_addr	=vio_peer_addr;
-    vio->vioblocking	=vio_ssl_blocking;
-    vio->is_blocking	=vio_is_blocking;
-    vio->timeout	=vio_timeout;
-    vio->poll_read      =vio_poll_read;
+    vio->io_wait        =vio_io_wait;
     vio->is_connected   =vio_is_connected;
     vio->has_data       =vio_ssl_has_data;
+    vio->timeout        =vio_socket_timeout;
     DBUG_VOID_RETURN;
   }
 #endif /* HAVE_OPENSSL */
@@ -155,32 +137,72 @@ static void vio_init(Vio* vio, enum enum
   vio->fastsend         =vio_fastsend;
   vio->viokeepalive     =vio_keepalive;
   vio->should_retry     =vio_should_retry;
-  vio->was_interrupted  =vio_was_interrupted;
+  vio->was_timeout      =vio_was_timeout;
   vio->vioclose         =vio_close;
   vio->peer_addr        =vio_peer_addr;
-  vio->vioblocking      =vio_blocking;
-  vio->is_blocking      =vio_is_blocking;
-  vio->timeout          =vio_timeout;
-  vio->poll_read        =vio_poll_read;
+  vio->io_wait          =vio_io_wait;
   vio->is_connected     =vio_is_connected;
+  vio->timeout          =vio_socket_timeout;
   vio->has_data=        (flags & VIO_BUFFERED_READ) ?
                             vio_buff_has_data : has_no_data;
   DBUG_VOID_RETURN;
 }
 
 
-/* Reset initialized VIO to use with another transport type */
+/**
+  Reinitialize an existing Vio object.
+
+  @remark Used to rebind an initialized socket-based Vio object
+          to another socket-based transport type. For example,
+          rebind a TCP/IP transport to SSL.
+
+  @param vio    A VIO object.
+  @param type   A socket-based transport type.
+  @param sd     The socket.
+  @param ssl    An optional SSL structure.
+  @param flags  Flags passed to vio_init.
+
+  @return Return value is zero on success.
+*/
 
-void vio_reset(Vio* vio, enum enum_vio_type type,
-               my_socket sd, HANDLE hPipe, uint flags)
+my_bool vio_reset(Vio* vio, enum enum_vio_type type,
+                  my_socket sd, void *ssl, uint flags)
 {
+  int ret= FALSE;
+  Vio old_vio= *vio;
+  DBUG_ENTER("vio_reset");
+
+  /* The only supported rebind is from a socket-based transport type. */
+  DBUG_ASSERT(vio->type == VIO_TYPE_TCPIP || vio->type == VIO_TYPE_SOCKET);
+
+  /*
+    Will be reinitialized depending on the flags.
+    Nonetheless, already buffered inside the SSL layer.
+  */
   my_free(vio->read_buffer);
-  vio_init(vio, type, sd, hPipe, flags);
-}
 
+  vio_init(vio, type, sd, flags);
+
+#ifdef HAVE_OPENSSL
+  vio->ssl_arg= ssl;
+#endif
+
+  /*
+    Propagate the timeout values. Necessary to also propagate
+    the underlying proprieties associated with the timeout,
+    such as the socket blocking mode.
+  */
+  if (old_vio.read_timeout >= 0)
+    ret|= vio_timeout(vio, 0, old_vio.read_timeout);
+
+  if (old_vio.write_timeout >= 0)
+    ret|= vio_timeout(vio, 1, old_vio.write_timeout);
 
-/* Open the socket or TCP/IP connection and read the fnctl() status */
+  DBUG_RETURN(test(ret));
+}
 
+
+/* Create a new VIO for socket or TCP/IP connection. */
 Vio *vio_new(my_socket sd, enum enum_vio_type type, uint flags)
 {
   Vio *vio;
@@ -188,43 +210,16 @@ Vio *vio_new(my_socket sd, enum enum_vio
   DBUG_PRINT("enter", ("sd: %d", sd));
   if ((vio = (Vio*) my_malloc(sizeof(*vio),MYF(MY_WME))))
   {
-    vio_init(vio, type, sd, 0, flags);
+    vio_init(vio, type, sd, flags);
     sprintf(vio->desc,
 	    (vio->type == VIO_TYPE_SOCKET ? "socket (%d)" : "TCP/IP (%d)"),
 	    vio->sd);
-#if !defined(__WIN__)
-#if !defined(NO_FCNTL_NONBLOCK)
-    /*
-      We call fcntl() to set the flags and then immediately read them back
-      to make sure that we and the system are in agreement on the state of
-      things.
-
-      An example of why we need to do this is FreeBSD (and apparently some
-      other BSD-derived systems, like Mac OS X), where the system sometimes
-      reports that the socket is set for non-blocking when it really will
-      block.
-    */
-    fcntl(sd, F_SETFL, 0);
-    vio->fcntl_mode= fcntl(sd, F_GETFL);
-#elif defined(HAVE_SYS_IOCTL_H)			/* hpux */
-    /* Non blocking sockets doesn't work good on HPUX 11.0 */
-    (void) ioctl(sd,FIOSNBIO,0);
-    vio->fcntl_mode &= ~O_NONBLOCK;
-#endif
-#else /* !defined(__WIN__) */
-    {
-      /* set to blocking mode by default */
-      ulong arg=0, r;
-      r = ioctlsocket(sd,FIONBIO,(void*) &arg);
-      vio->fcntl_mode &= ~O_NONBLOCK;
-    }
-#endif
   }
   DBUG_RETURN(vio);
 }
 
 
-#ifdef __WIN__
+#ifdef _WIN32
 
 Vio *vio_new_win32pipe(HANDLE hPipe)
 {
@@ -232,7 +227,15 @@ Vio *vio_new_win32pipe(HANDLE hPipe)
   DBUG_ENTER("vio_new_handle");
   if ((vio = (Vio*) my_malloc(sizeof(Vio),MYF(MY_WME))))
   {
-    vio_init(vio, VIO_TYPE_NAMEDPIPE, 0, hPipe, VIO_LOCALHOST);
+    vio_init(vio, VIO_TYPE_NAMEDPIPE, 0, VIO_LOCALHOST);
+    /* Create an object for event notification. */
+    vio->overlapped.hEvent= CreateEvent(NULL, FALSE, FALSE, NULL);
+    if (vio->overlapped.hEvent == NULL)
+    {
+      my_free(vio);
+      DBUG_RETURN(NULL);
+    }
+    vio->hPipe= hPipe;
     strmov(vio->desc, "named pipe");
   }
   DBUG_RETURN(vio);
@@ -248,7 +251,7 @@ Vio *vio_new_win32shared_memory(HANDLE h
   DBUG_ENTER("vio_new_win32shared_memory");
   if ((vio = (Vio*) my_malloc(sizeof(Vio),MYF(MY_WME))))
   {
-    vio_init(vio, VIO_TYPE_SHARED_MEMORY, 0, 0, VIO_LOCALHOST);
+    vio_init(vio, VIO_TYPE_SHARED_MEMORY, 0, VIO_LOCALHOST);
     vio->handle_file_map= handle_file_map;
     vio->handle_map= handle_map;
     vio->event_server_wrote= event_server_wrote;
@@ -266,6 +269,49 @@ Vio *vio_new_win32shared_memory(HANDLE h
 #endif
 
 
+/**
+  Set timeout for a network send or receive operation.
+
+  @remark A non-infinite timeout causes the socket to be
+          set to non-blocking mode. On infinite timeouts,
+          the socket is set to blocking mode.
+
+  @remark A negative timeout means an infinite timeout.
+
+  @param vio      A VIO object.
+  @param which    Whether timeout is for send (1) or receive (0).
+  @param timeout  Timeout interval in seconds.
+
+  @return FALSE on success, TRUE otherwise.
+*/
+
+int vio_timeout(Vio *vio, uint which, int timeout_sec)
+{
+  int timeout_ms;
+  my_bool old_mode;
+
+  /*
+    Vio timeouts are measured in milliseconds. Check for a possible
+    overflow. In case of overflow, set to infinite.
+  */
+  if (timeout_sec > INT_MAX/1000)
+    timeout_ms= -1;
+  else
+    timeout_ms= (int) (timeout_sec * 1000);
+
+  /* Deduce the current timeout status mode. */
+  old_mode= vio->write_timeout < 0 && vio->read_timeout < 0;
+
+  if (which)
+    vio->write_timeout= timeout_ms;
+  else
+    vio->read_timeout= timeout_ms;
+
+  /* VIO-specific timeout handling. Might change the blocking mode. */
+  return vio->timeout ? vio->timeout(vio, which, old_mode) : 0;
+}
+
+
 void vio_delete(Vio* vio)
 {
   if (!vio)

=== modified file 'vio/vio_priv.h'
--- a/vio/vio_priv.h	2011-04-15 09:33:58 +0000
+++ b/vio/vio_priv.h	2011-05-31 13:52:09 +0000
@@ -1,6 +1,3 @@
-#ifndef VIO_PRIV_INCLUDED
-#define VIO_PRIV_INCLUDED
-
 /* Copyright (c) 2003, 2011, Oracle and/or its affiliates. All rights reserved.
 
    This program is free software; you can redistribute it and/or modify
@@ -16,6 +13,9 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301  USA */
 
+#ifndef VIO_PRIV_INCLUDED
+#define VIO_PRIV_INCLUDED
+
 /* Structures and functions private to the vio package */
 
 #define DONT_MAP_VIO
@@ -26,10 +26,6 @@
 #include <violite.h>
 
 #ifdef _WIN32
-void	vio_win32_timeout(Vio *vio, uint which, uint timeout);
-#endif
-
-#ifdef __WIN__
 size_t vio_read_pipe(Vio *vio, uchar * buf, size_t size);
 size_t vio_write_pipe(Vio *vio, const uchar * buf, size_t size);
 my_bool vio_is_connected_pipe(Vio *vio);
@@ -43,8 +39,9 @@ my_bool vio_is_connected_shared_memory(V
 int vio_close_shared_memory(Vio * vio);
 #endif
 
-void	vio_timeout(Vio *vio,uint which, uint timeout);
 my_bool vio_buff_has_data(Vio *vio);
+int vio_socket_io_wait(Vio *vio, enum enum_vio_io_event event);
+int vio_socket_timeout(Vio *vio, uint which, my_bool old_mode);
 
 #ifdef HAVE_OPENSSL
 #include "my_net.h"			/* needed because of struct in_addr */
@@ -55,9 +52,6 @@ size_t	vio_ssl_write(Vio *vio,const ucha
 /* When the workday is over... */
 int vio_ssl_close(Vio *vio);
 void vio_ssl_delete(Vio *vio);
-
-int vio_ssl_blocking(Vio *vio, my_bool set_blocking_mode, my_bool *old_mode);
-
 my_bool vio_ssl_has_data(Vio *vio);
 
 #endif /* HAVE_OPENSSL */

=== added file 'vio/viopipe.c'
--- a/vio/viopipe.c	1970-01-01 00:00:00 +0000
+++ b/vio/viopipe.c	2011-05-31 13:52:09 +0000
@@ -0,0 +1,122 @@
+/* Copyright (c) 2011, Oracle and/or its affiliates. All rights reserved.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; version 2 of the License.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301  USA */
+
+#include "vio_priv.h"
+
+#ifdef _WIN32
+
+static size_t wait_overlapped_result(Vio *vio, int timeout)
+{
+  size_t ret= (size_t) -1;
+  DWORD transferred, wait_status, timeout_ms;
+
+  timeout_ms= timeout >= 0 ? timeout : INFINITE;
+
+  /* Wait for the overlapped operation to be completed. */
+  wait_status= WaitForSingleObject(vio->overlapped.hEvent, timeout_ms);
+
+  /* The operation might have completed, attempt to retrieve the result. */
+  if (wait_status == WAIT_OBJECT_0)
+  {
+    /* If retrieval fails, a error code will have been set. */
+    if (GetOverlappedResult(vio->hPipe, &vio->overlapped, &transferred, FALSE))
+      ret= transferred;
+  }
+  else
+  {
+    /* Error or timeout, cancel the pending I/O operation. */
+    CancelIo(vio->hPipe);
+
+    /*
+      If the wait timed out, set error code to indicate a
+      timeout error. Otherwise, wait_status is WAIT_FAILED
+      and extended error information was already set.
+    */
+    if (wait_status == WAIT_TIMEOUT)
+      SetLastError(SOCKET_ETIMEDOUT);
+  }
+
+  return ret;
+}
+
+
+size_t vio_read_pipe(Vio *vio, uchar *buf, size_t count)
+{
+  DWORD transferred;
+  size_t ret= (size_t) -1;
+  DBUG_ENTER("vio_read_pipe");
+
+  /* Attempt to read from the pipe (overlapped I/O). */
+  if (ReadFile(vio->hPipe, buf, count, &transferred, &vio->overlapped))
+  {
+    /* The operation completed immediately. */
+    ret= transferred;
+  }
+  /* Read operation is pending completion asynchronously? */
+  else if (GetLastError() == ERROR_IO_PENDING)
+    ret= wait_overlapped_result(vio, vio->read_timeout);
+
+  DBUG_RETURN(ret);
+}
+
+
+size_t vio_write_pipe(Vio *vio, const uchar *buf, size_t count)
+{
+  DWORD transferred;
+  size_t ret= (size_t) -1;
+  DBUG_ENTER("vio_write_pipe");
+
+  /* Attempt to write to the pipe (overlapped I/O). */
+  if (WriteFile(vio->hPipe, buf, count, &transferred, &vio->overlapped))
+  {
+    /* The operation completed immediately. */
+    ret= transferred;
+  }
+  /* Write operation is pending completion asynchronously? */
+  else if (GetLastError() == ERROR_IO_PENDING)
+    ret= wait_overlapped_result(vio, vio->write_timeout);
+
+  DBUG_RETURN(ret);
+}
+
+
+my_bool vio_is_connected_pipe(Vio *vio)
+{
+  if (PeekNamedPipe(vio->hPipe, NULL, 0, NULL, NULL, NULL))
+    return TRUE;
+  else
+    return (GetLastError() != ERROR_BROKEN_PIPE);
+}
+
+
+int vio_close_pipe(Vio *vio)
+{
+  BOOL ret;
+  DBUG_ENTER("vio_close_pipe");
+
+  CancelIo(vio->hPipe);
+  CloseHandle(vio->overlapped.hEvent);
+  DisconnectNamedPipe(vio->hPipe);
+  ret= CloseHandle(vio->hPipe);
+
+  vio->type= VIO_CLOSED;
+  vio->hPipe= NULL;
+  vio->sd= -1;
+
+  DBUG_RETURN(ret);
+}
+
+#endif
+

=== added file 'vio/vioshm.c'
--- a/vio/vioshm.c	1970-01-01 00:00:00 +0000
+++ b/vio/vioshm.c	2011-05-31 13:52:09 +0000
@@ -0,0 +1,217 @@
+/* Copyright (c) 2010, 2011, Oracle and/or its affiliates. All rights reserved.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; version 2 of the License.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301  USA */
+
+#include "vio_priv.h"
+
+#if defined(_WIN32) && defined(HAVE_SMEM)
+
+size_t vio_read_shared_memory(Vio *vio, uchar *buf, size_t size)
+{
+  size_t length;
+  size_t remain_local;
+  char *current_position;
+  HANDLE events[2];
+  DWORD timeout;
+  DBUG_ENTER("vio_read_shared_memory");
+
+  remain_local= size;
+  current_position= buf;
+  timeout= vio->read_timeout >= 0 ? vio->read_timeout : INFINITE;
+
+  events[0]= vio->event_server_wrote;
+  events[1]= vio->event_conn_closed;
+
+  do
+  {
+    if (vio->shared_memory_remain == 0)
+    {
+      DWORD wait_status;
+
+      wait_status= WaitForMultipleObjects(array_elements(events), events,
+                                          FALSE, timeout);
+
+      /*
+        WaitForMultipleObjects can return next values:
+         WAIT_OBJECT_0+0 - event from vio->event_server_wrote
+         WAIT_OBJECT_0+1 - event from vio->event_conn_closed.
+                           We can't read anything
+         WAIT_ABANDONED_0 and WAIT_TIMEOUT - fail.  We can't read anything
+      */
+      if (wait_status != WAIT_OBJECT_0)
+      {
+        /*
+          If wait_status is WAIT_TIMEOUT, set error code to indicate a
+          timeout error. If vio->event_conn_closed was set, use an EOF
+          condition (return value of zero) to indicate that the operation
+          has been aborted.
+        */
+        if (wait_status == WAIT_TIMEOUT)
+          SetLastError(SOCKET_ETIMEDOUT);
+        else if (wait_status == (WAIT_OBJECT_0 + 1))
+          DBUG_RETURN(0);
+
+        DBUG_RETURN(-1);
+      }
+
+      vio->shared_memory_pos= vio->handle_map;
+      vio->shared_memory_remain= uint4korr((ulong*)vio->shared_memory_pos);
+      vio->shared_memory_pos+= 4;
+    }
+
+    length= size;
+
+    if (vio->shared_memory_remain < length)
+       length= vio->shared_memory_remain;
+    if (length > remain_local)
+       length= remain_local;
+
+    memcpy(current_position, vio->shared_memory_pos, length);
+
+    vio->shared_memory_remain-= length;
+    vio->shared_memory_pos+= length;
+    current_position+= length;
+    remain_local-= length;
+
+    if (!vio->shared_memory_remain)
+    {
+      if (!SetEvent(vio->event_client_read))
+        DBUG_RETURN(-1);
+    }
+  } while (remain_local);
+  length= size;
+
+  DBUG_RETURN(length);
+}
+
+
+size_t vio_write_shared_memory(Vio *vio, const uchar *buf, size_t size)
+{
+  size_t length, remain, sz;
+  HANDLE pos;
+  const uchar *current_position;
+  HANDLE events[2];
+  DWORD timeout;
+  DBUG_ENTER("vio_write_shared_memory");
+
+  remain= size;
+  current_position= buf;
+  timeout= vio->write_timeout >= 0 ? vio->write_timeout : INFINITE;
+
+  events[0]= vio->event_server_read;
+  events[1]= vio->event_conn_closed;
+
+  while (remain != 0)
+  {
+    DWORD wait_status;
+
+    wait_status= WaitForMultipleObjects(array_elements(events), events,
+                                        FALSE, timeout);
+
+    if (wait_status != WAIT_OBJECT_0)
+    {
+      /* Set error code to indicate a timeout error or disconnect. */
+      if (wait_status == WAIT_TIMEOUT)
+        SetLastError(SOCKET_ETIMEDOUT);
+      else
+        SetLastError(ERROR_GRACEFUL_DISCONNECT);
+
+      DBUG_RETURN((size_t) -1);
+    }
+
+    sz= (remain > shared_memory_buffer_length ? shared_memory_buffer_length :
+         remain);
+
+    int4store(vio->handle_map, sz);
+    pos= vio->handle_map + 4;
+    memcpy(pos, current_position, sz);
+    remain-= sz;
+    current_position+= sz;
+    if (!SetEvent(vio->event_client_wrote))
+      DBUG_RETURN((size_t) -1);
+  }
+  length= size;
+
+  DBUG_RETURN(length);
+}
+
+
+my_bool vio_is_connected_shared_memory(Vio *vio)
+{
+  return (WaitForSingleObject(vio->event_conn_closed, 0) != WAIT_OBJECT_0);
+}
+
+
+/**
+ Close shared memory and DBUG_PRINT any errors that happen on closing.
+ @return Zero if all closing functions succeed, and nonzero otherwise.
+*/
+int vio_close_shared_memory(Vio * vio)
+{
+  int error_count= 0;
+  DBUG_ENTER("vio_close_shared_memory");
+  if (vio->type != VIO_CLOSED)
+  {
+    /*
+      Set event_conn_closed for notification of both client and server that
+      connection is closed
+    */
+    SetEvent(vio->event_conn_closed);
+    /*
+      Close all handlers. UnmapViewOfFile and CloseHandle return non-zero
+      result if they are success.
+    */
+    if (UnmapViewOfFile(vio->handle_map) == 0)
+    {
+      error_count++;
+      DBUG_PRINT("vio_error", ("UnmapViewOfFile() failed"));
+    }
+    if (CloseHandle(vio->event_server_wrote) == 0)
+    {
+      error_count++;
+      DBUG_PRINT("vio_error", ("CloseHandle(vio->esw) failed"));
+    }
+    if (CloseHandle(vio->event_server_read) == 0)
+    {
+      error_count++;
+      DBUG_PRINT("vio_error", ("CloseHandle(vio->esr) failed"));
+    }
+    if (CloseHandle(vio->event_client_wrote) == 0)
+    {
+      error_count++;
+      DBUG_PRINT("vio_error", ("CloseHandle(vio->ecw) failed"));
+    }
+    if (CloseHandle(vio->event_client_read) == 0)
+    {
+      error_count++;
+      DBUG_PRINT("vio_error", ("CloseHandle(vio->ecr) failed"));
+    }
+    if (CloseHandle(vio->handle_file_map) == 0)
+    {
+      error_count++;
+      DBUG_PRINT("vio_error", ("CloseHandle(vio->hfm) failed"));
+    }
+    if (CloseHandle(vio->event_conn_closed) == 0)
+    {
+      error_count++;
+      DBUG_PRINT("vio_error", ("CloseHandle(vio->ecc) failed"));
+    }
+  }
+  vio->type= VIO_CLOSED;
+  vio->sd=   -1;
+  DBUG_RETURN(error_count);
+}
+
+#endif /* #if defined(_WIN32) && defined(HAVE_SMEM) */
+

=== modified file 'vio/viosocket.c'
--- a/vio/viosocket.c	2011-05-05 23:50:31 +0000
+++ b/vio/viosocket.c	2011-05-31 13:52:09 +0000
@@ -30,33 +30,100 @@
 
 int vio_errno(Vio *vio __attribute__((unused)))
 {
-  return socket_errno;		/* On Win32 this mapped to WSAGetLastError() */
+  /* These transport types are not Winsock based. */
+#ifdef _WIN32
+  if (vio->type == VIO_TYPE_NAMEDPIPE ||
+      vio->type == VIO_TYPE_SHARED_MEMORY)
+    return GetLastError();
+#endif
+
+  /* Mapped to WSAGetLastError() on Win32. */
+  return socket_errno;
 }
 
 
-size_t vio_read(Vio * vio, uchar* buf, size_t size)
+/**
+  Attempt to wait for an I/O event on a socket.
+
+  @param vio      VIO object representing a connected socket.
+  @param event    The type of I/O event (read or write) to wait for.
+
+  @return Return value is -1 on failure, 0 on success.
+*/
+
+int vio_socket_io_wait(Vio *vio, enum enum_vio_io_event event)
 {
-  size_t r;
+  int timeout, ret;
+
+  DBUG_ASSERT(event == VIO_IO_EVENT_READ || event == VIO_IO_EVENT_WRITE);
+
+  /* Choose an appropriate timeout. */
+  if (event == VIO_IO_EVENT_READ)
+    timeout= vio->read_timeout;
+  else
+    timeout= vio->write_timeout;
+
+  /* Wait for input data to become available. */
+  switch (vio_io_wait(vio, event, timeout))
+  {
+  case -1:
+    /* Upon failure, vio_read/write() shall return -1. */
+    ret= -1;
+    break;
+  case  0:
+    /* The wait timed out. */
+    ret= -1;
+    break;
+  default:
+    /* A positive value indicates an I/O event. */
+    ret= 0;
+    break;
+  }
+
+  return ret;
+}
+
+
+/*
+  Define a stub MSG_DONTWAIT if unavailable. In this case, fcntl
+  (or a equivalent) is used to enable non-blocking operations.
+  The flag must be supported in both send and recv operations.
+*/
+#if defined(__linux__)
+#define VIO_USE_DONTWAIT  1
+#define VIO_DONTWAIT      MSG_DONTWAIT
+#else
+#define VIO_DONTWAIT 0
+#endif
+
+
+size_t vio_read(Vio *vio, uchar *buf, size_t size)
+{
+  ssize_t ret;
+  int flags= 0;
   DBUG_ENTER("vio_read");
-  DBUG_PRINT("enter", ("sd: %d  buf: 0x%lx  size: %u", vio->sd, (long) buf,
-                       (uint) size));
 
-  /* Ensure nobody uses vio_read_buff and vio_read simultaneously */
+  /* Ensure nobody uses vio_read_buff and vio_read simultaneously. */
   DBUG_ASSERT(vio->read_end == vio->read_pos);
-#ifdef __WIN__
-  r = recv(vio->sd, buf, size,0);
-#else
-  errno=0;					/* For linux */
-  r = read(vio->sd, buf, size);
-#endif /* __WIN__ */
-#ifndef DBUG_OFF
-  if (r == (size_t) -1)
+
+  /* If timeout is enabled, do not block if data is unavailable. */
+  if (vio->read_timeout >= 0)
+    flags= VIO_DONTWAIT;
+
+  while ((ret= recv(vio->sd, buf, size, flags)) == -1)
   {
-    DBUG_PRINT("vio_error", ("Got error %d during read",errno));
+    int error= socket_errno;
+
+    /* The operation would block? */
+    if (error != SOCKET_EAGAIN && error != SOCKET_EWOULDBLOCK)
+      break;
+
+    /* Wait for input data to become available. */
+    if ((ret= vio_socket_io_wait(vio, VIO_IO_EVENT_READ)))
+      break;
   }
-#endif /* DBUG_OFF */
-  DBUG_PRINT("exit", ("%ld", (long) r));
-  DBUG_RETURN(r);
+
+  DBUG_RETURN(ret);
 }
 
 
@@ -104,97 +171,144 @@ size_t vio_read_buff(Vio *vio, uchar* bu
 #undef VIO_UNBUFFERED_READ_MIN_SIZE
 }
 
+
 my_bool vio_buff_has_data(Vio *vio)
 {
   return (vio->read_pos != vio->read_end);
 }
 
-size_t vio_write(Vio * vio, const uchar* buf, size_t size)
+
+size_t vio_write(Vio *vio, const uchar* buf, size_t size)
 {
-  size_t r;
+  ssize_t ret;
+  int flags= 0;
   DBUG_ENTER("vio_write");
-  DBUG_PRINT("enter", ("sd: %d  buf: 0x%lx  size: %u", vio->sd, (long) buf,
-                       (uint) size));
-#ifdef __WIN__
-  r = send(vio->sd, buf, size,0);
-#else
-  r = write(vio->sd, buf, size);
-#endif /* __WIN__ */
-#ifndef DBUG_OFF
-  if (r == (size_t) -1)
+
+  /* If timeout is enabled, do not block. */
+  if (vio->write_timeout >= 0)
+    flags= VIO_DONTWAIT;
+
+  while ((ret= send(vio->sd, buf, size, flags)) == -1)
   {
-    DBUG_PRINT("vio_error", ("Got error on write: %d",socket_errno));
+    int error= socket_errno;
+
+    /* The operation would block? */
+    if (error != SOCKET_EAGAIN && error != SOCKET_EWOULDBLOCK)
+      break;
+
+    /* Wait for the output buffer to become writable.*/
+    if ((ret= vio_socket_io_wait(vio, VIO_IO_EVENT_WRITE)))
+      break;
   }
-#endif /* DBUG_OFF */
-  DBUG_PRINT("exit", ("%u", (uint) r));
-  DBUG_RETURN(r);
+
+  DBUG_RETURN(ret);
 }
 
-int vio_blocking(Vio * vio __attribute__((unused)), my_bool set_blocking_mode,
-		 my_bool *old_mode)
+
+static int vio_set_blocking(Vio *vio, my_bool status)
 {
-  int r=0;
-  DBUG_ENTER("vio_blocking");
+  DBUG_ENTER("vio_set_blocking");
 
-  *old_mode= test(!(vio->fcntl_mode & O_NONBLOCK));
-  DBUG_PRINT("enter", ("set_blocking_mode: %d  old_mode: %d",
-		       (int) set_blocking_mode, (int) *old_mode));
-
-#if !defined(__WIN__)
-#if !defined(NO_FCNTL_NONBLOCK)
-  if (vio->sd >= 0)
-  {
-    int old_fcntl=vio->fcntl_mode;
-    if (set_blocking_mode)
-      vio->fcntl_mode &= ~O_NONBLOCK; /* clear bit */
-    else
-      vio->fcntl_mode |= O_NONBLOCK; /* set bit */
-    if (old_fcntl != vio->fcntl_mode)
-    {
-      r= fcntl(vio->sd, F_SETFL, vio->fcntl_mode);
-      if (r == -1)
-      {
-        DBUG_PRINT("info", ("fcntl failed, errno %d", errno));
-        vio->fcntl_mode= old_fcntl;
-      }
-    }
+#ifdef _WIN32
+  DBUG_ASSERT(vio->type != VIO_TYPE_NAMEDPIPE);
+  DBUG_ASSERT(vio->type != VIO_TYPE_SHARED_MEMORY);
+  {
+    int ret;
+    u_long arg= status ? 0 : 1;
+    ret= ioctlsocket(vio->sd, FIONBIO, &arg);
+    DBUG_RETURN(ret);
   }
 #else
-  r= set_blocking_mode ? 0 : 1;
-#endif /* !defined(NO_FCNTL_NONBLOCK) */
-#else /* !defined(__WIN__) */
-  if (vio->type != VIO_TYPE_NAMEDPIPE && vio->type != VIO_TYPE_SHARED_MEMORY)
-  { 
-    ulong arg;
-    int old_fcntl=vio->fcntl_mode;
-    if (set_blocking_mode)
+  {
+    int flags;
+
+    if ((flags= fcntl(vio->sd, F_GETFL, NULL)) < 0)
+      DBUG_RETURN(-1);
+
+    /*
+      Always set/clear the flag to avoid inheritance issues. This is
+      a issue mainly on Mac OS X Tiger (version 10.4) where although
+      the O_NONBLOCK flag is inherited from the parent socket, the
+      actual non-blocking behavior is not inherited.
+    */
+    if (status)
+      flags&= ~O_NONBLOCK;
+    else
+      flags|= O_NONBLOCK;
+
+    if (fcntl(vio->sd, F_SETFL, flags) == -1)
+      DBUG_RETURN(-1);
+  }
+#endif
+
+  DBUG_RETURN(0);
+}
+
+
+int vio_socket_timeout(Vio *vio,
+                       uint which __attribute__((unused)),
+                       my_bool old_mode __attribute__((unused)))
+{
+  int ret= 0;
+  DBUG_ENTER("vio_socket_timeout");
+
+#if defined(_WIN32)
+  {
+    int optname;
+    DWORD timeout= 0;
+    const char *optval= (const char *) &timeout;
+
+    /*
+      The default socket timeout value is zero, which means an infinite
+      timeout. Values less than 500 milliseconds are interpreted to be of
+      500 milliseconds. Hence, the VIO behavior for zero timeout, which is
+      intended to cause the send or receive operation to fail immediately
+      if no data is available, is not supported on WIN32 and neither is
+      necessary as it's not possible to set the VIO timeout value to zero.
+
+      Assert that the VIO timeout is either positive or set to infinite.
+    */
+    DBUG_ASSERT(which || vio->read_timeout);
+    DBUG_ASSERT(!which || vio->write_timeout);
+
+    if (which)
     {
-      arg = 0;
-      vio->fcntl_mode &= ~O_NONBLOCK; /* clear bit */
+      optname= SO_SNDTIMEO;
+      if (vio->write_timeout > 0)
+        timeout= vio->write_timeout;
     }
     else
     {
-      arg = 1;
-      vio->fcntl_mode |= O_NONBLOCK; /* set bit */
+      optname= SO_RCVTIMEO;
+      if (vio->read_timeout > 0)
+        timeout= vio->read_timeout;
     }
-    if (old_fcntl != vio->fcntl_mode)
-      r = ioctlsocket(vio->sd,FIONBIO,(void*) &arg);
+
+    ret= setsockopt(vio->sd, SOL_SOCKET, optname, optval, sizeof(timeout));
   }
-  else
-    r=  test(!(vio->fcntl_mode & O_NONBLOCK)) != set_blocking_mode;
-#endif /* !defined(__WIN__) */
-  DBUG_PRINT("exit", ("%d", r));
-  DBUG_RETURN(r);
-}
+#else
+  /*
+    The MSG_DONTWAIT trick is not used with SSL sockets as the send and
+    receive I/O operations are wrapped through SSL-specific functions
+    (SSL_read and SSL_write) which are not equivalent to the standard
+    recv(2) and send(2) used in vio_read() and vio_write(). Hence, the
+    socket blocking mode is changed and vio_io_wait() is used to wait
+    for I/O or timeout.
+  */
+#ifdef VIO_USE_DONTWAIT
+  if (vio->type == VIO_TYPE_SSL)
+#endif
+  {
+    /* Deduce what should be the new blocking mode of the socket. */
+    my_bool new_mode= vio->write_timeout < 0 && vio->read_timeout < 0;
 
-my_bool
-vio_is_blocking(Vio * vio)
-{
-  my_bool r;
-  DBUG_ENTER("vio_is_blocking");
-  r = !(vio->fcntl_mode & O_NONBLOCK);
-  DBUG_PRINT("exit", ("%d", (int) r));
-  DBUG_RETURN(r);
+    /* If necessary, update the blocking mode. */
+    if (new_mode != old_mode)
+      ret= vio_set_blocking(vio, new_mode);
+  }
+#endif
+
+  DBUG_RETURN(ret);
 }
 
 
@@ -249,30 +363,37 @@ int vio_keepalive(Vio* vio, my_bool set_
 }
 
 
+/**
+  Indicate whether a I/O operation must be retried later.
+
+  @param vio  A VIO object
+
+  @return Whether a I/O operation should be deferred.
+  @retval TRUE    Temporary failure, retry operation.
+  @retval FALSE   Indeterminate failure.
+*/
+
 my_bool
-vio_should_retry(Vio * vio)
+vio_should_retry(Vio *vio)
 {
-  int en = socket_errno;
-  /*
-    man 2 read write
-      EAGAIN or EWOULDBLOCK when a socket is a non-blocking mode means
-      that the read/write would block.
-    man 7 socket
-      EAGAIN or EWOULDBLOCK when a socket is in a blocking mode means
-      that the corresponding receiving or sending timeout was reached.
-  */
-  return  en == SOCKET_EINTR ||
-          (!vio_is_blocking(vio) &&
-            (en == SOCKET_EAGAIN || en == SOCKET_EWOULDBLOCK));
+  return (vio_errno(vio) == SOCKET_EINTR);
 }
 
 
+/**
+  Indicate whether a I/O operation timed out.
+
+  @param vio  A VIO object
+
+  @return Whether a I/O operation timed out.
+  @retval TRUE    Operation timed out.
+  @retval FALSE   Not a timeout failure.
+*/
+
 my_bool
-vio_was_interrupted(Vio *vio __attribute__((unused)))
+vio_was_timeout(Vio *vio)
 {
-  int en= socket_errno;
-  return (en == SOCKET_EAGAIN || en == SOCKET_EINTR ||
-	  en == SOCKET_EWOULDBLOCK || en == SOCKET_ETIMEDOUT);
+  return (vio_errno(vio) == SOCKET_ETIMEDOUT);
 }
 
 
@@ -526,58 +647,6 @@ my_bool vio_peer_addr(Vio *vio, char *ip
 
 
 /**
-  Indicate whether there is data to read on a given socket.
-
-  @note An exceptional condition event and/or errors are
-        interpreted as if there is data to read.
-
-  @param sd       A connected socket.
-  @param timeout  Maximum time in seconds to poll.
-
-  @retval FALSE   There is data to read.
-  @retval TRUE    There is no data to read.
-*/
-
-static my_bool socket_poll_read(my_socket sd, uint timeout)
-{
-#ifdef __WIN__
-  int res;
-  my_socket fd= sd;
-  fd_set readfds, errorfds;
-  struct timeval tm;
-  DBUG_ENTER("socket_poll_read");
-  tm.tv_sec= timeout;
-  tm.tv_usec= 0;
-  FD_ZERO(&readfds);
-  FD_ZERO(&errorfds);
-  FD_SET(fd, &readfds);
-  FD_SET(fd, &errorfds);
-  /* The first argument is ignored on Windows, so a conversion to int is OK */
-  if ((res= select((int) fd, &readfds, NULL, &errorfds, &tm) <= 0))
-  {
-    DBUG_RETURN(res < 0 ? 0 : 1);
-  }
-  res= FD_ISSET(fd, &readfds) || FD_ISSET(fd, &errorfds);
-  DBUG_RETURN(!res);
-#elif defined(HAVE_POLL)
-  struct pollfd fds;
-  int res;
-  DBUG_ENTER("socket_poll_read");
-  fds.fd=sd;
-  fds.events=POLLIN;
-  fds.revents=0;
-  if ((res=poll(&fds,1,(int) timeout*1000)) <= 0)
-  {
-    DBUG_RETURN(res < 0 ? 0 : 1);		/* Don't return 1 on errors */
-  }
-  DBUG_RETURN(fds.revents & (POLLIN | POLLERR | POLLHUP) ? 0 : 1);
-#else
-  return 0;
-#endif
-}
-
-
-/**
   Retrieve the amount of data that can be read from a socket.
 
   @param vio          A VIO object.
@@ -611,459 +680,345 @@ static my_bool socket_peek_read(Vio *vio
 #endif
 }
 
+#ifndef _WIN32
 
 /**
-  Indicate whether there is data to read on a given socket.
-
-  @remark Errors are interpreted as if there is data to read.
-
-  @param sd       A connected socket.
-  @param timeout  Maximum time in seconds to wait.
-
-  @retval FALSE   There is data (or EOF) to read. Also FALSE if error.
-  @retval TRUE    There is _NO_ data to read or timed out.
+  Set of event flags grouped by operations.
 */
 
-my_bool vio_poll_read(Vio *vio, uint timeout)
-{
-  my_socket sd= vio->sd;
-  DBUG_ENTER("vio_poll_read");
-#ifdef HAVE_OPENSSL
-  if (vio->type == VIO_TYPE_SSL)
-    sd= SSL_get_fd((SSL*) vio->ssl_arg);
-#endif
-  DBUG_RETURN(socket_poll_read(sd, timeout));
-}
-
-
-/**
-  Determine if the endpoint of a connection is still available.
-
-  @remark The socket is assumed to be disconnected if an EOF
-          condition is encountered.
-
-  @param vio      The VIO object.
-
-  @retval TRUE    EOF condition not found.
-  @retval FALSE   EOF condition is signaled.
+/*
+  Linux specific flag used to detect connection shutdown. The flag is
+  also used for half-closed notification, which here is interpreted as
+  if there is data available to be read from the socket.
 */
-
-my_bool vio_is_connected(Vio *vio)
-{
-  uint bytes= 0;
-  DBUG_ENTER("vio_is_connected");
-
-  /* In the presence of errors the socket is assumed to be connected. */
-
-  /*
-    The first step of detecting a EOF condition is veryfing
-    whether there is data to read. Data in this case would
-    be the EOF.
-  */
-  if (vio_poll_read(vio, 0))
-    DBUG_RETURN(TRUE);
-
-  /*
-    The second step is read() or recv() from the socket returning
-    0 (EOF). Unfortunelly, it's not possible to call read directly
-    as we could inadvertently read meaningful connection data.
-    Simulate a read by retrieving the number of bytes available to
-    read -- 0 meaning EOF.
-  */
-  if (socket_peek_read(vio, &bytes))
-    DBUG_RETURN(TRUE);
-
-#ifdef HAVE_OPENSSL
-  /* There might be buffered data at the SSL layer. */
-  if (!bytes && vio->type == VIO_TYPE_SSL)
-    bytes= SSL_pending((SSL*) vio->ssl_arg);
+#ifndef POLLRDHUP
+#define POLLRDHUP 0
 #endif
 
-  DBUG_RETURN(bytes ? TRUE : FALSE);
-}
+/* Data may be read. */
+#define MY_POLL_SET_IN      (POLLIN | POLLPRI)
+/* Data may be written. */
+#define MY_POLL_SET_OUT     (POLLOUT)
+/* An error or hangup. */
+#define MY_POLL_SET_ERR     (POLLERR | POLLHUP | POLLNVAL)
 
-
-void vio_timeout(Vio *vio, uint which, uint timeout)
-{
-#if defined(SO_SNDTIMEO) && defined(SO_RCVTIMEO)
-  int r;
-  DBUG_ENTER("vio_timeout");
-
-  {
-#ifdef __WIN__
-  /* Windows expects time in milliseconds as int */
-  int wait_timeout= (int) timeout * 1000;
-#else
-  /* POSIX specifies time as struct timeval. */
-  struct timeval wait_timeout;
-  wait_timeout.tv_sec= timeout;
-  wait_timeout.tv_usec= 0;
 #endif
 
-  r= setsockopt(vio->sd, SOL_SOCKET, which ? SO_SNDTIMEO : SO_RCVTIMEO,
-                IF_WIN((const char*), (const void*))&wait_timeout,
-                sizeof(wait_timeout));
+/**
+  Wait for an I/O event on a VIO socket.
 
-  }
+  @param vio      VIO object representing a connected socket.
+  @param event    The type of I/O event to wait for.
+  @param timeout  Interval (in milliseconds) to wait for an I/O event.
+                  A negative timeout value means an infinite timeout.
 
-  if (r != 0)
-    DBUG_PRINT("error", ("setsockopt failed: %d, errno: %d", r, socket_errno));
+  @remark sock_errno is set to SOCKET_ETIMEDOUT on timeout.
 
-  DBUG_VOID_RETURN;
-#else
-/*
-  Platforms not suporting setting of socket timeout should either use
-  thr_alarm or just run without read/write timeout(s)
+  @return A three-state value which indicates the operation status.
+  @retval -1  Failure, socket_errno indicates the error.
+  @retval  0  The wait has timed out.
+  @retval  1  The requested I/O event has occurred.
 */
-#endif
-}
-
 
-#ifdef __WIN__
+#ifndef _WIN32
 
-/*
-  Finish pending IO on pipe. Honor wait timeout
-*/
-static size_t pipe_complete_io(Vio* vio, char* buf, size_t size, DWORD timeout_ms)
+int vio_io_wait(Vio *vio, enum enum_vio_io_event event, int timeout)
 {
-  DWORD length;
-  DWORD ret;
+  int ret;
+  short revents= 0;
+  struct pollfd pfd;
+  my_socket sd= vio->sd;
+  DBUG_ENTER("vio_io_wait");
 
-  DBUG_ENTER("pipe_complete_io");
+  memset(&pfd, 0, sizeof(pfd));
+
+  pfd.fd= sd;
 
-  ret= WaitForSingleObject(vio->pipe_overlapped.hEvent, timeout_ms);
   /*
-    WaitForSingleObjects will normally return WAIT_OBJECT_O (success, IO completed)
-    or WAIT_TIMEOUT.
+    Set the poll bitmask describing the type of events.
+    The error flags are only valid in the revents bitmask.
   */
-  if(ret != WAIT_OBJECT_0)
+  switch (event)
   {
-    CancelIo(vio->hPipe);
-    DBUG_PRINT("error",("WaitForSingleObject() returned  %d", ret));
-    DBUG_RETURN((size_t)-1);
+  case VIO_IO_EVENT_READ:
+    pfd.events= MY_POLL_SET_IN;
+    revents= MY_POLL_SET_IN | MY_POLL_SET_ERR | POLLRDHUP;
+    break;
+  case VIO_IO_EVENT_WRITE:
+  case VIO_IO_EVENT_CONNECT:
+    pfd.events= MY_POLL_SET_OUT;
+    revents= MY_POLL_SET_OUT | MY_POLL_SET_ERR;
+    break;
   }
 
-  if (!GetOverlappedResult(vio->hPipe,&(vio->pipe_overlapped),&length, FALSE))
+  /*
+    Wait for the I/O event and return early in case of
+    error or timeout.
+  */
+  switch ((ret= poll(&pfd, 1, timeout)))
   {
-    DBUG_PRINT("error",("GetOverlappedResult() returned last error  %d", 
-      GetLastError()));
-    DBUG_RETURN((size_t)-1);
+  case -1:
+    /* On error, -1 is returned. */
+    break;
+  case 0:
+    /*
+      Set errno to indicate a timeout error.
+      (This is not compiled in on WIN32.)
+    */
+    errno= SOCKET_ETIMEDOUT;
+    break;
+  default:
+    /* Ensure that the requested I/O event has completed. */
+    DBUG_ASSERT(pfd.revents & revents);
+    break;
   }
 
-  DBUG_RETURN(length);
+  DBUG_RETURN(ret);
 }
 
+#else
 
-size_t vio_read_pipe(Vio * vio, uchar *buf, size_t size)
+int vio_io_wait(Vio *vio, enum enum_vio_io_event event, int timeout)
 {
-  DWORD bytes_read;
-  size_t retval;
-  DBUG_ENTER("vio_read_pipe");
-  DBUG_PRINT("enter", ("sd: %d  buf: 0x%lx  size: %u", vio->sd, (long) buf,
-                       (uint) size));
+  int ret;
+  struct timeval tm;
+  my_socket fd= vio->sd;
+  fd_set readfds, writefds, exceptfds;
+  DBUG_ENTER("vio_io_wait");
 
-  if (ReadFile(vio->hPipe, buf, (DWORD)size, &bytes_read,
-      &(vio->pipe_overlapped)))
+  /* Convert the timeout, in milliseconds, to seconds and microseconds. */
+  if (timeout >= 0)
   {
-    retval= bytes_read;
+    tm.tv_sec= timeout / 1000;
+    tm.tv_usec= (timeout % 1000) * 1000;
   }
-  else
-  {
-    if (GetLastError() != ERROR_IO_PENDING)
-    {
-      DBUG_PRINT("error",("ReadFile() returned last error %d",
-        GetLastError()));
-      DBUG_RETURN((size_t)-1);
-    }
-    retval= pipe_complete_io(vio, buf, size,vio->read_timeout_ms);
-  }
-
-  DBUG_PRINT("exit", ("%lld", (longlong)retval));
-  DBUG_RETURN(retval);
-}
 
+  FD_ZERO(&readfds);
+  FD_ZERO(&writefds);
+  FD_ZERO(&exceptfds);
 
-size_t vio_write_pipe(Vio * vio, const uchar* buf, size_t size)
-{
-  DWORD bytes_written;
-  size_t retval;
-  DBUG_ENTER("vio_write_pipe");
-  DBUG_PRINT("enter", ("sd: %d  buf: 0x%lx  size: %u", vio->sd, (long) buf,
-                       (uint) size));
+  /* Always receive notification of exceptions. */
+  FD_SET(fd, &exceptfds);
 
-  if (WriteFile(vio->hPipe, buf, (DWORD)size, &bytes_written, 
-      &(vio->pipe_overlapped)))
-  {
-    retval= bytes_written;
-  }
-  else
+  switch (event)
   {
-    if (GetLastError() != ERROR_IO_PENDING)
-    {
-      DBUG_PRINT("vio_error",("WriteFile() returned last error %d",
-        GetLastError()));
-      DBUG_RETURN((size_t)-1);
-    }
-    retval= pipe_complete_io(vio, (char *)buf, size, vio->write_timeout_ms);
+  case VIO_IO_EVENT_READ:
+    /* Readiness for reading. */
+    FD_SET(fd, &readfds);
+    break;
+  case VIO_IO_EVENT_WRITE:
+  case VIO_IO_EVENT_CONNECT:
+    /* Readiness for writing. */
+    FD_SET(fd, &writefds);
+    break;
   }
 
-  DBUG_PRINT("exit", ("%lld", (longlong)retval));
-  DBUG_RETURN(retval);
-}
-
-
-my_bool vio_is_connected_pipe(Vio *vio)
-{
-  if (PeekNamedPipe(vio->hPipe, NULL, 0, NULL, NULL, NULL))
-    return TRUE;
-  else
-    return (GetLastError() != ERROR_BROKEN_PIPE);
-}
-
+  /* The first argument is ignored on Windows. */
+  ret= select(0, &readfds, &writefds, &exceptfds, (timeout >= 0) ? &tm : NULL);
 
-int vio_close_pipe(Vio * vio)
-{
-  int r;
-  DBUG_ENTER("vio_close_pipe");
+  /* Set error code to indicate a timeout error. */
+  if (ret == 0)
+    WSASetLastError(SOCKET_ETIMEDOUT);
+
+  /* Error or timeout? */
+  if (ret <= 0)
+    DBUG_RETURN(ret);
 
-  CancelIo(vio->hPipe);
-  CloseHandle(vio->pipe_overlapped.hEvent);
-  DisconnectNamedPipe(vio->hPipe);
-  r= CloseHandle(vio->hPipe);
-  if (r)
+  /* The requested I/O event is ready? */
+  switch (event)
   {
-    DBUG_PRINT("vio_error", ("close() failed, error: %d",GetLastError()));
-    /* FIXME: error handling (not critical for MySQL) */
+  case VIO_IO_EVENT_READ:
+    ret= test(FD_ISSET(fd, &readfds));
+    break;
+  case VIO_IO_EVENT_WRITE:
+  case VIO_IO_EVENT_CONNECT:
+    ret= test(FD_ISSET(fd, &writefds));
+    break;
   }
-  vio->type= VIO_CLOSED;
-  vio->sd=   -1;
-  DBUG_RETURN(r);
-}
 
+  /* Error conditions pending? */
+  ret|= test(FD_ISSET(fd, &exceptfds));
 
-void vio_win32_timeout(Vio *vio, uint which , uint timeout_sec)
-{
-    DWORD timeout_ms;
-    /*
-      Windows is measuring timeouts in milliseconds. Check for possible int 
-      overflow.
-    */
-    if (timeout_sec > UINT_MAX/1000)
-      timeout_ms= INFINITE;
-    else
-      timeout_ms= timeout_sec * 1000;
+  /* Not a timeout, ensure that a condition was met. */
+  DBUG_ASSERT(ret);
 
-    /* which == 1 means "write", which == 0 means "read".*/
-    if(which)
-      vio->write_timeout_ms= timeout_ms;
-    else
-      vio->read_timeout_ms= timeout_ms;
+  DBUG_RETURN(ret);
 }
 
+#endif /* _WIN32 */
 
-#ifdef HAVE_SMEM
 
-size_t vio_read_shared_memory(Vio * vio, uchar* buf, size_t size)
-{
-  size_t length;
-  size_t remain_local;
-  char *current_position;
-  HANDLE events[2];
-
-  DBUG_ENTER("vio_read_shared_memory");
-  DBUG_PRINT("enter", ("sd: %d  buf: 0x%lx  size: %d", vio->sd, (long) buf,
-                       size));
+/**
+  Connect to a peer address.
 
-  remain_local = size;
-  current_position=buf;
+  @param vio       A VIO object.
+  @param addr      Socket address containing the peer address.
+  @param len       Length of socket address.
+  @param timeout   Interval (in milliseconds) to wait until a
+                   connection is established.
 
-  events[0]= vio->event_server_wrote;
-  events[1]= vio->event_conn_closed;
+  @retval FALSE   A connection was successfully established.
+  @retval TRUE    A fatal error. See socket_errno.
+*/
 
-  do
-  {
-    if (vio->shared_memory_remain == 0)
-    {
-      /*
-        WaitForMultipleObjects can return next values:
-         WAIT_OBJECT_0+0 - event from vio->event_server_wrote
-         WAIT_OBJECT_0+1 - event from vio->event_conn_closed. We can't read
-		           anything
-         WAIT_ABANDONED_0 and WAIT_TIMEOUT - fail.  We can't read anything
-      */
-      if (WaitForMultipleObjects(array_elements(events), events, FALSE,
-                                 vio->read_timeout_ms) != WAIT_OBJECT_0)
-      {
-        DBUG_RETURN(-1);
-      };
+my_bool
+vio_socket_connect(Vio *vio, struct sockaddr *addr, socklen_t len, int timeout)
+{
+  int ret, wait;
+  DBUG_ENTER("vio_socket_connect");
 
-      vio->shared_memory_pos = vio->handle_map;
-      vio->shared_memory_remain = uint4korr((ulong*)vio->shared_memory_pos);
-      vio->shared_memory_pos+=4;
-    }
+  /* Only for socket-based transport types. */
+  DBUG_ASSERT(vio->type == VIO_TYPE_SOCKET || vio->type == VIO_TYPE_TCPIP);
 
-    length = size;
+  /* If timeout is not infinite, set socket to non-blocking mode. */
+  if ((timeout > -1) && vio_set_blocking(vio, FALSE))
+    DBUG_RETURN(TRUE);
 
-    if (vio->shared_memory_remain < length)
-       length = vio->shared_memory_remain;
-    if (length > remain_local)
-       length = remain_local;
+  /* Initiate the connection. */
+  ret= connect(vio->sd, addr, len);
 
-    memcpy(current_position,vio->shared_memory_pos,length);
+#ifdef _WIN32
+  wait= (ret == SOCKET_ERROR) &&
+        (WSAGetLastError() == WSAEINPROGRESS ||
+         WSAGetLastError() == WSAEWOULDBLOCK);
+#else
+  wait= (ret == -1) && (errno == EINPROGRESS || errno == EALREADY);
+#endif
 
-    vio->shared_memory_remain-=length;
-    vio->shared_memory_pos+=length;
-    current_position+=length;
-    remain_local-=length;
+  /*
+    The connection is in progress. The vio_io_wait() call can be used
+    to wait up to a specified period of time for the connection to
+    succeed.
+
+    If vio_io_wait() returns 0 (after waiting however many seconds),
+    the socket never became writable (host is probably unreachable.)
+    Otherwise, if vio_io_wait() returns 1, then one of two conditions
+    exist:
+
+    1. An error occurred. Use getsockopt() to check for this.
+    2. The connection was set up successfully: getsockopt() will
+       return 0 as an error.
+  */
+  if (wait && (vio_io_wait(vio, VIO_IO_EVENT_CONNECT, timeout) == 1))
+  {
+    int error;
+    IF_WIN(int, socklen_t) optlen= sizeof(error);
+    IF_WIN(char, void) *optval= (IF_WIN(char, void) *) &error;
 
-    if (!vio->shared_memory_remain)
+    /*
+      At this point, we know that something happened on the socket.
+      But this does not means that everything is alright. The connect
+      might have failed. We need to retrieve the error code from the
+      socket layer. We must return success only if we are sure that
+      it was really a success. Otherwise we might prevent the caller
+      from trying another address to connect to.
+    */
+    if (!(ret= getsockopt(vio->sd, SOL_SOCKET, SO_ERROR, optval, &optlen)))
     {
-      if (!SetEvent(vio->event_client_read))
-        DBUG_RETURN(-1);
+#ifdef _WIN32
+      WSASetLastError(error);
+#else
+      errno= error;
+#endif
+      ret= test(error);
     }
-  } while (remain_local);
-  length = size;
+  }
+
+  /* If necessary, restore the blocking mode, but only if connect succeeded. */
+  if ((timeout > -1) && (ret == 0))
+  {
+    if (vio_set_blocking(vio, TRUE))
+      DBUG_RETURN(TRUE);
+  }
 
-  DBUG_PRINT("exit", ("%lu", (ulong) length));
-  DBUG_RETURN(length);
+  DBUG_RETURN(test(ret));
 }
 
 
-size_t vio_write_shared_memory(Vio * vio, const uchar* buf, size_t size)
-{
-  size_t length, remain, sz;
-  HANDLE pos;
-  const uchar *current_position;
-  HANDLE events[2];
+/**
+  Determine if the endpoint of a connection is still available.
 
-  DBUG_ENTER("vio_write_shared_memory");
-  DBUG_PRINT("enter", ("sd: %d  buf: 0x%lx  size: %d", vio->sd, (long) buf,
-                       size));
+  @remark The socket is assumed to be disconnected if an EOF
+          condition is encountered.
 
-  remain = size;
-  current_position = buf;
+  @param vio      The VIO object.
 
-  events[0]= vio->event_server_read;
-  events[1]= vio->event_conn_closed;
+  @retval TRUE    EOF condition not found.
+  @retval FALSE   EOF condition is signaled.
+*/
 
-  while (remain != 0)
-  {
-    if (WaitForMultipleObjects(array_elements(events), events, FALSE,
-                               vio->write_timeout_ms) != WAIT_OBJECT_0)
-    {
-      DBUG_RETURN((size_t) -1);
-    }
+my_bool vio_is_connected(Vio *vio)
+{
+  uint bytes= 0;
+  DBUG_ENTER("vio_is_connected");
 
-    sz= (remain > shared_memory_buffer_length ? shared_memory_buffer_length :
-         remain);
+  /*
+    The first step of detecting an EOF condition is verifying
+    whether there is data to read. Data in this case would be
+    the EOF. An exceptional condition event and/or errors are
+    interpreted as if there is data to read.
+  */
+  if (!vio_io_wait(vio, VIO_IO_EVENT_READ, 0))
+    DBUG_RETURN(TRUE);
 
-    int4store(vio->handle_map,sz);
-    pos = vio->handle_map + 4;
-    memcpy(pos,current_position,sz);
-    remain-=sz;
-    current_position+=sz;
-    if (!SetEvent(vio->event_client_wrote))
-      DBUG_RETURN((size_t) -1);
+  /*
+    The second step is read() or recv() from the socket returning
+    0 (EOF). Unfortunately, it's not possible to call read directly
+    as we could inadvertently read meaningful connection data.
+    Simulate a read by retrieving the number of bytes available to
+    read -- 0 meaning EOF. In the presence of unrecoverable errors,
+    the socket is assumed to be disconnected.
+  */
+  while (socket_peek_read(vio, &bytes))
+  {
+    if (socket_errno != SOCKET_EINTR)
+      DBUG_RETURN(FALSE);
   }
-  length = size;
-
-  DBUG_PRINT("exit", ("%lu", (ulong) length));
-  DBUG_RETURN(length);
-}
 
+#ifdef HAVE_OPENSSL
+  /* There might be buffered data at the SSL layer. */
+  if (!bytes && vio->type == VIO_TYPE_SSL)
+    bytes= SSL_pending((SSL*) vio->ssl_arg);
+#endif
 
-my_bool vio_is_connected_shared_memory(Vio *vio)
-{
-  return (WaitForSingleObject(vio->event_conn_closed, 0) != WAIT_OBJECT_0);
+  DBUG_RETURN(bytes ? TRUE : FALSE);
 }
 
+#ifndef DBUG_OFF
 
 /**
- Close shared memory and DBUG_PRINT any errors that happen on closing.
- @return Zero if all closing functions succeed, and nonzero otherwise.
-*/
-int vio_close_shared_memory(Vio * vio)
-{
-  int error_count= 0;
-  DBUG_ENTER("vio_close_shared_memory");
-  if (vio->type != VIO_CLOSED)
-  {
-    /*
-      Set event_conn_closed for notification of both client and server that
-      connection is closed
-    */
-    SetEvent(vio->event_conn_closed);
-    /*
-      Close all handlers. UnmapViewOfFile and CloseHandle return non-zero
-      result if they are success.
-    */
-    if (UnmapViewOfFile(vio->handle_map) == 0) 
-    {
-      error_count++;
-      DBUG_PRINT("vio_error", ("UnmapViewOfFile() failed"));
-    }
-    if (CloseHandle(vio->event_server_wrote) == 0)
-    {
-      error_count++;
-      DBUG_PRINT("vio_error", ("CloseHandle(vio->esw) failed"));
-    }
-    if (CloseHandle(vio->event_server_read) == 0)
-    {
-      error_count++;
-      DBUG_PRINT("vio_error", ("CloseHandle(vio->esr) failed"));
-    }
-    if (CloseHandle(vio->event_client_wrote) == 0)
-    {
-      error_count++;
-      DBUG_PRINT("vio_error", ("CloseHandle(vio->ecw) failed"));
-    }
-    if (CloseHandle(vio->event_client_read) == 0)
-    {
-      error_count++;
-      DBUG_PRINT("vio_error", ("CloseHandle(vio->ecr) failed"));
-    }
-    if (CloseHandle(vio->handle_file_map) == 0)
-    {
-      error_count++;
-      DBUG_PRINT("vio_error", ("CloseHandle(vio->hfm) failed"));
-    }
-    if (CloseHandle(vio->event_conn_closed) == 0)
-    {
-      error_count++;
-      DBUG_PRINT("vio_error", ("CloseHandle(vio->ecc) failed"));
-    }
-  }
-  vio->type= VIO_CLOSED;
-  vio->sd=   -1;
-  DBUG_RETURN(error_count);
-}
-#endif /* HAVE_SMEM */
-#endif /* __WIN__ */
+  Number of bytes in the read or socket buffer
 
+  @remark An EOF condition might count as one readable byte.
 
-/**
-  Number of bytes in the read buffer.
-
-  @return number of bytes in the read buffer or < 0 if error.
+  @return number of bytes in one of the buffers or < 0 if error.
 */
 
 ssize_t vio_pending(Vio *vio)
 {
-#ifdef HAVE_OPENSSL
-  SSL *ssl= (SSL*) vio->ssl_arg;
-#endif
+  uint bytes= 0;
 
+  /* Data pending on the read buffer. */
   if (vio->read_pos < vio->read_end)
     return vio->read_end - vio->read_pos;
 
-#ifdef HAVE_OPENSSL
-  if (ssl)
-    return SSL_pending(ssl);
-#endif
+  /* Skip non-socket based transport types. */
+  if (vio->type == VIO_TYPE_TCPIP || vio->type == VIO_TYPE_SOCKET)
+  {
+    /* Obtain number of readable bytes in the socket buffer. */
+    if (socket_peek_read(vio, &bytes))
+      return -1;
+  }
 
-  return 0;
+  /*
+    SSL not checked due to a yaSSL bug in SSL_pending that
+    causes it to attempt to read from the socket.
+  */
+
+  return (ssize_t) bytes;
 }
 
+#endif
 
 /**
   Checks if the error code, returned by vio_getnameinfo(), means it was the

=== modified file 'vio/viossl.c'
--- a/vio/viossl.c	2011-05-27 10:02:10 +0000
+++ b/vio/viossl.c	2011-05-31 13:52:09 +0000
@@ -1,4 +1,4 @@
-/* Copyright (C) 2000 MySQL AB
+/* Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.
 
    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
@@ -11,7 +11,7 @@
 
    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
-   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
+   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301  USA */
 
 /*
   Note that we can't have assertion on file descriptors;  The reason for
@@ -54,40 +54,162 @@ report_errors(SSL* ssl)
 #endif
 
 
-size_t vio_ssl_read(Vio *vio, uchar* buf, size_t size)
+/**
+  Obtain the equivalent system error status for the last SSL I/O operation.
+
+  @param ssl_error  The result code of the failed TLS/SSL I/O operation.
+*/
+
+static void ssl_set_sys_error(int ssl_error)
 {
-  size_t r;
-  DBUG_ENTER("vio_ssl_read");
-  DBUG_PRINT("enter", ("sd: %d  buf: 0x%lx  size: %u  ssl: 0x%lx",
-		       vio->sd, (long) buf, (uint) size, (long) vio->ssl_arg));
+  int error= 0;
 
-  r= SSL_read((SSL*) vio->ssl_arg, buf, size);
-#ifndef DBUG_OFF
-  if (r == (size_t) -1)
-    report_errors((SSL*) vio->ssl_arg);
+  switch (ssl_error)
+  {
+  case SSL_ERROR_ZERO_RETURN:
+    error= SOCKET_ECONNRESET;
+    break;
+  case SSL_ERROR_WANT_READ:
+  case SSL_ERROR_WANT_WRITE:
+#ifdef SSL_ERROR_WANT_CONNECT
+  case SSL_ERROR_WANT_CONNECT:
+#endif
+#ifdef SSL_ERROR_WANT_ACCEPT
+  case SSL_ERROR_WANT_ACCEPT:
+#endif
+    error= SOCKET_EWOULDBLOCK;
+    break;
+  case SSL_ERROR_SSL:
+    /* Protocol error. */
+#ifdef EPROTO
+    error= EPROTO;
+#else
+    error= SOCKET_ECONNRESET;
+#endif
+    break;
+  case SSL_ERROR_SYSCALL:
+  case SSL_ERROR_NONE:
+  default:
+    break;
+  };
+
+  /* Set error status to a equivalent of the SSL error. */
+  if (error)
+  {
+#ifdef _WIN32
+    WSASetLastError(error);
+#else
+    errno= error;
 #endif
-  DBUG_PRINT("exit", ("%u", (uint) r));
-  DBUG_RETURN(r);
+  }
 }
 
 
-size_t vio_ssl_write(Vio *vio, const uchar* buf, size_t size)
+/**
+  Indicate whether a SSL I/O operation must be retried later.
+
+  @param vio  VIO object representing a SSL connection.
+  @param ret  Value returned by a SSL I/O function.
+  @param event[out] The type of I/O event to wait/retry.
+
+  @return Whether a SSL I/O operation should be deferred.
+  @retval TRUE    Temporary failure, retry operation.
+  @retval FALSE   Indeterminate failure.
+*/
+
+static my_bool ssl_should_retry(Vio *vio, int ret, enum enum_vio_io_event *event)
 {
-  size_t r;
-  DBUG_ENTER("vio_ssl_write");
-  DBUG_PRINT("enter", ("sd: %d  buf: 0x%lx  size: %u", vio->sd,
-                       (long) buf, (uint) size));
+  int ssl_error;
+  SSL *ssl= vio->ssl_arg;
+  my_bool should_retry= TRUE;
 
-  r= SSL_write((SSL*) vio->ssl_arg, buf, size);
+  /* Retrieve the result for the SSL I/O operation. */
+  ssl_error= SSL_get_error(ssl, ret);
+
+  /* Retrieve the result for the SSL I/O operation. */
+  switch (ssl_error)
+  {
+  case SSL_ERROR_WANT_READ:
+    *event= VIO_IO_EVENT_READ;
+    break;
+  case SSL_ERROR_WANT_WRITE:
+    *event= VIO_IO_EVENT_WRITE;
+    break;
+  default:
 #ifndef DBUG_OFF
-  if (r == (size_t) -1)
-    report_errors((SSL*) vio->ssl_arg);
+    report_errors(ssl);
 #endif
-  DBUG_PRINT("exit", ("%u", (uint) r));
-  DBUG_RETURN(r);
+    should_retry= FALSE;
+    ssl_set_sys_error(ssl_error);
+    break;
+  }
+
+  return should_retry;
+}
+
+
+size_t vio_ssl_read(Vio *vio, uchar *buf, size_t size)
+{
+  int ret;
+  SSL *ssl= vio->ssl_arg;
+  DBUG_ENTER("vio_ssl_read");
+
+  while ((ret= SSL_read(ssl, buf, size)) < 0)
+  {
+    enum enum_vio_io_event event;
+
+    /* Process the SSL I/O error. */
+    if (!ssl_should_retry(vio, ret, &event))
+      break;
+
+    /* Attempt to wait for an I/O event. */
+    if (vio_socket_io_wait(vio, event))
+      break;
+  }
+
+  DBUG_RETURN(ret < 0 ? -1 : ret);
 }
 
 
+size_t vio_ssl_write(Vio *vio, const uchar *buf, size_t size)
+{
+  int ret;
+  SSL *ssl= vio->ssl_arg;
+  DBUG_ENTER("vio_ssl_write");
+
+  while ((ret= SSL_write(ssl, buf, size)) < 0)
+  {
+    enum enum_vio_io_event event;
+
+    /* Process the SSL I/O error. */
+    if (!ssl_should_retry(vio, ret, &event))
+      break;
+
+    /* Attempt to wait for an I/O event. */
+    if (vio_socket_io_wait(vio, event))
+      break;
+  }
+
+  DBUG_RETURN(ret < 0 ? -1 : ret);
+}
+
+#ifdef HAVE_YASSL
+
+/* Emulate a blocking recv() call with vio_read(). */
+static long yassl_recv(void *ptr, void *buf, size_t len)
+{
+  return vio_read(ptr, buf, len);
+}
+
+
+/* Emulate a blocking send() call with vio_write(). */
+static long yassl_send(void *ptr, const void *buf, size_t len)
+{
+  return vio_write(ptr, buf, len);
+}
+
+#endif
+
 int vio_ssl_close(Vio *vio)
 {
   int r= 0;
@@ -145,26 +267,60 @@ void vio_ssl_delete(Vio *vio)
 }
 
 
+/** SSL handshake handler. */
+typedef int (*ssl_handshake_func_t)(SSL*);
+
+
+/**
+  Loop and wait until a SSL handshake is completed.
+
+  @param vio    VIO object representing a SSL connection.
+  @param ssl    SSL structure for the connection.
+  @param func   SSL handshake handler.
+
+  @return Return value is 1 on success.
+*/
+
+static int ssl_handshake_loop(Vio *vio, SSL *ssl, ssl_handshake_func_t func)
+{
+  int ret;
+
+  vio->ssl_arg= ssl;
+
+  /* Initiate the SSL handshake. */
+  while ((ret= func(ssl)) < 1)
+  {
+    enum enum_vio_io_event event;
+
+    /* Process the SSL I/O error. */
+    if (!ssl_should_retry(vio, ret, &event))
+      break;
+
+    /* Wait for I/O so that the handshake can proceed. */
+    if (vio_socket_io_wait(vio, event))
+      break;
+  }
+
+  vio->ssl_arg= NULL;
+
+  return ret;
+}
+
+
 static int ssl_do(struct st_VioSSLFd *ptr, Vio *vio, long timeout,
-                  int (*connect_accept_func)(SSL*), unsigned long *errptr)
+                  ssl_handshake_func_t func, unsigned long *errptr)
 {
   int r;
   SSL *ssl;
-  my_bool unused;
-  my_bool was_blocking;
 
   DBUG_ENTER("ssl_do");
   DBUG_PRINT("enter", ("ptr: 0x%lx, sd: %d  ctx: 0x%lx",
                        (long) ptr, vio->sd, (long) ptr->ssl_context));
 
-  /* Set socket to blocking if not already set */
-  vio_blocking(vio, 1, &was_blocking);
-
   if (!(ssl= SSL_new(ptr->ssl_context)))
   {
     DBUG_PRINT("error", ("SSL_new failure"));
     *errptr= ERR_get_error();
-    vio_blocking(vio, was_blocking, &unused);
     DBUG_RETURN(1);
   }
   DBUG_PRINT("info", ("ssl: 0x%lx timeout: %ld", (long) ssl, timeout));
@@ -172,12 +328,25 @@ static int ssl_do(struct st_VioSSLFd *pt
   SSL_SESSION_set_timeout(SSL_get_session(ssl), timeout);
   SSL_set_fd(ssl, vio->sd);
 
-  if ((r= connect_accept_func(ssl)) < 1)
+  /*
+    Since yaSSL does not support non-blocking send operations, use
+    special transport functions that properly handles non-blocking
+    sockets. These functions emulate the behavior of blocking I/O
+    operations by waiting for I/O to become available.
+  */
+#ifdef HAVE_YASSL
+  /* Set first argument of the transport functions. */
+  yaSSL_transport_set_ptr(ssl, vio);
+  /* Set functions to use in order to send and receive data. */
+  yaSSL_transport_set_recv_function(ssl, yassl_recv);
+  yaSSL_transport_set_send_function(ssl, yassl_send);
+#endif
+
+  if ((r= ssl_handshake_loop(vio, ssl, func)) < 1)
   {
     DBUG_PRINT("error", ("SSL_connect/accept failure"));
     *errptr= SSL_get_error(ssl, r);
     SSL_free(ssl);
-    vio_blocking(vio, was_blocking, &unused);
     DBUG_RETURN(1);
   }
 
@@ -186,8 +355,8 @@ static int ssl_do(struct st_VioSSLFd *pt
     change type, set sd to the fd used when connecting
     and set pointer to the SSL structure
   */
-  vio_reset(vio, VIO_TYPE_SSL, SSL_get_fd(ssl), 0, 0);
-  vio->ssl_arg= (void*)ssl;
+  if (vio_reset(vio, VIO_TYPE_SSL, SSL_get_fd(ssl), ssl, 0))
+    DBUG_RETURN(1);
 
 #ifndef DBUG_OFF
   {
@@ -237,16 +406,6 @@ int sslconnect(struct st_VioSSLFd *ptr,
 }
 
 
-int vio_ssl_blocking(Vio *vio __attribute__((unused)),
-		     my_bool set_blocking_mode,
-		     my_bool *old_mode)
-{
-  /* Mode is always blocking */
-  *old_mode= 1;
-  /* Return error if we try to change to non_blocking mode */
-  return (set_blocking_mode ? 0 : 1);
-}
-
 my_bool vio_ssl_has_data(Vio *vio)
 {
   return SSL_pending(vio->ssl_arg) > 0 ? TRUE : FALSE;


Attachment: [text/bzr-bundle] bzr/davi.arnaut@oracle.com-20110531135209-8kxz4np8c4gav6s2.bundle
Thread
bzr commit into mysql-trunk branch (davi:3134) Bug#11758972 Bug#11762221Davi Arnaut31 May