Hi Davi,
Overall , great patch. Removes a lot of organically grown code, and replacement is better
understandable and well commented . Some
questions and comments below (comments are suggestions and minor nitpicking , nothing big)
I have run sysbench RO on my Windows, the changes seem to be within noise, but it looks
like there was not many changes for Windows
in the server..
4 8 16 32 64 128 256
patch : 1104 1175 1203 1189 1185 1177 1162
no-patch : 1116 1200 1193 1171 1173 1160 1131
I think it makes sense to also run sysbench point_selects, I believe it measures
client/server overhead best, as the queries in
server are very fast.
> -----Original Message-----
> From: Davi Arnaut [mailto:davi.arnaut@stripped]
> Sent: Tuesday, November 16, 2010 11:10 AM
> To: commits@stripped
> Subject: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790
>
> # At a local mysql-5.5-runtime repository of davi
>
> 3186 Davi Arnaut 2010-11-16
> Bug#54790: Use of non-blocking mode for sockets limits performance
>
[skip]
>
> The solution is to remove alarm based timeouts from the
> protocol layer and push timeout handling down to the
> virtual I/O layer. This approach allows the handling of
> socket timeouts on a platform-specific basis. The blocking
> mode of the socket is no longer exported and VIO read and
> write operations will either complete or fail with a error
> or timeout.
>
> On Linux, the MSG_DONTWAIT flag is used to enable non-blocking
> send and receive operations. If the operation would block,
> poll() is used to wait for readiness or until a timeout
> occurs. This strategy avoids the need to set the socket
> timeout and blocking mode twice per query.
>
> On Windows, as before, the timeout is set on a per-socket
> fashion. In all remaining operating systems, the socket is
> set to non-blocking mode and poll() is used to wait for
> readiness or until a timeout occurs.
A comment how it is handled on neither Windows nor Linux should be added for
completeness. Besides, also a comment about why
setsockopt(SO_SNDTIMEO, SO_RECVTIMEO) are not used even if available and working (which
if I recall correctly is everywhere except
HPUX). I do know you're planning to extend use of poll() for other helpful things, but
others won't know that.
> @ sql/net_serv.cc
> Remove net_data_is_ready, it is no longer necessary as its
> functionality is now implemented by vio_io_wait.
>
> Remove the use of signal based alarms for timeouts. Instead,
> rely on the underlying VIO layer to provide socket timeouts
> and control the blocking mode. The error handling is changed
> to use vio_was_timeout() to recognize a timeout occurrence
> and vio_should_retry() to detect interrupted I/O operations.
>
> The loop in the packet reading path (my_real_path) is unrolled
> into two steps: reading the packet header and payload. Each
> step is now documented and should be easier to understand.
>
> net_clear() no longer drains the socket buffer. Unread data
> on the socket constitutes a protocol violation and shouldn't
> be skipped. The previous behavior was race prone anyway
> as it only drained data on the socket buffer, missing any
> data in transit. Instead, net_clear() now just checks if
> the endpoint is still connected.
>
> Remove use of the MYSQL_INSTANCE_MANAGER macro.
> @ vio/vio.c
> Rename no_poll_read to no_io_wait. The stub method is used
> for the transport types named pipe and shared memory.
Why actually no_io_wait? It should be rather simple to add one at least to shared memory.
Or, we do not need it for these
transports, as because timeouts are functional ?
> @ vio/viosocket.c
> Use vio_ssl_errno to retrieve the error number if the VIO
> type is SSL.
> Move code related to the transport types named pipe and
> shared memory to their own files.
Very good!
> === modified file 'mysql-test/t/ssl.test'
> +LET $ID= `SELECT connection_id()`;
> +SET @@SESSION.wait_timeout = 5;
> +
> +connection default;
> +
> +let $wait_condition=
> + SELECT COUNT(*) = 0 FROM INFORMATION_SCHEMA.PROCESSLIST
> + WHERE ID = $ID;
> +--source include/wait_condition.inc
It waits for 5 seconds, right? Maybe 1 second is already enough (mtr is not being too
productive during the wait )
> === modified file 'mysql-test/t/wait_timeout.test'
> --- a/mysql-test/t/wait_timeout.test 2010-10-19 11:54:28 +0000
> +++ b/mysql-test/t/wait_timeout.test 2010-11-16 10:10:18 +0000
> # The last connect is to keep tools checking the current test happy.
> +LET $ID= `SELECT connection_id()`;
> +SET @@SESSION.wait_timeout = 5;
> +
> +connection default;
> +
> +let $wait_condition=
> + SELECT COUNT(*) = 0 FROM INFORMATION_SCHEMA.PROCESSLIST
> + WHERE ID = $ID;
> +--source include/wait_condition.inc
Same here.
> === modified file 'sql-common/client.c'
> -#ifdef __WIN__
> +#ifdef _WIN32
Thanks! (Here and on different other places :)
> +int vio_timeout(Vio *vio, uint which, int timeout_sec)
[skip]
> + /* Deduce the current timeout status mode. */
> + old_mode= vio->write_timeout < 0 && vio->read_timeout < 0;
Maybe more obvious variable name than old_mode (was_blocking?), or more explanation what
and how was deduced.
> === modified file 'vio/viosocket.c'
> --- a/vio/viosocket.c 2010-08-16 12:50:27 +0000
> +/*
> + Define a stub MSG_DONTWAIT if unavailable. In this case, fcntl
> + (or a equivalent) is used to enable non-blocking operations.
> + The flag must be supported in both send and recv operations.
> +*/
> +#if defined(__linux__)
> +#define VIO_USE_DONTWAIT 1
> +#define VIO_DONTWAIT MSG_DONTWAIT
> +#else
> +#define VIO_DONTWAIT 0
> +#endif
I have to admit I have not fully understood why it is a hard requirement to support
MSG_DONTWAIT in both send() and recv(). In
recv() , it is supported by Linux, Solaris and FreeBSD,while in send() it is supported by
Linux only. So maybe we can take
advantage of the optimization here by defining 2 constants say VIO_MSG_DONTWAIT_SEND (
== MSG_DONTWAIT on Linux, 0 everywhere
else) , and VIO_MSG_DONTWAIT_RECV (== MSG_DONTWAIT on Linux, Solaris and BSD).
> + /* Deduce what should be the new blocking mode of the socket. */
> + my_bool new_mode= vio->write_timeout < 0 && vio->read_timeout
> < 0;
> +
> + /* If necessary, update the blocking mode. */
> + if (new_mode != old_mode)
> + ret= vio_set_blocking(vio, new_mode);
See earlier comment, I think it would be better understandable if old_mode were named
was_blocking and new_mode were named
is_blocking or something like this..
> +static int
> +select_poll(my_socket fd, enum enum_vio_io_event event, struct timeval *timeout)
{
> + int ret;
> + fd_set readfds, writefds, exceptfds;
This is minor , but you only need 2 fd_sets, you never use all 3 together, only
readfds/exceptfds or writefds/exceptfds, so you
might just have fds and exceptfds .
> + /* The first argument is ignored on Windows. */
> + ret= select(IF_WIN(0, fd + 1), &readfds, &writefds, &exceptfds,
> timeout);
It is already under #ifdef _WIN32, no IF_WIN required.
> + IF_WIN(int, socklen_t) optlen= sizeof(error);
> + IF_WIN(char, void) *optval= (IF_WIN(char, void) *) &error;
Maybe there is a way to reduce 3 IF_WIN s to a single #ifdef _WIN32 ?