List:Commits« Previous MessageNext Message »
From:Davi Arnaut Date:November 18 2010 4:44pm
Subject:Re: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790
View as plain text  
Hi Vladislav,

On 11/18/10 12:39 PM, Vladislav Vaintroub wrote:
> Hi Davi, Overall , great patch. Removes a lot of organically grown
> code, and replacement is better understandable and well commented .
> Some questions and comments below (comments are suggestions and minor
> nitpicking , nothing big) I have run  sysbench RO on my Windows, the
> changes seem to be within noise, but it looks like there was not many
> changes for Windows in the server..
>
> 4          8        16      32     64      128   256 patch        :
> 1104 1175 1203 1189 1185 1177 1162 no-patch : 1116 1200 1193 1171
> 1173 1160 1131
>
> I think it makes sense to also run  sysbench point_selects,  I
> believe it measures client/server overhead best, as the queries in
> server are very fast.

OK, will do.

>
>> -----Original Message----- From: Davi Arnaut
>> [mailto:davi.arnaut@stripped] Sent: Tuesday, November 16, 2010
>> 11:10 AM To: commits@stripped Subject: bzr commit into
>> mysql-5.5-runtime branch (davi:3186) Bug#54790
>>
>> # At a local mysql-5.5-runtime repository of davi
>>
>> 3186 Davi Arnaut	2010-11-16 Bug#54790: Use of non-blocking mode for
>> sockets limits performance
>>
>
> [skip]
>>
>> The solution is to remove alarm based timeouts from the protocol
>> layer and push timeout handling down to the virtual I/O layer. This
>> approach allows the handling of socket timeouts on a
>> platform-specific basis. The blocking mode of the socket is no
>> longer exported and VIO read and write operations will either
>> complete or fail with a error or timeout.
>>
>> On Linux, the MSG_DONTWAIT flag is used to enable non-blocking send
>> and receive operations. If the operation would block, poll() is
>> used to wait for readiness or until a timeout occurs. This strategy
>> avoids the need to set the socket timeout and blocking mode twice
>> per query.
>>
>> On Windows, as before, the timeout is set on a per-socket fashion.
>> In all remaining operating systems, the socket is set to
>> non-blocking mode and poll() is used to wait for readiness or until
>> a timeout occurs.
>
> A comment how it is handled on neither Windows nor Linux  should be
> added for completeness. Besides, also a comment about why
> setsockopt(SO_SNDTIMEO, SO_RECVTIMEO) are not used even if available
> and  working (which if I recall correctly is everywhere except HPUX).
> I do know you're planning to extend use of poll() for other helpful
> things, but others won't know that.

I'll add a explanation. Essentially, it allows for a single, common and 
shared implementation. If Windows XP supported WSAPoll(), I would have 
gone for poll even there.

>
>> @ sql/net_serv.cc Remove net_data_is_ready, it is no longer
>> necessary as its functionality is now implemented by vio_io_wait.
>>
>> Remove the use of signal based alarms for timeouts. Instead, rely
>> on the underlying VIO layer to provide socket timeouts and control
>> the blocking mode. The error handling is changed to use
>> vio_was_timeout() to recognize a timeout occurrence and
>> vio_should_retry() to detect interrupted I/O operations.
>>
>> The loop in the packet reading path (my_real_path) is unrolled into
>> two steps: reading the packet header and payload. Each step is now
>> documented and should be easier to understand.
>>
>> net_clear() no longer drains the socket buffer. Unread data on the
>> socket constitutes a protocol violation and shouldn't be skipped.
>> The previous behavior was race prone anyway as it only drained data
>> on the socket buffer, missing any data in transit. Instead,
>> net_clear() now just checks if the endpoint is still connected.
>>
>> Remove use of the MYSQL_INSTANCE_MANAGER macro. @ vio/vio.c Rename
>> no_poll_read to no_io_wait. The stub method is used for the
>> transport types named pipe and shared memory.
>
> Why actually no_io_wait? It should be rather simple to add one at
> least to shared memory. Or, we do not need it for these transports,
> as because timeouts are functional ?

Correct, we do not need it, for the timeout case -- and nor in other 
scenarios where vio_io_wait is used.

>
>> @ vio/viosocket.c Use vio_ssl_errno to retrieve the error number if
>> the VIO type is SSL.
>
>
>> Move code related to the transport types named pipe and shared
>> memory to their own files.
>
> Very good!
>
>> === modified file 'mysql-test/t/ssl.test'
>
>> +LET $ID= `SELECT connection_id()`; +SET @@SESSION.wait_timeout =
>> 5; + +connection default; + +let $wait_condition= +  SELECT
>> COUNT(*) = 0 FROM INFORMATION_SCHEMA.PROCESSLIST +  WHERE ID =
>> $ID; +--source include/wait_condition.inc
>
> It waits for 5 seconds, right?  Maybe 1 second is already enough (mtr
> is not being too productive during the wait )

Will reduce it to 2, which is the value used by similar tests.

>> === modified file 'mysql-test/t/wait_timeout.test' ---
>> a/mysql-test/t/wait_timeout.test	2010-10-19 11:54:28 +0000 +++
>> b/mysql-test/t/wait_timeout.test	2010-11-16 10:10:18 +0000
>
>> # The last connect is to keep tools checking the current test
>> happy. +LET $ID= `SELECT connection_id()`; +SET
>> @@SESSION.wait_timeout = 5; + +connection default; + +let
>> $wait_condition= +  SELECT COUNT(*) = 0 FROM
>> INFORMATION_SCHEMA.PROCESSLIST +  WHERE ID = $ID; +--source
>> include/wait_condition.inc
>
> Same here.
>
>> === modified file 'sql-common/client.c'
>
>> -#ifdef __WIN__ +#ifdef _WIN32
>
> Thanks! (Here and on different other places :)
>
>> +int vio_timeout(Vio *vio, uint which, int timeout_sec)
> [skip]
>> +  /* Deduce the current timeout status mode. */ +  old_mode=
>> vio->write_timeout<  0&&  vio->read_timeout<  0;
>
> Maybe more obvious variable name than old_mode (was_blocking?), or
> more explanation what and how was deduced.

It is deducing a change in the mode of the timeout values, whether both 
of them represent infinite timeouts. Blocking mode is on a layer below, 
so it doesn't make much sense. I'll improve the comment.

>
>> === modified file 'vio/viosocket.c' --- a/vio/viosocket.c
>> 2010-08-16 12:50:27 +0000
>
>
>> +/* +  Define a stub MSG_DONTWAIT if unavailable. In this case,
>> fcntl +  (or a equivalent) is used to enable non-blocking
>> operations. +  The flag must be supported in both send and recv
>> operations. +*/ +#if defined(__linux__) +#define VIO_USE_DONTWAIT
>> 1 +#define VIO_DONTWAIT      MSG_DONTWAIT +#else +#define
>> VIO_DONTWAIT 0 +#endif
>
> I have to admit I have not fully understood why it is a hard
> requirement to support MSG_DONTWAIT  in both send() and recv().  In
> recv() , it is supported by Linux, Solaris and FreeBSD,while  in
> send() it is supported by Linux only. So maybe we can take advantage
> of the optimization here   by defining 2 constants say
> VIO_MSG_DONTWAIT_SEND ( == MSG_DONTWAIT on Linux,  0 everywhere else)
> ,  and VIO_MSG_DONTWAIT_RECV (== MSG_DONTWAIT on Linux, Solaris and
> BSD).

Currently there is no way to take advantage of the optimization just for 
recv(). This is because there will always be a timeout for sending and 
receiving, as there is no way to just set a timeout just for receive. 
Since in order to implement the send timeout we need to change the 
socket to non-blocking mode, the use of the flag in receive becomes 
useless as the socket is already in non-blocking mode. If this changes 
in the future, we can revisit it. Otherwise, it would just be dead code.

>
>> +    /* Deduce what should be the new blocking mode of the socket.
>> */ +    my_bool new_mode= vio->write_timeout<  0&&
>> vio->read_timeout<  0; + +    /* If necessary, update the blocking
>> mode. */ +    if (new_mode != old_mode) +      ret=
>> vio_set_blocking(vio, new_mode);
>
> See earlier comment, I think it  would be better understandable if
> old_mode  were named  was_blocking and new_mode  were named
> is_blocking  or something like this..

But the information is dependent upon the timeout value, not the 
blocking mode. You have to consider what we use as the source of the 
information. I'll add a comment to clarify.

>
>> +static int +select_poll(my_socket fd, enum enum_vio_io_event
>> event, struct timeval *timeout)
>
> {
>> +  int ret; +  fd_set readfds, writefds, exceptfds;
>
> This is minor  , but you only need 2 fd_sets, you never use all 3
> together, only readfds/exceptfds or writefds/exceptfds, so you might
> just have fds and exceptfds .

But select takes three fds. I would have to either use a pointer or a 
conditional branch to pass the fds as the second or third parameter.

>> +  /* The first argument is ignored on Windows. */ +  ret=
>> select(IF_WIN(0, fd + 1),&readfds,&writefds,&exceptfds, timeout);
>
> It is already under #ifdef _WIN32, no IF_WIN required.
>
>> +    IF_WIN(int, socklen_t) optlen= sizeof(error); +
>> IF_WIN(char, void) *optval= (IF_WIN(char, void) *)&error;
>
> Maybe there is a way to reduce 3 IF_WIN s to a single #ifdef _WIN32
> ?

Personally, the result looked more ugly two me. I would actually prefer 
proper typedefs here. Do we set socklen_t to int on Windows? I could 
introduce another type for the argument to setsockopt.

Regards,

Davi
Thread
bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Davi Arnaut16 Nov
Re: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Davi Arnaut18 Nov
  • RE: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Vladislav Vaintroub19 Nov
    • Re: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Davi Arnaut19 Nov
      • RE: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Vladislav Vaintroub19 Nov
        • Re: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Davi Arnaut19 Nov
          • RE: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Vladislav Vaintroub19 Nov
            • Re: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Davi Arnaut19 Nov
            • Re: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Davi Arnaut19 Nov
              • RE: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Vladislav Vaintroub19 Nov
                • Re: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Davi Arnaut19 Nov
                  • RE: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Vladislav Vaintroub19 Nov
                    • Re: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Davi Arnaut19 Nov
RE: bzr commit into mysql-5.5-runtime branch (davi:3186) Bug#54790Vladislav Vaintroub19 Nov