List:Commits« Previous MessageNext Message »
From:Alfranio Correia Date:January 24 2011 3:56pm
Subject:Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925
View as plain text  
Hi Daogang,

On 01/24/2011 06:06 AM, Daogang Qu wrote:
> 2011-01-21 19:12, Alfranio Correia wrote:
>> STATUS
>> ------
>>
>> Not approved as there are several problems
>> with the current design and implementation.
>>
>> REQUESTS
>> --------
>>
>> 1. It is missing a set of variables to implement an automated mechanism to
>> pre-allocate binary logs. For example,
>>
>>    1.1 How do I know if there is any pre-allocated file that is not in use?
>>    
> The unused files which are preallocated previously will be used when 
> server restart.
> Why we need to know the unused file? We can know it by mysqlbinlog 
> client tool
> if we really want to know it.


When do I know that new files should be pre-allocated? Try and error?


>>    1.2 How do I get its maximum size?
>>    
> The maximum size is limited by max-binlog-size, The file size is specified
> when preallocating and it's equal to max-binlog-size by default.
>>    1.3 How do I get the actual size of the current file?
>>    
> 
> The "ulonglong actual_size" member variable in MYSQL_BIN_LOG class is introduced for
> managing the latest binlog's actual size.
>

How do I access this information? SHOW? SELECT?


> 
>> Note this is not a complete list and we may provide different alternatives to
>> achieve the same goal. Look at the questions above as just examples.
>>
>> 2. Why do we need a separate client to pre-allocate binary logs?
>>    
>
> One issue is that in some OS/filesystems (i.e. ext3 on RHEL5) posix_fallocate()
> is defined but does zero filling internally, which will take a lot of time and
> massive writes. To avoid this scenario, we introduced the separate client for
> pre-allocating binary logs before mysql server start.
> 


The problem here is I/O? So how may a new client reduce I/O? Clearly, if the
slave is not running you can pre-allocate files without a negative impact on
performance. However, after starting the slave, how do you plan to do that?
There is no magic bullet here and the client in my opinion is in most cases
useless, although you can keep it provided the problem I pointed out in what
follows is not a real issue.


>> I don't buy the argument of performance as described in the WL because the time
>> to pre-allocate the files is taking into account to do the performance
> assessment
>> and creating the client does not fix anything.
>>    
> The time of static pre-allocate files should not be taken into account to
> performance assessment. Because it's done before the server start.
> 
> 

Ignore my comment here and see what I wrote above.


> 
>> The issue here should be seen from a different perspective: "Pre-allocating
> files
>> will increase disk activity and as such will interfere in the performance". So
>> the files need to be created when the server is idle and when there is really a
>> need to do so.
>>    
> These files are preallocated by the client tool when the server is stopped
> or idle.

The slave will be stopped in rare occasions, so when it is idle, how can you decide
if it is necessary to pre-allocate new files?


>> Note that I am ok with the client provided item 3 is not a problem. However I
>> prefer a new command.
>>    
> What's your suggestion for the new command?
>> 3. Isn't there any race condition between rotation and pre-allocation?
>>
>>     server --->  decides to use master.X
>>
>>     mysqlbinlogalloc --->  creates master.X
>>
>>     server --->  fails while creating master.X (booom)
>>    
> No. Static and automatic preallocation are compatible. One will not
> preallocate the file if another is a preallocating it. So no conflict here.


There is a race condition in here. Are you assuming that the mysqlbinlogalloc will
only be used when the server is off-line or slave is stopped? If not, how can you
prevent the race condition? If so, I think this is a very bad design decision.

> 
>> 4. Using the binlog_preallocate to decide if either the old or new behavior
> should
>>     be used is not a good idea. We may pre-allocate files and start the server
> with
>>       binlog_preallocate<>  0 and then forget and restart the server with
> binlog_preallocate
>>     = 0. In this case, we are going to have dangling files consuming lots of disk
> space.
>>
>>     If We can I think we must avoid situations where an user's mistake may cause
> problems.
>>    
> Maybe. But how to avoid this?


See comments in what follows.

>> 5. Using binlog_preallocate>  1 to automatically pre-allocate binary log files
> is not
>>     a good idea because users will not have control when the pre-allocation will
> happen.
>>     See item 1.
>>    
> I don't think the user should control something in the process as it's a 
> automatic preallocation.

On the contrary, the user (i.e. DBA) should be responsible for checking either manually
or automatically if the server is idle and if it is necessary to pre-allocate a new file.
The idea is to provide the means to define policies on when the pre-allocation will
happen.


>> 6. Why do you need to keep the old behavior where new files are created after
> searching
>>     in the filesystem? I know that this done just for safety in order to avoid
> overwriting
>>     a possible useful binary log. However, this case will only happen if someone
> manually
>>     copies files around.
>>    
> The old behavior is good. It can guarantee that Static and automatic 
> preallocation are compatible.

Supporting both behaviors just complicate the code.


>>     Besides, we have committed several patches towards making the index file the
> source
>>     of truth. So why don't we keep just the new behavior where the index file is
> the source
>>     of truth?
>>
>>     If overwriting a file is really a concern. I think we should use two states
> in the
>>     binary log:
>>       . LOG_EVENT_BINLOG_IN_USE_F
>>       . LOG_EVENT_BINLOG_PRE_ALLOC
>>
>>     So, if the new file is pre-allocated, its status would be
> LOG_EVENT_BINLOG_PRE_ALLOC
>>     and the server would be able to know if it is safe to overwrite it or not.
>>    
> It's not necessary to overwrite a preallocated file. So no requirement for
> LOG_EVENT_BINLOG_PRE_ALLOC flag.


I don't agree. See Libing's reply and my arguments against the option binlog_preallocate.

>> 7. If you decide to keep binlog_preallocate, please, run mtr with
> binlog_preallocate = 1,
>>     because there are several tests failing that were not supposed to fail and
> the binary
>>     log is pre-allocated when this is not supposed to happen.
>>    
> The automatic preallocation will be started if we run mtr with 
> binlog_preallocate = 1.
> So it's possible that some binlog related tests will fail.

binlog_preallocate = 1 is to use a pre-allocate file and binlog_preallocate > 1 is to
use a pre-allocate file and automatically pre-allocate.


I don't see why the first case, in most of the test cases, should generate any error.

Cheers.

>> 8. State clearly what is the purpose of the test cases, what are the assertions
> to be checked,
>> etc, etc...
>>    
> Will. Thanks!
>> 9. See several comments in-line.
>>    
> Will reply one by one after the design discussion. Thanks!
>> SUGGESTIONS
>> -----------
>>
>> 1. Create two additional WLs. One for a test case and another for performance
> assessment.
>> Check this with Luis. Although I think this should be done, there may be some
> bureaucratic
>> problems.
>>    
> Sounds good. Luis, how about the suggestion?
> 
> Best Regards,
> 
> 
> Daogang
>>
>>
Thread
bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925daogang.qu5 Jan
  • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Alfranio Correia21 Jan
    • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Daogang Qu24 Jan
      • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Luís Soares24 Jan
      • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925anders24 Jan
        • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Alfranio Correia24 Jan
          • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Daogang Qu25 Jan
        • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Daogang Qu25 Jan
          • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925anders25 Jan
            • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Daogang Qu25 Jan
              • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925anders25 Jan
                • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Daogang Qu25 Jan
          • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Alfranio Correia25 Jan
      • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Alfranio Correia24 Jan
        • Re: bzr commit into mysql-trunk branch (daogang.qu:3446) WL#4925Daogang Qu25 Jan