List:Internals« Previous MessageNext Message »
From:MARK CALLAGHAN Date:May 4 2010 4:13pm
Subject:Re: error handling for MySQL binary log
View as plain text  
Update function declarations with "__attribute__((warn_unused_result)"
and you might find lots of interesting things.
http://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Function-Attributes.html#index-g_t_0040code_007bwarn_005funused_005fresult_007d-attribute-2445

See these for examples:
http://bugs.mysql.com/bug.php?id=25058
http://bugs.mysql.com/bug.php?id=40492
http://bugs.mysql.com/bug.php?id=37148

One of the problems was that callers to mysql_bin_log.write() did not
check for an error. That appears to have been fixed. I assume a lot
more work remains as there are only two references to
"warn_unused_result" in 5.1 source and MySQL still generates a lot of
warnings during compilation.

There was a bug a few years ago in the replication code that caused a
lot of downtime for us from code that did not check the return value
for a function that failed when unable to allocate memory. We messed
up by getting the server to run out of memory. But the silent
corruption that followed was a serious pain for us and one of the bugs
is still open -- http://bugs.mysql.com/bug.php?id=40492.

On Tue, May 4, 2010 at 7:32 AM, Kristian Nielsen
<knielsen@stripped> wrote:
> Hi,
>
> While working on the binlog code in sql/log.cc, I discovered what I think is a
> number of problems with the error handling in the code that writes events to
> the binary log. At this point I do not have much in the way of concrete
> suggestions for how to improve this (sorry!), but I thought it might be useful
> to at least mention the issues, rather than just ignore it and let it be
> forgotten until someone else hits the same issues (and maybe someone can
> correct me if I misunderstood the code).
>
> Suppose that mysqld gets an error writing to the binlog file. Eg. I tested
> with an error insert like this in sql/log.cc MYSQL_BIN_LOG::write_cache():
>
>     }
>
>     /* Write data to the binary log file */
> +    DBUG_EXECUTE_IF("fail_binlog_write_1", return ER_ERROR_ON_WRITE;);
>     if (my_b_write(&log_file, cache->read_pos, length))
>       return ER_ERROR_ON_WRITE;
>     cache->read_pos=cache->read_end;        
>   // Mark buffer used up
>
> With this, I found a number of problems/bugs in the code:
>
>
> 1. A debug built server crashes with an assertion in net_end_statement()
> because the binlog code causes the statement to fail without setting an
> error status with my_error() or similar. The reason is this statement:
>
>      if ((write_error= write_cache(cache, false, false)))
>         goto err;
>
> write_cache does not set any error status with my_error(), and nor does the
> code at the err: label, due to assigning to write_error (write_error seems to
> be a flag that disables further error reporting, to prevent reporting write
> errors multiple times). So the statement fails without setting any error code.
>
> Also, this is the only place that write_error can ever be cleared, which seems
> wrong. My guess is that the write_error stuff is some old code that is no
> longer fully working...
>
>
> 2. MYSQL_BIN_LOG::write(THD *, IO_CACHE *, Log_event *_event, bool) and other
> places in the code reports errors with sql_print_error() instead of my_error()
> or similar. Again this causes the assertion in net_end_statement(), as the
> current SQL statement fails but no error status is set.
>
>
> 3. If writing the binlog fails for an InnoDB transaction, what happens is that
> first innobase_prepare() is called, which logs the prepare_commit_mutex, then
> binlogging is tried and fails, then innobase_rollback() is called to roll back
> the transaction due to the failure. But innobase_rollback() is missing code to
> unlock the prepare_commit_mutex, so we get an assertion about locking an
> already locked mutex in the next transaction.
>
> Eg. innobase_rollback() needs code like this:
>
>    if (trx->active_trans == 2) {
>
>          
>  pthread_mutex_unlock(&prepare_commit_mutex);
>    }
>
>
> 4. When MYSQL_BIN_LOG::write_cache() bails out due to error, it does not get
> to this code near the end:
>
>    cache->read_pos=cache->read_end;        
>    // Mark buffer used up
>
> Because of this, binlog_trx_data::empty() will return true, and
> binlog_trx_data::reset() will not call binlog_trx_data::truncate() to
> re-initialise the transaction cache. This means that after an error,
> binlogging for subsequent transactions in the same session will just silently
> be ignored, as the transaction cache is in READ_CACHE mode rather than
> WRITE_CACHE.
>
> It seems the logic in is a bit flawed, and it is kind of by accident that the
> transaction cache ends up with empty() returning false when write_cache()
> succeeds.
>
>
> Unless I am missing something, it seems that currently the server is not
> really able to handle/recover from a failure to write the binary log...
>
> Hope this helps,
>
>  - Kristian.
>
> --
> MySQL Internals Mailing List
> For list archives: http://lists.mysql.com/internals
> To unsubscribe:    http://lists.mysql.com/internals?unsub=1
>
>



-- 
Mark Callaghan
mdcallag@stripped
Thread
error handling for MySQL binary logKristian Nielsen4 May
  • Re: error handling for MySQL binary logMARK CALLAGHAN4 May
  • Re: error handling for MySQL binary logStewart Smith5 May
  • Re: error handling for MySQL binary logHe Zhenxing28 May