Hi,
Thank you for reporting this problem, We are working on this problem,
bug#37148 has been pushed, which fixed some obvious cases, and the
WL#1790, which will try to fix all of them.
Kristian Nielsen wrote:
> Hi,
>
> While working on the binlog code in sql/log.cc, I discovered what I think is a
> number of problems with the error handling in the code that writes events to
> the binary log. At this point I do not have much in the way of concrete
> suggestions for how to improve this (sorry!), but I thought it might be useful
> to at least mention the issues, rather than just ignore it and let it be
> forgotten until someone else hits the same issues (and maybe someone can
> correct me if I misunderstood the code).
>
> Suppose that mysqld gets an error writing to the binlog file. Eg. I tested
> with an error insert like this in sql/log.cc MYSQL_BIN_LOG::write_cache():
>
> }
>
> /* Write data to the binary log file */
> + DBUG_EXECUTE_IF("fail_binlog_write_1", return ER_ERROR_ON_WRITE;);
> if (my_b_write(&log_file, cache->read_pos, length))
> return ER_ERROR_ON_WRITE;
> cache->read_pos=cache->read_end; // Mark buffer used up
>
> With this, I found a number of problems/bugs in the code:
>
>
> 1. A debug built server crashes with an assertion in net_end_statement()
> because the binlog code causes the statement to fail without setting an
> error status with my_error() or similar. The reason is this statement:
>
> if ((write_error= write_cache(cache, false, false)))
> goto err;
>
> write_cache does not set any error status with my_error(), and nor does the
> code at the err: label, due to assigning to write_error (write_error seems to
> be a flag that disables further error reporting, to prevent reporting write
> errors multiple times). So the statement fails without setting any error code.
>
> Also, this is the only place that write_error can ever be cleared, which seems
> wrong. My guess is that the write_error stuff is some old code that is no
> longer fully working...
>
>
> 2. MYSQL_BIN_LOG::write(THD *, IO_CACHE *, Log_event *_event, bool) and other
> places in the code reports errors with sql_print_error() instead of my_error()
> or similar. Again this causes the assertion in net_end_statement(), as the
> current SQL statement fails but no error status is set.
>
>
> 3. If writing the binlog fails for an InnoDB transaction, what happens is that
> first innobase_prepare() is called, which logs the prepare_commit_mutex, then
> binlogging is tried and fails, then innobase_rollback() is called to roll back
> the transaction due to the failure. But innobase_rollback() is missing code to
> unlock the prepare_commit_mutex, so we get an assertion about locking an
> already locked mutex in the next transaction.
>
> Eg. innobase_rollback() needs code like this:
>
> if (trx->active_trans == 2) {
>
> pthread_mutex_unlock(&prepare_commit_mutex);
> }
>
>
> 4. When MYSQL_BIN_LOG::write_cache() bails out due to error, it does not get
> to this code near the end:
>
> cache->read_pos=cache->read_end; // Mark buffer used up
>
> Because of this, binlog_trx_data::empty() will return true, and
> binlog_trx_data::reset() will not call binlog_trx_data::truncate() to
> re-initialise the transaction cache. This means that after an error,
> binlogging for subsequent transactions in the same session will just silently
> be ignored, as the transaction cache is in READ_CACHE mode rather than
> WRITE_CACHE.
>
> It seems the logic in is a bit flawed, and it is kind of by accident that the
> transaction cache ends up with empty() returning false when write_cache()
> succeeds.
>
>
> Unless I am missing something, it seems that currently the server is not
> really able to handle/recover from a failure to write the binary log...
>
> Hope this helps,
>
> - Kristian.
>