List:Commits« Previous MessageNext Message »
From:Alfranio Correia Date:November 17 2010 2:10pm
Subject:Re: bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493
View as plain text  
Hi Daogang,

Great work.

Please, the patch is almost ready.

See some comments in-line and the some major questions/requests in what follows.


STATUS
------
Not approved.

REQUIRED
--------

1. Pull from trunk and replace DBUG_ABORT by DBUG_SUICIDE.

2. Move the test case from this WL to WL#5440. Besides, the
part of the test that checks the index should go into binlog_index.test.

3. Try to find better names to both

   . append_file_name_to_index_file
   . update_log_index

The first is used when adding a new log and the second when removing log(s).

4. The crash-safety for the index file is based on the assumption that the
"rename" operation is atomic. Have you checked if this is true for all
platforms and possible filesystems? You haven't it is better to send an
email to dev-private asking for input.

This is an important point and should be carefully documented.

5. Is this fdle->flags&= ~LOG_EVENT_BINLOG_IN_USE_F right?

int MYSQL_BIN_LOG::recover(IO_CACHE *log, Format_description_log_event *fdle,
                             my_off_t *valid_pos)
{

   fdle->flags&= ~LOG_EVENT_BINLOG_IN_USE_F; // abort on the first error

6. I think the description on the possible solutions for this WL is missing
some important points:

    1) Extending Binlog File Header.

    . It is not necessary an extra write per event. In fact, we should flush
    the binlog cache as usual and sync the file, do the extra write and sync
    again. The extra activity happens everytime we need to flush the binlog
    cache and not when we need to write an event.


    . This extra cost can be easily mitigate by a group commit.

    2) and 3)

    . I think that both 2 and 3 in the WL will not handle the case that a transaction
    is partially written to the binlog, i.e.

    T = b, e_1, e_2, e_3, c

    but due to a crash, the binlog has

    b, e_1, e_2

    How do you handle this case?


SUGGESTIONS
-----------

1. The crash-safety for the binlog is based on the assumption that the
"my_chsize" is idempotent. Howerver, my_chsize may not be an atomic operation.
So I think we should inject some faults in order to make sure that the master
will be safe if the recovery fails.

2. The same idea apply to the indexes.


Cheers.


On 11/15/2010 09:04 AM, Dao-Gang.Qu@stripped wrote:
> #At file:///home/daogang/bzrwork/wl5493/mysql-next-mr/ based on
> revid:alexander.nozdrin@stripped
>
>   3203 Dao-Gang.Qu@stripped	2010-11-15
>        WL#5493 Binlog crash-safe when master crashed
>
>        Trim the crashed binlog file to last valid transaction or event
>        (non-transaction) base on binlog size(valid_pos) when MYSQL server
>        crashed in the middle of binlog. And add a temp file to guarantee
>        the binlog index file to be crash safe.
>       @ mysql-test/r/crash_commit_before.result
>          Removed the result file. Because its test file is removed.
>       @ mysql-test/suite/rpl/r/rpl_crashed_master.result
>          Removed the test file. Because the test case is tested
>          in rpl_crashed_master.test file.
>       @ mysql-test/suite/rpl/t/rpl_crashed_master.test
>          Added test to verify if the crashed binlog file is trimmed to
>          last valid transaction or event(non-transaction) correctly.
>          And if the binlog index is crash safe.
>       @ mysql-test/t/crash_commit_before-master.opt
>          Removed the opt file. Because its test file is removed.
>       @ mysql-test/t/crash_commit_before.test
>          Removed the test file. Because the test case is tested
>          in rpl_crashed_master.test file.
>       @ sql/binlog.cc
>          Added code to trim the crashed binlog file to last valid transaction
>          or event(non-transaction) base on valid_pos, make binlog index to be
>          crash safe.
>       @ sql/binlog.h
>          Added code to define these functions for binlog index crash safe.
>
>      removed:
>        mysql-test/r/crash_commit_before.result
>        mysql-test/t/crash_commit_before-master.opt
>        mysql-test/t/crash_commit_before.test
>      added:
>        mysql-test/suite/rpl/r/rpl_crashed_master.result
>        mysql-test/suite/rpl/t/rpl_crashed_master.test
>      modified:
>        sql/binlog.cc
>        sql/binlog.h
> === removed file 'mysql-test/r/crash_commit_before.result'
> --- a/mysql-test/r/crash_commit_before.result	2007-04-03 09:36:33 +0000
> +++ b/mysql-test/r/crash_commit_before.result	1970-01-01 00:00:00 +0000
> @@ -1,14 +0,0 @@
> -CREATE TABLE t1(a int) engine=innodb;
> -START TRANSACTION;
> -insert into t1 values(9);
> -SET SESSION debug="d,crash_commit_before";
> -COMMIT;
> -ERROR HY000: Lost connection to MySQL server during query
> -SHOW CREATE TABLE t1;
> -Table	Create Table
> -t1	CREATE TABLE `t1` (
> -  `a` int(11) DEFAULT NULL
> -) ENGINE=InnoDB DEFAULT CHARSET=latin1
> -SELECT * FROM t1;
> -a
> -DROP TABLE t1;
>
> === added file 'mysql-test/suite/rpl/r/rpl_crashed_master.result'
> --- a/mysql-test/suite/rpl/r/rpl_crashed_master.result	1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/rpl/r/rpl_crashed_master.result	2010-11-15 09:04:32 +0000
> @@ -0,0 +1,262 @@
> +stop slave;
> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> +reset master;
> +reset slave;
> +drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
> +start slave;
> +STOP SLAVE;
> +RESET MASTER;
> +START SLAVE;
> +call mtr.add_suppression("Attempting backtrace");
> +call mtr.add_suppression("allocated tablespace *., old maximum was 0");
> +call mtr.add_suppression("Error in Log_event::read_log_event()");
> +call mtr.add_suppression("Buffered warning: Performance schema disabled");
> +CREATE TABLE t1(a LONGBLOB) ENGINE=INNODB;
> +# Test case1: Set DEBUG POINT before binlog to make
> +#             the master crash for transaction
> +BEGIN;
> +INSERT INTO t1 (a) VALUES (REPEAT('a',2));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',2));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',2));
> +SET SESSION debug="d,crash_commit_after_prepare";
> +COMMIT;
> +ERROR HY000: Lost connection to MySQL server during query
> +# Restart the master server
> +# Test the transaction statements will not be binlogged
> +show binlog events in 'master-bin.000001' from<binlog_start>;
> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> +master-bin.000001	#	Query	#	#	use `test`; CREATE TABLE t1(a LONGBLOB) ENGINE=INNODB
> +# On master, test the data will be rolled back after restart.
> +SELECT COUNT(*) FROM t1;
> +COUNT(*)
> +0
> +# On slave, test replication will work fine, and the data
> +#           is not replicated
> +SELECT COUNT(*) FROM t1;
> +COUNT(*)
> +0
> +# Test case2: Set DEBUG POINT after binlog, and before the date
> +#             is committed to make crash for transaction
> +BEGIN;
> +INSERT INTO t1 (a) VALUES (REPEAT('a',2));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',2));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',2));
> +SET SESSION debug="d,crash_commit_after_log";
> +COMMIT;
> +ERROR HY000: Lost connection to MySQL server during query
> +# Restart the master server
> +# Test the transaction statements will be binlogged
> +show binlog events in 'master-bin.000002' from<binlog_start>;
> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> +master-bin.000002	#	Query	#	#	BEGIN
> +master-bin.000002	#	Table_map	#	#	table_id: # (test.t1)
> +master-bin.000002	#	Write_rows	#	#	table_id: # flags: STMT_END_F
> +master-bin.000002	#	Table_map	#	#	table_id: # (test.t1)
> +master-bin.000002	#	Write_rows	#	#	table_id: # flags: STMT_END_F
> +master-bin.000002	#	Table_map	#	#	table_id: # (test.t1)
> +master-bin.000002	#	Write_rows	#	#	table_id: # flags: STMT_END_F
> +master-bin.000002	#	Xid	#	#	COMMIT /* XID */
> +# On master, test the data will be recovered after the master restart
> +SELECT COUNT(*) FROM t1;
> +COUNT(*)
> +3
> +# On slave, test replication will work fine, and the data is replicated
> +SELECT COUNT(*) FROM t1;
> +COUNT(*)
> +3
> +DROP TABLE t1;
> +include/stop_slave.inc
> +CREATE TABLE t1(a LONGBLOB) ENGINE=INNODB;
> +# Test case3: Set DEBUG POINT in the middle of binlog to
> +#             make the master crash for transaction.
> +SET SESSION debug="d,half_binlogged_transaction";
> +BEGIN;
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +COMMIT;
> +ERROR HY000: Lost connection to MySQL server during query
> +# Restart the master server
> +# Test the halfly binlogged transaction will be trimmed
> +# from the crashed binlog file
> +show binlog events in 'master-bin.000003' from<binlog_start>;
> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> +master-bin.000003	#	Query	#	#	use `test`; CREATE TABLE t1(a LONGBLOB) ENGINE=INNODB
> +# Test the data will not be recovered successfully
> +# after the master restart.
> +SELECT COUNT(*) FROM t1;
> +COUNT(*)
> +0
> +# Test case4: Set DEBUG POINT in the middle of binlog to
> +#             make the master crash for non-transaction.
> +SET SESSION debug="d,half_binlogged_transaction";
> +CREATE TABLE t2(a LONGBLOB) ENGINE=MYISAM;
> +INSERT INTO t2 (a) VALUES (REPEAT('a',16384));
> +ERROR HY000: Lost connection to MySQL server during query
> +# Restart the master server
> +# Test the halfly binlogged non-transaction statement will be trimmed
> +# from the crashed binlog file
> +show binlog events in 'master-bin.000004' from<binlog_start>;
> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> +master-bin.000004	#	Query	#	#	use `test`; CREATE TABLE t2(a LONGBLOB) ENGINE=MYISAM
> +# Test the data will not be recovered successfully
> +# after the master restart.
> +SELECT COUNT(*) FROM t2;
> +COUNT(*)
> +0
> +# Test case5: Set DEBUG POINT before rename index file to make the master
> +#             crash when appending a binlog file name to index file.
> +show binary logs;
> +Log_name	File_size
> +master-bin.000001	#
> +master-bin.000002	#
> +master-bin.000003	#
> +master-bin.000004	#
> +master-bin.000005	#
> +insert into t1 values (1);
> +select count(*) from t1;
> +count(*)
> +1
> +SET SESSION debug="+d,crash_create_before_rename_index_file";
> +flush logs;
> +ERROR HY000: Lost connection to MySQL server during query
> +# Restart the master server
> +# Test the statement will be binlogged rightly before the master crashes.
> +show binlog events in 'master-bin.000005' from<binlog_start>;
> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> +master-bin.000005	#	Query	#	#	BEGIN
> +master-bin.000005	#	Table_map	#	#	table_id: # (test.t1)
> +master-bin.000005	#	Write_rows	#	#	table_id: # flags: STMT_END_F
> +master-bin.000005	#	Xid	#	#	COMMIT /* XID */
> +master-bin.000005	#	Rotate	#	#	master-bin.000006;pos=4
> +# Test the statement will be binlogged rightly after the master restarts.
> +insert into t1 values (2);
> +show binlog events in 'master-bin.000006' from<binlog_start>;
> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> +master-bin.000006	#	Query	#	#	BEGIN
> +master-bin.000006	#	Table_map	#	#	table_id: # (test.t1)
> +master-bin.000006	#	Write_rows	#	#	table_id: # flags: STMT_END_F
> +master-bin.000006	#	Xid	#	#	COMMIT /* XID */
> +select count(*) from t1;
> +count(*)
> +2
> +# Test the index file is complete.
> +show binary logs;
> +Log_name	File_size
> +master-bin.000001	#
> +master-bin.000002	#
> +master-bin.000003	#
> +master-bin.000004	#
> +master-bin.000005	#
> +master-bin.000006	#
> +# Test case6: Set DEBUG POINT after rename index file to make the master
> +#             crash when appending a binlog file name to index file.
> +insert into t2 values (1);
> +select count(*) from t2;
> +count(*)
> +1
> +SET SESSION debug="+d,crash_create_after_rename_index_file";
> +flush logs;
> +ERROR HY000: Lost connection to MySQL server during query
> +# Restart the master server
> +# Test the statement will be binlogged rightly before the master crashes.
> +show binlog events in 'master-bin.000006' from<binlog_start>;
> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> +master-bin.000006	#	Query	#	#	BEGIN
> +master-bin.000006	#	Table_map	#	#	table_id: # (test.t2)
> +master-bin.000006	#	Write_rows	#	#	table_id: # flags: STMT_END_F
> +master-bin.000006	#	Query	#	#	COMMIT
> +master-bin.000006	#	Rotate	#	#	master-bin.000007;pos=4
> +# Test the statement will be binlogged rightly after the master restarts.
> +insert into t2 values (2);
> +show binlog events in 'master-bin.000008' from<binlog_start>;
> +Log_name	Pos	Event_type	Server_id	End_log_pos	Info
> +master-bin.000008	#	Query	#	#	BEGIN
> +master-bin.000008	#	Table_map	#	#	table_id: # (test.t2)
> +master-bin.000008	#	Write_rows	#	#	table_id: # flags: STMT_END_F
> +master-bin.000008	#	Query	#	#	COMMIT
> +select count(*) from t2;
> +count(*)
> +2
> +# Test the index file is complete.
> +show binary logs;
> +Log_name	File_size
> +master-bin.000001	#
> +master-bin.000002	#
> +master-bin.000003	#
> +master-bin.000004	#
> +master-bin.000005	#
> +master-bin.000006	#
> +master-bin.000007	#
> +master-bin.000008	#
> +DROP TABLE t1, t2;
> +# Test case7: Set DEBUG POINT after rename index file to make the master
> +#             crash when purging the index file.
> +show binary logs;
> +Log_name	File_size
> +master-bin.000001	#
> +master-bin.000002	#
> +master-bin.000003	#
> +master-bin.000004	#
> +master-bin.000005	#
> +master-bin.000006	#
> +master-bin.000007	#
> +master-bin.000008	#
> +SET SESSION debug="+d,crash_create_after_rename_index_file";
> +purge binary logs TO 'master-bin.000004';
> +ERROR HY000: Lost connection to MySQL server during query
> +# Restart the master server
> +# Test the index file is complete and is purged successfully
> +show binary logs;
> +Log_name	File_size
> +master-bin.000004	#
> +master-bin.000005	#
> +master-bin.000006	#
> +master-bin.000007	#
> +master-bin.000008	#
> +master-bin.000009	#
> +# Test case8: Set DEBUG POINT befor rename index file to make the master
> +#             crash when purging the index file.
> +show binary logs;
> +Log_name	File_size
> +master-bin.000004	#
> +master-bin.000005	#
> +master-bin.000006	#
> +master-bin.000007	#
> +master-bin.000008	#
> +master-bin.000009	#
> +SET SESSION debug="+d,crash_create_before_rename_index_file";
> +purge binary logs TO 'master-bin.000006';
> +ERROR HY000: Lost connection to MySQL server during query
> +# Restart the master server
> +# Test the index file is complete, although is not purged successfully.
> +show binary logs;
> +Log_name	File_size
> +master-bin.000004	#
> +master-bin.000005	#
> +master-bin.000006	#
> +master-bin.000007	#
> +master-bin.000008	#
> +master-bin.000009	#
> +master-bin.000010	#
>
> === added file 'mysql-test/suite/rpl/t/rpl_crashed_master.test'
> --- a/mysql-test/suite/rpl/t/rpl_crashed_master.test	1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/rpl/t/rpl_crashed_master.test	2010-11-15 09:04:32 +0000
> @@ -0,0 +1,331 @@
> +#
> +# WL#5493&  WL#5440
> +# Test case1 verifies if the transaction statements will not be
> +# binlogged and replication will work fine, but the data will
> +# be rolled back on master after the master restarted when setting
> +# DEBUG POINT before binlog to make the master crash.
> +#
> +# Test case2 verifies if the transaction statements will be
> +# binlogged and replication will work fine, and the data will
> +# be recovered after the master restarted when setting DEBUG
> +# POINT after binlog, and before the date is committed to make
> +# the master crash.
> +#
> +# Test case3 verifies if the halfly binlogged transaction
> +# statements will be trimmed from the crashed binlog file
> +# and the data will not be recovered successfully after
> +# the master restarted when setting DEBUG POINT in the
> +# middle of binlog to make the master crash
> +#
> +# Test case4 verifies if the halfly binlogged non-transaction
> +# statement will be trimmed from the crashed binlog file
> +# and the data will not be recovered successfully after
> +# the master restarted when setting DEBUG POINT in the
> +# middle of binlog to make the master crash.
> +#
> +# Test case5 verifies if the index file is complete and the
> +# statements well be binlogged rightly after the master restarts
> +# when setting DEBUG POINT before rename index file and appending
> +# binlog file name to index file to cause the master crash.
> +#
> +# Test case6 verifies if the index file is complete and the statements
> +# will be binlogged rightly after the master restarts when setting
> +# DEBUG POINT after rename index file and appending binlog file name
> +# to index file to cause the master crash.
> +#
> +# Test case7 verifies if the index file is complete and is purged
> +# successfully after the master restarts when setting DEBUG POINT
> +# after rename index file and purging the index file to cause the
> +# master crash.
> +#
> +# Test case8 verifies if the index file is complete, although is
> +# not purged successfully after the master restarts when setting
> +# DEBUG POINT before rename index file and purging the index file
> +# to cause the master crash.
> +#
> +
> +
> +# Don't test this under valgrind, memory leaks will occur
> +-- source include/not_valgrind.inc
> +-- source include/not_embedded.inc
> +-- source include/master-slave.inc
> +-- source include/have_debug.inc
> +-- source include/have_innodb.inc
> +-- source include/have_binlog_format_row.inc
> +
> +# Reset master
> +connection slave;
> +STOP SLAVE;
> +--source include/wait_for_slave_to_stop.inc
> +
> +connection master;
> +RESET MASTER;
> +
> +connection slave;
> +START SLAVE;
> +--source include/wait_for_slave_to_start.inc
> +
> +connection master;
> +call mtr.add_suppression("Attempting backtrace");
> +call mtr.add_suppression("allocated tablespace *., old maximum was 0");
> +call mtr.add_suppression("Error in Log_event::read_log_event()");
> +call mtr.add_suppression("Buffered warning: Performance schema disabled");
> +
> +-- let $binlog_start= query_get_value(SHOW MASTER STATUS, Position, 1)
> +-- let $binlog_file= query_get_value(SHOW MASTER STATUS, File, 1)
> +CREATE TABLE t1(a LONGBLOB) ENGINE=INNODB;
> +
> +-- echo # Test case1: Set DEBUG POINT before binlog to make
> +-- echo #             the master crash for transaction
> +#SET SESSION debug="d,crash_trans_commit_before_binlog";
> +BEGIN;
> +let $rows= 3;
> +WHILE($rows)
> +{
> +  INSERT INTO t1 (a) VALUES (REPEAT('a',2));
> +  dec $rows;
> +}
> +# Write file to make mysql-test-run.pl expect crash and restart
> +-- exec echo "wait">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +SET SESSION debug="d,crash_commit_after_prepare";
> +# Run the crashing query
> +-- error 2013
> +COMMIT;
> +
> +-- source include/wait_until_disconnected.inc
> +-- enable_reconnect
> +-- echo # Restart the master server
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +-- source include/wait_until_connected_again.inc
> +-- disable_reconnect
> +
> +-- echo # Test the transaction statements will not be binlogged
> +-- source include/show_binlog_events.inc
> +
> +-- echo # On master, test the data will be rolled back after restart.
> +SELECT COUNT(*) FROM t1;
> +
> +sync_slave_with_master;
> +-- echo # On slave, test replication will work fine, and the data
> +-- echo #           is not replicated
> +SELECT COUNT(*) FROM t1;
> +
> +connection master;
> +-- let $binlog_start= query_get_value(SHOW MASTER STATUS, Position, 1)
> +-- let $binlog_file= query_get_value(SHOW MASTER STATUS, File, 1)
> +-- echo # Test case2: Set DEBUG POINT after binlog, and before the date
> +-- echo #             is committed to make crash for transaction
> +#SET SESSION debug="d,crash_trans_commit_after_binlog";
> +BEGIN;
> +let $rows= 3;
> +WHILE($rows)
> +{
> +  INSERT INTO t1 (a) VALUES (REPEAT('a',2));
> +  dec $rows;
> +}
> +# Write file to make mysql-test-run.pl expect crash and restart
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +SET SESSION debug="d,crash_commit_after_log";
> +# Run the crashing query
> +-- error 2013
> +COMMIT;
> +
> +-- source include/wait_until_disconnected.inc
> +-- enable_reconnect
> +-- echo # Restart the master server
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +-- source include/wait_until_connected_again.inc
> +-- disable_reconnect
> +
> +-- echo # Test the transaction statements will be binlogged
> +-- source include/show_binlog_events.inc
> +
> +-- echo # On master, test the data will be recovered after the master restart
> +SELECT COUNT(*) FROM t1;
> +
> +sync_slave_with_master;
> +-- echo # On slave, test replication will work fine, and the data is replicated
> +SELECT COUNT(*) FROM t1;
> +
> +connection master;
> +DROP TABLE t1;
> +sync_slave_with_master;
> +
> +-- source include/stop_slave.inc
> +
> +connection master;
> +-- let $binlog_start= query_get_value(SHOW MASTER STATUS, Position, 1)
> +-- let $binlog_file= query_get_value(SHOW MASTER STATUS, File, 1)
> +# Test transaction with large data inserted
> +CREATE TABLE t1(a LONGBLOB) ENGINE=INNODB;
> +
> +-- echo # Test case3: Set DEBUG POINT in the middle of binlog to
> +-- echo #             make the master crash for transaction.
> +SET SESSION debug="d,half_binlogged_transaction";
> +BEGIN;
> +let $rows= 24;
> +WHILE($rows)
> +{
> +  INSERT INTO t1 (a) VALUES (REPEAT('a',6144));
> +  dec $rows;
> +}
> +# Write file to make mysql-test-run.pl expect crash and restart
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +# Run the crashing query
> +-- error 2013
> +COMMIT;
> +
> +-- source include/wait_until_disconnected.inc
> +-- enable_reconnect
> +-- echo # Restart the master server
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +-- source include/wait_until_connected_again.inc
> +-- disable_reconnect
> +
> +-- echo # Test the halfly binlogged transaction will be trimmed
> +-- echo # from the crashed binlog file
> +-- source include/show_binlog_events.inc
> +
> +-- echo # Test the data will not be recovered successfully
> +-- echo # after the master restart.
> +SELECT COUNT(*) FROM t1;
> +
> +-- echo # Test case4: Set DEBUG POINT in the middle of binlog to
> +-- echo #             make the master crash for non-transaction.
> +SET SESSION debug="d,half_binlogged_transaction";
> +# Write file to make mysql-test-run.pl expect crash and restart
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +-- let $binlog_start= query_get_value(SHOW MASTER STATUS, Position, 1)
> +-- let $binlog_file= query_get_value(SHOW MASTER STATUS, File, 1)
> +CREATE TABLE t2(a LONGBLOB) ENGINE=MYISAM;
> +-- error 2013
> +INSERT INTO t2 (a) VALUES (REPEAT('a',16384));
> +
> +-- source include/wait_until_disconnected.inc
> +-- enable_reconnect
> +-- echo # Restart the master server
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +-- source include/wait_until_connected_again.inc
> +-- disable_reconnect
> +
> +-- echo # Test the halfly binlogged non-transaction statement will be trimmed
> +-- echo # from the crashed binlog file
> +-- source include/show_binlog_events.inc
> +
> +-- echo # Test the data will not be recovered successfully
> +-- echo # after the master restart.
> +SELECT COUNT(*) FROM t2;
> +
> +-- echo # Test case5: Set DEBUG POINT before rename index file to make the master
> +-- echo #             crash when appending a binlog file name to index file.
> +-- let $binlog_start= query_get_value(SHOW MASTER STATUS, Position, 1)
> +-- let $binlog_file= query_get_value(SHOW MASTER STATUS, File, 1)
> +-- source include/show_binary_logs.inc
> +insert into t1 values (1);
> +select count(*) from t1;
> +# Write file to make mysql-test-run.pl expect crash and restart
> +SET SESSION debug="+d,crash_create_before_rename_index_file";
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +
> +--error 2013
> +flush logs;
> +
> +-- source include/wait_until_disconnected.inc
> +-- enable_reconnect
> +-- echo # Restart the master server
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +-- source include/wait_until_connected_again.inc
> +-- disable_reconnect
> +
> +-- echo # Test the statement will be binlogged rightly before the master crashes.
> +-- source include/show_binlog_events.inc
> +
> +-- echo # Test the statement will be binlogged rightly after the master restarts.
> +-- let $binlog_start= query_get_value(SHOW MASTER STATUS, Position, 1)
> +-- let $binlog_file= query_get_value(SHOW MASTER STATUS, File, 1)
> +insert into t1 values (2);
> +-- source include/show_binlog_events.inc
> +select count(*) from t1;
> +
> +-- echo # Test the index file is complete.
> +-- source include/show_binary_logs.inc
> +
> +-- echo # Test case6: Set DEBUG POINT after rename index file to make the master
> +-- echo #             crash when appending a binlog file name to index file.
> +-- let $binlog_start= query_get_value(SHOW MASTER STATUS, Position, 1)
> +-- let $binlog_file= query_get_value(SHOW MASTER STATUS, File, 1)
> +
> +insert into t2 values (1);
> +select count(*) from t2;
> +
> +# Write file to make mysql-test-run.pl expect crash and restart
> +SET SESSION debug="+d,crash_create_after_rename_index_file";
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +
> +-- error 2013
> +flush logs;
> +
> +-- source include/wait_until_disconnected.inc
> +-- enable_reconnect
> +-- echo # Restart the master server
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +-- source include/wait_until_connected_again.inc
> +-- disable_reconnect
> +
> +-- echo # Test the statement will be binlogged rightly before the master crashes.
> +-- source include/show_binlog_events.inc
> +
> +-- echo # Test the statement will be binlogged rightly after the master restarts.
> +-- let $binlog_start= query_get_value(SHOW MASTER STATUS, Position, 1)
> +-- let $binlog_file= query_get_value(SHOW MASTER STATUS, File, 1)
> +insert into t2 values (2);
> +-- source include/show_binlog_events.inc
> +select count(*) from t2;
> +
> +-- echo # Test the index file is complete.
> +-- source include/show_binary_logs.inc
> +
> +DROP TABLE t1, t2;
> +
> +-- echo # Test case7: Set DEBUG POINT after rename index file to make the master
> +-- echo #             crash when purging the index file.
> +
> +-- source include/show_binary_logs.inc
> +# Write file to make mysql-test-run.pl expect crash and restart
> +SET SESSION debug="+d,crash_create_after_rename_index_file";
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +
> +-- error 2013
> +purge binary logs TO 'master-bin.000004';
> +
> +-- source include/wait_until_disconnected.inc
> +-- enable_reconnect
> +-- echo # Restart the master server
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +-- source include/wait_until_connected_again.inc
> +-- disable_reconnect
> +
> +-- echo # Test the index file is complete and is purged successfully
> +-- source include/show_binary_logs.inc
> +
> +-- echo # Test case8: Set DEBUG POINT befor rename index file to make the master
> +-- echo #             crash when purging the index file.
> +
> +-- source include/show_binary_logs.inc
> +# Write file to make mysql-test-run.pl expect crash and restart
> +SET SESSION debug="+d,crash_create_before_rename_index_file";
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +
> +-- error 2013
> +purge binary logs TO 'master-bin.000006';
> +
> +-- source include/wait_until_disconnected.inc
> +-- enable_reconnect
> +-- echo # Restart the master server
> +-- exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> +-- source include/wait_until_connected_again.inc
> +-- disable_reconnect
> +
> +-- echo # Test the index file is complete, although is not purged successfully.
> +-- source include/show_binary_logs.inc
> +
>
> === removed file 'mysql-test/t/crash_commit_before-master.opt'
> --- a/mysql-test/t/crash_commit_before-master.opt	2010-06-21 07:58:54 +0000
> +++ b/mysql-test/t/crash_commit_before-master.opt	1970-01-01 00:00:00 +0000
> @@ -1,3 +0,0 @@
> ---skip-stack-trace --skip-core-file
> ---default-storage-engine=MyISAM
> ---innodb-file-per-table=0
>
> === removed file 'mysql-test/t/crash_commit_before.test'
> --- a/mysql-test/t/crash_commit_before.test	2007-12-12 17:19:24 +0000
> +++ b/mysql-test/t/crash_commit_before.test	1970-01-01 00:00:00 +0000
> @@ -1,35 +0,0 @@
> --- source include/not_embedded.inc
> -# Don't test this under valgrind, memory leaks will occur
> ---source include/not_valgrind.inc
> -
> -# Binary must be compiled with debug for crash to occur
> ---source include/have_debug.inc
> -
> ---source include/have_innodb.inc
> -
> -CREATE TABLE t1(a int) engine=innodb;
> -START TRANSACTION;
> -insert into t1 values(9);
> -
> -# Setup the mysqld to crash at certain point
> -SET SESSION debug="d,crash_commit_before";
> -
> -# Write file to make mysql-test-run.pl expect crash and restart
> ---exec echo "restart">  $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
> -
> -# Run the crashing query
> ---error 2013
> -COMMIT;
> -
> -# Turn on reconnect
> ---enable_reconnect
> -
> -# Call script that will poll the server waiting for it to be back online again
> ---source include/wait_until_connected_again.inc
> -
> -SHOW CREATE TABLE t1;
> -
> -SELECT * FROM t1;
> -
> -
> -DROP TABLE t1;
>
> === modified file 'sql/binlog.cc'
> --- a/sql/binlog.cc	2010-10-08 14:35:24 +0000
> +++ b/sql/binlog.cc	2010-11-15 09:04:32 +0000
> @@ -1123,58 +1123,49 @@ int query_error_code(THD *thd, bool not_
>
>
>   /**
> -  Move all data up in a file in an filename index file.
> +  Copy content of 'from' file from offset to 'to' file.
>
> -    We do the copy outside of the IO_CACHE as the cache buffers would just
> -    make things slower and more complicated.
> -    In most cases the copy loop should only do one read.
> +  - We do the copy outside of the IO_CACHE as the cache
> +  buffers would just make things slower and more complicated.
> +  In most cases the copy loop should only do one read.
>
> -  @param index_file			File to move
> -  @param offset			Move everything from here to beginning
> +  @param from          File to copy
> +  @param to            File to copy to
> +  @param offset        Copy content of 'from' file from here to 'to' file

Replace

+  @param offset        Copy content of 'from' file from here to 'to' file

by

+  @param offset        Offset in 'from' file.


>
> -  @note
> -    File will be truncated to be 'offset' shorter or filled up with newlines
>
>     @retval
> -    0	ok
> +    0    ok
> +  @retval
> +    -1    error
>   */
> -
> -#ifdef HAVE_REPLICATION
> -
> -static bool copy_up_file_and_fill(IO_CACHE *index_file, my_off_t offset)
> +static bool copy_file(IO_CACHE *from, IO_CACHE *to, my_off_t offset)
>   {
>     int bytes_read;
> -  my_off_t init_offset= offset;
> -  File file= index_file->file;
>     uchar io_buf[IO_SIZE*2];
> -  DBUG_ENTER("copy_up_file_and_fill");
> +  DBUG_ENTER("copy_file");
>
> -  for (;; offset+= bytes_read)
> +  mysql_file_seek(from->file, offset, MY_SEEK_SET, MYF(0));
> +  do
>     {
> -    mysql_file_seek(file, offset, MY_SEEK_SET, MYF(0));
> -    if ((bytes_read= (int) mysql_file_read(file, io_buf, sizeof(io_buf),
> +    if ((bytes_read= (int) mysql_file_read(from->file, io_buf, sizeof(io_buf),
>                                              MYF(MY_WME)))
> -	<  0)
> +<  0)
>         goto err;
>       if (!bytes_read)
> -      break;					// end of file
> -    mysql_file_seek(file, offset-init_offset, MY_SEEK_SET, MYF(0));
> -    if (mysql_file_write(file, io_buf, bytes_read, MYF(MY_WME | MY_NABP)))
> +      break;                                    // end of file


Please, either remove the

   if (!bytes_read)
     break;

or replace the do/while by a for(;;) as the execution will not leave the loop
through the "while".


> +    if (mysql_file_write(to->file, io_buf, bytes_read, MYF(MY_WME | MY_NABP)))
>         goto err;
> -  }
> -  /* The following will either truncate the file or fill the end with \n' */
> -  if (mysql_file_chsize(file, offset - init_offset, '\n', MYF(MY_WME)) ||
> -      mysql_file_sync(file, MYF(MY_WME)))
> -    goto err;
> +  } while (bytes_read);
>
> -  /* Reset data in old index cache */
> -  reinit_io_cache(index_file, READ_CACHE, (my_off_t) 0, 0, 1);
>     DBUG_RETURN(0);
>
>   err:
>     DBUG_RETURN(1);
>   }
>
> +
> +#ifdef HAVE_REPLICATION
>   /**
>      Load data's io cache specific hook to be executed
>      before a chunk of data is being read into the cache's buffer
> @@ -1677,18 +1675,16 @@ bool MYSQL_BIN_LOG::open(const char *log
>   #endif
>
>         DBUG_ASSERT(my_b_inited(&index_file) != 0);

Do you still need this?

> -      reinit_io_cache(&index_file, WRITE_CACHE,
> -                      my_b_filelength(&index_file), 0, 0);

>         /*
>           As this is a new log file, we write the file name to the index
>           file. As every time we write to the index file, we sync it.
>         */

This comment is not valid.

> +#ifdef HAVE_REPLICATION
> +      DBUG_EXECUTE_IF("crash_create_before_update_index", DBUG_ABORT(););
> +#endif


This crash point is equivalent to crash_create_critical_before_update_index.
Besides, this crash_create_before_update_index is not used.


>         if (DBUG_EVALUATE_IF("fault_injection_updating_index", 1, 0) ||
> -          my_b_write(&index_file, (uchar*) log_file_name,
> -                     strlen(log_file_name)) ||
> -          my_b_write(&index_file, (uchar*) "\n", 1) ||
> -          flush_io_cache(&index_file) ||
> -          mysql_file_sync(index_file.file, MYF(MY_WME)))
> +          append_file_name_to_index_file((uchar*) log_file_name,
> +                                         strlen(log_file_name), need_mutex))
>           goto err;
>
>   #ifdef HAVE_REPLICATION
> @@ -1725,6 +1721,133 @@ shutdown the MySQL server and restart it
>   }
>
>
> +/**
> +  Move crash safe index file to index file.
> +
> +  @param neet_mutex    Set it to FALSE if its caller already have a
> +                       lock on LOCK_index

Please, replace

+  @param neet_mutex    Set it to FALSE if its caller already have a
+                       lock on LOCK_index

by

+  @param neet_mutex    Set it to FALSE if its caller already have a
 > +                       lock on LOCK_index



> +
> +  @retval
> +    0    ok
> +  @retval
> +    -1    error
> +*/
> +int MYSQL_BIN_LOG::move_crash_safe_index_file_to_index_file(bool need_mutex)
> +{
> +  File fd= -1;
> +  DBUG_ENTER("MYSQL_BIN_LOG::move_crash_safe_index_file_to_index_file");
> +
> +  if (need_mutex)
> +    mysql_mutex_lock(&LOCK_index);

Please, add an assertion here to guarantee that the caller has locked
the mutex.

> +  if (my_b_inited(&index_file))
> +  {
> +    end_io_cache(&index_file);
> +    if (mysql_file_close(index_file.file, MYF(0))<  0)
> +    {

Why don't you use my_close?

> +      sql_print_error("MYSQL_BIN_LOG::move_crash_safe_index_file_to_index_file "
> +                      "failed to close the index file.");
> +      goto err;
> +    }
> +    bzero((char*)&index_file, sizeof(index_file));
> +  }
> +
> +  DBUG_EXECUTE_IF("crash_create_before_rename_index_file", DBUG_ABORT(););
> +  if (my_rename(crash_safe_index_file_name, index_file_name, MYF(MY_WME)))
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::move_crash_safe_index_file_to_index_file "
> +                    "failed to move crash_safe_index_file to index file.");
> +    goto err;
> +  }
> +  DBUG_EXECUTE_IF("crash_create_after_rename_index_file", DBUG_ABORT(););
> +
> +  if ((fd= mysql_file_open(key_file_binlog_index,
> +                           index_file_name,
> +                           O_RDWR | O_CREAT | O_BINARY,
> +                           MYF(MY_WME)))<  0 ||
> +           mysql_file_sync(fd, MYF(MY_WME)) ||
> +           init_io_cache(&index_file, fd, IO_SIZE, READ_CACHE,
> +                         mysql_file_seek(fd, 0L, MY_SEEK_END, MYF(0)),
> +                                         0, MYF(MY_WME | MY_WAIT_IF_FULL)))
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::move_crash_safe_index_file_to_index_file "
> +                      "failed to open the index file.");
> +    goto err;
> +  }

my_open?


> +
> +  if (need_mutex)
> +    mysql_mutex_unlock(&LOCK_index);
> +  DBUG_RETURN(0);
> +
> +err:
> +  if (need_mutex)
> +    mysql_mutex_unlock(&LOCK_index);
> +  DBUG_RETURN(-1);
> +}


Why don't you change the logic to avoid duplicating

  +  if (need_mutex)
  +    mysql_mutex_unlock(&LOCK_index);

?


> +
> +
> +/**
> +  Append log file name to index file.
> +
> +  - To make crash safe, we copy all the content of index file
> +  to crash safe index file firstly and then append the log
> +  file name to the crash safe index file. Finally move the
> +  crash safe index file to index file.
> +
> +  @retval
> +    0   ok
> +  @retval
> +    -1   error
> +*/
> +int MYSQL_BIN_LOG::append_file_name_to_index_file(uchar* log_file_name,
> +                                                  int name_len, bool need_mutex)
> +{
> +  DBUG_ENTER("MYSQL_BIN_LOG::append_file_name_to_index_file");
> +
> +  if (open_crash_safe_index_file())
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::append_file_name_to_index_file "
> +                    "failed to open the crash safe index file.");
> +    goto err;
> +  }
> +
> +  if (copy_file(&index_file,&crash_safe_index_file, 0))
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::append_file_name_to_index_file "
> +                    "failed to copy index file to crash safe index file.");
> +    goto err;
> +  }
> +
> +  if (my_b_write(&crash_safe_index_file, log_file_name, name_len) ||
> +      my_b_write(&crash_safe_index_file, (uchar*) "\n", 1) ||
> +      flush_io_cache(&crash_safe_index_file) ||
> +      mysql_file_sync(crash_safe_index_file.file, MYF(MY_WME)))
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::append_file_name_to_index_file "
> +                    "failed to append log file name: %s, to crash "
> +                    "safe index file.", log_file_name);
> +    goto err;
> +  }

my_sync?

> +
> +  if (close_crash_safe_index_file())
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::append_file_name_to_index_file "
> +                    "failed to close the crash safe index file.");
> +    goto err;
> +  }
> +
> +  if (move_crash_safe_index_file_to_index_file(need_mutex))
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::append_file_name_to_index_file "
> +                    "failed to move crash safe index file to index file.");
> +    goto err;
> +  }
> +
> +  DBUG_RETURN(0);
> +
> +err:
> +  DBUG_RETURN(-1);
> +}
> +
>   int MYSQL_BIN_LOG::get_current_log(LOG_INFO* linfo)
>   {
>     mysql_mutex_lock(&LOCK_log);
> @@ -2048,6 +2171,92 @@ err:
>
>



> +int MYSQL_BIN_LOG::open_crash_safe_index_file()
> +{
> +  int error= 0;
> +  File file= -1;
> +
> +  DBUG_ENTER("MYSQL_BIN_LOG::open_crash_safe_index_file");
> +
> +  if (!my_b_inited(&crash_safe_index_file))
> +  {
> +    if ((file= my_open(crash_safe_index_file_name, O_RDWR | O_CREAT | O_BINARY,
> +                       MYF(MY_WME | ME_WAITTANG)))<  0  ||
> +        init_io_cache(&crash_safe_index_file, file, IO_SIZE, WRITE_CACHE,
> +                      0, 0, MYF(MY_WME | MY_NABP | MY_WAIT_IF_FULL)))
> +    {
> +      error= 1;
> +      sql_print_error("MYSQL_BIN_LOG::open_crash_safe_index_file failed "
> +                      "to open register file.");
> +    }
> +  }
> +  DBUG_RETURN(error);
> +}
> +

Please, s/to open register file/temporary index file/

> +
> +  @retval
> +    0    ok
> +  @retval
> +    LOG_INFO_IO    Got IO error while reading/writing file
> +*/
>   int MYSQL_BIN_LOG::update_log_index(LOG_INFO* log_info, bool need_update_threads)
>   {

Why do you introduced LOG_INFO_IO here and not in other places, e.g.
append_file_name_to_index_file?
Besides, the error is not mentioned in other functions that call this one, e.g.
purge_logs, and therefore is hard to see that the error is being used.


> -  if (copy_up_file_and_fill(&index_file, log_info->index_file_start_offset))
> -    return LOG_INFO_IO;
> +  if (open_crash_safe_index_file())
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::update_log_index failed to "
> +                    "open the crash safe index file.");
> +    goto err;
> +  }
> +
> +  if (copy_file(&index_file,&crash_safe_index_file,
> +                log_info->index_file_start_offset))
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::update_log_index failed to "
> +                    "copy index file to crash safe index file.");
> +    goto err;
> +  }
> +
> +  if (close_crash_safe_index_file())
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::update_log_index failed to "
> +                    "close the crash safe index file.");
> +    goto err;
> +  }
> +  if (move_crash_safe_index_file_to_index_file(FALSE))
> +  {
> +    sql_print_error("MYSQL_BIN_LOG::update_log_index failed to "
> +                    "move crash safe index file to index file.");
> +    goto err;
> +  }
>
>     // now update offsets in index file for running threads
>     if (need_update_threads)
>       adjust_linfo_offsets(log_info->index_file_start_offset);
>     return 0;
> +
> +err:
> +  return LOG_INFO_IO;
>   }
>
>   /**
> @@ -3474,9 +3729,9 @@ bool MYSQL_BIN_LOG::write(THD *thd, IO_C
>           goto err;
>
>         bool synced= 0;
> +      DBUG_EXECUTE_IF("half_binlogged_transaction", DBUG_ABORT(););
>         if (flush_and_sync(&synced))
>           goto err;
> -      DBUG_EXECUTE_IF("half_binlogged_transaction", DBUG_ABORT(););
>         if (cache->error)				// Error on read
>         {
>           sql_print_error(ER(ER_ERROR_ON_READ), cache->file_name, errno);
> @@ -3734,6 +3989,9 @@ int MYSQL_BIN_LOG::open(const char *opt_
>       Log_event  *ev=0;
>       Format_description_log_event fdle(BINLOG_VERSION);
>       char        log_name[FN_REFLEN];
> +    my_off_t    valid_pos= 0;
> +    my_off_t    binlog_size;
> +    MY_STAT     s;
>
>       if (! fdle.is_valid())
>         goto err;
> @@ -3755,12 +4013,16 @@ int MYSQL_BIN_LOG::open(const char *opt_
>         goto err;
>       }
>
> +    my_stat(log_name,&s, MYF(0));
> +    binlog_size= s.st_size;
> +
>       if ((ev= Log_event::read_log_event(&log, 0,&fdle))&&
>           ev->get_type_code() == FORMAT_DESCRIPTION_EVENT&&
>           ev->flags&  LOG_EVENT_BINLOG_IN_USE_F)
>       {
>         sql_print_information("Recovering after a crash using %s", opt_name);
> -      error= recover(&log, (Format_description_log_event *)ev);
> +      valid_pos= my_b_tell(&log);
> +      error= recover(&log, (Format_description_log_event *)ev,&valid_pos);
>       }
>       else
>         error=0;
> @@ -3771,6 +4033,51 @@ int MYSQL_BIN_LOG::open(const char *opt_
>
>       if (error)
>         goto err;
> +
> +    /* Trim the crashed binlog file to last valid transaction
> +      or event (non-transaction) base on valid_pos. */
> +    if (valid_pos>  0)
> +    {
> +      if ((file= mysql_file_open(key_file_binlog, log_name,
> +                                 O_RDWR | O_BINARY, MYF(MY_WME)))<  0)
> +      {
> +        sql_print_error("Failed to open the crashed binlog file "
> +                        "when master server is recovering it.");
> +        return -1;
> +      }
> +
> +      /* Change binlog file size to valid_pos */
> +      if (valid_pos<  binlog_size)
> +      {
> +        if (my_chsize(file, valid_pos, 0, MYF(MY_WME)))
> +        {
> +          sql_print_error("Failed to trim the crashed binlog file "
> +                          "when master server is recovering it.");
> +          mysql_file_close(file, MYF(MY_WME));
> +          return -1;
> +        }
> +        else
> +        {
> +          sql_print_information("Crashed binlog file %s size is %llu, "
> +                                "but recovered up to %llu. Binlog trimmed to %llu
> bytes.",
> +                                log_name, binlog_size, valid_pos, valid_pos);
> +        }
> +      }
> +
> +      /* Clear LOG_EVENT_BINLOG_IN_USE_F */
> +      my_off_t offset= BIN_LOG_HEADER_SIZE + FLAGS_OFFSET;
> +      uchar flags= 0;
> +      if (mysql_file_pwrite(file,&flags, 1, offset, MYF(0)) != 1)
> +      {
> +        sql_print_error("Failed to clear LOG_EVENT_BINLOG_IN_USE_F "
> +                        "for the crashed binlog file when master "
> +                        "server is recovering it.");
> +        mysql_file_close(file, MYF(MY_WME));
> +        return -1;
> +      }
> +
> +      mysql_file_close(file, MYF(MY_WME));
> +    } //end if
>     }
>
>   err:
> @@ -3820,11 +4127,13 @@ void MYSQL_BIN_LOG::unlog(ulong cookie,
>     rotate_and_purge(0);     // as ::write() did not rotate
>   }
>
> -int MYSQL_BIN_LOG::recover(IO_CACHE *log, Format_description_log_event *fdle)
> +int MYSQL_BIN_LOG::recover(IO_CACHE *log, Format_description_log_event *fdle,
> +                            my_off_t *valid_pos)


Please, add doxygen comments.

>   {
>     Log_event  *ev;
>     HASH xids;
>     MEM_ROOT mem_root;
> +  my_off_t last_valid_pos= *valid_pos;
>
>     if (! fdle->is_valid() ||
>         my_hash_init(&xids,&my_charset_bin, TC_LOG_PAGE_SIZE/3, 0,
> @@ -3837,6 +4146,15 @@ int MYSQL_BIN_LOG::recover(IO_CACHE *log
>
>     while ((ev= Log_event::read_log_event(log,0,fdle))&&  ev->is_valid())
>     {
> +    /*
> +      Recored valid pos for crahsed binlog file
> +      which contains incorrect events.
> +    */

Typo: s/Recored/Recorded and s/crahsed/crashed

> +    if (ev->get_type_code() == QUERY_EVENT&&
> +        !strcmp(((Query_log_event*)ev)->query, "BEGIN"))
> +      *valid_pos= last_valid_pos;
> +    last_valid_pos= my_b_tell(log);
> +

See comments above.

>       if (ev->get_type_code() == XID_EVENT)
>       {
>         Xid_log_event *xev=(Xid_log_event *)ev;
> @@ -3848,6 +4166,13 @@ int MYSQL_BIN_LOG::recover(IO_CACHE *log
>       delete ev;
>     }
>
> +  /*
> +    Recored valid pos for crashed binlog file
> +    which did not contain incorrect events.
> +  */
> +  if (!log->error)
> +    *valid_pos= last_valid_pos;
> +
>     if (ha_recover(&xids))
>       goto err2;


Thread
bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493Dao-Gang.Qu15 Nov
  • Re: bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493Alfranio Correia17 Nov
    • Re: bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493Daogang Qu19 Nov
      • Re: bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493Alfranio Correia19 Nov
        • Re: bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493Alfranio Correia19 Nov
          • Re: bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493Daogang Qu22 Nov
            • Re: bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493Daogang Qu22 Nov
              • Re: bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493Alfranio Correia22 Nov
                • Re: bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493Daogang Qu23 Nov
        • Re: bzr commit into mysql-next-mr branch (Dao-Gang.Qu:3203) WL#5493Daogang Qu22 Nov