List:Commits« Previous MessageNext Message »
From:vasil.dimov Date:July 3 2010 6:17pm
Subject:bzr push into mysql-5.1-innodb branch (vasil.dimov:3527 to 3535) Bug#54311
View as plain text  
 3535 Jimmy Yang	2010-06-30
      Port fix for bug #54311 from mysql-trunk-innodb to mysql-5.1-innodb codeline.
      Bug #54311: Crash on CHECK PARTITION after concurrent LOAD DATA
      and adaptive_hash_index=OFF

    modified:
      storage/innodb_plugin/ChangeLog
      storage/innodb_plugin/btr/btr0sea.c
      storage/innodb_plugin/ha/ha0ha.c
      storage/innodb_plugin/handler/ha_innodb.cc
      storage/innodb_plugin/include/btr0sea.h
 3534 Marko Mäkelä	2010-06-30
      Bug#54358 follow-up: Correct some error handling.

    modified:
      storage/innodb_plugin/row/row0sel.c
 3533 Marko Mäkelä	2010-06-30
      Correct some comments that were added in the fix of Bug #54358
      (READ UNCOMMITTED access failure of off-page DYNAMIC or COMPRESSED columns).
      
      Records that lack incompletely written externally stored columns may
      be accessed by READ UNCOMMITTED transaction even without involving a
      crash during an INSERT or UPDATE operation. I verified this as follows.
      
      (1) added a delay after the mini-transaction for writing the clustered
      index 'stub' record was committed (patch attached)
      (2) started mysqld in gdb, setting breakpoints to the where the
      assertions about READ UNCOMMITTED were added in the bug fix
      (3) invoked ibtest3 --create-options=key_block_size=2
      to create BLOBs in a COMPRESSED table
      (4) invoked the following:
      yes 'set transaction isolation level read uncommitted;
      checksum table blobt3;select sleep(1);'|mysql -uroot test
      (5) noted that one of the breakpoints was triggered
      (return(NULL) in btr_rec_copy_externally_stored_field())
      
      === modified file 'storage/innodb_plugin/row/row0ins.c'
      --- storage/innodb_plugin/row/row0ins.c	2010-06-30 08:17:25 +0000
      +++ storage/innodb_plugin/row/row0ins.c	2010-06-30 08:17:25 +0000
      @@ -2120,6 +2120,7 @@ function_exit:
       		rec_t*	rec;
       		ulint*	offsets;
       		mtr_start(&mtr);
      +		os_thread_sleep(5000000);
       
       		btr_cur_search_to_nth_level(index, 0, entry, PAGE_CUR_LE,
       					    BTR_MODIFY_TREE, &cursor, 0,
      
      === modified file 'storage/innodb_plugin/row/row0upd.c'
      --- storage/innodb_plugin/row/row0upd.c	2010-06-30 08:11:55 +0000
      +++ storage/innodb_plugin/row/row0upd.c	2010-06-30 08:11:55 +0000
      @@ -1763,6 +1763,7 @@ row_upd_clust_rec(
       		rec_offs_init(offsets_);
       
       		mtr_start(mtr);
      +		os_thread_sleep(5000000);
       
       		ut_a(btr_pcur_restore_position(BTR_MODIFY_TREE, pcur, mtr));
       		rec = btr_cur_get_rec(btr_cur);

    modified:
      storage/innodb_plugin/btr/btr0cur.c
      storage/innodb_plugin/row/row0sel.c
 3532 Marko Mäkelä	2010-06-29
      ChangeLog entry for Bug #54408

    modified:
      storage/innodb_plugin/ChangeLog
 3531 Marko Mäkelä	2010-06-29
      Bug#54408: txn rollback after recovery: row0umod.c:673
      dict_table_get_format(index->table)
      
      The REDUNDANT and COMPACT formats store a local 768-byte prefix of
      each externally stored column. No row_ext cache is needed, but we
      initialized one nevertheless. When the BLOB pointer was zero, we would
      ignore the locally stored prefix as well. This triggered an assertion
      failure in row_undo_mod_upd_exist_sec().
      
      row_build(): Allow ext==NULL when a REDUNDANT or COMPACT table
      contains externally stored columns.
      
      row_undo_search_clust_to_pcur(), row_upd_store_row(): Invoke
      row_build() with ext==NULL on REDUNDANT and COMPACT tables.
      
      rb://382 approved by Jimmy Yang

    modified:
      storage/innodb_plugin/row/row0row.c
      storage/innodb_plugin/row/row0undo.c
      storage/innodb_plugin/row/row0upd.c
 3530 Marko Mäkelä	2010-06-29
      ChangeLog entry for Bug #54358

    modified:
      storage/innodb_plugin/ChangeLog
 3529 Marko Mäkelä	2010-06-29
      Bug#54358: READ UNCOMMITTED access failure of off-page DYNAMIC or COMPRESSED
      columns
      
      When the server crashes after a record stub has been inserted and
      before all its off-page columns have been written, the record will
      contain incomplete off-page columns after crash recovery. Such records
      may only be accessed at the READ UNCOMMITTED isolation level or when
      rolling back a recovered transaction in recv_recovery_rollback_active().
      Skip these records at the READ UNCOMMITTED isolation level.
      
      TODO: Add assertions for checking the above assumptions hold when an
      incomplete BLOB is encountered.
      
      btr_rec_copy_externally_stored_field(): Return NULL if the field is
      incomplete.
      
      row_prebuilt_t::templ_contains_blob: Clarify what "BLOB" means in this
      context. Hint: MySQL BLOBs are not the same as InnoDB BLOBs.
      
      row_sel_store_mysql_rec(): Return FALSE if not all columns could be
      retrieved. Previously this function always returned TRUE.  Assert that
      the record is not delete-marked.
      
      row_sel_push_cache_row_for_mysql(): Return FALSE if not all columns
      could be retrieved.
      
      row_search_for_mysql(): Skip records containing incomplete off-page
      columns. Assert that the transaction isolation level is READ
      UNCOMMITTED.
      
      rb://380 approved by Jimmy Yang

    modified:
      storage/innodb_plugin/btr/btr0cur.c
      storage/innodb_plugin/include/btr0cur.h
      storage/innodb_plugin/include/row0mysql.h
      storage/innodb_plugin/row/row0merge.c
      storage/innodb_plugin/row/row0sel.c
 3528 Jimmy Yang	2010-06-28
      Check in fix for bug #53756: "ALTER TABLE ADD PRIMARY KEY affects
      crash recovery"
      
      rb://369 approved by Marko

    added:
      mysql-test/suite/innodb/r/innodb_bug53756.result
      mysql-test/suite/innodb/t/innodb_bug53756.test
    modified:
      storage/innobase/dict/dict0load.c
 3527 Sunny Bains	2010-06-25
      Fix bug#54583. This change reverses rsvn:1350 by getting rid of a bogus assertion
      and clarifies the invariant in dict_table_get_on_id().
            
      In Mar 2007 Marko observed a crash during recovery, the crash resulted from
      an UNDO operation on a system table. His solution was to acquire an X lock on
      the data dictionary, this in hindsight was an overkill. It is unclear what
      caused the crash, current hypothesis is that it was a memory corruption.
            
      The X lock results in performance issues by when undoing changes due to
      rollback during normal operation on regular tables.
            
      Why the change is safe:
      ======================
      The InnoDB code has changed since the original X lock change was made. In the
      new code we always lock the data dictionary in X mode during startup when
      UNDOing operations on the system tables (this is a given). This ensures that
      the crash Marko observed cannot happen as long as all transactions that update
      the system tables follow the standard rules by setting the appropriate DICT_OP
      flag when writing the log records when they make the changes.
            
      If transactions violate the above mentioned rule then during recovery (at
      startup) the rollback code (see trx0roll.c) will not acquire the X lock
      and we will see the crash again.  This will however be a different bug.

    modified:
      storage/innobase/dict/dict0dict.c
      storage/innobase/include/sync0sync.h
      storage/innobase/row/row0undo.c
      storage/innodb_plugin/dict/dict0dict.c
      storage/innodb_plugin/include/sync0sync.h
      storage/innodb_plugin/row/row0undo.c
=== added file 'mysql-test/suite/innodb/r/innodb_bug53756.result'
--- a/mysql-test/suite/innodb/r/innodb_bug53756.result	1970-01-01 00:00:00 +0000
+++ b/mysql-test/suite/innodb/r/innodb_bug53756.result	revid:jimmy.yang@stripped0701050601-i58b3e1c1o78w8ou
@@ -0,0 +1,118 @@
+DROP TABLE IF EXISTS bug_53756 ;
+CREATE TABLE bug_53756 (pk INT, c1 INT) ENGINE=InnoDB;
+ALTER TABLE bug_53756 ADD PRIMARY KEY (pk);
+INSERT INTO bug_53756 VALUES(1, 11), (2, 22), (3, 33), (4, 44);
+
+# Select a less restrictive isolation level.
+SET GLOBAL TRANSACTION ISOLATION LEVEL READ COMMITTED;
+SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
+COMMIT;
+
+# Start a transaction in the default connection for isolation.
+START TRANSACTION;
+SELECT @@tx_isolation;
+@@tx_isolation
+READ-COMMITTED
+SELECT * FROM bug_53756;
+pk	c1
+1	11
+2	22
+3	33
+4	44
+
+# connection con1 deletes row 1
+START TRANSACTION;
+SELECT @@tx_isolation;
+@@tx_isolation
+READ-COMMITTED
+DELETE FROM bug_53756 WHERE pk=1;
+
+# connection con2 deletes row 2
+START TRANSACTION;
+SELECT @@tx_isolation;
+@@tx_isolation
+READ-COMMITTED
+DELETE FROM bug_53756 WHERE pk=2;
+
+# connection con3 updates row 3
+START TRANSACTION;
+SELECT @@tx_isolation;
+@@tx_isolation
+READ-COMMITTED
+UPDATE bug_53756 SET c1=77 WHERE pk=3;
+
+# connection con4 updates row 4
+START TRANSACTION;
+SELECT @@tx_isolation;
+@@tx_isolation
+READ-COMMITTED
+UPDATE bug_53756 SET c1=88 WHERE pk=4;
+
+# connection con5 inserts row 5
+START TRANSACTION;
+SELECT @@tx_isolation;
+@@tx_isolation
+READ-COMMITTED
+INSERT INTO bug_53756 VALUES(5, 55);
+
+# connection con6 inserts row 6
+START TRANSACTION;
+SELECT @@tx_isolation;
+@@tx_isolation
+READ-COMMITTED
+INSERT INTO bug_53756 VALUES(6, 66);
+
+# connection con1 commits.
+COMMIT;
+
+# connection con3 commits.
+COMMIT;
+
+# connection con4 rolls back.
+ROLLBACK;
+
+# connection con6 rolls back.
+ROLLBACK;
+
+# The connections 2 and 5 stay open.
+
+# connection default selects resulting data.
+# Delete of row 1 was committed.
+# Dpdate of row 3 was committed.
+# Due to isolation level read committed, these should be included.
+# All other changes should not be included.
+SELECT * FROM bug_53756;
+pk	c1
+2	22
+3	77
+4	44
+
+# connection default
+#
+# Crash server.
+START TRANSACTION;
+INSERT INTO bug_53756 VALUES (666,666);
+SET SESSION debug="+d,crash_commit_before";
+COMMIT;
+ERROR HY000: Lost connection to MySQL server during query
+
+#
+# disconnect con1, con2, con3, con4, con5, con6.
+#
+# Restart server.
+
+#
+# Select recovered data.
+# Delete of row 1 was committed.
+# Update of row 3 was committed.
+# These should be included.
+# All other changes should not be included.
+# Delete of row 2 and insert of row 5 should be rolled back
+SELECT * FROM bug_53756;
+pk	c1
+2	22
+3	77
+4	44
+
+# Clean up.
+DROP TABLE bug_53756;

=== added file 'mysql-test/suite/innodb/t/innodb_bug53756.test'
--- a/mysql-test/suite/innodb/t/innodb_bug53756.test	1970-01-01 00:00:00 +0000
+++ b/mysql-test/suite/innodb/t/innodb_bug53756.test	revid:jimmy.yang@stripped-i58b3e1c1o78w8ou
@@ -0,0 +1,184 @@
+# This is the test case for bug #53756. Alter table operation could
+# leave a deleted record for the temp table (later renamed to the altered
+# table) in the SYS_TABLES secondary index, we should ignore this row and
+# find the first non-deleted row for the specified table_id when load table
+# metadata in the function dict_load_table_on_id() during crash recovery.
+
+#
+# innobackup needs to connect to the server. Not supported in embedded.
+--source include/not_embedded.inc
+#
+# This test case needs to crash the server. Needs a debug server.
+--source include/have_debug.inc
+#
+# Don't test this under valgrind, memory leaks will occur.
+--source include/not_valgrind.inc
+#
+# This test case needs InnoDB.
+--source include/have_innodb.inc
+
+#
+# Precautionary clean up.
+#
+--disable_warnings
+DROP TABLE IF EXISTS bug_53756 ;
+--enable_warnings
+
+#
+# Create test data.
+#
+CREATE TABLE bug_53756 (pk INT, c1 INT) ENGINE=InnoDB;
+ALTER TABLE bug_53756 ADD PRIMARY KEY (pk);
+INSERT INTO bug_53756 VALUES(1, 11), (2, 22), (3, 33), (4, 44);
+
+--echo
+--echo # Select a less restrictive isolation level.
+# Don't use user variables. They won't survive server crash.
+--let $global_isolation= `SELECT @@global.tx_isolation`;
+--let $session_isolation= `SELECT @@session.tx_isolation`;
+SET GLOBAL TRANSACTION ISOLATION LEVEL READ COMMITTED;
+SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
+COMMIT;
+
+--echo
+--echo # Start a transaction in the default connection for isolation.
+START TRANSACTION;
+SELECT @@tx_isolation;
+SELECT * FROM bug_53756;
+
+--echo
+--echo # connection con1 deletes row 1
+--connect (con1,localhost,root,,)
+START TRANSACTION;
+SELECT @@tx_isolation;
+DELETE FROM bug_53756 WHERE pk=1;
+
+--echo
+--echo # connection con2 deletes row 2
+--connect (con2,localhost,root,,)
+START TRANSACTION;
+SELECT @@tx_isolation;
+DELETE FROM bug_53756 WHERE pk=2;
+
+--echo
+--echo # connection con3 updates row 3
+--connect (con3,localhost,root,,)
+START TRANSACTION;
+SELECT @@tx_isolation;
+UPDATE bug_53756 SET c1=77 WHERE pk=3;
+
+--echo
+--echo # connection con4 updates row 4
+--connect (con4,localhost,root,,)
+START TRANSACTION;
+SELECT @@tx_isolation;
+UPDATE bug_53756 SET c1=88 WHERE pk=4;
+
+--echo
+--echo # connection con5 inserts row 5
+--connect (con5,localhost,root,,)
+START TRANSACTION;
+SELECT @@tx_isolation;
+INSERT INTO bug_53756 VALUES(5, 55);
+
+--echo
+--echo # connection con6 inserts row 6
+--connect (con6,localhost,root,,)
+START TRANSACTION;
+SELECT @@tx_isolation;
+INSERT INTO bug_53756 VALUES(6, 66);
+
+--echo
+--echo # connection con1 commits.
+--connection con1
+COMMIT;
+
+--echo
+--echo # connection con3 commits.
+--connection con3
+COMMIT;
+
+--echo
+--echo # connection con4 rolls back.
+--connection con4
+ROLLBACK;
+
+--echo
+--echo # connection con6 rolls back.
+--connection con6
+ROLLBACK;
+
+--echo
+--echo # The connections 2 and 5 stay open.
+
+--echo
+--echo # connection default selects resulting data.
+--echo # Delete of row 1 was committed.
+--echo # Dpdate of row 3 was committed.
+--echo # Due to isolation level read committed, these should be included.
+--echo # All other changes should not be included.
+--connection default
+SELECT * FROM bug_53756;
+
+--echo
+--echo # connection default
+--connection default
+--echo #
+--echo # Crash server.
+#
+# Write file to make mysql-test-run.pl expect the "crash", but don't start
+# it until it's told to
+--exec echo "wait" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+#
+START TRANSACTION;
+INSERT INTO bug_53756 VALUES (666,666);
+#
+# Request a crash on next execution of commit.
+SET SESSION debug="+d,crash_commit_before";
+#
+# Execute the statement that causes the crash.
+--error 2013
+COMMIT;
+--echo
+--echo #
+--echo # disconnect con1, con2, con3, con4, con5, con6.
+--disconnect con1
+--disconnect con2
+--disconnect con3
+--disconnect con4
+--disconnect con5
+--disconnect con6
+--echo #
+--echo # Restart server.
+#
+# Write file to make mysql-test-run.pl start up the server again
+--exec echo "restart" > $MYSQLTEST_VARDIR/tmp/mysqld.1.expect
+#
+# Turn on reconnect
+--enable_reconnect
+#
+# Call script that will poll the server waiting for it to be back online again
+--source include/wait_until_connected_again.inc
+#
+# Turn off reconnect again
+--disable_reconnect
+--echo
+
+--echo #
+--echo # Select recovered data.
+--echo # Delete of row 1 was committed.
+--echo # Update of row 3 was committed.
+--echo # These should be included.
+--echo # All other changes should not be included.
+--echo # Delete of row 2 and insert of row 5 should be rolled back
+SELECT * FROM bug_53756;
+
+--echo
+--echo # Clean up.
+DROP TABLE bug_53756;
+
+--disable_query_log
+eval SET GLOBAL tx_isolation= '$global_isolation';
+eval SET SESSION tx_isolation= '$session_isolation';
+--enable_query_log
+

=== modified file 'storage/innobase/dict/dict0load.c'
--- a/storage/innobase/dict/dict0load.c	revid:sunny.bains@stripped5081841-ppulnkjk1qlazh82
+++ b/storage/innobase/dict/dict0load.c	revid:jimmy.yang@stripped1c1o78w8ou
@@ -927,6 +927,8 @@ dict_load_table_on_id(
 
 	ut_ad(mutex_own(&(dict_sys->mutex)));
 
+	table = NULL;
+
 	/* NOTE that the operation of this function is protected by
 	the dictionary mutex, and therefore no deadlocks can occur
 	with other dictionary operations. */
@@ -953,15 +955,17 @@ dict_load_table_on_id(
 				  BTR_SEARCH_LEAF, &pcur, &mtr);
 	rec = btr_pcur_get_rec(&pcur);
 
-	if (!btr_pcur_is_on_user_rec(&pcur, &mtr)
-	    || rec_get_deleted_flag(rec, 0)) {
+	if (!btr_pcur_is_on_user_rec(&pcur, &mtr)) {
 		/* Not found */
+		goto func_exit;
+	}
 
-		btr_pcur_close(&pcur);
-		mtr_commit(&mtr);
-		mem_heap_free(heap);
-
-		return(NULL);
+	/* Find the first record that is not delete marked */
+	while (rec_get_deleted_flag(rec, 0)) {
+		if (!btr_pcur_move_to_next_user_rec(&pcur, &mtr)) {
+			goto func_exit;
+		}
+		rec = btr_pcur_get_rec(&pcur);
 	}
 
 	/*---------------------------------------------------*/
@@ -974,19 +978,14 @@ dict_load_table_on_id(
 
 	/* Check if the table id in record is the one searched for */
 	if (ut_dulint_cmp(table_id, mach_read_from_8(field)) != 0) {
-
-		btr_pcur_close(&pcur);
-		mtr_commit(&mtr);
-		mem_heap_free(heap);
-
-		return(NULL);
+		goto func_exit;
 	}
 
 	/* Now we get the table name from the record */
 	field = rec_get_nth_field_old(rec, 1, &len);
 	/* Load the table definition to memory */
 	table = dict_load_table(mem_heap_strdupl(heap, (char*) field, len));
-
+func_exit:
 	btr_pcur_close(&pcur);
 	mtr_commit(&mtr);
 	mem_heap_free(heap);

=== modified file 'storage/innodb_plugin/ChangeLog'
--- a/storage/innodb_plugin/ChangeLog	revid:sunny.bains@stripped841-ppulnkjk1qlazh82
+++ b/storage/innodb_plugin/ChangeLog	revid:jimmy.yang@strippedw8ou
@@ -1,3 +1,21 @@
+2010-06-30	The InnoDB Team
+
+	* btr/btr0sea.c, ha/ha0ha.c, handler/ha_innodb.cc, include/btr0sea.h:
+	Fix Bug#54311 Crash on CHECK PARTITION after concurrent LOAD DATA
+	and adaptive_hash_index=OFF
+
+2010-06-29	The InnoDB Team
+	* row/row0row.c, row/row0undo.c, row/row0upd.c:
+	Fix Bug#54408 txn rollback after recovery: row0umod.c:673
+	dict_table_get_format(index->table)
+
+2010-06-29	The InnoDB Team
+
+	* btr/btr0cur.c, include/btr0cur.h,
+	include/row0mysql.h, row/row0merge.c, row/row0sel.c:
+	Fix Bug#54358 READ UNCOMMITTED access failure of off-page DYNAMIC
+	or COMPRESSED columns
+
 2010-06-24	The InnoDB Team
 
 	* handler/ha_innodb.cc:

=== modified file 'storage/innodb_plugin/btr/btr0cur.c'
--- a/storage/innodb_plugin/btr/btr0cur.c	revid:sunny.bains@oracle.com-20100625081841-ppulnkjk1qlazh82
+++ b/storage/innodb_plugin/btr/btr0cur.c	revid:jimmy.yang@stripped1050601-i58b3e1c1o78w8ou
@@ -4814,7 +4814,7 @@ btr_copy_externally_stored_field(
 
 /*******************************************************************//**
 Copies an externally stored field of a record to mem heap.
-@return	the field copied to heap */
+@return	the field copied to heap, or NULL if the field is incomplete */
 UNIV_INTERN
 byte*
 btr_rec_copy_externally_stored_field(
@@ -4844,6 +4844,18 @@ btr_rec_copy_externally_stored_field(
 
 	data = rec_get_nth_field(rec, offsets, no, &local_len);
 
+	ut_a(local_len >= BTR_EXTERN_FIELD_REF_SIZE);
+
+	if (UNIV_UNLIKELY
+	    (!memcmp(data + local_len - BTR_EXTERN_FIELD_REF_SIZE,
+		     field_ref_zero, BTR_EXTERN_FIELD_REF_SIZE))) {
+		/* The externally stored field was not written yet.
+		This record should only be seen by
+		recv_recovery_rollback_active() or any
+		TRX_ISO_READ_UNCOMMITTED transactions. */
+		return(NULL);
+	}
+
 	return(btr_copy_externally_stored_field(len, data,
 						zip_size, local_len, heap));
 }

=== modified file 'storage/innodb_plugin/btr/btr0sea.c'
--- a/storage/innodb_plugin/btr/btr0sea.c	revid:sunny.bains@strippedk1qlazh82
+++ b/storage/innodb_plugin/btr/btr0sea.c	revid:jimmy.yang@stripped
@@ -46,6 +46,7 @@ Created 2/17/1996 Heikki Tuuri
 /** Flag: has the search system been enabled?
 Protected by btr_search_latch and btr_search_enabled_mutex. */
 UNIV_INTERN char		btr_search_enabled	= TRUE;
+UNIV_INTERN ibool		btr_search_fully_disabled = FALSE;
 
 /** Mutex protecting btr_search_enabled */
 static mutex_t			btr_search_enabled_mutex;
@@ -201,12 +202,19 @@ btr_search_disable(void)
 	mutex_enter(&btr_search_enabled_mutex);
 	rw_lock_x_lock(&btr_search_latch);
 
+	/* Disable access to hash index, also tell ha_insert_for_fold()
+	stop adding new nodes to hash index, but still allow updating
+	existing nodes */
 	btr_search_enabled = FALSE;
 
 	/* Clear all block->is_hashed flags and remove all entries
 	from btr_search_sys->hash_index. */
 	buf_pool_drop_hash_index();
 
+	/* hash index has been cleaned up, disallow any operation to
+	the hash index */
+	btr_search_fully_disabled = TRUE;
+
 	/* btr_search_enabled_mutex should guarantee this. */
 	ut_ad(!btr_search_enabled);
 
@@ -225,6 +233,7 @@ btr_search_enable(void)
 	rw_lock_x_lock(&btr_search_latch);
 
 	btr_search_enabled = TRUE;
+	btr_search_fully_disabled = FALSE;
 
 	rw_lock_x_unlock(&btr_search_latch);
 	mutex_exit(&btr_search_enabled_mutex);
@@ -1363,7 +1372,7 @@ btr_search_build_page_hash_index(
 
 	rw_lock_x_lock(&btr_search_latch);
 
-	if (UNIV_UNLIKELY(!btr_search_enabled)) {
+	if (UNIV_UNLIKELY(btr_search_fully_disabled)) {
 		goto exit_func;
 	}
 

=== modified file 'storage/innodb_plugin/ha/ha0ha.c'
--- a/storage/innodb_plugin/ha/ha0ha.c	revid:sunny.bains@strippedh82
+++ b/storage/innodb_plugin/ha/ha0ha.c	revid:jimmy.yang@stripped
@@ -31,9 +31,7 @@ Created 8/22/1994 Heikki Tuuri
 #ifdef UNIV_DEBUG
 # include "buf0buf.h"
 #endif /* UNIV_DEBUG */
-#ifdef UNIV_SYNC_DEBUG
-# include "btr0sea.h"
-#endif /* UNIV_SYNC_DEBUG */
+#include "btr0sea.h"
 #include "page0page.h"
 
 /*************************************************************//**
@@ -127,7 +125,8 @@ ha_clear(
 /*************************************************************//**
 Inserts an entry into a hash table. If an entry with the same fold number
 is found, its node is updated to point to the new data, and no new node
-is inserted.
+is inserted. If btr_search_enabled is set to FALSE, we will only allow
+updating existing nodes, but no new node is allowed to be added.
 @return	TRUE if succeed, FALSE if no more memory could be allocated */
 UNIV_INTERN
 ibool
@@ -174,6 +173,7 @@ ha_insert_for_fold_func(
 				prev_block->n_pointers--;
 				block->n_pointers++;
 			}
+			ut_ad(!btr_search_fully_disabled);
 # endif /* !UNIV_HOTBACKUP */
 
 			prev_node->block = block;
@@ -186,6 +186,13 @@ ha_insert_for_fold_func(
 		prev_node = prev_node->next;
 	}
 
+	/* We are in the process of disabling hash index, do not add
+	new chain node */
+	if (!btr_search_enabled) {
+		ut_ad(!btr_search_fully_disabled);
+		return(TRUE);
+	}
+
 	/* We have to allocate a new chain node */
 
 	node = mem_heap_alloc(hash_get_heap(table, fold), sizeof(ha_node_t));

=== modified file 'storage/innodb_plugin/handler/ha_innodb.cc'
--- a/storage/innodb_plugin/handler/ha_innodb.cc	revid:sunny.bains@stripped1qlazh82
+++ b/storage/innodb_plugin/handler/ha_innodb.cc	revid:jimmy.yang@stripped8ou
@@ -2270,6 +2270,7 @@ innobase_change_buffering_inited_ok:
 	/* Get the current high water mark format. */
 	innobase_file_format_check = (char*) trx_sys_file_format_max_get();
 
+	btr_search_fully_disabled = (!btr_search_enabled);
 	DBUG_RETURN(FALSE);
 error:
 	DBUG_RETURN(TRUE);

=== modified file 'storage/innodb_plugin/include/btr0cur.h'
--- a/storage/innodb_plugin/include/btr0cur.h	revid:sunny.bains@stripped
+++ b/storage/innodb_plugin/include/btr0cur.h	revid:jimmy.yang@stripped
@@ -570,7 +570,7 @@ btr_copy_externally_stored_field_prefix(
 	ulint		local_len);/*!< in: length of data, in bytes */
 /*******************************************************************//**
 Copies an externally stored field of a record to mem heap.
-@return	the field copied to heap */
+@return	the field copied to heap, or NULL if the field is incomplete */
 UNIV_INTERN
 byte*
 btr_rec_copy_externally_stored_field(

=== modified file 'storage/innodb_plugin/include/btr0sea.h'
--- a/storage/innodb_plugin/include/btr0sea.h	revid:sunny.bains@strippedk1qlazh82
+++ b/storage/innodb_plugin/include/btr0sea.h	revid:jimmy.yang@strippedu
@@ -190,7 +190,13 @@ btr_search_validate(void);
 
 /** Flag: has the search system been enabled?
 Protected by btr_search_latch and btr_search_enabled_mutex. */
-extern char btr_search_enabled;
+extern char	btr_search_enabled;
+
+/** Flag: whether the search system has completed its disabling process,
+It is set to TRUE right after buf_pool_drop_hash_index() in
+btr_search_disable(), indicating hash index entries are cleaned up.
+Protected by btr_search_latch and btr_search_enabled_mutex. */
+extern ibool	btr_search_fully_disabled;
 
 /** The search info struct in an index */
 struct btr_search_struct{

=== modified file 'storage/innodb_plugin/include/row0mysql.h'
--- a/storage/innodb_plugin/include/row0mysql.h	revid:sunny.bains@oracle.com-20100625081841-ppulnkjk1qlazh82
+++ b/storage/innodb_plugin/include/row0mysql.h	revid:jimmy.yang@oracle.com-20100701050601-i58b3e1c1o78w8ou
@@ -622,7 +622,11 @@ struct row_prebuilt_struct {
 					the secondary index, then this is
 					set to TRUE */
 	unsigned	templ_contains_blob:1;/*!< TRUE if the template contains
-					BLOB column(s) */
+					a column with DATA_BLOB ==
+					get_innobase_type_from_mysql_type();
+					not to be confused with InnoDB
+					externally stored columns
+					(VARCHAR can be off-page too) */
 	mysql_row_templ_t* mysql_template;/*!< template used to transform
 					rows fast between MySQL and Innobase
 					formats; memory for this template

=== modified file 'storage/innodb_plugin/row/row0merge.c'
--- a/storage/innodb_plugin/row/row0merge.c	revid:sunny.bains@stripped
+++ b/storage/innodb_plugin/row/row0merge.c	revid:jimmy.yang@stripped
@@ -1780,6 +1780,11 @@ row_merge_copy_blobs(
 		(below). */
 		data = btr_rec_copy_externally_stored_field(
 			mrec, offsets, zip_size, i, &len, heap);
+		/* Because we have locked the table, any records
+		written by incomplete transactions must have been
+		rolled back already. There must not be any incomplete
+		BLOB columns. */
+		ut_a(data);
 
 		dfield_set_data(field, data, len);
 	}

=== modified file 'storage/innodb_plugin/row/row0row.c'
--- a/storage/innodb_plugin/row/row0row.c	revid:sunny.bains@stripped00625081841-ppulnkjk1qlazh82
+++ b/storage/innodb_plugin/row/row0row.c	revid:jimmy.yang@strippedi58b3e1c1o78w8ou
@@ -294,7 +294,13 @@ row_build(
 
 	ut_ad(dtuple_check_typed(row));
 
-	if (j) {
+	if (!ext) {
+		/* REDUNDANT and COMPACT formats store a local
+		768-byte prefix of each externally stored
+		column. No cache is needed. */
+		ut_ad(dict_table_get_format(index->table)
+		      < DICT_TF_FORMAT_ZIP);
+	} else if (j) {
 		*ext = row_ext_create(j, ext_cols, row,
 				      dict_table_zip_size(index->table),
 				      heap);

=== modified file 'storage/innodb_plugin/row/row0sel.c'
--- a/storage/innodb_plugin/row/row0sel.c	revid:sunny.bains@stripped-20100625081841-ppulnkjk1qlazh82
+++ b/storage/innodb_plugin/row/row0sel.c	revid:jimmy.yang@stripped601-i58b3e1c1o78w8ou
@@ -416,7 +416,7 @@ row_sel_fetch_columns(
 							      field_no))) {
 
 				/* Copy an externally stored field to the
-				temporary heap */
+				temporary heap, if possible. */
 
 				heap = mem_heap_create(1);
 
@@ -425,6 +425,17 @@ row_sel_fetch_columns(
 					dict_table_zip_size(index->table),
 					field_no, &len, heap);
 
+				/* data == NULL means that the
+				externally stored field was not
+				written yet. This record
+				should only be seen by
+				recv_recovery_rollback_active() or any
+				TRX_ISO_READ_UNCOMMITTED
+				transactions. The InnoDB SQL parser
+				(the sole caller of this function)
+				does not implement READ UNCOMMITTED,
+				and it is not involved during rollback. */
+				ut_a(data);
 				ut_a(len != UNIV_SQL_NULL);
 
 				needs_copy = TRUE;
@@ -926,6 +937,7 @@ row_sel_get_clust_rec(
 	when plan->clust_pcur was positioned.  The latch will not be
 	released until mtr_commit(mtr). */
 
+	ut_ad(!rec_get_deleted_flag(clust_rec, rec_offs_comp(offsets)));
 	row_sel_fetch_columns(index, clust_rec, offsets,
 			      UT_LIST_GET_FIRST(plan->columns));
 	*out_rec = clust_rec;
@@ -1628,6 +1640,13 @@ skip_lock:
 				}
 
 				if (old_vers == NULL) {
+					/* The record does not exist
+					in our read view. Skip it, but
+					first attempt to determine
+					whether the index segment we
+					are searching through has been
+					exhausted. */
+
 					offsets = rec_get_offsets(
 						rec, index, offsets,
 						ULINT_UNDEFINED, &heap);
@@ -2647,9 +2666,8 @@ Convert a row in the Innobase format to 
 Note that the template in prebuilt may advise us to copy only a few
 columns to mysql_rec, other columns are left blank. All columns may not
 be needed in the query.
-@return TRUE if success, FALSE if could not allocate memory for a BLOB
-(though we may also assert in that case) */
-static
+@return TRUE on success, FALSE if not all columns could be retrieved */
+static __attribute__((warn_unused_result))
 ibool
 row_sel_store_mysql_rec(
 /*====================*/
@@ -2672,6 +2690,7 @@ row_sel_store_mysql_rec(
 	ut_ad(prebuilt->mysql_template);
 	ut_ad(prebuilt->default_rec);
 	ut_ad(rec_offs_validate(rec, NULL, offsets));
+	ut_ad(!rec_get_deleted_flag(rec, rec_offs_comp(offsets)));
 
 	if (UNIV_LIKELY_NULL(prebuilt->blob_heap)) {
 		mem_heap_free(prebuilt->blob_heap);
@@ -2719,6 +2738,21 @@ row_sel_store_mysql_rec(
 				dict_table_zip_size(prebuilt->table),
 				templ->rec_field_no, &len, heap);
 
+			if (UNIV_UNLIKELY(!data)) {
+				/* The externally stored field
+				was not written yet. This
+				record should only be seen by
+				recv_recovery_rollback_active()
+				or any TRX_ISO_READ_UNCOMMITTED
+				transactions. */
+
+				if (extern_field_heap) {
+					mem_heap_free(extern_field_heap);
+				}
+
+				return(FALSE);
+			}
+
 			ut_a(len != UNIV_SQL_NULL);
 		} else {
 			/* Field is stored in the row. */
@@ -3136,9 +3170,10 @@ row_sel_pop_cached_row_for_mysql(
 }
 
 /********************************************************************//**
-Pushes a row for MySQL to the fetch cache. */
-UNIV_INLINE
-void
+Pushes a row for MySQL to the fetch cache.
+@return TRUE on success, FALSE if the record contains incomplete BLOBs */
+UNIV_INLINE __attribute__((warn_unused_result))
+ibool
 row_sel_push_cache_row_for_mysql(
 /*=============================*/
 	row_prebuilt_t*	prebuilt,	/*!< in: prebuilt struct */
@@ -3180,10 +3215,11 @@ row_sel_push_cache_row_for_mysql(
 				  prebuilt->fetch_cache[
 					  prebuilt->n_fetch_cached],
 				  prebuilt, rec, offsets))) {
-		ut_error;
+		return(FALSE);
 	}
 
 	prebuilt->n_fetch_cached++;
+	return(TRUE);
 }
 
 /*********************************************************************//**
@@ -3578,11 +3614,21 @@ row_search_for_mysql(
 
 				if (!row_sel_store_mysql_rec(buf, prebuilt,
 							     rec, offsets)) {
-					err = DB_TOO_BIG_RECORD;
+					/* Only fresh inserts may contain
+					incomplete externally stored
+					columns. Pretend that such
+					records do not exist. Such
+					records may only be accessed
+					at the READ UNCOMMITTED
+					isolation level or when
+					rolling back a recovered
+					transaction. Rollback happens
+					at a lower level, not here. */
+					ut_a(trx->isolation_level
+					     == TRX_ISO_READ_UNCOMMITTED);
 
-					/* We let the main loop to do the
-					error handling */
-					goto shortcut_fails_too_big_rec;
+					/* Proceed as in case SEL_RETRY. */
+					break;
 				}
 
 				mtr_commit(&mtr);
@@ -3622,7 +3668,7 @@ release_search_latch_if_needed:
 			default:
 				ut_ad(0);
 			}
-shortcut_fails_too_big_rec:
+
 			mtr_commit(&mtr);
 			mtr_start(&mtr);
 		}
@@ -4357,9 +4403,18 @@ requires_clust_rec:
 		not cache rows because there the cursor is a scrollable
 		cursor. */
 
-		row_sel_push_cache_row_for_mysql(prebuilt, result_rec,
-						 offsets);
-		if (prebuilt->n_fetch_cached == MYSQL_FETCH_CACHE_SIZE) {
+		if (!row_sel_push_cache_row_for_mysql(prebuilt, result_rec,
+						      offsets)) {
+			/* Only fresh inserts may contain incomplete
+			externally stored columns. Pretend that such
+			records do not exist. Such records may only be
+			accessed at the READ UNCOMMITTED isolation
+			level or when rolling back a recovered
+			transaction. Rollback happens at a lower
+			level, not here. */
+			ut_a(trx->isolation_level == TRX_ISO_READ_UNCOMMITTED);
+		} else if (prebuilt->n_fetch_cached
+			   == MYSQL_FETCH_CACHE_SIZE) {
 
 			goto got_row;
 		}
@@ -4375,9 +4430,17 @@ requires_clust_rec:
 		} else {
 			if (!row_sel_store_mysql_rec(buf, prebuilt,
 						     result_rec, offsets)) {
-				err = DB_TOO_BIG_RECORD;
-
-				goto lock_wait_or_error;
+				/* Only fresh inserts may contain
+				incomplete externally stored
+				columns. Pretend that such records do
+				not exist. Such records may only be
+				accessed at the READ UNCOMMITTED
+				isolation level or when rolling back a
+				recovered transaction. Rollback
+				happens at a lower level, not here. */
+				ut_a(trx->isolation_level
+				     == TRX_ISO_READ_UNCOMMITTED);
+				goto next_rec;
 			}
 		}
 

=== modified file 'storage/innodb_plugin/row/row0undo.c'
--- a/storage/innodb_plugin/row/row0undo.c	revid:sunny.bains@oracle.com-20100625081841-ppulnkjk1qlazh82
+++ b/storage/innodb_plugin/row/row0undo.c	revid:jimmy.yang@stripped01050601-i58b3e1c1o78w8ou
@@ -199,8 +199,24 @@ row_undo_search_clust_to_pcur(
 
 		ret = FALSE;
 	} else {
+		row_ext_t**	ext;
+
+		if (dict_table_get_format(node->table) >= DICT_TF_FORMAT_ZIP) {
+			/* In DYNAMIC or COMPRESSED format, there is
+			no prefix of externally stored columns in the
+			clustered index record. Build a cache of
+			column prefixes. */
+			ext = &node->ext;
+		} else {
+			/* REDUNDANT and COMPACT formats store a local
+			768-byte prefix of each externally stored
+			column. No cache is needed. */
+			ext = NULL;
+			node->ext = NULL;
+		}
+
 		node->row = row_build(ROW_COPY_DATA, clust_index, rec,
-				      offsets, NULL, &node->ext, node->heap);
+				      offsets, NULL, ext, node->heap);
 		if (node->update) {
 			node->undo_row = dtuple_copy(node->row, node->heap);
 			row_upd_replace(node->undo_row, &node->undo_ext,

=== modified file 'storage/innodb_plugin/row/row0upd.c'
--- a/storage/innodb_plugin/row/row0upd.c	revid:sunny.bains@stripped
+++ b/storage/innodb_plugin/row/row0upd.c	revid:jimmy.yang@oracle.com-20100701050601-i58b3e1c1o78w8ou
@@ -1398,6 +1398,7 @@ row_upd_store_row(
 	dict_index_t*	clust_index;
 	rec_t*		rec;
 	mem_heap_t*	heap		= NULL;
+	row_ext_t**	ext;
 	ulint		offsets_[REC_OFFS_NORMAL_SIZE];
 	const ulint*	offsets;
 	rec_offs_init(offsets_);
@@ -1414,8 +1415,22 @@ row_upd_store_row(
 
 	offsets = rec_get_offsets(rec, clust_index, offsets_,
 				  ULINT_UNDEFINED, &heap);
+
+	if (dict_table_get_format(node->table) >= DICT_TF_FORMAT_ZIP) {
+		/* In DYNAMIC or COMPRESSED format, there is no prefix
+		of externally stored columns in the clustered index
+		record. Build a cache of column prefixes. */
+		ext = &node->ext;
+	} else {
+		/* REDUNDANT and COMPACT formats store a local
+		768-byte prefix of each externally stored column.
+		No cache is needed. */
+		ext = NULL;
+		node->ext = NULL;
+	}
+
 	node->row = row_build(ROW_COPY_DATA, clust_index, rec, offsets,
-			      NULL, &node->ext, node->heap);
+			      NULL, ext, node->heap);
 	if (node->is_delete) {
 		node->upd_row = NULL;
 		node->upd_ext = NULL;

Attachment: [text/bzr-bundle] bzr/jimmy.yang@oracle.com-20100701050601-i58b3e1c1o78w8ou.bundle
Thread
bzr push into mysql-5.1-innodb branch (vasil.dimov:3527 to 3535) Bug#54311vasil.dimov3 Jul