List:Maria Storage Engine« Previous MessageNext Message »
From:Michael Widenius Date:January 12 2009 6:44pm
Subject:two tail pages and a recovery crash
View as plain text  
Hi!

>>>>> "Guilhem" == Guilhem Bichot <guilhem@stripped> writes:

Guilhem> Hello Monty,
Guilhem> A mystery of the block record format.

Guilhem> In the latest 5.1-maria in pushbuild2 (BUG#41493) there is such crash at 
Guilhem> recovery:
Guilhem> 081218 16:40:52 [Note] mysqld: Maria engine: starting recovery
Guilhem> recovered pages: 0% 19% 29% 42% 58% 75% 93%mysqld: ma_blockrec.c:6258: 
Guilhem> _ma_apply_redo_purge_row_head_or_tail: Assertion `(buff[7] & 7) == 
Guilhem> (uchar) page_type' failed.

Guilhem> In gdb I see that page_type is TAIL page but in buff[7] there is BLOB.

Guilhem> By using the Maria log from pushbuild2 I can reproduce it.
Guilhem> The weirdness seems to come from the previous changes done to this page, 
Guilhem> here's what I find in the trace of recovery (I extended recovery by 
Guilhem> adding some printouts for insert_row_blobs):

<cut>

Guilhem> You see: it's a row update, and it has *two* REDO_INSERT_ROW_TAIL (page 
Guilhem> 1 then page 2) and then REDO_INSERT_ROW_HEAD (page 597) then 
Guilhem> REDO_INSERT_ROW_BLOBS which covers page 2: how can page 2 be, in the 
Guilhem> same group of records, first an inserted tail and then an inserted BLOB? 
Guilhem> Is that theoretically possible? And is it normal to have two tails?

There is nothing strange with the two row_insert_row_tail.

The process of writing a row goes like this:

- First write the tails for all blobs
- Then we write the tails for the on-blob fields.
- After that we write the head block
- Then all full blob pages

This is described in a comment in write_block_record().

I was able to repeat it with this simple test:

create table t1 (a int, c varchar(10000), b blob) engine=maria;
insert into t1 values (1,repeat('c',9000), repeat("b",9000));
update t1 set c=repeat('d',9500), b=repeat('e',10000);

Here is the output for the update from maria_read_log:
Rec#12 LSN (1,0x68ee) short_trid 42217 long_transaction_id(num_type:36) len 6
Transaction long_trid 2 short_trid 42217 starts, undo_lsn (0,0x0) first_undo_lsn (0,0x0)
Rec#13 LSN (1,0x68f7) short_trid 42217 redo_free_head_or_tail(num_type:9) len 7
Rec#14 LSN (1,0x6901) short_trid 42217 redo_free_head_or_tail(num_type:9) len 7
Rec#15 LSN (1,0x690b) short_trid 42217 redo_free_blocks(num_type:8) len 18
Rec#16 LSN (1,0x6923) short_trid 42217 redo_new_row_tail(num_type:4) len 1828
Rec#17 LSN (1,0x704f) short_trid 42217 redo_new_row_tail(num_type:4) len 1328
Rec#18 LSN (1,0x7587) short_trid 42217 redo_insert_row_head(num_type:1) len 56
Rec#19 LSN (1,0x75c5) short_trid 42217 redo_insert_row_blobs(num_type:5) len 16388
Rec#20 LSN (1,0xb5e3) short_trid 42217 undo_row_update(num_type:20) len 18063
   ends a group:
   Rec#0 LSN (1,0x68f7) short_trid 42217 redo_free_head_or_tail(num_type:9) len 7
   For table of short id 38, './test/t1' page 5, applying record
   Rec#0 LSN (1,0x6901) short_trid 42217 redo_free_head_or_tail(num_type:9) len 7
   For table of short id 38, './test/t1' page 2, applying record
   Rec#0 LSN (1,0x690b) short_trid 42217 redo_free_blocks(num_type:8) len 18
   For table of short id 38, './test/t1', applying record
   Rec#0 LSN (1,0x6923) short_trid 42217 redo_new_row_tail(num_type:4) len 1828
   For table of short id 38, './test/t1' page 1, applying record
   Rec#0 LSN (1,0x704f) short_trid 42217 redo_new_row_tail(num_type:4) len 1328
   For table of short id 38, './test/t1' page 2, applying record
   Rec#0 LSN (1,0x7587) short_trid 42217 redo_insert_row_head(num_type:1) len 56
   For table of short id 38, './test/t1' page 3, applying record
   Rec#0 LSN (1,0x75c5) short_trid 42217 redo_insert_row_blobs(num_type:5) len 16388
   For table of short id 38, './test/t1', applying record
   Rec#0 LSN (1,0xb5e3) short_trid 42217 undo_row_update(num_type:20) len 18063

I was able to use the above log to recreate the t1.MA? tables, so it
looks like the base logic is correct and working.

I even tried to test things with several updates that modifies the row
in various ways, but it just worked for me:

create table t1 (a int, c varchar(10000), b blob) engine=maria;
insert into t1 values (1,repeat('c',9000), repeat("b",9000));
insert into t1 values (2,"", repeat("b",9000));
insert into t1 values (3,"", "");
update t1 set c=repeat('d',9500), b=repeat('e',10000) where a=1;
update t1 set c=repeat('d',9500), b=repeat('e',10000) where a=2;
update t1 set c=repeat('d',9500), b=repeat('e',10000) where a=3;
update t1 set c=repeat('d',9500), b=repeat('e',20000) where a=1;
update t1 set c=repeat('d',9500), b="" where a=2;
update t1 set c="", b=repeat('e,10000) where a=3;

Guilhem> After that, page 2 is a blob, and recovery breaks later on:

Guilhem> Rec#2437 LSN (1,0x5dd461e) short_trid 46587 
Guilhem> redo_purge_row_tail(num_type:7) len 8
Guilhem> Rec#2438 LSN (1,0x5dd4629) short_trid 46587 redo_free_blocks(num_type:8) 
Guilhem> len 11
Guilhem> Rec#2439 LSN (1,0x5dd463a) short_trid 46587 
Guilhem> redo_insert_row_tail(num_type:2) len 3140
Guilhem> Rec#2440 LSN (1,0x5dd5286) short_trid 46587 
Guilhem> redo_insert_row_head(num_type:1) len 56
Guilhem> Rec#2441 LSN (1,0x5dd52c4) short_trid 46587 
Guilhem> redo_insert_row_blobs(num_type:5) len 8197
Guilhem> Rec#2442 LSN (1,0x5dd72db) short_trid 46587 undo_row_update(num_type:20) 
Guilhem> len 47
Guilhem>     ends a group:
Guilhem>     Rec#0 LSN (1,0x5dd461e) short_trid 46587 
Guilhem> redo_purge_row_tail(num_type:7) len 8
Guilhem>     For table of short id 39, './test/table100_maria_page_int_autoinc' 
Guilhem> page 2, applying record

Guilhem> because it wants to purge a tail piece in page 2 and finds that page 2 
Guilhem> is BLOB.

As there any way to repeat the problem at a local computer?
(Have a trace of when this happens would be very usefull)
(It would be nice if pushbuild2 could be run in a repeatable mode...)

If one would know where the blob originally was written (or if there
even was a blob with a full blob page) would help a lot in solving this.

Possible bugs that could cause this:

- We log the wrong disk address for the blob.
- There was no full pages for the blob, but we logged them anyway

It would also be nice to know if the table was correct before we run
recover on it.

Regards,
Monty
Thread
two tail pages and a recovery crashGuilhem Bichot18 Dec
  • Re: two tail pages and a recovery crashGuilhem Bichot19 Dec
  • re: two tail pages and a recovery crashMichael Widenius22 Dec
  • re: two tail pages and a recovery crashMichael Widenius27 Dec
  • two tail pages and a recovery crashMichael Widenius12 Jan