List:Maria Storage Engine« Previous MessageNext Message »
From:Michael Widenius Date:April 8 2008 10:19am
Subject:re: setting key_del when applying REDOs: bug?
View as plain text  
Hi!

>>>>> "Guilhem" == Guilhem Bichot <guilhem@stripped> writes:

Guilhem> Hello,
Guilhem> in _ma_apply_redo_index_new_page() there is:

Guilhem>   /* free_page is 0 if we shouldn't set key_del */
Guilhem>   if (free_page)
Guilhem>   {
Guilhem>     if (free_page != IMPOSSIBLE_PAGE_NO)
Guilhem>       share->state.key_del= (my_off_t) free_page * share->block_size;
Guilhem>     else
Guilhem>       share->state.key_del= HA_OFFSET_ERROR;
Guilhem>   }

Guilhem> and in _ma_apply_redo_index_free_page() there is:
Guilhem>   share->state.key_del= ((my_off_t) page) * share->block_size;

Guilhem> I wonder: shouldn't both settings be included in a
Guilhem> if (cmp_translog_addr(lsn, share->state.is_of_horizon) >= 0)
Guilhem>   { set key_del }
Guilhem> ?

Guilhem> Otherwise, imagine the following.

Guilhem> At runtime:
Guilhem>  key_del is empty
Guilhem>  DELETE ...; frees page 1 (key_del=1)
Guilhem>  DELETE ...; frees page 2 (key_del=2, page 2 points to page 1)
Guilhem>  page 2 gets flushed, not page 1
Guilhem>  checkpoint happens (page 1 is in list of dirty pages, not page 2),
Guilhem>  which flushes state (so key_del=2 goes to disk)

Guilhem> At REDO applying:
Guilhem>  checkpoint says REDO for page 2 should not even be read and page 2
Guilhem>  should not even be loaded from disk
Guilhem>  key_del is 2

Where in the code is this logic to skip executing redo entries ?
The current code for REDO_INDEX_FREE_PAGE assumes that the log entry is
always executed (even if the page is not read).

It's possible to skip the code in _ma_apply_redo_index_new_page()
starting from if (!(buff= pagecache_read(...), but I don't now if it's
good to skip the first part of the code (setting key_del, key_root and
file_size)

Guilhem>  REDO for freeing of page 1 is executed as page 1 is in dirty-pages
Guilhem>  list of checkpoint, sets key_del to 1 (because if() missing)
Guilhem>  REDO for freeing page 2 is skipped

Guilhem> So we have page 2 marked deleted, pointing to page 1, and key_del is 1
Guilhem> instead of 2: this makes a wrong table (page 2 is lost forever).

Yes, this may happen if we don't execute the first part of
 _ma_apply_redo_index_new_page() for all pages.

I assume adding a test of cmp_translog_addr() should also be safe.

Guilhem> Scenario with INSERT is worse:

Guilhem> At runtime:
Guilhem>  key_del is 2 (page 2 and 1 have been deleted, 2 points to 1)
Guilhem>  INSERT ...; uses page 2 (key_del= 1)
Guilhem>  INSERT ...; uses page 1 (key_del empty)
Guilhem>  page 1 gets flushed, not page 2
Guilhem>  checkpoint happens (page 2 is in list of dirty pages, not page 1),
Guilhem>  which flushes state (so key_del=empty goes to disk)

Guilhem> At REDO applying:
Guilhem>  checkpoint says REDO for page 1 should not even be read and page 1
Guilhem>  should not even be loaded from disk
Guilhem>  key_del empty
Guilhem>  REDO for new-ing of page 2 is executed as page 2 is in dirty-pages
Guilhem>  list of checkpoint, sets key_del to 1 (because if() missing)
Guilhem>  REDO for new-ing page 1 is skipped

Guilhem> So we have key_del=1 though page 1 is a non-deleted page, next INSERT
Guilhem> will have a problem (error or destroying important data).

Guilhem> Do you remember of a reason why the if() is not there? Can I add it in
Guilhem> both functions?

The reason is that the code assumes that this code part is executed
for all REDO, in which case it should always be safe to set the del.

So the question is at what point you plan to skip reading the pages.
If you want to skip calling _ma_apply_redo_index_new_page()
altogether, then you need to add tests for cmp_translog_addr().

If you only want to skip reading the not used pages and add the test
if a page should be read just before pagecache_read(), then the
current code is probably safe.

Feel free to go with either method.

Regards,
Monty




Thread
re: setting key_del when applying REDOs: bug?Michael Widenius8 Apr