Hello,
Michael Widenius a écrit, Le 11/11/2008 08:34 PM:
>>>>>> "G" == Guilhem Bichot <guilhem@stripped> writes:
> G> I have some questions about the specs.
>
>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>> TASK...........: Versioning for delete & update (for transactional
> tables)
>>> DESCRIPTION:
>>>
>>> Versioning for delete & update (for transactional tables)
>>>
>>>
>>> HIGH-LEVEL SPECIFICATION:
>>>
>>> For delete, instead of physically deleting rows when maria_delete() is
>>> called, we will change the delete internally to an update where the
>>> row and it's keys are tagged with the current transaction id as their
>>> delete trans id.
>
> G> Will there be a new type of REDO log record to describe this change done
> G> to the row or key? Or does some existing REDO log record need to be
> G> slightly changed?
> G> If there is a "yes" to one of these questions, we need to update
> G> recovery's code to add/modify "log record replay" functions.
>
> Haven't thought about this in detail yet. Don't think we need a new
> type of REDO. The main change is that instead of doing a delete of
> the row will be doing an update instead.
Ok.
And will there be a new type of UNDO?
>>> LOW-LEVEL DESIGN:
>
> <cut>
>
>>> Storing of transid (Trid):
>>>
>>> Trid is max 6 bytes long
>>>
>>> First Trid it's converted to a smaller number by using
>>> trid= trid - create_trid.
>
> G> I think this needs a note:
> G> Right now maria_create() executes with dummy_transaction_object as far
> G> as I know. Either this can be changed to a real non-dummy TRN, or
> G> maria_create() can just fetch the current value of the global TrID
> G> generator and use that as create_trid.
>
> What's wrong with using a dummy_transaction_object?
It does not have a proper transaction id (it's just 0), which would have
been a problem if you would have used its id to fill create_trid. But
you're not, so no problem.
> Don't we need this to log the create of the table?
We log the CREATE, using dummy_transaction_object.
> For create_trid, we use trnman_get_min_safe_trid() which basicly
> fetches the safest value to use for a TrId.
Ok.
> We can't use the current value as a long running transaction may want
> to write to the newly created table.
Indeed.
> <cut>
>
>>> Prefix bytes 244 to 249 are reserved for negative transid, that can be used
>>> when we pack transid relative to each other on a key block.
>>>
>>> We have to store transid in high-byte-first order to be able to do a
>>> fast byte-per-byte comparision of them without packing them up.
>
> G> you mean "without unpacking them" ?
>
> Yes.
>
> G> I don't understand the sentence, could you please explain the scenario
> G> why this high-byte-first helps?
>
> High byte first allows us to compare byte per byte. The first byte
> that differs tells us which TrID is larger. (larger byte is larger)
Ok
>>> ------------
>>>
>>> For example, assuming we the following data:
>>>
>>> key_data: 1 (4 byte integer)
>>> pointer_to_row: 2 << 8 + 3 = 515 (page 2, row 3)
>>> table_create_transid 1000 Defined at create table time
>>> transid 1010 Transaction that created row
>>> delete_transid 2011 Transaction that deleted row
>>>
>>> In addition we assume the table is created with a data pointer length
>>> of 4 bytes (this is automatically calculated based on the medium length of
> rows
>>> and the given max number of rows)
>>>
>>> The binary data for the key would then look like this in hex:
>>>
>>> 00 00 00 01 Key
>>> 00 00 00 47 (515 << 1) + 1 ; The last 1 is marker that key
> cont.
>>> 15 ((1000-1010) << 1) + 1 ; The last 1 is marker that key
> cont.
> G> you mean 1000-1010 or 1010-1000 ?
>
> Sorry, meant 1010-1000
>
>>> FB 07 E6 length byte and ((2011 - 1000) << 1) = 07 E6
>
> G> Could you please add "07E6 is 2 bytes and so 249 + 2 = 251 = FB" ?
>
> Can do.
... nomination.
>
> G> Do you need to specifically update maria_chk, maria_pack, so that they
> G> don't fail when finding a delete_transid?
>
> maria_pack should never see a delete_transid.
> maira_chk will ignore any rows with a delete_transid.
> (No code changes needed for this)
Why does maria_chk need no code change? Where's the magic?
> G> What about "zerofill" code, does it need an update?
>
> No, as zerofill will never see a row with delete_transid.
ok
> (This is because you never run zerofill while there is something that
> is not purged).
>
> What needs to be done is to not allow one to run
> repair/optimize/zerofill while there is old transactions that may see
> any of the deleted rows.
Could you please mention this into the WL?