List:Internals« Previous MessageNext Message »
From:Mats Kindahl Date:January 31 2013 8:39am
Subject:Re: reducing fsyncs during handlerton->prepare and handlerton->commit
in 5.6
View as plain text  
On 01/30/2013 10:43 PM, Zardosht Kasheff wrote:
> Hello all,
>
> As I understand it, for transactional storage engines to be in sync
> with the binary log after recovery, the storage engine must support
> two phase commit (aka XA). In MySQL 5.5 and MariaDB 5.5, the engine
> must fsync its log when a transaction prepares and when a transaction
> commits. So, for each transaction, there are three fsyncs, one for
> prepare, one for the binary log, and one for commit.
>
> I also understand that this requirement has changed in MySQL 5.6 and
> MariaDB 10.0. With those releases, there is a way for storage engines
> to reduce their fsyncs. I also believe that MariaDB 5.5 still requires
> all of these fsyncs.
>
> My questions are:
>  - in MySQL 5.6 and MariaDB 10.0, what are the fsyncing requirements
> for storage engines during prepare and commit?

In MySQL 5.6, the storage engine have to sync on prepare (or make the
state durable some other way).

On commit, the storage engine can either sync or not. If the storage
engine decides to not sync and there is a crash, crash recovery will
commit the prepared transaction if it was written to the binary log and
roll it back otherwise.

>  - are there new APIs that the storage engine must comply with in
> order to get these benefits?

In MySQL 5.6, there are no new APIs that you *have* to comply with.

However, the procedure set the thd->durability_properties flag to
HA_IGNORE_DURABILITY if you can safely ignore the durability
requirements for the commit and leave the committing and/or aborting of
the transactions to the recovery procedure.

>  - in MySQL 5.6, I see new handlerton methods commit_low and
> prepare_low. What do these APIs do? What is their contract?

Are you referring to the ha_{commit,prepare}_low() functions? These are
used to do the actual commit or rollback call into all engines
participating in the transaction.

The commits on the higher level (e.g., ha_commit_trans) call into the
transaction coordinators, and the binary log batch writes to the binary
log, so it has to batch these low-level commits as well.

These functions are intended to be used by anybody implementing a
transaction coordinator (the binary log is one example), but should not
be necessary for any storage engine writer to use.

Just my few cents,
Mats Kindahl

>
> Thanks
> -Zardosht
>

-- 
Senior Principal Software Developer
Oracle, MySQL Department

Thread
reducing fsyncs during handlerton->prepare and handlerton->commit in 5.6Zardosht Kasheff30 Jan
  • Re: reducing fsyncs during handlerton->prepare and handlerton->commit in 5.6Kristian Nielsen31 Jan
  • Re: reducing fsyncs during handlerton->prepare and handlerton->commitin 5.6Mats Kindahl31 Jan
    • Re: reducing fsyncs during handlerton->prepare and handlerton->commit in 5.6Kristian Nielsen31 Jan
      • Re: reducing fsyncs during handlerton->prepare and handlerton->commitin 5.6Zardosht Kasheff31 Jan
        • Re: reducing fsyncs during handlerton->prepare and handlerton->commitin 5.6Mats Kindahl31 Jan
          • Re: reducing fsyncs during handlerton->prepare and handlerton->commitin 5.6Zardosht Kasheff31 Jan
            • Re: reducing fsyncs during handlerton->prepare and handlerton->commitin 5.6Mats Kindahl1 Feb
              • Re: reducing fsyncs during handlerton->prepare and handlerton->commitin 5.6Zardosht Kasheff1 Feb
                • Re: reducing fsyncs during handlerton->prepare and handlerton->commitin 5.6Mats Kindahl1 Feb
      • Re: reducing fsyncs during handlerton->prepare and handlerton->commitin 5.6Mats Kindahl3 Feb