List:Internals« Previous MessageNext Message »
From:Paul McCullagh Date:June 20 2006 9:25am
Subject:Re: how replication handles commit fails
View as plain text  
Hi Sergei,

On Jun 20, 2006, at 11:00 AM, Sergei Golubchik wrote:

> XA standard allows both. xa_commit() has many dirrefent return values, 
> I
> quote two most relevant here:
>
> [XA_RETRY]
>    The resource manager is not able to commit the transaction branch at
>    this time.  This value may be returned when a blocking condition
>    exists and TMNOWAIT was set. Note, however, that this value may also
>    be returned even when TMNOWAIT is not set (for example, if the
>    necessary stable storage is currently unavailable). This value 
> cannot
>    be returned if TMONEPHASE is set in flags . All resources held on
>    behalf of xid remain in a prepared state until commitment is
>    possible. The transaction manager should reissue xa_commit() at a
>    later time.
>
> [XAER_RMERR]
>    An error occurred in committing the work performed on behalf of the
>    transaction branch and the branch's work has been rolled back. Note
>    that returning this error signals a catastrophic event to a
>    transaction manager since other resource managers may successfully
>    commit their work on behalf of this branch.  This error should be
>    returned only when a resource manager concludes that it can never
>    commit the branch and that it cannot hold the branch's resources in 
> a
>    prepared state. Otherwise, [XA_RETRY] should be returned.
>
>> So my question is: since recovery is only done on startup, if a
>> commit() call fails, doesn't this mean that the data server should
>> actually shutdown immediately and automatically restart (or some
>> equivalent operation)?
>
> According to the XA standard - no, it only means MySQL should keep
> retrying the commit as long as it is getting XA_RETRY back.
> But I don't think it is a practical solution.  There must be some
> timeouts or whatever limits to prevent MySQL from retrying forever.
>
> Also, XA standard does not specify when recovery happens. And if I'd be
> given a choice whether to crash MySQL when commit fails, or add a 
> support
> for recovery not only at startup - I'd rather fix recovery :)

OK, this makes a lot of sense.

And as Stewart Smith says, the engine could provide read-only access to 
the affected tables during the retry cycle until the commit succeeds.

I guess the engine should also print an error to the log each time the 
retry fails so that the operator can fix things if required.

Thanks!

Paul

__  _______
\ \/ _   _/     Paul McCullagh (MSc)
  \  / | |       CTO
  /  \ | |       SNAP Innovation GmbH
/ /\ \| |       Altonaer Poststr 9a
------------    22767 Hamburg, Germany
PrimeBase XT    www.primebase.com/xt


__  _______
\ \/ _   _/     Paul McCullagh (MSc)
  \  / | |       CTO
  /  \ | |       SNAP Innovation GmbH
/ /\ \| |       Altonaer Poststr 9a
------------    22767 Hamburg, Germany
PrimeBase XT    www.primebase.com/xt

Thread
how replication handles commit failsWei Li19 Jun
  • Re: how replication handles commit failsSergei Golubchik19 Jun
    • Re: how replication handles commit failsPaul McCullagh19 Jun
      • Re: how replication handles commit failsOlaf van der Spek19 Jun
        • Re: how replication handles commit failsStewart Smith20 Jun
      • Re: how replication handles commit failsStewart Smith20 Jun
      • Re: how replication handles commit failsSergei Golubchik20 Jun
        • Re: how replication handles commit failsPaul McCullagh20 Jun