Hi Sergei,
On Jun 20, 2006, at 11:00 AM, Sergei Golubchik wrote:
> XA standard allows both. xa_commit() has many dirrefent return values,
> I
> quote two most relevant here:
>
> [XA_RETRY]
> The resource manager is not able to commit the transaction branch at
> this time. This value may be returned when a blocking condition
> exists and TMNOWAIT was set. Note, however, that this value may also
> be returned even when TMNOWAIT is not set (for example, if the
> necessary stable storage is currently unavailable). This value
> cannot
> be returned if TMONEPHASE is set in flags . All resources held on
> behalf of xid remain in a prepared state until commitment is
> possible. The transaction manager should reissue xa_commit() at a
> later time.
>
> [XAER_RMERR]
> An error occurred in committing the work performed on behalf of the
> transaction branch and the branch's work has been rolled back. Note
> that returning this error signals a catastrophic event to a
> transaction manager since other resource managers may successfully
> commit their work on behalf of this branch. This error should be
> returned only when a resource manager concludes that it can never
> commit the branch and that it cannot hold the branch's resources in
> a
> prepared state. Otherwise, [XA_RETRY] should be returned.
>
>> So my question is: since recovery is only done on startup, if a
>> commit() call fails, doesn't this mean that the data server should
>> actually shutdown immediately and automatically restart (or some
>> equivalent operation)?
>
> According to the XA standard - no, it only means MySQL should keep
> retrying the commit as long as it is getting XA_RETRY back.
> But I don't think it is a practical solution. There must be some
> timeouts or whatever limits to prevent MySQL from retrying forever.
>
> Also, XA standard does not specify when recovery happens. And if I'd be
> given a choice whether to crash MySQL when commit fails, or add a
> support
> for recovery not only at startup - I'd rather fix recovery :)
OK, this makes a lot of sense.
And as Stewart Smith says, the engine could provide read-only access to
the affected tables during the retry cycle until the commit succeeds.
I guess the engine should also print an error to the log each time the
retry fails so that the operator can fix things if required.
Thanks!
Paul
__ _______
\ \/ _ _/ Paul McCullagh (MSc)
\ / | | CTO
/ \ | | SNAP Innovation GmbH
/ /\ \| | Altonaer Poststr 9a
------------ 22767 Hamburg, Germany
PrimeBase XT www.primebase.com/xt
__ _______
\ \/ _ _/ Paul McCullagh (MSc)
\ / | | CTO
/ \ | | SNAP Innovation GmbH
/ /\ \| | Altonaer Poststr 9a
------------ 22767 Hamburg, Germany
PrimeBase XT www.primebase.com/xt