List:Commits« Previous MessageNext Message »
From:Sergei Golubchik Date:October 20 2008 4:01pm
Subject:Re: bzr commit into mysql-5.0-bugteam branch (davi:2707) Bug#28323
View as plain text  
Hi, Davi!

On Oct 16, Davi Arnaut wrote:
>  2707 Davi Arnaut	2008-10-16
>       Bug#28323: Server crashed in xid cache operations
>       
>       The problem was that the server did not robustly handle a
>       unilateral roll back issued by the Resource Manager (RM)
>       due to a resource deadlock within the transaction branch.
>       By not acknowledging the roll back, the server (TM) would
>       eventually corrupt the XA transaction state and crash.
>       
>       The solution is to mark the transaction as rollback-only
>       if the RM indicates that it rolled back its branch of the
>       transaction.

> === modified file 'sql/sql_parse.cc'
> --- a/sql/sql_parse.cc	2008-10-02 11:57:52 +0000
> +++ b/sql/sql_parse.cc	2008-10-16 14:50:38 +0000
> @@ -90,9 +90,54 @@ const char *command_name[]={
>  };
>  
>  const char *xa_state_names[]={
> -  "NON-EXISTING", "ACTIVE", "IDLE", "PREPARED"
> +  "NON-EXISTING", "ACTIVE", "IDLE", "PREPARED", "ROLLBACK ONLY"
>  };
>  
> +/**
> +  Mark a XA transaction as rollback-only if the RM unilaterally
> +  rolled back the transaction branch.
> +
> +  @note If a rollback was requested by the RM, this function sets
> +        the appropriate rollback error code and transits the state
> +        to XA_ROLLBACK_ONLY.

please explain also the return values here.

> +*/
> +static bool xa_trans_rolled_back(THD *thd, XID_STATE *xid_state)

Do you need THD in here ?

> +{
> +  if (xid_state->rm_error)
> +  {
> +    switch (xid_state->rm_error) {
> +    case ER_LOCK_WAIT_TIMEOUT:
> +      my_error(ER_XA_RBTIMEOUT, MYF(0));
> +      break;
> +    case ER_LOCK_DEADLOCK:
> +      my_error(ER_XA_RBDEADLOCK, MYF(0));
> +      break;
> +    default:
> +      my_error(ER_XA_RBROLLBACK, MYF(0));
> +    }
> +    xid_state->xa_state= XA_ROLLBACK_ONLY;
> +  }
> +
> +  return (xid_state->xa_state == XA_ROLLBACK_ONLY);
> +}
> +
> +/**
> +  Rollback work done on behalf of at ransaction branch.
> +*/
> +static bool xa_trans_rollback(THD *thd)
> +{
> +  bool status= test(ha_rollback(thd));
> +
> +  thd->options&= ~(ulong) OPTION_BEGIN;
> +  thd->transaction.all.modified_non_trans_table= FALSE;
> +  thd->server_status&= ~SERVER_STATUS_IN_TRANS;
> +  xid_cache_delete(&thd->transaction.xid_state);
> +  thd->transaction.xid_state.xa_state= XA_NOTR;
> +  thd->transaction.xid_state.rm_error= 0;
> +
> +  return status;
> +}
> +
>  #ifndef EMBEDDED_LIBRARY
>  static bool do_command(THD *thd);
>  #endif // EMBEDDED_LIBRARY
> @@ -5017,6 +5062,8 @@ create_sp_error:
>      if (thd->transaction.xid_state.xa_state == XA_IDLE &&
>          thd->lex->xa_opt == XA_RESUME)
>      {
> +      if (xa_trans_rolled_back(thd, &thd->transaction.xid_state))
> +        break;

Move this check down, after the check for xid.eq(...).

I don't think you should return ER_XA_RBROLLBACK if someone did XA START
for some arbitrary, unrelated XID.

On the second thought... Can you reach XA_IDLE state at all with
non-zero xid.rm_error ? Doesn't xid.rm_error guarantee that the state
will be XA_ROLLBACK_ONLY ?

>        if (! thd->transaction.xid_state.xid.eq(thd->lex->xid))
>        {
>          my_error(ER_XAER_NOTA, MYF(0));
> @@ -5074,6 +5122,8 @@ create_sp_error:
>        my_error(ER_XAER_NOTA, MYF(0));
>        break;
>      }
> +    if (xa_trans_rolled_back(thd, &thd->transaction.xid_state))
> +      break;

That's in case SQLCOM_XA_END as far as I understand. Ok.

>      thd->transaction.xid_state.xa_state=XA_IDLE;
>      send_ok(thd);
>      break;
> @@ -5105,6 +5155,11 @@ create_sp_error:
>        XID_STATE *xs=xid_cache_search(thd->lex->xid);

This is SQLCOM_XA_COMMIT. Don't you need to add this if(xa_trans_rolled_back
also to case SQLCOM_XA_PREPARE ?

>        if (!xs || xs->in_thd)
>          my_error(ER_XAER_NOTA, MYF(0));
> +      else if (xa_trans_rolled_back(thd, xs))
> +      {
> +        xa_trans_rollback(thd);
> +        break;
> +      }

Hmm, you're inside

  if (!thd->transaction.xid_state.xid.eq(thd->lex->xid))

that is, for the case when one does XA COMMIT for another connection.
You cannot use xa_trans_rollback() here, you need to use
ha_commit_or_rollback_by_xid().

>        else
>        {
>          ha_commit_or_rollback_by_xid(thd->lex->xid, 1);
> @@ -5157,6 +5217,11 @@ create_sp_error:
>        XID_STATE *xs=xid_cache_search(thd->lex->xid);
>        if (!xs || xs->in_thd)
>          my_error(ER_XAER_NOTA, MYF(0));
> +      else if (xa_trans_rolled_back(thd, xs))
> +      {
> +        xa_trans_rollback(thd);
> +        break;

Same here about ha_commit_or_rollback_by_xid()

> +      }
>        else
>        {
>          ha_commit_or_rollback_by_xid(thd->lex->xid, 0);

Regards / Mit vielen Grüßen,
Sergei

-- 
   __  ___     ___ ____  __
  /  |/  /_ __/ __/ __ \/ /   Sergei Golubchik <serg@stripped>
 / /|_/ / // /\ \/ /_/ / /__  Principal Software Engineer/Server Architect
/_/  /_/\_, /___/\___\_\___/  Sun Microsystems GmbH, HRB München 161028
       <___/                  Sonnenallee 1, 85551 Kirchheim-Heimstetten
Geschäftsführer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Häring
Thread
bzr commit into mysql-5.0-bugteam branch (davi:2707) Bug#28323Davi Arnaut16 Oct
  • Re: bzr commit into mysql-5.0-bugteam branch (davi:2707) Bug#28323Konstantin Osipov16 Oct
  • Re: bzr commit into mysql-5.0-bugteam branch (davi:2707) Bug#28323Sergei Golubchik20 Oct
    • Re: bzr commit into mysql-5.0-bugteam branch (davi:2707) Bug#28323Davi Arnaut20 Oct