MySQL Lists are EOL. Please join:

List:Commits« Previous MessageNext Message »
From:He Zhenxing Date:September 23 2009 6:04am
Subject:Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)
Bug#37148
View as plain text  
Andrei Elkin wrote:
> Zhen Xing, hello.
> 
> > Andrei Elkin wrote:
> >> Zhen Xing, hello.
> >> 
> >> I have some concerns, questions, comments and remarks.
> >> Besides, I believe this work rather big patch would have less chances
> >> to break anything if we had a test. I suggest to add one an error
> >> simulation inside MYSQL_BIN_LOG::write() and activate it in the test.
> >> 
> >
> > As we have agree, we'll not add such a test in this patch because it
> > would be a very big test to cover all the cases, but will create a
> > WL/BUG to do that.
> 
> It's not obvious such test would be enormous.
> I find it actually quite moderate. What is enough from my point of
> view is, 
> 
> 1. as said above, to simulate an error

This is easy. and there is already such a mechanism.

> 2. to execute one by one all or most or even some affected query cases
>    that easy to identify 
> 

Actually this is the part that I think difficult to do. I think it's
hard (and implementation dependent) to check if the return value of
mysql_bin_log.write() is correctly propagated or not. Could you please
provide an example on how to test this?


> Such test can serve as the sceleton for a WL in case it really could
> not cover all.
> I am in firm opinion that we must have a test associated with fixes
> like the current.
> Another compromise could be to wait for the WL be ready and push both
> item simultaneously.
> 
> 
> >> >
> >> > Yes, I think it's good to react on the error sooner, for example if we
> >> > are doing a LOAD DATA statement to load a big hunk of data and if
> >> > binlogging fails at the beginning, it would be much better to check
> for
> >> > this error and quit the statement early then wait until the end of the
> >> > statement. And I also think it would make the code clearer and less
> >> > error prone.
> >> >
> >> 
> >> Your comments confirm the bug is rather about optimization.
> >> The existing logics of examining thd->main_da->is_error() at the end
> >> of query execution does not seem to be compromised.
> >> Correct me if it is said wrong.
> >> 
> >
> > I don't agree it's only about optimization, fail to check the return
> > value early can result in problem. For example, if we got an error in
> > mysql_bin_log.write(), it will set thd->main_da->is_error(), and if we
> > continue the execution and then encounter another error and try to call
> > my_error() to report it, this will cause the assertion in
> > Diagnostics_area::set_error_status() to fail because the error status
> > has already been set. This problem can be avoided by checking the error
> > and quite early.
> 
> I must confess some of my standing points have been connected to 5.4
> code features.
> What you have written is actually exclusive 5.1 issue, not 5.4. 
> Multiple my_error() per query is okay in the latter version.
> So more to the optimization the patch is trying to circumevent this
> 5.1 limitation as well.
> 

I don't think so, I run the following test and it crashes on 5.4 too:

CREATE USER test_user;
SET GLOBAL debug="d,injecting_fault_writing";
REVOKE ALL PRIVILEGES, GRANT OPTION FROM test_user;

And there is still assertion in Diagnostics_area::set_error_status():
 DBUG_ASSERT(! is_set() || can_overwrite_status);

> >
> >> Considering the current patch, I have some concerns, comments and
> >> idea, please read on.
> >> 
> >> CONCERNS:
> >> 
> >> I have noticed some dangerous places.
> >> Your patch forces early return from some functions like
> >> 
> >> @@ -2029,8 +2030,9 @@ ndb_binlog_thread_handle_schema_event(TH
> >> 
> ...
> >
> > Yes, Jonas has already review the NDB part of the fix and he is OK with
> > that.
> >
> 
> good.
> 
> 
> >> -
> >> +      if (thd.binlog_query(THD::ROW_QUERY_TYPE,
> >> +                           row->query.str, row->query.length,
> >> +                           FALSE, FALSE, errcode))
> >> +        goto err;
> >> +      
> >> 
> >> where `goto err' bypasses
> >> 
> >>     if (table->s->blob_fields)
> >>       free_delayed_insert_blobs(table);
> >> 
> >> that could be a part of a cleanup, as I read it.
> >> 
> >
> > this cleanup is also done at the end of the function:
> >
> >  err:
> > #ifndef DBUG_OFF
> >   max_rows= 0;                                  // For DBUG output
> > #endif
> >   /* Remove all not used rows */
> >   while ((row=rows.get()))
> >   {
> >     if (table->s->blob_fields)
> >     {
> >       memcpy(table->record[0],row->record,table->s->reclength);
> >       free_delayed_insert_blobs(table);
> 
> ok
> 
> >> -    thd->binlog_query(THD::ROW_QUERY_TYPE,
> >> -                      thd->query, thd->query_length,
> >> -                      trans_table, FALSE, errcode);
> >> +    if (thd->binlog_query(THD::ROW_QUERY_TYPE,
> >> +                          thd->query, thd->query_length,
> >> +                          trans_table, FALSE, errcode))
> >> +    {
> >> +      table->file->ha_release_auto_increment();
> >> +      DBUG_RETURN(1);
> >> +    }
> >> 
> >> But the resulting new code does not look pretty:
> >> 
> >
> > Agree :)
> 
> Why not to make it 
> 
> >
> >> if (thd->binlog_query(..)
> >> {
> >>      table->file->ha_release_auto_increment();
> >>      DBUG_RETURN(1);
> >> }
> >> table->file->ha_release_auto_increment();
> >> 
> 

The above should be:

error= <possible handler errors>

if (!error && mysql_bin_log.is_open())
{
   if (thd->binlog_query())
   {
      table->file->ha_release_auto_increment();
      DBUG_RETURN(1);
   }
}
table->file->ha_release_auto_increment();

if (error)
{
  table->file->print_error(error,MYF(0));
  DBUG_RETURN(1);
}

> error= thd->binlog_query(..)
> table->file->ha_release_auto_increment();
> if (error)
>    DBUG_RETURN(error)
> 
> one line less then?
> 

I'm afraid this is not correct, please check the code above.

If we want to only have one ha_release_auto_increment() here, then we
need a new variable like 'binlog_error', and change the code like this:

error= <possible handler errors>

int binlog_error= 0;
if (!error && mysql_bin_log.is_open())
{
   ...
   binlog_error= thd->binlog_query();
}
table->file->ha_release_auto_increment();

if (binlog_error)
  DBUG_RETURN(1);

if (error)
{
  table->file->print_error(error,MYF(0));
  DBUG_RETURN(1);
}

> 
> 
> >> 
> >> Some more findings.
> >> 
> >>    @@ -6004,15 +6005,17 @@ bool mysql_revoke_all(THD *thd,  List <L
> >> 
> >>    VOID(pthread_mutex_unlock(&acl_cache->lock));
> >> 
> >>    -  write_bin_log(thd, FALSE, thd->query, thd->query_length);
> >>    +  int binlog_error=
> >>    +    write_bin_log(thd, FALSE, thd->query, thd->query_length);
> >> 
> >>    rw_unlock(&LOCK_grant);
> >>    close_thread_tables(thd);
> >> 
> >>    -  if (result)
> >>    +  /* error for writing binary log has already been reported */
> >>    +  if (result && !binlog_error)
> >>    my_message(ER_REVOKE_GRANTS, ER(ER_REVOKE_GRANTS), MYF(0));
> >> 
> >>    -  DBUG_RETURN(result);
> >>    +  DBUG_RETURN(result || binlog_error);
> >>    }
> >>  
> >> 
> >> Obviously, the old code is okay:
> >> 
> >
> > No, actually the old code will cause problem, because if there is
> > already error in write_bin_log, then try to report another error by
> > my_message will cause assertion failure.
> 
> well, in 5.1 only.
> 

See above comments.

> >
> >>    if (result)
> >>       my_message()
> >> 
> >> because the returned combined result
> >>    +  DBUG_RETURN(result || binlog_error);
> >> is not yet merged at that point.
> >> 
> >> 
> >> === modified file 'sql/sql_base.cc'
> >> --- a/sql/sql_base.cc	2009-06-19 08:24:43 +0000
> >> +++ b/sql/sql_base.cc	2009-08-20 09:53:53 +0000
> >> @@ -1333,7 +1333,8 @@ void close_thread_tables(THD *thd)
> >>        handled either before writing a query log event (inside
> >>        binlog_query()) or when preparing a pending event.
> >>       */
> >> -    thd->binlog_flush_pending_rows_event(TRUE);
> >> +    /* TODO: I (hezx) think the error here should not be ignored */
> >> +    (void)thd->binlog_flush_pending_rows_event(TRUE);
> >>      mysql_unlock_tables(thd, thd->lock);
> >>      thd->lock=0;
> >>    }
> >> 
> >> I don't think you need such TODO here. The error is not ignored but
> >> rather its check-out delayed till net_end_statement().
> >> 
> >
> > I mean the code should check the error and return early.
> >
> 
> We can say should in 5.1. But you want leave these lines in 5.4.
> 
> 
> 
> >> Similar hunk follows:
> >> 
> >> === modified file 'sql/sql_delete.cc'
> >> --- a/sql/sql_delete.cc	2009-06-19 08:24:43 +0000
> >> +++ b/sql/sql_delete.cc	2009-08-20 09:53:53 +0000
> >> @@ -841,9 +841,10 @@ void multi_delete::abort()
> >>      if (mysql_bin_log.is_open())
> >>      {
> >>        int errcode= query_error_code(thd, thd->killed ==
> THD::NOT_KILLED);
> >> -      thd->binlog_query(THD::ROW_QUERY_TYPE,
> >> -                        thd->query, thd->query_length,
> >> -                        transactional_tables, FALSE, errcode);
> >> +      /* possible error of writing binary log is ignored deliberately */
> >> +      (void) thd->binlog_query(THD::ROW_QUERY_TYPE,
> >> +                               thd->query, thd->query_length,
> >> +                               transactional_tables, FALSE, errcode);
> >> 
> >> Yet another with the same comment:
> >> 
> >
> > This error is ignored deliberately, and is OK.
> >
> >> @@ -3784,7 +3794,8 @@ void select_create::abort()
> >>    select_insert::abort();
> >>    thd->transaction.stmt.modified_non_trans_table= FALSE;
> >>    reenable_binlog(thd);
> >> -  thd->binlog_flush_pending_rows_event(TRUE);
> >> +  /* possible error of writing binary log is ignored deliberately */
> >> +  (void)thd->binlog_flush_pending_rows_event(TRUE);
> >> 
> >> And this one:
> >> 
> >
> > here too.
> >
> >> 
> >> === modified file 'sql/sql_update.cc'
> >> --- a/sql/sql_update.cc	2009-06-18 14:16:14 +0000
> >> +++ b/sql/sql_update.cc	2009-08-20 09:53:53 +0000
> >> @@ -1851,9 +1851,10 @@ void multi_update::abort()
> >>          into repl event.
> >>        */
> >>        int errcode= query_error_code(thd, thd->killed ==
> THD::NOT_KILLED);
> >> -      thd->binlog_query(THD::ROW_QUERY_TYPE,
> >> -                        thd->query, thd->query_length,
> >> -                        transactional_tables, FALSE, errcode);
> >> +      /* the error of binary logging is ignored */
> >> +      (void)thd->binlog_query(THD::ROW_QUERY_TYPE,
> 
> ok
> 
> >> 
> >> 
> >>
> ===============================================================================
> >> COMMENTS:
> >> 
> >> This hunk seems to be okay
> >> 
> >> @@ -1543,7 +1544,11 @@ void close_temporary_tables(THD *thd)
> >>                              s_query.length() - 1 /* to remove trailing ','
> */,
> >>                              0, FALSE, 0);
> >>        thd->variables.character_set_client= cs_save;
> >> -      mysql_bin_log.write(&qinfo);
> >> +      if (mysql_bin_log.write(&qinfo))
> >> +      {
> >> +        push_warning(thd, MYSQL_ERROR::WARN_LEVEL_ERROR, MYF(0),
> >> +                     "Failed to write the DROP statement for temporary
> tables to binary log");
> >> +      }
> >>        thd->variables.pseudo_thread_id= save_pseudo_thread_id;
> >>      }
> >>      else
> >> 
> >> though my_error() must have been done inside mysql_bin_log.write().
> >> I wonder if the warning is really necessary.
> >> 
> >
> > Agree, and since close_temporary_tables() is called when closing the
> > current connection, so I think the warning is no necessary, will be
> > removed.
> >
> 
> And moreover, this subsequent to the write's  my_error()
> push_warning() should lead to the assertion, should it not?
> 

No, because push_warning will not set error status if it's already set.
The check is done in THD::raise_condition().

> 
> >>
> ============================================================================
> >> QUESTIONS:
> >> 
> >> === modified file 'sql/sql_repl.cc'
> >> --- a/sql/sql_repl.cc	2009-05-15 12:57:51 +0000
> >> +++ b/sql/sql_repl.cc	2009-08-20 09:53:53 +0000
> >> @@ -1007,8 +1007,8 @@ int reset_slave(THD *thd, Master_info* m
> >>    MY_STAT stat_area;
> >>    char fname[FN_REFLEN];
> >>    int thread_mask= 0, error= 0;
> >> -  uint sql_errno=0;
> >> -  const char* errmsg=0;
> >> +  uint sql_errno=ER_UNKNOWN_ERROR;
> >> +  const char* errmsg= "Unknown error occured while reseting slave";
> >>    DBUG_ENTER("reset_slave");
> >> 
> >> why these changes?
> >> 
> >
> > This change is not related to this bug, errmsg can be uninitialized when
> > the call to my_stat() fails.
> 
> I think it is worth to note one has to explain details that do not
> connect immediately to fixes in cset comments. I believe you're going
> to do that.
> 

OK, I'll add per file comment for this

> >
> >> =========================================================================
> >> REMARKS
> >> 
> >> The base code mysql_grant() calls
> >> 
> >>   if (!result)
> >>   {
> >>     write_bin_log(thd, TRUE, thd->query, thd->query_length);
> >>   }
> >> 
> >>   rw_unlock(&LOCK_grant);
> >>   close_thread_tables(thd);
> >> 
> >>   if (!result)
> >>     my_ok(thd);
> >> 
> >> my_ok() not after examining main_da->is_error(). Sounds like an issue.
> >> 
> >> Similarly in mysql_create_db() and some more places.
> >> 
> >
> > if (!result) means there is no error, and main_da->is_error() must be
> > false in this case.
> 
> It's just a hope. net_end_statement() does it right to separate
> failure from ok via testing main_da.
> Just one little argument about the error propagation via integer
> return value. It just can't work with functions that return an integer
> informal results. Like random(). if that could call my_error() inside
> we would have to examine main_da not -1 of the return value.
> 

I think there is a difference between the return value of a function and
the error in main_da, main_da->is_error() means there is an error that
causes current statement to fail, while the return value of a function
(if the return value indicates success or failure) only means if there
is a problem when executing that function or not. it may or may not set
main_da->is_error() when the function fails. But on the other hand, if
the function succeeded, it should never set main_da->is_error(). I think
this must be followed, otherwise a lot of the code would be break on
this.

And I think check the return value of the function and do error handling
as early as possible should be a coding style that we should follow.

For functions like random(), the return value does not indicate failure
or success, I think this is another issue.

And let's suppose random() would call set is_error() when it failed,
then I think you should check is_error() **immediately** after calling
random(), not postpone the check to the end of the execution.

------- I think this is correct ----------------
val= random();
if (is_error())
  // do error handling
else
  // use the 'val' to continue
----------------------------------------------------

--------- I think this is not ----------------------
val= random();
// use the 'val' to continue
....
if (is_error())
  // do error handling


> 
> cheers,
> 
> Andrei
> 

Thread
bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973) Bug#37148He Zhenxing20 Aug
  • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Andrei Elkin17 Sep
    • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing19 Sep
      • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Alfranio Correia20 Sep
        • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing21 Sep
      • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Andrei Elkin21 Sep
        • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing22 Sep
          • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Andrei Elkin22 Sep
            • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing23 Sep
              • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Andrei Elkin23 Sep
                • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Andrei Elkin23 Sep
                  • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing24 Sep
        • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Alfranio Correia22 Sep
          • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing22 Sep