MySQL Lists are EOL. Please join:

List:Commits« Previous MessageNext Message »
From:Andrei Elkin Date:September 22 2009 10:41am
Subject:Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)
Bug#37148
View as plain text  
Zhen Xing, hello.

> Andrei Elkin wrote:
>> Zhen Xing, hello.
>> 
>> I have some concerns, questions, comments and remarks.
>> Besides, I believe this work rather big patch would have less chances
>> to break anything if we had a test. I suggest to add one an error
>> simulation inside MYSQL_BIN_LOG::write() and activate it in the test.
>> 
>
> As we have agree, we'll not add such a test in this patch because it
> would be a very big test to cover all the cases, but will create a
> WL/BUG to do that.

It's not obvious such test would be enormous.
I find it actually quite moderate. What is enough from my point of
view is, 

1. as said above, to simulate an error
2. to execute one by one all or most or even some affected query cases
   that easy to identify 

Such test can serve as the sceleton for a WL in case it really could
not cover all.
I am in firm opinion that we must have a test associated with fixes
like the current.
Another compromise could be to wait for the WL be ready and push both
item simultaneously.


>> >
>> > Yes, I think it's good to react on the error sooner, for example if we
>> > are doing a LOAD DATA statement to load a big hunk of data and if
>> > binlogging fails at the beginning, it would be much better to check for
>> > this error and quit the statement early then wait until the end of the
>> > statement. And I also think it would make the code clearer and less
>> > error prone.
>> >
>> 
>> Your comments confirm the bug is rather about optimization.
>> The existing logics of examining thd->main_da->is_error() at the end
>> of query execution does not seem to be compromised.
>> Correct me if it is said wrong.
>> 
>
> I don't agree it's only about optimization, fail to check the return
> value early can result in problem. For example, if we got an error in
> mysql_bin_log.write(), it will set thd->main_da->is_error(), and if we
> continue the execution and then encounter another error and try to call
> my_error() to report it, this will cause the assertion in
> Diagnostics_area::set_error_status() to fail because the error status
> has already been set. This problem can be avoided by checking the error
> and quite early.

I must confess some of my standing points have been connected to 5.4
code features.
What you have written is actually exclusive 5.1 issue, not 5.4. 
Multiple my_error() per query is okay in the latter version.
So more to the optimization the patch is trying to circumevent this
5.1 limitation as well.

>
>> Considering the current patch, I have some concerns, comments and
>> idea, please read on.
>> 
>> CONCERNS:
>> 
>> I have noticed some dangerous places.
>> Your patch forces early return from some functions like
>> 
>> @@ -2029,8 +2030,9 @@ ndb_binlog_thread_handle_schema_event(TH
>> 
...
>
> Yes, Jonas has already review the NDB part of the fix and he is OK with
> that.
>

good.


>> -
>> +      if (thd.binlog_query(THD::ROW_QUERY_TYPE,
>> +                           row->query.str, row->query.length,
>> +                           FALSE, FALSE, errcode))
>> +        goto err;
>> +      
>> 
>> where `goto err' bypasses
>> 
>>     if (table->s->blob_fields)
>>       free_delayed_insert_blobs(table);
>> 
>> that could be a part of a cleanup, as I read it.
>> 
>
> this cleanup is also done at the end of the function:
>
>  err:
> #ifndef DBUG_OFF
>   max_rows= 0;                                  // For DBUG output
> #endif
>   /* Remove all not used rows */
>   while ((row=rows.get()))
>   {
>     if (table->s->blob_fields)
>     {
>       memcpy(table->record[0],row->record,table->s->reclength);
>       free_delayed_insert_blobs(table);

ok

>> -    thd->binlog_query(THD::ROW_QUERY_TYPE,
>> -                      thd->query, thd->query_length,
>> -                      trans_table, FALSE, errcode);
>> +    if (thd->binlog_query(THD::ROW_QUERY_TYPE,
>> +                          thd->query, thd->query_length,
>> +                          trans_table, FALSE, errcode))
>> +    {
>> +      table->file->ha_release_auto_increment();
>> +      DBUG_RETURN(1);
>> +    }
>> 
>> But the resulting new code does not look pretty:
>> 
>
> Agree :)

Why not to make it 

>
>> if (thd->binlog_query(..)
>> {
>>      table->file->ha_release_auto_increment();
>>      DBUG_RETURN(1);
>> }
>> table->file->ha_release_auto_increment();
>> 

error= thd->binlog_query(..)
table->file->ha_release_auto_increment();
if (error)
   DBUG_RETURN(error)

one line less then?



>> 
>> Some more findings.
>> 
>>    @@ -6004,15 +6005,17 @@ bool mysql_revoke_all(THD *thd,  List <L
>> 
>>    VOID(pthread_mutex_unlock(&acl_cache->lock));
>> 
>>    -  write_bin_log(thd, FALSE, thd->query, thd->query_length);
>>    +  int binlog_error=
>>    +    write_bin_log(thd, FALSE, thd->query, thd->query_length);
>> 
>>    rw_unlock(&LOCK_grant);
>>    close_thread_tables(thd);
>> 
>>    -  if (result)
>>    +  /* error for writing binary log has already been reported */
>>    +  if (result && !binlog_error)
>>    my_message(ER_REVOKE_GRANTS, ER(ER_REVOKE_GRANTS), MYF(0));
>> 
>>    -  DBUG_RETURN(result);
>>    +  DBUG_RETURN(result || binlog_error);
>>    }
>>  
>> 
>> Obviously, the old code is okay:
>> 
>
> No, actually the old code will cause problem, because if there is
> already error in write_bin_log, then try to report another error by
> my_message will cause assertion failure.

well, in 5.1 only.

>
>>    if (result)
>>       my_message()
>> 
>> because the returned combined result
>>    +  DBUG_RETURN(result || binlog_error);
>> is not yet merged at that point.
>> 
>> 
>> === modified file 'sql/sql_base.cc'
>> --- a/sql/sql_base.cc	2009-06-19 08:24:43 +0000
>> +++ b/sql/sql_base.cc	2009-08-20 09:53:53 +0000
>> @@ -1333,7 +1333,8 @@ void close_thread_tables(THD *thd)
>>        handled either before writing a query log event (inside
>>        binlog_query()) or when preparing a pending event.
>>       */
>> -    thd->binlog_flush_pending_rows_event(TRUE);
>> +    /* TODO: I (hezx) think the error here should not be ignored */
>> +    (void)thd->binlog_flush_pending_rows_event(TRUE);
>>      mysql_unlock_tables(thd, thd->lock);
>>      thd->lock=0;
>>    }
>> 
>> I don't think you need such TODO here. The error is not ignored but
>> rather its check-out delayed till net_end_statement().
>> 
>
> I mean the code should check the error and return early.
>

We can say should in 5.1. But you want leave these lines in 5.4.



>> Similar hunk follows:
>> 
>> === modified file 'sql/sql_delete.cc'
>> --- a/sql/sql_delete.cc	2009-06-19 08:24:43 +0000
>> +++ b/sql/sql_delete.cc	2009-08-20 09:53:53 +0000
>> @@ -841,9 +841,10 @@ void multi_delete::abort()
>>      if (mysql_bin_log.is_open())
>>      {
>>        int errcode= query_error_code(thd, thd->killed == THD::NOT_KILLED);
>> -      thd->binlog_query(THD::ROW_QUERY_TYPE,
>> -                        thd->query, thd->query_length,
>> -                        transactional_tables, FALSE, errcode);
>> +      /* possible error of writing binary log is ignored deliberately */
>> +      (void) thd->binlog_query(THD::ROW_QUERY_TYPE,
>> +                               thd->query, thd->query_length,
>> +                               transactional_tables, FALSE, errcode);
>> 
>> Yet another with the same comment:
>> 
>
> This error is ignored deliberately, and is OK.
>
>> @@ -3784,7 +3794,8 @@ void select_create::abort()
>>    select_insert::abort();
>>    thd->transaction.stmt.modified_non_trans_table= FALSE;
>>    reenable_binlog(thd);
>> -  thd->binlog_flush_pending_rows_event(TRUE);
>> +  /* possible error of writing binary log is ignored deliberately */
>> +  (void)thd->binlog_flush_pending_rows_event(TRUE);
>> 
>> And this one:
>> 
>
> here too.
>
>> 
>> === modified file 'sql/sql_update.cc'
>> --- a/sql/sql_update.cc	2009-06-18 14:16:14 +0000
>> +++ b/sql/sql_update.cc	2009-08-20 09:53:53 +0000
>> @@ -1851,9 +1851,10 @@ void multi_update::abort()
>>          into repl event.
>>        */
>>        int errcode= query_error_code(thd, thd->killed == THD::NOT_KILLED);
>> -      thd->binlog_query(THD::ROW_QUERY_TYPE,
>> -                        thd->query, thd->query_length,
>> -                        transactional_tables, FALSE, errcode);
>> +      /* the error of binary logging is ignored */
>> +      (void)thd->binlog_query(THD::ROW_QUERY_TYPE,

ok

>> 
>> 
>> ===============================================================================
>> COMMENTS:
>> 
>> This hunk seems to be okay
>> 
>> @@ -1543,7 +1544,11 @@ void close_temporary_tables(THD *thd)
>>                              s_query.length() - 1 /* to remove trailing ',' */,
>>                              0, FALSE, 0);
>>        thd->variables.character_set_client= cs_save;
>> -      mysql_bin_log.write(&qinfo);
>> +      if (mysql_bin_log.write(&qinfo))
>> +      {
>> +        push_warning(thd, MYSQL_ERROR::WARN_LEVEL_ERROR, MYF(0),
>> +                     "Failed to write the DROP statement for temporary tables to
> binary log");
>> +      }
>>        thd->variables.pseudo_thread_id= save_pseudo_thread_id;
>>      }
>>      else
>> 
>> though my_error() must have been done inside mysql_bin_log.write().
>> I wonder if the warning is really necessary.
>> 
>
> Agree, and since close_temporary_tables() is called when closing the
> current connection, so I think the warning is no necessary, will be
> removed.
>

And moreover, this subsequent to the write's  my_error()
push_warning() should lead to the assertion, should it not?


>> ============================================================================
>> QUESTIONS:
>> 
>> === modified file 'sql/sql_repl.cc'
>> --- a/sql/sql_repl.cc	2009-05-15 12:57:51 +0000
>> +++ b/sql/sql_repl.cc	2009-08-20 09:53:53 +0000
>> @@ -1007,8 +1007,8 @@ int reset_slave(THD *thd, Master_info* m
>>    MY_STAT stat_area;
>>    char fname[FN_REFLEN];
>>    int thread_mask= 0, error= 0;
>> -  uint sql_errno=0;
>> -  const char* errmsg=0;
>> +  uint sql_errno=ER_UNKNOWN_ERROR;
>> +  const char* errmsg= "Unknown error occured while reseting slave";
>>    DBUG_ENTER("reset_slave");
>> 
>> why these changes?
>> 
>
> This change is not related to this bug, errmsg can be uninitialized when
> the call to my_stat() fails.

I think it is worth to note one has to explain details that do not
connect immediately to fixes in cset comments. I believe you're going
to do that.

>
>> =========================================================================
>> REMARKS
>> 
>> The base code mysql_grant() calls
>> 
>>   if (!result)
>>   {
>>     write_bin_log(thd, TRUE, thd->query, thd->query_length);
>>   }
>> 
>>   rw_unlock(&LOCK_grant);
>>   close_thread_tables(thd);
>> 
>>   if (!result)
>>     my_ok(thd);
>> 
>> my_ok() not after examining main_da->is_error(). Sounds like an issue.
>> 
>> Similarly in mysql_create_db() and some more places.
>> 
>
> if (!result) means there is no error, and main_da->is_error() must be
> false in this case.

It's just a hope. net_end_statement() does it right to separate
failure from ok via testing main_da.
Just one little argument about the error propagation via integer
return value. It just can't work with functions that return an integer
informal results. Like random(). if that could call my_error() inside
we would have to examine main_da not -1 of the return value.


cheers,

Andrei
Thread
bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973) Bug#37148He Zhenxing20 Aug
  • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Andrei Elkin17 Sep
    • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing19 Sep
      • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Alfranio Correia20 Sep
        • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing21 Sep
      • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Andrei Elkin21 Sep
        • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing22 Sep
          • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Andrei Elkin22 Sep
            • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing23 Sep
              • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Andrei Elkin23 Sep
                • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Andrei Elkin23 Sep
                  • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing24 Sep
        • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148Alfranio Correia22 Sep
          • Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973)Bug#37148He Zhenxing22 Sep