List: | Commits | « Previous MessageNext Message » | |
From: | He Zhenxing | Date: | September 23 2009 6:04am |
Subject: | Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:2973) Bug#37148 | ||
View as plain text |
Andrei Elkin wrote: > Zhen Xing, hello. > > > Andrei Elkin wrote: > >> Zhen Xing, hello. > >> > >> I have some concerns, questions, comments and remarks. > >> Besides, I believe this work rather big patch would have less chances > >> to break anything if we had a test. I suggest to add one an error > >> simulation inside MYSQL_BIN_LOG::write() and activate it in the test. > >> > > > > As we have agree, we'll not add such a test in this patch because it > > would be a very big test to cover all the cases, but will create a > > WL/BUG to do that. > > It's not obvious such test would be enormous. > I find it actually quite moderate. What is enough from my point of > view is, > > 1. as said above, to simulate an error This is easy. and there is already such a mechanism. > 2. to execute one by one all or most or even some affected query cases > that easy to identify > Actually this is the part that I think difficult to do. I think it's hard (and implementation dependent) to check if the return value of mysql_bin_log.write() is correctly propagated or not. Could you please provide an example on how to test this? > Such test can serve as the sceleton for a WL in case it really could > not cover all. > I am in firm opinion that we must have a test associated with fixes > like the current. > Another compromise could be to wait for the WL be ready and push both > item simultaneously. > > > >> > > >> > Yes, I think it's good to react on the error sooner, for example if we > >> > are doing a LOAD DATA statement to load a big hunk of data and if > >> > binlogging fails at the beginning, it would be much better to check > for > >> > this error and quit the statement early then wait until the end of the > >> > statement. And I also think it would make the code clearer and less > >> > error prone. > >> > > >> > >> Your comments confirm the bug is rather about optimization. > >> The existing logics of examining thd->main_da->is_error() at the end > >> of query execution does not seem to be compromised. > >> Correct me if it is said wrong. > >> > > > > I don't agree it's only about optimization, fail to check the return > > value early can result in problem. For example, if we got an error in > > mysql_bin_log.write(), it will set thd->main_da->is_error(), and if we > > continue the execution and then encounter another error and try to call > > my_error() to report it, this will cause the assertion in > > Diagnostics_area::set_error_status() to fail because the error status > > has already been set. This problem can be avoided by checking the error > > and quite early. > > I must confess some of my standing points have been connected to 5.4 > code features. > What you have written is actually exclusive 5.1 issue, not 5.4. > Multiple my_error() per query is okay in the latter version. > So more to the optimization the patch is trying to circumevent this > 5.1 limitation as well. > I don't think so, I run the following test and it crashes on 5.4 too: CREATE USER test_user; SET GLOBAL debug="d,injecting_fault_writing"; REVOKE ALL PRIVILEGES, GRANT OPTION FROM test_user; And there is still assertion in Diagnostics_area::set_error_status(): DBUG_ASSERT(! is_set() || can_overwrite_status); > > > >> Considering the current patch, I have some concerns, comments and > >> idea, please read on. > >> > >> CONCERNS: > >> > >> I have noticed some dangerous places. > >> Your patch forces early return from some functions like > >> > >> @@ -2029,8 +2030,9 @@ ndb_binlog_thread_handle_schema_event(TH > >> > ... > > > > Yes, Jonas has already review the NDB part of the fix and he is OK with > > that. > > > > good. > > > >> - > >> + if (thd.binlog_query(THD::ROW_QUERY_TYPE, > >> + row->query.str, row->query.length, > >> + FALSE, FALSE, errcode)) > >> + goto err; > >> + > >> > >> where `goto err' bypasses > >> > >> if (table->s->blob_fields) > >> free_delayed_insert_blobs(table); > >> > >> that could be a part of a cleanup, as I read it. > >> > > > > this cleanup is also done at the end of the function: > > > > err: > > #ifndef DBUG_OFF > > max_rows= 0; // For DBUG output > > #endif > > /* Remove all not used rows */ > > while ((row=rows.get())) > > { > > if (table->s->blob_fields) > > { > > memcpy(table->record[0],row->record,table->s->reclength); > > free_delayed_insert_blobs(table); > > ok > > >> - thd->binlog_query(THD::ROW_QUERY_TYPE, > >> - thd->query, thd->query_length, > >> - trans_table, FALSE, errcode); > >> + if (thd->binlog_query(THD::ROW_QUERY_TYPE, > >> + thd->query, thd->query_length, > >> + trans_table, FALSE, errcode)) > >> + { > >> + table->file->ha_release_auto_increment(); > >> + DBUG_RETURN(1); > >> + } > >> > >> But the resulting new code does not look pretty: > >> > > > > Agree :) > > Why not to make it > > > > >> if (thd->binlog_query(..) > >> { > >> table->file->ha_release_auto_increment(); > >> DBUG_RETURN(1); > >> } > >> table->file->ha_release_auto_increment(); > >> > The above should be: error= <possible handler errors> if (!error && mysql_bin_log.is_open()) { if (thd->binlog_query()) { table->file->ha_release_auto_increment(); DBUG_RETURN(1); } } table->file->ha_release_auto_increment(); if (error) { table->file->print_error(error,MYF(0)); DBUG_RETURN(1); } > error= thd->binlog_query(..) > table->file->ha_release_auto_increment(); > if (error) > DBUG_RETURN(error) > > one line less then? > I'm afraid this is not correct, please check the code above. If we want to only have one ha_release_auto_increment() here, then we need a new variable like 'binlog_error', and change the code like this: error= <possible handler errors> int binlog_error= 0; if (!error && mysql_bin_log.is_open()) { ... binlog_error= thd->binlog_query(); } table->file->ha_release_auto_increment(); if (binlog_error) DBUG_RETURN(1); if (error) { table->file->print_error(error,MYF(0)); DBUG_RETURN(1); } > > > >> > >> Some more findings. > >> > >> @@ -6004,15 +6005,17 @@ bool mysql_revoke_all(THD *thd, List <L > >> > >> VOID(pthread_mutex_unlock(&acl_cache->lock)); > >> > >> - write_bin_log(thd, FALSE, thd->query, thd->query_length); > >> + int binlog_error= > >> + write_bin_log(thd, FALSE, thd->query, thd->query_length); > >> > >> rw_unlock(&LOCK_grant); > >> close_thread_tables(thd); > >> > >> - if (result) > >> + /* error for writing binary log has already been reported */ > >> + if (result && !binlog_error) > >> my_message(ER_REVOKE_GRANTS, ER(ER_REVOKE_GRANTS), MYF(0)); > >> > >> - DBUG_RETURN(result); > >> + DBUG_RETURN(result || binlog_error); > >> } > >> > >> > >> Obviously, the old code is okay: > >> > > > > No, actually the old code will cause problem, because if there is > > already error in write_bin_log, then try to report another error by > > my_message will cause assertion failure. > > well, in 5.1 only. > See above comments. > > > >> if (result) > >> my_message() > >> > >> because the returned combined result > >> + DBUG_RETURN(result || binlog_error); > >> is not yet merged at that point. > >> > >> > >> === modified file 'sql/sql_base.cc' > >> --- a/sql/sql_base.cc 2009-06-19 08:24:43 +0000 > >> +++ b/sql/sql_base.cc 2009-08-20 09:53:53 +0000 > >> @@ -1333,7 +1333,8 @@ void close_thread_tables(THD *thd) > >> handled either before writing a query log event (inside > >> binlog_query()) or when preparing a pending event. > >> */ > >> - thd->binlog_flush_pending_rows_event(TRUE); > >> + /* TODO: I (hezx) think the error here should not be ignored */ > >> + (void)thd->binlog_flush_pending_rows_event(TRUE); > >> mysql_unlock_tables(thd, thd->lock); > >> thd->lock=0; > >> } > >> > >> I don't think you need such TODO here. The error is not ignored but > >> rather its check-out delayed till net_end_statement(). > >> > > > > I mean the code should check the error and return early. > > > > We can say should in 5.1. But you want leave these lines in 5.4. > > > > >> Similar hunk follows: > >> > >> === modified file 'sql/sql_delete.cc' > >> --- a/sql/sql_delete.cc 2009-06-19 08:24:43 +0000 > >> +++ b/sql/sql_delete.cc 2009-08-20 09:53:53 +0000 > >> @@ -841,9 +841,10 @@ void multi_delete::abort() > >> if (mysql_bin_log.is_open()) > >> { > >> int errcode= query_error_code(thd, thd->killed == > THD::NOT_KILLED); > >> - thd->binlog_query(THD::ROW_QUERY_TYPE, > >> - thd->query, thd->query_length, > >> - transactional_tables, FALSE, errcode); > >> + /* possible error of writing binary log is ignored deliberately */ > >> + (void) thd->binlog_query(THD::ROW_QUERY_TYPE, > >> + thd->query, thd->query_length, > >> + transactional_tables, FALSE, errcode); > >> > >> Yet another with the same comment: > >> > > > > This error is ignored deliberately, and is OK. > > > >> @@ -3784,7 +3794,8 @@ void select_create::abort() > >> select_insert::abort(); > >> thd->transaction.stmt.modified_non_trans_table= FALSE; > >> reenable_binlog(thd); > >> - thd->binlog_flush_pending_rows_event(TRUE); > >> + /* possible error of writing binary log is ignored deliberately */ > >> + (void)thd->binlog_flush_pending_rows_event(TRUE); > >> > >> And this one: > >> > > > > here too. > > > >> > >> === modified file 'sql/sql_update.cc' > >> --- a/sql/sql_update.cc 2009-06-18 14:16:14 +0000 > >> +++ b/sql/sql_update.cc 2009-08-20 09:53:53 +0000 > >> @@ -1851,9 +1851,10 @@ void multi_update::abort() > >> into repl event. > >> */ > >> int errcode= query_error_code(thd, thd->killed == > THD::NOT_KILLED); > >> - thd->binlog_query(THD::ROW_QUERY_TYPE, > >> - thd->query, thd->query_length, > >> - transactional_tables, FALSE, errcode); > >> + /* the error of binary logging is ignored */ > >> + (void)thd->binlog_query(THD::ROW_QUERY_TYPE, > > ok > > >> > >> > >> > =============================================================================== > >> COMMENTS: > >> > >> This hunk seems to be okay > >> > >> @@ -1543,7 +1544,11 @@ void close_temporary_tables(THD *thd) > >> s_query.length() - 1 /* to remove trailing ',' > */, > >> 0, FALSE, 0); > >> thd->variables.character_set_client= cs_save; > >> - mysql_bin_log.write(&qinfo); > >> + if (mysql_bin_log.write(&qinfo)) > >> + { > >> + push_warning(thd, MYSQL_ERROR::WARN_LEVEL_ERROR, MYF(0), > >> + "Failed to write the DROP statement for temporary > tables to binary log"); > >> + } > >> thd->variables.pseudo_thread_id= save_pseudo_thread_id; > >> } > >> else > >> > >> though my_error() must have been done inside mysql_bin_log.write(). > >> I wonder if the warning is really necessary. > >> > > > > Agree, and since close_temporary_tables() is called when closing the > > current connection, so I think the warning is no necessary, will be > > removed. > > > > And moreover, this subsequent to the write's my_error() > push_warning() should lead to the assertion, should it not? > No, because push_warning will not set error status if it's already set. The check is done in THD::raise_condition(). > > >> > ============================================================================ > >> QUESTIONS: > >> > >> === modified file 'sql/sql_repl.cc' > >> --- a/sql/sql_repl.cc 2009-05-15 12:57:51 +0000 > >> +++ b/sql/sql_repl.cc 2009-08-20 09:53:53 +0000 > >> @@ -1007,8 +1007,8 @@ int reset_slave(THD *thd, Master_info* m > >> MY_STAT stat_area; > >> char fname[FN_REFLEN]; > >> int thread_mask= 0, error= 0; > >> - uint sql_errno=0; > >> - const char* errmsg=0; > >> + uint sql_errno=ER_UNKNOWN_ERROR; > >> + const char* errmsg= "Unknown error occured while reseting slave"; > >> DBUG_ENTER("reset_slave"); > >> > >> why these changes? > >> > > > > This change is not related to this bug, errmsg can be uninitialized when > > the call to my_stat() fails. > > I think it is worth to note one has to explain details that do not > connect immediately to fixes in cset comments. I believe you're going > to do that. > OK, I'll add per file comment for this > > > >> ========================================================================= > >> REMARKS > >> > >> The base code mysql_grant() calls > >> > >> if (!result) > >> { > >> write_bin_log(thd, TRUE, thd->query, thd->query_length); > >> } > >> > >> rw_unlock(&LOCK_grant); > >> close_thread_tables(thd); > >> > >> if (!result) > >> my_ok(thd); > >> > >> my_ok() not after examining main_da->is_error(). Sounds like an issue. > >> > >> Similarly in mysql_create_db() and some more places. > >> > > > > if (!result) means there is no error, and main_da->is_error() must be > > false in this case. > > It's just a hope. net_end_statement() does it right to separate > failure from ok via testing main_da. > Just one little argument about the error propagation via integer > return value. It just can't work with functions that return an integer > informal results. Like random(). if that could call my_error() inside > we would have to examine main_da not -1 of the return value. > I think there is a difference between the return value of a function and the error in main_da, main_da->is_error() means there is an error that causes current statement to fail, while the return value of a function (if the return value indicates success or failure) only means if there is a problem when executing that function or not. it may or may not set main_da->is_error() when the function fails. But on the other hand, if the function succeeded, it should never set main_da->is_error(). I think this must be followed, otherwise a lot of the code would be break on this. And I think check the return value of the function and do error handling as early as possible should be a coding style that we should follow. For functions like random(), the return value does not indicate failure or success, I think this is another issue. And let's suppose random() would call set is_error() when it failed, then I think you should check is_error() **immediately** after calling random(), not postpone the check to the end of the execution. ------- I think this is correct ---------------- val= random(); if (is_error()) // do error handling else // use the 'val' to continue ---------------------------------------------------- --------- I think this is not ---------------------- val= random(); // use the 'val' to continue .... if (is_error()) // do error handling > > cheers, > > Andrei >