List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:November 27 2008 8:42pm
Subject:Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)
Bug#38133
View as plain text  
Hello Ingo,

Ingo Strüwing a écrit, Le 11/21/2008 05:59 PM:
> Hi Guilhem,
> 
> thank you for your thoughts. I made a new patch that includes most of
> your wishes, as I understood them. I did also make it a bit smaller,
> leaving out debug print I don't need any more.
> 
> But you write a lot of "could" and "can". Could you please be a bit
> clearer, how much you would like me to make the respective changes?

Normally my "could" or "can" are optional. They propose something, which 
you use if you like.

> Please see below for comments.
> 
> Guilhem Bichot, 14.11.2008 15:21:
> ...
>> Ingo Struewing a écrit, Le 11/13/2008 09:55 AM:
> ...
>>>  2730 Ingo Struewing    2008-11-13
>>>       Bug#38133 - Myisamlog test fails on Windows
>>>             Updating of a table from the myisam.log file was not
>>> possible in most cases.
>>>       - The initialization of myisamlog set the number of usable file
>>> descriptors
>>>         to one (1!) on Windows.
>>>       - Every log command to a different file required a close of the
>>> previous file.
>>>         After a 'close' command the closing of the previous file
>>> failed because
>>>         all files were closed already and myisamlog stopped.
>>>       - File locking on Windows failed for the sequence "exclusive
>>> lock, shared lock,
>>>         unlock, exclusive lock". It blocked on the last try to acquire
>>> an exclusive lock.
>>>             Fixed by
>>>       - requesting a reasonable number of file descriptors,
>>>       - before re-open, close a file only if all file descriptors are
>>> in use,
>>>       - do not re-open a file, if the command is 'close',
>>>       - unlock a file before every lock.
>>> per-file messages:
>>>   mysys/my_lock.c
>>>     Bug#38133 - Myisamlog test fails on Windows
>>>     Added unlocking before locking to prevent multiple locks by the
>>> same process.
>>>   sql/handler.cc
>>>     Bug#38133 - Myisamlog test fails on Windows
>>>     Changed abort() to DBUG_ABORT().
>> you could say why this change was needed.
> 
> IMHO file comments should mainly tell what was changed. This can be
> valuable information for difficult merges.
> 
> The reason for changes is usually better placed in the revision comment,
> in the code (if it can be written as a reason for that implementation,
> not for a change), or perhaps in the bug report.
> 
> For a small change like this, none of these seem appropriate. Especially
> as I don't find a written reason for the same change in all the other
> uses within sql/handler.cc.

But you did the change for a good reason (popup under Windows), and it 
is good to tell people about this reason, in case they wonder (at review 
time, or later when they study revision history); for me the file's 
revision comment is appropriate.

> Anyway, the change was unrelated to the bug. So I reverted it.

ok. It would have been fine to keep it, it's so small and unquestionable.

> ...
>>> === modified file 'mysys/my_lock.c'
>>> --- a/mysys/my_lock.c    2008-09-12 12:57:19 +0000
>>> +++ b/mysys/my_lock.c    2008-11-13 08:55:37 +0000
>>> @@ -71,13 +71,23 @@ static int win_lock(File fd, int locktyp
>>>      /* write lock is mapped to an exclusive lock. */
>>>      dwFlags= LOCKFILE_EXCLUSIVE_LOCK;
>>>  
>>> +  /*
>>> +    Drop old lock first to avoid double locking.
>>> +    During analyze of Bug#38133 (Myisamlog test fails on Windows)
>>> +    I met the situation that the program myisamlog locked the file
>>> +    exclusively, then additionally shared, then did one unlock, and
>>> +    then blocked on an attempt to lock it exclusively again.
>>> +    Unlocking before every lock fixed the problem.
>>> +  */
>>> +  (void) UnlockFileEx(hFile, 0, liLength.LowPart, liLength.HighPart,
>>> &ov);
>> I'm anxious to have a fix like this which will in theory affect the
>> entire server, only to fix something exhibited by myisamlog, without
>> having a clear understanding of what is actually wrong (what module is
>> guilty).
>> When I read the comment above, my impression is that myisamlog is doing
>> something wrong:
>> - why lock shared if already has an exclusive lock?
>> - why re-lock exclusive if already has an exclusive lock?
>> It is "expected" that myisamlog does weird things: it's code which was
>> dead for >6 years, is very little tested in mysql-test, it may have old
>> bugs or new bugs due to other code's evolution. So I'd prefer if we dug
>> into myisamlog and fixed myisamlog than the server-wide locking code.
>>
>> In other words, I believe the bug could be in the user of mysys
>> (myisamlog) more than in mysys. If the server didn't have any problem
>> with mysys of before-the-patch, given that the server is much more used
>> than myisamlog, I'd trust mysys more than myisamlog.
> 
> Here is a trace of the situation with statements from the test case:
> create table t1 (a int, b varchar(100)) engine=myisam;
> insert into t1 values(1,'life'), (2,'file');
> - open and lock for write (exclusive on Windows)
> - write (two times)
> - unlock
> insert into t1 select a*5, concat("A ",b) from t1;
> - open and lock for write (exclusive on Windows)
> - open second instance and lock for read
>   (lock suppressed by mi_lock_database: stronger lock exists)
> - write (two times)
> - unlock first instance (mi_lock_database wants to turn write lock
>   to read lock and thus adds a shared lock on Windows. It expects
>   a new lock to replace an old lock)
> - unlock second instance (final unlock by mi_lock_database. This
>   unlocks the latest lock on Windows: the shared lock. The exclusive
>   lock still persists.)
> update t1 set b="A knife" where a=5;
> - lock (first instance) for write (exclusive on Windows, this one
>   hangs as another exclusive lock still exists.)

If I understand well, all the locks which you describe above happen in 
the Server when running with --external-locking?
So you mean that the hang happens in the Server too, if using 
--external-locking? It's not a myisamlog-specific problem?

> My initial proposal was to fix it in my_lock() so that its behavior is
> consistent over platforms: A new lock replaces an old one.
> 
> The second best solution would be to make mi_lock_database() aware of
> the different behavior on the platforms IMHO. In case of lock type
> changes on Windows, it would first unlock.
> 
> But since this does also affect the server, I implemented kind of a
> "behavior prediction engine" in myisamlog. Depending on the current
> locks on the share, and the platform, I release a lock before calling
> mi_lock_database().

Now I'm confused. If the hang can also exist in the Server, why fix only 
myisamlog?


> But I wonder how you can be so sure that "the server didn't have any
> problem with mysys of before-the-patch", given that the server is much
> more used than myisamlog". How much do we test with --external-locking
> on Windows?

I had not understood that the bug was affecting the Server too.

> ...
>>> @@ -186,6 +194,14 @@ int mi_examine_log(MI_EXAMINE_LOG_PARAM    (void)
>>> init_key_cache(dflt_key_cache,KEY_CACHE_BLOCK_SIZE,KEY_CACHE_SIZE,
>>>                          0, 0);
>>>  
>>> +  /*
>>> +    Initialize members of file_info that are used for pointing to
>>> +    allocated memory. At the error labels we want to be able to free it.
>>> +  */
>>> +  file_info.name= NULL;
>>> +  file_info.show_name= NULL;
>>> +  file_info.record= NULL;
>> A super-critical comment: you could have put all inits on one line:
>> file_info.name= file_info.show_name= file_info.record= NULL;
> 
> Here's one of these "could"s. And yes, you are right. I could have
> written it on one line. And I did know that before. It is just so that
> for my old eyes the one-line writing is worse to parse than the three
> line writing. When reading in the code below and raising the question
> "has file_info.show_name been initialized above?", I would find it among
> the one-assignment-per-line faster than among the
> as-many-assignments-as-you-can-per-line.

Ok.

> So if you "could" is weaker than "it is our coding style", then I'd just
> ignore it here.

It's weaker, it's a suggestion. It's really up to you. Ok for 3 lines.

>>> +    /*
>>> +      'command' is a number that is used to index arrays. Better check
>>> +      it for range.
>>> +    */
>>> +    if (command >= MI_LOG_END_SENTINEL)
>>> +    {
>>> +      my_errno= HA_ERR_WRONG_COMMAND;
>>> +      fprintf(stderr,"Unknown command %u in logfile at position %s\n",
>>> +              command, llstr(isamlog_filepos, llbuff));
>>> +      fflush(stderr);
>>> +      goto end;
>> After "end:" we don't make use of my_errno, so no need to set it.
>> Or, set it, but use "goto err", in that case you can remove the
>> fflush(stderr).
> 
> Ok, removed. I trust you that nobody will ever expect
> mi_examine_log() to set my_errno on error.

I believe no one should expect this, as it's not announced by the 
Doxygen comment of mi_examine_log(). For me, setting my_errno is part of 
the API. If it's not documented, it does not have to be...
I looked at the two only callers of mi_examine_log() (myisamlog.c and 
myisam_backup_engine.cc and they don't read my_errno.

But, I'm fine if you make mi_examine_log() set my_errno, and document 
this in the function's Doxygen comment.

> 
> The other option would print a second, confusing error message.
> 
> ...
>>> === modified file 'storage/myisam/mi_locking.c'
>>> --- a/storage/myisam/mi_locking.c    2008-09-09 19:02:38 +0000
>>> +++ b/storage/myisam/mi_locking.c    2008-11-13 08:55:37 +0000
>>> @@ -104,7 +104,11 @@ int mi_lock_database(MI_INFO *info, int
>>>            mi_print_error(share, HA_ERR_CRASHED);
>>>            mi_mark_crashed(info);
>>>          }
>>> -    if (info->lock_type != F_EXTRA_LCK)
>>> +        /*
>>> +          If we have a pseudo lock (F_EXTRA_LCK) or a temporary table,
>>> +          skip file locking.
>>> +        */
>>> +        if ((info->lock_type != F_EXTRA_LCK) &&
> !share->temporary)
>>>      {
>>>        if (share->r_locks)
>>>        {                    /* Only read locks left */
>> You explained that this change's reason was:
>> "I believe, I have seen cases where a temporary table did not take a
>> F_EXTRA_LCK lock. I want to prevent locking in this case too."
>>
>> It is a desirable goal, but more of an optimization than a bug fix.
>> Also, normally, there should not be such cases: temporary tables should
>> always use F_EXTRA_LCK, like here:
>>
>> int ha_myisam::external_lock(THD *thd, int lock_type)
>> {
>>   file->in_use.data= thd;
>>   return mi_lock_database(file, !table->s->tmp_table ?
>>               lock_type : ((lock_type == F_UNLCK) ?
>>                        F_UNLCK : F_EXTRA_LCK));
>> }
>>
>> There is this code at start of mi_lock_database():
>>   if (lock_type == F_EXTRA_LCK)                 /* Used by TMP tables */
>>   {
>>     ++share->w_locks;
>>     ++share->tot_locks;
>>     info->lock_type= lock_type;
>>     info->s->in_use= list_add(info->s->in_use,
> &info->in_use);
>>     DBUG_RETURN(0);
>>   }
>> I believe this is the optimized code which is intended for temporary
>> tables. If myisamlog does not pass F_EXTRA_LCK, it may be because it's a
>> too old program which needs some fixing. By the way, myisamlog should
>> not log anything about temporary tables at all.
>>
>> I suggest to add
>>
>> else
>>   DBUG_ASSERT(!share->temporary);
>>
>> to the if() block just above, run the tests and see what happens.
> 
> You mean the following?
> 
>   if (lock_type == F_EXTRA_LCK)                 /* Used by TMP tables */
>   {
>     ++share->w_locks;
>     ++share->tot_locks;
>     info->lock_type= lock_type;
>     info->s->in_use= list_add(info->s->in_use, &info->in_use);
>     DBUG_RETURN(0);
>   }
>   else if (lock_type != F_UNLOCK)
>     DBUG_ASSERT(!share->temporary);

yes, that would be the correct assertion; I had missed the F_UNLOCK case.

>> Another note: isn't it worrying that if short of file descriptors,
>> close_some_file() closes a file which may be locked? I hope that the
>> mi_close() code is ready for such awkward situation, but I'm not sure.
>> Maybe close_some_file() should, if appropriate, unlock the table before
>> closing it.
> 
> mi_close() contains:
> 
>   if (share->reopen == 1 && share->kfile >= 0)
>     _mi_decrement_open_count(info);
> 
>   if (info->lock_type != F_UNLCK)
>   {
>     if (mi_lock_database(info,F_UNLCK))
>       error=my_errno;
>   }
> 
> _mi_decrement_open_count() calls mi_lock_database(info,F_WRLCK), updates
> open_count and calls mi_lock_database(info,old_lock).
> 
> This means:
> 
> Even for a temporary table, a write lock is taken, regardless, what the
> previous lock was. Thereafter the old lock is taken back, regardless,
> what it was. This could make problems on Windows.

Ok, and this would extend beyond your bugfix.
In the code above I see that, if the table is still locked after 
_mi_decrement_open_count(), mi_lock_database(info,F_UNLCK) is called. So 
when the file is finally closed it has been unlocked already, which I 
was not sure about.

> 
> At least I need to remove DBUG_ASSERT(!share->temporary). It crashed in
> alter_table.test, mix2_myisam.test, temp_table.test, and
> rpl_temp_table.test.

It's excellent that you tested the idea. So indeed, it proves that 
F_WRLCK is used for temp table sometimes... Then could you please change 
this comment:
   if (lock_type == F_EXTRA_LCK)                 /* Used by TMP tables */
by adding "sometimes" in the comment?

> The updated patch: http://lists.mysql.com/commits/59568

Could you please tell me against what revision id this patch is, so that 
I can apply it?
I recently changed our commit emailer so that such information is 
automatically included in commit mails.
Thread
bzr commit into mysql-6.0-backup branch (ingo.struewing:2730) Bug#38133Ingo Struewing13 Nov
  • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Guilhem Bichot14 Nov
    • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Ingo Strüwing21 Nov
      • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Guilhem Bichot27 Nov
        • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Ingo Strüwing29 Nov
          • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Guilhem Bichot4 Dec
            • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Ingo Strüwing5 Dec