List:Commits« Previous MessageNext Message »
From:Ingo Strüwing Date:November 21 2008 4:59pm
Subject:Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)
Bug#38133
View as plain text  
Hi Guilhem,

thank you for your thoughts. I made a new patch that includes most of
your wishes, as I understood them. I did also make it a bit smaller,
leaving out debug print I don't need any more.

But you write a lot of "could" and "can". Could you please be a bit
clearer, how much you would like me to make the respective changes?
Please see below for comments.

Guilhem Bichot, 14.11.2008 15:21:
...
> Ingo Struewing a écrit, Le 11/13/2008 09:55 AM:
...
>>  2730 Ingo Struewing    2008-11-13
>>       Bug#38133 - Myisamlog test fails on Windows
>>             Updating of a table from the myisam.log file was not
>> possible in most cases.
>>       - The initialization of myisamlog set the number of usable file
>> descriptors
>>         to one (1!) on Windows.
>>       - Every log command to a different file required a close of the
>> previous file.
>>         After a 'close' command the closing of the previous file
>> failed because
>>         all files were closed already and myisamlog stopped.
>>       - File locking on Windows failed for the sequence "exclusive
>> lock, shared lock,
>>         unlock, exclusive lock". It blocked on the last try to acquire
>> an exclusive lock.
>>             Fixed by
>>       - requesting a reasonable number of file descriptors,
>>       - before re-open, close a file only if all file descriptors are
>> in use,
>>       - do not re-open a file, if the command is 'close',
>>       - unlock a file before every lock.
> 
>> per-file messages:
>>   mysys/my_lock.c
>>     Bug#38133 - Myisamlog test fails on Windows
>>     Added unlocking before locking to prevent multiple locks by the
>> same process.
>>   sql/handler.cc
>>     Bug#38133 - Myisamlog test fails on Windows
>>     Changed abort() to DBUG_ABORT().
> 
> you could say why this change was needed.

IMHO file comments should mainly tell what was changed. This can be
valuable information for difficult merges.

The reason for changes is usually better placed in the revision comment,
in the code (if it can be written as a reason for that implementation,
not for a change), or perhaps in the bug report.

For a small change like this, none of these seem appropriate. Especially
as I don't find a written reason for the same change in all the other
uses within sql/handler.cc.

Anyway, the change was unrelated to the bug. So I reverted it.

...
>> === modified file 'mysys/my_lock.c'
>> --- a/mysys/my_lock.c    2008-09-12 12:57:19 +0000
>> +++ b/mysys/my_lock.c    2008-11-13 08:55:37 +0000
>> @@ -71,13 +71,23 @@ static int win_lock(File fd, int locktyp
>>      /* write lock is mapped to an exclusive lock. */
>>      dwFlags= LOCKFILE_EXCLUSIVE_LOCK;
>>  
>> +  /*
>> +    Drop old lock first to avoid double locking.
>> +    During analyze of Bug#38133 (Myisamlog test fails on Windows)
>> +    I met the situation that the program myisamlog locked the file
>> +    exclusively, then additionally shared, then did one unlock, and
>> +    then blocked on an attempt to lock it exclusively again.
>> +    Unlocking before every lock fixed the problem.
>> +  */
>> +  (void) UnlockFileEx(hFile, 0, liLength.LowPart, liLength.HighPart,
>> &ov);
> 
> I'm anxious to have a fix like this which will in theory affect the
> entire server, only to fix something exhibited by myisamlog, without
> having a clear understanding of what is actually wrong (what module is
> guilty).
> When I read the comment above, my impression is that myisamlog is doing
> something wrong:
> - why lock shared if already has an exclusive lock?
> - why re-lock exclusive if already has an exclusive lock?
> It is "expected" that myisamlog does weird things: it's code which was
> dead for >6 years, is very little tested in mysql-test, it may have old
> bugs or new bugs due to other code's evolution. So I'd prefer if we dug
> into myisamlog and fixed myisamlog than the server-wide locking code.
> 
> In other words, I believe the bug could be in the user of mysys
> (myisamlog) more than in mysys. If the server didn't have any problem
> with mysys of before-the-patch, given that the server is much more used
> than myisamlog, I'd trust mysys more than myisamlog.

Here is a trace of the situation with statements from the test case:
create table t1 (a int, b varchar(100)) engine=myisam;
insert into t1 values(1,'life'), (2,'file');
- open and lock for write (exclusive on Windows)
- write (two times)
- unlock
insert into t1 select a*5, concat("A ",b) from t1;
- open and lock for write (exclusive on Windows)
- open second instance and lock for read
  (lock suppressed by mi_lock_database: stronger lock exists)
- write (two times)
- unlock first instance (mi_lock_database wants to turn write lock
  to read lock and thus adds a shared lock on Windows. It expects
  a new lock to replace an old lock)
- unlock second instance (final unlock by mi_lock_database. This
  unlocks the latest lock on Windows: the shared lock. The exclusive
  lock still persists.)
update t1 set b="A knife" where a=5;
- lock (first instance) for write (exclusive on Windows, this one
  hangs as another exclusive lock still exists.)

My initial proposal was to fix it in my_lock() so that its behavior is
consistent over platforms: A new lock replaces an old one.

The second best solution would be to make mi_lock_database() aware of
the different behavior on the platforms IMHO. In case of lock type
changes on Windows, it would first unlock.

But since this does also affect the server, I implemented kind of a
"behavior prediction engine" in myisamlog. Depending on the current
locks on the share, and the platform, I release a lock before calling
mi_lock_database().

But I wonder how you can be so sure that "the server didn't have any
problem with mysys of before-the-patch", given that the server is much
more used than myisamlog". How much do we test with --external-locking
on Windows?

...
>> @@ -186,6 +194,14 @@ int mi_examine_log(MI_EXAMINE_LOG_PARAM    (void)
>> init_key_cache(dflt_key_cache,KEY_CACHE_BLOCK_SIZE,KEY_CACHE_SIZE,
>>                          0, 0);
>>  
>> +  /*
>> +    Initialize members of file_info that are used for pointing to
>> +    allocated memory. At the error labels we want to be able to free it.
>> +  */
>> +  file_info.name= NULL;
>> +  file_info.show_name= NULL;
>> +  file_info.record= NULL;
> 
> A super-critical comment: you could have put all inits on one line:
> file_info.name= file_info.show_name= file_info.record= NULL;

Here's one of these "could"s. And yes, you are right. I could have
written it on one line. And I did know that before. It is just so that
for my old eyes the one-line writing is worse to parse than the three
line writing. When reading in the code below and raising the question
"has file_info.show_name been initialized above?", I would find it among
the one-assignment-per-line faster than among the
as-many-assignments-as-you-can-per-line.

So if you "could" is weaker than "it is our coding style", then I'd just
ignore it here.

...
>> +    /*
>> +      'command' is a number that is used to index arrays. Better check
>> +      it for range.
>> +    */
>> +    if (command >= MI_LOG_END_SENTINEL)
>> +    {
>> +      my_errno= HA_ERR_WRONG_COMMAND;
>> +      fprintf(stderr,"Unknown command %u in logfile at position %s\n",
>> +              command, llstr(isamlog_filepos, llbuff));
>> +      fflush(stderr);
>> +      goto end;
> 
> After "end:" we don't make use of my_errno, so no need to set it.
> Or, set it, but use "goto err", in that case you can remove the
> fflush(stderr).

Ok, removed. I trust you that nobody will ever expect
mi_examine_log() to set my_errno on error.

The other option would print a second, confusing error message.

...
>> === modified file 'storage/myisam/mi_locking.c'
>> --- a/storage/myisam/mi_locking.c    2008-09-09 19:02:38 +0000
>> +++ b/storage/myisam/mi_locking.c    2008-11-13 08:55:37 +0000
>> @@ -104,7 +104,11 @@ int mi_lock_database(MI_INFO *info, int
>>            mi_print_error(share, HA_ERR_CRASHED);
>>            mi_mark_crashed(info);
>>          }
>> -    if (info->lock_type != F_EXTRA_LCK)
>> +        /*
>> +          If we have a pseudo lock (F_EXTRA_LCK) or a temporary table,
>> +          skip file locking.
>> +        */
>> +        if ((info->lock_type != F_EXTRA_LCK) &&
> !share->temporary)
>>      {
>>        if (share->r_locks)
>>        {                    /* Only read locks left */
> 
> You explained that this change's reason was:
> "I believe, I have seen cases where a temporary table did not take a
> F_EXTRA_LCK lock. I want to prevent locking in this case too."
> 
> It is a desirable goal, but more of an optimization than a bug fix.
> Also, normally, there should not be such cases: temporary tables should
> always use F_EXTRA_LCK, like here:
> 
> int ha_myisam::external_lock(THD *thd, int lock_type)
> {
>   file->in_use.data= thd;
>   return mi_lock_database(file, !table->s->tmp_table ?
>               lock_type : ((lock_type == F_UNLCK) ?
>                        F_UNLCK : F_EXTRA_LCK));
> }
> 
> There is this code at start of mi_lock_database():
>   if (lock_type == F_EXTRA_LCK)                 /* Used by TMP tables */
>   {
>     ++share->w_locks;
>     ++share->tot_locks;
>     info->lock_type= lock_type;
>     info->s->in_use= list_add(info->s->in_use, &info->in_use);
>     DBUG_RETURN(0);
>   }
> I believe this is the optimized code which is intended for temporary
> tables. If myisamlog does not pass F_EXTRA_LCK, it may be because it's a
> too old program which needs some fixing. By the way, myisamlog should
> not log anything about temporary tables at all.
> 
> I suggest to add
> 
> else
>   DBUG_ASSERT(!share->temporary);
> 
> to the if() block just above, run the tests and see what happens.

You mean the following?

  if (lock_type == F_EXTRA_LCK)                 /* Used by TMP tables */
  {
    ++share->w_locks;
    ++share->tot_locks;
    info->lock_type= lock_type;
    info->s->in_use= list_add(info->s->in_use, &info->in_use);
    DBUG_RETURN(0);
  }
  else if (lock_type != F_UNLOCK)
    DBUG_ASSERT(!share->temporary);

> 
> Another note: isn't it worrying that if short of file descriptors,
> close_some_file() closes a file which may be locked? I hope that the
> mi_close() code is ready for such awkward situation, but I'm not sure.
> Maybe close_some_file() should, if appropriate, unlock the table before
> closing it.

mi_close() contains:

  if (share->reopen == 1 && share->kfile >= 0)
    _mi_decrement_open_count(info);

  if (info->lock_type != F_UNLCK)
  {
    if (mi_lock_database(info,F_UNLCK))
      error=my_errno;
  }

_mi_decrement_open_count() calls mi_lock_database(info,F_WRLCK), updates
open_count and calls mi_lock_database(info,old_lock).

This means:

Even for a temporary table, a write lock is taken, regardless, what the
previous lock was. Thereafter the old lock is taken back, regardless,
what it was. This could make problems on Windows.

At least I need to remove DBUG_ASSERT(!share->temporary). It crashed in
alter_table.test, mix2_myisam.test, temp_table.test, and
rpl_temp_table.test.

The updated patch: http://lists.mysql.com/commits/59568

Regards
Ingo
-- 
Ingo Strüwing, Database Group
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Geschäftsführer: Thomas Schröder, Wolfgang Engels, Dr. Roland Bömer
Vorsitzender des Aufsichtsrates: Martin Häring   HRB München 161028





Thread
bzr commit into mysql-6.0-backup branch (ingo.struewing:2730) Bug#38133Ingo Struewing13 Nov
  • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Guilhem Bichot14 Nov
    • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Ingo Strüwing21 Nov
      • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Guilhem Bichot27 Nov
        • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Ingo Strüwing29 Nov
          • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Guilhem Bichot4 Dec
            • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Ingo Strüwing5 Dec