List:Commits« Previous MessageNext Message »
From:Ingo Strüwing Date:December 5 2008 12:40pm
Subject:Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)
Bug#38133
View as plain text  
Hi Guilhem,

Guilhem Bichot, 04.12.2008 23:05:
...
> Ingo Strüwing a écrit, Le 11/29/2008 04:07 PM:
...
>> Guilhem Bichot, 27.11.2008 21:42:
>> ...
>>> Ingo Strüwing a écrit, Le 11/21/2008 05:59 PM:
>> ...
>>>> Guilhem Bichot, 14.11.2008 15:21:
>>>> ...
>>>>> Ingo Struewing a écrit, Le 11/13/2008 09:55 AM:
>>>> ...
>>>>>>  2730 Ingo Struewing    2008-11-13
>>>>>>       Bug#38133 - Myisamlog test fails on Windows
...
> Given the new facts which you explained... I am sorry... I am going to
> ask you to instead exhume your fix of my_lock.c as in
> http://lists.mysql.com/commits/58606 .

Ok. I will take that and add the changes, we agreed upon.

> 
> By the way, that fix to my_lock.c which unlocks before locking: this
> opens the possibility of a race: does the existing usage of my_lock
> expect that putting a 2nd lock atomically replaces the first lock (with
> no window for any other process acquiring the lock during replacement)?
> You mentioned that problem in a comment in
> http://bugs.mysql.com/bug.php?id=41124
> Imagine that process P1 does my_lock to change from write to read lock:
> it first unlocks the file (in your patch), and before it can re-lock it
> for read, then another process P2 comes and locks the file and reads its
> state for example. When P2 is done, P1 can finish its my_lock (get a
> read lock) and then unlock it in mi_lock_database(F_UNLCK), and what
> worries me is that things happen in that last call: it may write the
> state, but this state is out-of-date (P2 has done some changes to the
> table) so P2's state is going to be overwritten by P1's old state.
> Perhaps a way to test this is to have a test with external-locking,
> multiple clients, and a DBUG_EXECUTE_IF sleep between unlock and re-lock
> in my_lock, to see if a client going through the window hurts.

I can think of a much worse scenario. While server has the write lock,
myisampack is started on the table. It blocks, waiting for its own write
lock. When server unlocks before read lock, myisampack gets the lock
immediately...

There cannot be a doubt that the race exists. Unfortunately, neither you
nor me know of a way to avoid it. Unless, on Windows, we stay with the
write lock until the final unlock. But that would require a change in
mi_lock_database() and perhaps other uses of my_lock().

My point is the assumed low probability for the race to happen. The use
of --external-locking with the server on Windows is broken, regardless
if we unlock before locking or not. In one case we risk a corrupted
table, which should be repairable. In the other case we risk a hang of
the server after certain statements, which requires a kill with Windows
mechanism (and which could also cause corrupted tables). Anyway, who
will use --external-locking at all, and especially on Windows? There are
two possible reasons to use it. 1. Two or more servers co-working on
common tables. 2. Use of myisamchk while the server is running. For 1),
I don't see a use case these days, 2) is avoidable with administrative
precautions, but then --external-locking isn't required anyway.

In the clients, external locking is always used implicitly. But here I
don't expect the race to be effective. Unless one runs two clients on
the same table at the same time.

> 
> You also showed this scenario:
> 
> <quote>
> create table t1 (a int, b varchar(100)) engine=myisam;
> insert into t1 values(1,'life'), (2,'file');
> - open and lock for write (exclusive on Windows)
> - write (two times)
> - unlock
> insert into t1 select a*5, concat("A ",b) from t1;
> - open and lock for write (exclusive on Windows)
> - open second instance and lock for read
>   (lock suppressed by mi_lock_database: stronger lock exists)
> - write (two times)
> - unlock first instance (mi_lock_database wants to turn write lock
>   to read lock and thus adds a shared lock on Windows. It expects
>   a new lock to replace an old lock)
> - unlock second instance (final unlock by mi_lock_database. This
>   unlocks the latest lock on Windows: the shared lock. The exclusive
>   lock still persists.)
> </quote>
> 
> And this confused me: "(lock suppressed by mi_lock_database: stronger
> lock exists)": I see in mi_lock_database() that if info->lock_type is
> write-lock, and a read-lock is requested, mi_lock_database() does not
> suppress that read-lock request, it rather downgrades write-lock to
> read-lock.

This happens when the same instance of the table (the same MI_INFO) has
a write lock and wants a read lock. And if its write lock is the only
lock on the share (the MYISAM_SHARE).

Here it is the "second instance" (a different MI_INFO), that wants the
read lock. It didn't have a lock yet, so it skips that if clause. And it
finds the share having a write lock already (from the first instance),
so it skips that if clause too. No call to my_lock() is done.

> 
> And there is a comment in mi_lock_database() which says "mysqld does not
> turn write locks to read locks", has it missed external-locking?

No. The decision, if to do external locking or not, is done inside
my_lock(). mi_lock_database() just calls it when locking would be
required. I guess, you missed the "second instance" thing.

> 
> I wondered if fcntl(), which my_lock.c uses, made this guarantee of
> atomicity when replacing an old lock, but "man fcntl" is unclear, and so
> is "man lockf".

I would very much guess that it is safe. The man page talks of lock type
"conversion". But I agree that the documentation is not strong enough to
be fully convinced. And we didn't even check the man pages of all
platforms, we port to...

In the opengroup documentation, I found a paragraph that could give us
more confidence: http://www.opengroup.org/onlinepubs/009695399/
"
There shall be at most one type of lock set for each byte in the file.
Before a successful return from an F_SETLK or an F_SETLKW request when
the calling process has previously existing locks on bytes in the region
specified by the request, the previous lock type for each byte in the
specified region shall be replaced by the new lock type. As specified
above under the descriptions of shared locks and exclusive locks, an
F_SETLK or an F_SETLKW request (respectively) shall fail or block when
another process has existing locks on bytes in the specified region and
the type of any of those locks conflicts with the type specified in the
request.
"
But there is no guarantee that all our target systems follow it.

...
>>> I believe no one should expect this, as it's not announced by the
>>> Doxygen comment of mi_examine_log(). For me, setting my_errno is part of
>>> the API. If it's not documented, it does not have to be...
>>
>> It's always amazing to hear an experienced MySQL developer say that. I
>> have a limited comprehension for Monty's standpoint: "The source code is
>> the ultimate documentation". But Doxygen? How many MyISAM functions have
>> Doxygen comments? Not even mysys has complete Doxygen docs. At least I
>> don't know about them. It's often unclear, what to expect from mysys
>> functions. See our problem with my_lock().
> 
> Ah wait, you removed a piece of my reply, which was "I looked at the two
> only callers of mi_examine_log() (myisamlog.c and
> myisam_backup_engine.cc and they don't read my_errno."
> That is:
> - in code currently nobody reads my_errno after mi_examine_log()
> - so practical use of the API does not require it
> - and theoretical use of the API (=comments) neither
> - so we are free to declare that mi_examine_log() does not need to set
> my_errno: doesn't break code and API.

Absolutely agree. It was my quote (again, sigh) that was confusing. My
comment aimed at "If it's not documented, it does not have to be..."
This is a commonly raised sentence, I'm not happy with.

...
>> BTW, I wonder why _mi_decrement_open_count() needs to lock temporary
>> tables at all?
> 
> It does not need to lock the file, but who knows, mi_lock_database() is
> more than a file lock, maybe it does important things; I'd stay away
> from optimizations.

Yes, I don't want to change it now. But what I mean: temporary tables
are not locked with lock_tables(), nor with mysql_lock_tables(), nor
with lock_external(), nor with handler::ha_external_lock(), and hence
not with mi_lock_database(). So it looks dubious to use
mi_lock_database() on temporary tables, just for decrementing the open
count.

OTOH, the heading comment in lock.cc suggests that there is a difference
between temporary tables and "temporary non transactional" tables.
Perhaps MyISAM cannot distinguish them.

Regards
Ingo
-- 
Ingo Strüwing, Database Group
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Geschäftsführer: Thomas Schröder, Wolfgang Engels, Dr. Roland Bömer
Vorsitzender des Aufsichtsrates: Martin Häring   HRB München 161028
Thread
bzr commit into mysql-6.0-backup branch (ingo.struewing:2730) Bug#38133Ingo Struewing13 Nov
  • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Guilhem Bichot14 Nov
    • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Ingo Strüwing21 Nov
      • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Guilhem Bichot27 Nov
        • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Ingo Strüwing29 Nov
          • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Guilhem Bichot4 Dec
            • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2730)Bug#38133Ingo Strüwing5 Dec