List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:June 10 2009 8:26am
Subject:Re: review of Performance Schema WL#2360: thread-safety
View as plain text  
Hello Marc,

Marc Alff a écrit, Le 10.06.2009 02:26:
> Hi Guilhem
> Guilhem Bichot wrote:

>> There are several protections:
>> a) pfs->m_lock.is_populated()
>> b1) pfs->m_lock.begin and b2) end_optimist_lock()
>> c) dest->m_wait_class= NO_WAIT_CLASS (copy_events_waits() for history
>> tables, and get_thread_*_locker_v1() for current events).
>> This is fine but a bit more than usual straightforward protection
>> methods like mutex/rwlock, so I think it would deserve a descriptive
>> comment in some central place, saying:
>> - what each of them protects against
>> - what kind of badness is allowed in the returned data.
>> There *are* comments spread in different places, maybe it's just a
>> matter of reorganizing them a bit? Here is some raw input for the
>> comment if you like. Looks like the logic is, when reading a row:
>> - first, in rnd_*, with (a), see if this row is filled - if no, get out
>> - then in make_row() remember with (b1) the version of this row
>> - then read the row into some private buffer (m_row)
>> - then check with (b2) if the version changed (i.e. row was freed and
>> is being reused) - if yes, mark row as phantom
>> - then in read_row_values(), if row is phantom, don't return it,
>> otherwise transform it into a row readable by the SQL layer.
>> And all this is allowed (due to non-atomic reads, out-of-order
>> execution) to occasionally return bad data ("bad" is to be defined) to
>> SELECT.
>> (a) cannot fail
>> (b) can fail because it uses non-atomic reads in end_optimist_lock()
>> when reading the current m_version (could be reading from an old CPU
>> cache value in theory)
>> (c) can fail because of out-of-order execution.
>> -> and that's why occasional garbage can be seen.
>>
>> I guess that above we could collapse "see if this row is filled" with
>> "remember the version of this row" in one single call. Something like
>> (my function has a bad name):
>> bool pfs_lock::is_populated_and_begin_optimistic_locking(struct pfs_lock
>> *copy)
>> {
>>   int32 copy_state= m_state; /* non volatile copy, and dirty read */
>>   /* snapshot record's states to later catch changes under out feet */
>>   copy->m_version= m_version;
>>   copy->m_state= copy_state;
>>   /* tell if record contains readable content */
>>   return ((copy_state == PFS_LOCK_ALLOCATED) || (copy_state ==
>> PFS_LOCK_RECYCLE));
>> }
>> This way won't work because pfs_lock (passed as "copy" parameter to
>> the function above) is currently local to
>> table_events_waits_common::make_row() so not visible to rnd_*();
>> maybe it can instead be local to rnd_*() and passed in parameter to
>> make_row().
>> To me, collapsing makes sense. If I am a human doing the job, I'd
>> rather look at m_state and memorize its value at the same time, than
>> do two trips at two different moments.
>> After doing this, this piece of end_optimistic_locking() can be
>> removed:
>>           ((copy->m_state == PFS_LOCK_ALLOCATED) ||
>>            (copy->m_state == PFS_LOCK_RECYCLE)));
>> because it is already checked by
>> is_populated_and_begin_optimistic_locking(). That is, we avoid
>> checking this state condition twice (right now we check it in
>> is_populated() and in end_optimist_lock()).
>> The only objection I see to doing this change is that we currently
>> don't do optimistic locking if "!thread_own_wait" in
>> table_events_waits_common::make_row(). However, that second step is
>> only a copy of two members, it may be worth losing time here if it
>> makes the implementation more simple.
> 
> In other words, you want to change the design and the code structure so
> that:
> - begin_optimistic_lock()
> - end_optimistic_lock()
> are not in the same function anymore (which I see as bad), just to save
> two copy of an int.

If that was the reason, that wouldn't be good.
I was not thinking about performance reasons. I find it natural to read 
the state and record the version of this state in one go, then do 
something else (build m_row), then check the new version without having 
to look at the state again. More natural than reading the state, doing 
something else (entering in another function), recording the version, 
doing something else (building m_row), then checking the new version and 
the new state.

> No change.

As you like.

>> In
>> unlock_rwlock_v1(): see my two "//" comments below
>>   if (pfs_rwlock->m_writer)
>>   {
>>     if (pfs_rwlock->m_class->m_timed)
>>     {
>>       locked_time= get_timer_value(wait_timer) -
>> pfs_rwlock->m_last_written;
>>       aggregate_single_stat_chain(& pfs_rwlock->m_write_lock_stat,
>> locked_time);
>>     }
>>     pfs_rwlock->m_writer= NULL;
>>     // if we come here we had a write-lock so nobody has a read-lock
>>     // so m_readers is already 0, no need to do it below, same for
>>     // m_last_read. Can be removed or replaced by assertions.

So here, my comment was ignored as I still see:
     /* Reset the readers stats, they could be off */
     pfs_rwlock->m_readers= 0;
     pfs_rwlock->m_last_read= 0;
How could they possibly be off?

>>     pfs_rwlock->m_readers= 0;
>>     pfs_rwlock->m_last_written= 0;
>>     pfs_rwlock->m_last_read= 0;
>>   }
>>   else if (pfs_rwlock->m_readers == 1)
>>   {
>>     if (pfs_rwlock->m_class->m_timed)
>>     {
>>       locked_time= get_timer_value(wait_timer) - pfs_rwlock->m_last_read;
>>       aggregate_single_stat_chain(& pfs_rwlock->m_read_lock_stat,
>> locked_time);
>>     }
>>     // if we come here, nobody has a write-lock so pfs_rwlock->m_writer
>>     // is already NULL (was always NULL or was set to NULL by
>>     // the last rwlock write-unlock), so no need to do it below; same for
>>     // m_last_written. Can be removed or replaced by assertions.

And here I still see
      pfs_rwlock->m_last_written= 0;
why?

>>     pfs_rwlock->m_writer= NULL;
>>     pfs_rwlock->m_readers= 0;
>>     pfs_rwlock->m_last_written= 0;
>>     pfs_rwlock->m_last_read= 0;

>> In end_rwlock_rdwait_v1(), "rwlock->m_writer= NULL;" is not needed:
>> either the rwlock was never locked then this is NULL, or it was
>> write-locked and then unlocked which also set this to NULL.

I still see
       rwlock->m_writer= NULL;
in end_rwlock_rdwait_v1(), why?

>> In end_rwlock_wrwait_v1(), "rwlock->m_readers= 0; rwlock->m_last_read=
>> 0;" is not needed: nobody has a read-lock so those variables are
>> already 0.

I still see
       rwlock->m_readers= 0;
       rwlock->m_last_read= 0;
in end_rwlock_wrwait_v1(), why?

> Rewrote end_rwlock_rdwait_v1().
> 

>> If mysqld runs with --myisam-use-mmap, most read/write operations to
>> data files of MyISAM tables happen through mmap()-ed memory instead of
>> traditional file operations; for example: mi_mmap_pread(). Those
>> mmap()-ed operations are not yet recorded by P_S; that may skew
>> statistics and confuse the user. Could you please make a note of this
>> in the manual (for users of P_S) and in the function comments of
>> mi_mmap_pread() and mi_mmap_pwrite()? Btw I hadn't noticed it, but
>> found it while reading Kay's comment on Apr 1st in
>> http://bugs.mysql.com/bug.php?id=44013 .
> 
> [Paul:]
> 
> I will let the documentation team decide.
> 
> My opinion is that the manual for the performance schema is not the
> place to document the internal behavior of the entire server code.
> The role of the performance schema is to expose that behavior, not
> justify or explain it.

You mean that as P_S does not instrument memory, and --myisam-mmap uses 
memory for I/O, then it's logical that MyISAM I/O does not show up in 
P_S. I agree, though I also let the docs team decide whether this 
deserves a note.

>> Functions which have no input parameter could be specified as f(void)
>> instead of f(), for example reset_events_waits_history_long(); I
>> remember you only fixed some places in a previous patch.
>> cd storage/perfschema; egrep "\(\)\s*$" *.h *.cc
>> catches most of them.
> 
> Changed, not fixed.

Thanks for doing the change anyway. Regarding why it is a fix,
as I wrote in a previous mail:
if decl.h contains
  int f();
and def.c contains
  int f(char *x) { return *x;}
and use.c contains
  #include "decl.h"
  printf("%d",f(3,26));
and we compile this, gcc will not complain (because "f()" is K&R valid 
style), and so the mismatch between definition and use will not be 
caught, and the program will behave strangely at runtime (segfault 
here). Of course such bad code would not be written this way in the 
beginning, but assume that def.c used to read:
int f(int x, int y) { return x+y;}
i.e. was consistent with use.c; then someone changed def.c, ran the 
compiler in the hope to catch all to-be-fixed invokation places 
(thinking that they would generate compiler errors)... and didn't find 
places, and the bug comes in. I hit this for real in Maria.
This is true only for C; in C++ the mismatch is detected. That's why I 
wrote "could" in my comment.

-- 
Mr. Guilhem Bichot <guilhem@stripped>
Sun Microsystems / MySQL, Lead Software Engineer
Bordeaux, France
www.sun.com / www.mysql.com
Thread
review of Performance Schema WL#2360: thread-safetyGuilhem Bichot3 Jun
  • Re: review of Performance Schema WL#2360: thread-safetyMarc Alff10 Jun
    • Re: review of Performance Schema WL#2360: thread-safetyGuilhem Bichot10 Jun