List:Commits« Previous MessageNext Message »
From:Dmitry Lenev Date:October 1 2010 10:33am
Subject:Re: bzr commit into mysql-5.5-bugfixing branch (vvaintroub:3204)
View as plain text  
Hello Vlad!

Here are my comments about your patch:

* Vladislav Vaintroub <vvaintroub@stripped> [10/09/27 23:43]:
> #At file:///H:/bzr-new/55bf/ based on
> revid:mats.kindahl@stripped
> 
>  3204 Vladislav Vaintroub	2010-09-27
>       Use native windows conditions and rwlocks i mysys if available.
>       
                                                  ^^
1)                                                in

>       This patch is intended to be applied  on top of 
>       dlenev's prlock patch.
>       
>       rwlocks used here do not implement extra properties required by runtime 
>       team, commonly knows as "readers preference" ( which in detail means
>       starving writer, recursive read acquires, serialization of unlock 
>       calls and perhaps more).
>       
>       Also , since this patch is using pthread_once heavily, optimize it to save
>       and interlocked operation in case once routine has already finished.

2) I think it makes sense to reference in the ChangeSet comment bugs
   with which this fix is associated. For example by starting it with
   something like:

   "
   A follow-up to the patch for bug #56405 "Deadlock in the MDL deadlock
   detector". This patch addresses performance regression in OLTP_RO/MyISAM
   test on Windows introduced by the fix for bug #56405. Thus it makes
   original patch acceptable as a solution for bug #56585 "Slowdown of
   readonly sysbench benchmarks (e.g point_select) on Windows 5.5".
   "

   I also think that is worth to mention that this patch doesn't introduce
   other regressions in SysBench tests (as you have checked, right?).
   And even maybe mention scale of great improvements it provides
   (AFAIU up to 35% compared to situation before prlock patch in OLTP_RO/
   MyISAM test for 256 threads).

> === modified file 'include/my_pthread.h'
> --- a/include/my_pthread.h	2010-08-10 21:12:01 +0000
> +++ b/include/my_pthread.h	2010-09-27 19:38:44 +0000
> @@ -49,6 +49,7 @@ typedef struct st_pthread_link {
>  } pthread_link;
>  
>  typedef struct {
> +  CONDITION_VARIABLE native_cond; /*native condition (Vista and later) */

3) Is CONDITION_VARIABLE type defined on Windows versions prior to
   Vista/2008? Does this mean that one no longer will be able to build
   server on Windows 2000/XP which we still support according to
   http://dev.mysql.com/doc/refman/5.5/en/windows-source-build.html?
   Maybe some #ifdef's are required here?

4) IMO it makes sense to have a short comment explaining before this
   structure explaining that we use native implementation of condition
   variable if it is available and resort to our own implementation
   otherwise.

5) You have told me that CONDITION_VARIABLE is a small object so
   it won't increase size of pthread_cond_t much. Still since it
   is never used together with the rest of structure maybe it
   makes sense to put it in union with the rest of the structure
   to save some space?

>    uint32 waiting;
>    CRITICAL_SECTION lock_waiting;
> 
> @@ -651,6 +652,10 @@ extern int rw_pr_init(struct st_my_rw_lo
>    read/write locks which prefer readers we have to use own implementation.
>  */
>  typedef struct st_my_rw_lock_t {
> +#ifdef _WIN32
> +	SRWLOCK srwlock;            /* native rwlock (Vista and later) */
> +	BOOL have_exclusive_srwlock;   /* used for unlock */
> +#endif
>  	pthread_mutex_t lock;		/* lock for structure		*/
>  	pthread_cond_t	readers;	/* waiting readers		*/
>  	pthread_cond_t	writers;	/* waiting writers		*/
> 

Comments 3, 4 and 5) apply to the above code as well.

> === modified file 'mysys/my_wincond.c'
> --- a/mysys/my_wincond.c	2009-11-23 17:08:37 +0000
> +++ b/mysys/my_wincond.c	2010-09-27 19:38:44 +0000
> @@ -24,7 +24,107 @@
>  #include <process.h>
>  #include <sys/timeb.h>
>  
> -int pthread_cond_init(pthread_cond_t *cond, const pthread_condattr_t *attr)
> +
> +/* Windows native condition variables. We use runtime loading / function 
> +  pointers, because they are not available on XP */
> +

6) Please fix formatting of the above comment to adhere to our coding
   guidelines (see:
http://forge.mysql.com/wiki/MySQL_Internals_Coding_Guidelines#Commenting_Code).

> +/* Prototypes and function pointers for condition variable functions */
> +typedef VOID (WINAPI * InitializeConditionVariableProc) 
> +  (PCONDITION_VARIABLE ConditionVariable);
> +static InitializeConditionVariableProc my_InitializeConditionVariable;
> +
> +typedef BOOL (WINAPI * SleepConditionVariableCSProc)
> +  (PCONDITION_VARIABLE ConditionVariable,
> +  PCRITICAL_SECTION CriticalSection, 
> +  DWORD dwMilliseconds);
> +static SleepConditionVariableCSProc my_SleepConditionVariableCS;
> +
> +typedef VOID (WINAPI * WakeAllConditionVariableProc)
> + (PCONDITION_VARIABLE ConditionVariable);
> +static WakeAllConditionVariableProc my_WakeAllConditionVariable;
> +
> +typedef VOID (WINAPI * WakeConditionVariableProc)
> +  (PCONDITION_VARIABLE ConditionVariable);
> +static WakeConditionVariableProc my_WakeConditionVariable;

7) I concur with Davi, it makes sense to separate typedefs and
   variable declarations to make code more readable.

> +
> +
> +/**
> + Indicates if we have native condition variables,
> + initialized first time pthread_cond_init is called.
> +*/
> +static BOOL have_native_conditions= FALSE;
> +
> +/* Old condition implementation (pre-Vista) */
> +static int legacy_cond_init(pthread_cond_t *, const pthread_condattr_t *);
> +static int legacy_cond_destroy(pthread_cond_t *);
> +static int legacy_cond_signal(pthread_cond_t *);
> +static int legacy_cond_broadcast(pthread_cond_t *);
> +static int legacy_cond_timedwait(pthread_cond_t *, pthread_mutex_t *, 
> +  struct timespec *);

8) Do we need these forward declarations at all?
   Maybe it makes sense to remove them?

> +
> +
> +/**
> +  Check if native conditions can be used, load function pointers 
> +*/
> +static void check_native_cond_availability(void)

9) AFAIR rhere should be an empty line between function comment
   and function implementation according to our coding guidelines.
   Please adhere to them here and in other places.

> +{
> +  HMODULE module= GetModuleHandle("kernel32");
> +
> +  my_InitializeConditionVariable = (InitializeConditionVariableProc)
> +    GetProcAddress(module, "InitializeConditionVariable");

10) According to coding guidelines there should be no ' ' before '='.
    Please fix this and other places.

> +  my_SleepConditionVariableCS= (SleepConditionVariableCSProc)
> +    GetProcAddress(module, "SleepConditionVariableCS");
> +  my_WakeAllConditionVariable= (WakeAllConditionVariableProc)
> +    GetProcAddress(module, "WakeAllConditionVariable");
> +  my_WakeConditionVariable= (WakeConditionVariableProc)
> +    GetProcAddress(module, "WakeConditionVariable");
> +
> +  if(my_InitializeConditionVariable)
> +    have_native_conditions = TRUE;
> +}
> +
> +/**

11) Our coding guidelines say - "To separate two functions, use three line
    breaks (two empty lines)." Please adhere to them here and in other
    places.

> +  Convert abstime to milliseconds
> +*/
> +static DWORD get_milliseconds(const struct timespec *abstime)
> +{
> +  long long millis; 
> +  union ft64 now;
> +
> +  if(abstime == NULL)
> +   return INFINITE;

12) Again according to coding guidelines there should be ' '
    between 'if' and '('. Please fix this and other places
    accordingly.

> +
> +  GetSystemTimeAsFileTime(&now.ft);
> +
> +  /*
> +    Calculate time left to abstime
> +    - subtract start time from current time(values are in 100ns units)
> +    - convert to millisec by dividing with 10000
> +  */
> +  millis = (abstime->tv.i64 - now.i64) / 10000;
> +  
> +  /* Don't allow the timeout to be negative */
> +  if (millis < 0)
> +    return 0;
> +
> +  /*
> +    Make sure the calucated timeout does not exceed original timeout
> +    value which could cause "wait for ever" if system time changes
> +  */
> +  if (millis > abstime->max_timeout_msec)
> +    millis= abstime->max_timeout_msec;
> +  
> +  if (millis > UINT_MAX)
> +    millis= UINT_MAX;
> +
> +  return (DWORD)millis;
> +}
> +
> +
> +/*
> +  Old (pre-vista) implementation using events
> +*/
> +static int legacy_cond_init(pthread_cond_t *cond, const pthread_condattr_t *attr)
>  {
>    cond->waiting= 0;
>    InitializeCriticalSection(&cond->lock_waiting);
> @@ -53,7 +153,8 @@ int pthread_cond_init(pthread_cond_t *co
>    return 0;
>  }
>  
> -int pthread_cond_destroy(pthread_cond_t *cond)
> +
> +static int legacy_cond_destroy(pthread_cond_t *cond)
>  {
>    DeleteCriticalSection(&cond->lock_waiting);
>  
> @@ -65,48 +166,13 @@ int pthread_cond_destroy(pthread_cond_t 
>  }
>  
>  
> -int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex)
> -{
> -  return pthread_cond_timedwait(cond,mutex,NULL);
> -}
> -
> -
> -int pthread_cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *mutex,
> +static int legacy_cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *mutex,
>                             struct timespec *abstime)
>  {
>    int result;
>    long timeout; 
> -  union ft64 now;
> -

13) May be we should make 'timeout' a DWORD variable as well?

> -  if( abstime != NULL )
> -  {
> -    GetSystemTimeAsFileTime(&now.ft);
> -
> -    /*
> -      Calculate time left to abstime
> -      - subtract start time from current time(values are in 100ns units)
> -      - convert to millisec by dividing with 10000
> -    */
> -    timeout= (long)((abstime->tv.i64 - now.i64) / 10000);
> -    
> -    /* Don't allow the timeout to be negative */
> -    if (timeout < 0)
> -      timeout= 0L;
> -
> -    /*
> -      Make sure the calucated timeout does not exceed original timeout
> -      value which could cause "wait for ever" if system time changes
> -    */
> -    if (timeout > abstime->max_timeout_msec)
> -      timeout= abstime->max_timeout_msec;
> -
> -  }
> -  else
> -  {
> -    /* No time specified; don't expire */
> -    timeout= INFINITE;
> -  }
>  
> +  timeout= get_milliseconds(abstime);
>    /* 
>      Block access if previous broadcast hasn't finished.
>      This is just for safety and should normally not
> @@ -142,7 +208,7 @@ int pthread_cond_timedwait(pthread_cond_
>    return result == WAIT_TIMEOUT ? ETIMEDOUT : 0;
>  }
>  
> -int pthread_cond_signal(pthread_cond_t *cond)
> +static int legacy_cond_signal(pthread_cond_t *cond)
>  {
>    EnterCriticalSection(&cond->lock_waiting);
>    
> @@ -155,7 +221,7 @@ int pthread_cond_signal(pthread_cond_t *
>  }
>  
>  
> -int pthread_cond_broadcast(pthread_cond_t *cond)
> +static int legacy_cond_broadcast(pthread_cond_t *cond)
>  {
>    EnterCriticalSection(&cond->lock_waiting);
>    /*
> @@ -177,6 +243,80 @@ int pthread_cond_broadcast(pthread_cond_
>  }
>  
>  
> +/* 
> + Posix API functions just chose between native and legacy implementation 
> +*/
> +
> +int pthread_cond_init(pthread_cond_t *cond, const pthread_condattr_t *attr)
> +{
> +  /* check if native conditions are available on the first call */
> +  static my_pthread_once_t once_control = MY_PTHREAD_ONCE_INIT;
> +  my_pthread_once(&once_control, check_native_cond_availability);
> +

14) As Davi I would have preferred if this one-time initialization
    would have happened somewhere in my_init() or my_basic_init().
    This would have allowed to avoid overhead of extra interlocked
    exchange/access to volatile variable for each pthread_cond_init.

    If you are concerned with a safety of doing this initialization in
    my_basic_init() we could add assert which will fail if someone tries
    to init condition variable before check_native_cond_availability()
    is called.

    In any case, if there are reasons to do this one time initialization
    using my_pthread_once() in this call they should be explained in the
    above comment.

...

> +  if(have_native_conditions)
> +  {
> +	my_InitializeConditionVariable(&cond->native_cond);
> +	return 0;
> +  }
> +  else 
> +    return legacy_cond_init(cond, attr);
> +}
> +
> +
> +int pthread_cond_destroy(pthread_cond_t *cond)
> +{
> +  if(have_native_conditions)
> +    return 0; /* no destroy function */
> +  else
> +    return legacy_cond_destroy(cond);
> +}
> +
> +
> +int pthread_cond_broadcast(pthread_cond_t *cond)
> +{
> +  if(have_native_conditions)
> +  {
> +    my_WakeAllConditionVariable(&cond->native_cond);
> +    return 0;
> +  }
> +  else
> +    return legacy_cond_broadcast(cond);
> +}
> +
> +
> +int pthread_cond_signal(pthread_cond_t *cond)
> +{
> +  if(have_native_conditions)
> +  {
> +    my_WakeConditionVariable(&cond->native_cond);
> +    return 0;
> +  }
> +  else
> +    return legacy_cond_signal(cond);
> +}
> +
> +
> +int pthread_cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *mutex,
> +  struct timespec *abstime)
> +{
> +  if(have_native_conditions)
> +  {
> +    DWORD timeout= get_milliseconds(abstime);
> +    if(!my_SleepConditionVariableCS(&cond->native_cond, mutex, timeout))
> +      return ETIMEDOUT;
> +    return 0;  
> +  }
> +  else
> +    return legacy_cond_timedwait(cond, mutex, abstime);
> +}
> +
> +
> +int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex)
> +{
> +  return pthread_cond_timedwait(cond, mutex, NULL);
> +}
> +
> +
>  int pthread_attr_init(pthread_attr_t *connect_att)
>  {
>    connect_att->dwStackSize	= 0;
> 
> === modified file 'mysys/my_winthread.c'
> --- a/mysys/my_winthread.c	2009-12-19 13:11:48 +0000
> +++ b/mysys/my_winthread.c	2010-09-27 19:38:44 +0000
> @@ -155,6 +155,13 @@ int pthread_cancel(pthread_t thread)
>  int my_pthread_once(my_pthread_once_t *once_control, 
>      void (*init_routine)(void))
>  {
> +  /*
> +    Do "dirty" read to find out if initialization is already done, to
> +    save an interlocked operation in common case.
> +  */
> +  if (*once_control == MY_PTHREAD_ONCE_DONE)
> +    return 0;
> +
>    LONG state= InterlockedCompareExchange(once_control, MY_PTHREAD_ONCE_INPROGRESS,
>                                            MY_PTHREAD_ONCE_INIT);

15) I concur with Davi that this change should not be necessary if you
    accept our comment 14).

    If you still want to keep it, then IMO there is some additional work to do:

    This is so called Double Checked Locking pattern or rather anti-pattern.
    (See for example http://en.wikipedia.org/wiki/Double-checked_locking).
    Since it is known not to work in general case you have to explain (in the
    comment describing this optimization) why it works in your particular case.

    E.g., because Visual C++ with version >=5.0 provides guarantee that read
    of volatile variable has Acquire semantics, so any reads of global/static
    memory which follow it won't be reordered to happen before volatile is
    read.

    Such explanation immediately raises two questions:

    Q1) Do we support any other compilers or older versions of Visual C++?
    A1) OK, from http://dev.mysql.com/doc/refman/5.5/en/windows-source-build.html
        looks like we don't. Still I think it is worth to mention this fact
        in the comment for this optimization and even add compile-time assert
        which will abort compilation if some other, probably non-compliant,
        compiler is used.
    Q2) Given that read of volatile variable involves memory barrier is
        it really cheaper than reading variable through Interlocked API?
        In other words does this optimization really makes sense?
    A2) Googling says that it is. I think this fact is worth mentioning in
        the comment as well.

>    switch(state)
> 
> === modified file 'mysys/thr_rwlock.c'
> --- a/mysys/thr_rwlock.c	2010-08-12 13:50:23 +0000
> +++ b/mysys/thr_rwlock.c	2010-09-27 19:38:44 +0000
> @@ -20,6 +20,131 @@
>  #if defined(NEED_MY_RW_LOCK)
>  #include <errno.h>
>  
> +#ifdef _WIN32
> +
> +/* 
> +  Use native windows slim reader writer lock if possible (Vista and later).
> +  Check presence of rwlock functions at runtime, and load them dynamically.
> +  Fallback to portable implementation on older Windows.
> +*/
> +
> +static BOOL have_srwlock = TRUE;
> +/* Prototypes and function pointers for windows  functions */
> +typedef VOID (WINAPI* srw_func) (PSRWLOCK SRWLock);
> +typedef BOOL (WINAPI* srw_bool_func) (PSRWLOCK SRWLock);
> +
> +static srw_func my_InitializeSRWLock;
> +static srw_func my_AcquireSRWLockExclusive;
> +static srw_func my_ReleaseSRWLockExclusive;
> +static srw_func my_AcquireSRWLockShared;
> +static srw_func my_ReleaseSRWLockShared;
> +
> +static srw_bool_func my_TryAcquireSRWLockExclusive;
> +static srw_bool_func my_TryAcquireSRWLockShared;
> +
> +/**
> +  Check for presence of Windows slim reader writer lock function
> +  (available in Vista and later). Load function pointers.
> +*/
> +static void check_srwlock_availability(void)
> +{
> +  HMODULE module= GetModuleHandle("kernel32");
> +
> +  my_InitializeSRWLock= (srw_func) GetProcAddress(module,
> +    "InitializeSRWLock");
> +  my_AcquireSRWLockExclusive= (srw_func) GetProcAddress(module,
> +    "AcquireSRWLockExclusive");
> +  my_AcquireSRWLockShared= (srw_func) GetProcAddress(module,
> +    "AcquireSRWLockShared");
> +  my_ReleaseSRWLockExclusive= (srw_func) GetProcAddress(module,
> +    "ReleaseSRWLockExclusive");
> +  my_ReleaseSRWLockShared= (srw_func) GetProcAddress(module,
> +    "ReleaseSRWLockShared");
> +  my_TryAcquireSRWLockExclusive=  (srw_bool_func) GetProcAddress(module,
> +    "TryAcquireSRWLockExclusive");
> +  my_TryAcquireSRWLockShared=  (srw_bool_func) GetProcAddress(module,
> +    "TryAcquireSRWLockShared");
> +  if(my_InitializeSRWLock)
> +    have_srwlock= TRUE;
> +}
> +
> +
> +static int srw_init(my_rw_lock_t *rwp)
> +{
> +  my_InitializeSRWLock(&rwp->srwlock);
> +  rwp->have_exclusive_srwlock = FALSE;
> +  return 0;
> +}
> +
> +
> +static int srw_rdlock(my_rw_lock_t *rwp)
> +{
> +  my_AcquireSRWLockShared(&rwp->srwlock);
> +  DBUG_ASSERT(!rwp->have_exclusive_srwlock);
> +  return 0;
> +}
> +
> +
> +static int srw_tryrdlock(my_rw_lock_t *rwp)
> +{
> +  /* TryAcquire is new in Win7, use if available */
> +  if(my_TryAcquireSRWLockShared)
> +  {
> +    if(!my_TryAcquireSRWLockShared(&rwp->srwlock))
> +      return EBUSY;
> +  }
> +  else
> +  {
> +    /* Try harder, actually acquire the lock */
> +    my_AcquireSRWLockShared(&rwp->srwlock);
> +  }
> +  DBUG_ASSERT(!rwp->have_exclusive_srwlock);

16) Hmm... Sorry but this won't cut the mustard :)
    Caller has right to assume that my_rw_rdlock() won't block
    when called for locked rwlock.

    This probably means that we:
    a) either have to support native rwlocks only on systems which
       support TryAcquire operations
    b) or implement support of attribute parameter to my_rwlock_init
       which specify whether support for trylock operation is needed.
    c) or get rid of code using try operations.

    I personally tend to favor approach b).

> +  return 0;
> +}
> +
> +
> +static int srw_wrlock(my_rw_lock_t *rwp)
> +{
> +  my_AcquireSRWLockExclusive(&rwp->srwlock);
> +  rwp->have_exclusive_srwlock= TRUE;
> +  return 0;
> +}
> +
> +
> +static int srw_trywrlock(my_rw_lock_t *rwp)
> +{
> +  /* TryAcquire is new in Win7, use it if available */
> +  if(my_TryAcquireSRWLockExclusive)
> +  {
> +    if(!my_TryAcquireSRWLockExclusive(&rwp->srwlock))
> +      return EBUSY;
> +  }
> +  else
> +  {
> +     /* Try harder, actually acquire the lock */
> +     my_AcquireSRWLockExclusive(&rwp->srwlock);
> +  }
> +  rwp->have_exclusive_srwlock= TRUE;
> +  return 0;
> +}

17) Same as 16).

> +
> +
> +static int srw_unlock(my_rw_lock_t *rwp)
> +{
> +  if(rwp->have_exclusive_srwlock)
> +  {
> +    rwp->have_exclusive_srwlock= FALSE;
> +    my_ReleaseSRWLockExclusive(&rwp->srwlock);
> +  }
> +  else
> +  {
> +    my_ReleaseSRWLockShared(&rwp->srwlock);
> +  }
> +  return 0;
> +}
> +
> +#endif
> +
>  /*
>    Source base from Sun Microsystems SPILT, simplified for MySQL use
>    -- Joshua Chamas
> @@ -63,6 +188,18 @@ int my_rw_init(my_rw_lock_t *rwp, my_boo
>  {
>    pthread_condattr_t	cond_attr;
>  
> +#ifdef _WIN32
> +  /* 
> +    Check if slim reader writer lock available on the first call
> +    to init.
> + */
> +  static my_pthread_once_t once_control= MY_PTHREAD_ONCE_INIT;
> +  my_pthread_once(&once_control, check_srwlock_availability);
> +
> +  if(have_srwlock)
> +    return srw_init(rwp);
> +#endif

18) Same as 14).

...

Otherwise, I am OK with your patch and think that it can be pushed
after discussing/addressing the above points.

Thank you for working on this!

-- 
Dmitry Lenev, Software Developer
Oracle Development SPB/MySQL, www.mysql.com

Are you MySQL certified?  http://www.mysql.com/certification
Thread
bzr commit into mysql-5.5-bugfixing branch (vvaintroub:3204) Vladislav Vaintroub27 Sep
Re: bzr commit into mysql-5.5-bugfixing branch (vvaintroub:3204)Davi Arnaut1 Oct
Re: bzr commit into mysql-5.5-bugfixing branch (vvaintroub:3204)Dmitry Lenev1 Oct
  • RE: bzr commit into mysql-5.5-bugfixing branch (vvaintroub:3204)Vladislav Vaintroub1 Oct
    • Re: bzr commit into mysql-5.5-bugfixing branch (vvaintroub:3204)Dmitry Lenev1 Oct