List:Commits« Previous MessageNext Message »
From:Marc Alff Date:April 1 2010 8:35am
Subject:Re: bzr commit into mysql-next-mr-bugfixing branch (marc.alff:3144)
Bug#52502
View as plain text  
Hi Guilhem,

On 3/31/10 2:27 PM, Guilhem Bichot wrote:
> Hello Marc,
>
> Marc Alff a écrit, Le 31.03.2010 17:48:
>> #At file:///Users/malff/BZR_TREE/mysql-next-mr-bugfixing-52502/ based
>> on revid:alik@stripped
>>
>> 3144 Marc Alff 2010-03-31
>> Bug#52502 Performance schema does not start with large mutex_instance
>> buffers
>> This is a performance issue affecting scalability.
>> Prior to this fix, the performance schema code did not scale well
>> when a very large number of instruments are created in the code.
>> The root cause was create_mutex() spinning "forever",
>> trying to find an empty slot in a full mutex_array.
>> With this fix:
>> - functions like create_mutex() abort sooner,
>> when used with a very large array which is full or almost full.
>> - randomize_index() has been revised to improve
>> the statistical spread of values in the mutex_array. See the comments
>> in the code for details.
>> Other allocation functions similar to create_mutex()
>> have been also fixed.
>> No MTR test case provided (performance scalability issue).
>> Fix tested manually.
>>
>> modified:
>> storage/perfschema/pfs_global.h
>> storage/perfschema/pfs_instr.cc
>> storage/perfschema/pfs_instr.h
>> === modified file 'storage/perfschema/pfs_global.h'
>> --- a/storage/perfschema/pfs_global.h 2010-01-12 01:47:27 +0000
>> +++ b/storage/perfschema/pfs_global.h 2010-03-31 15:48:47 +0000
>> @@ -1,4 +1,4 @@
>> -/* Copyright (C) 2008-2009 Sun Microsystems, Inc
>> +/* Copyright (c) 2008, 2010, Oracle and/or its affiliates. All rights
>> reserved.
>>
>> This program is free software; you can redistribute it and/or modify
>> it under the terms of the GNU General Public License as published by
>> @@ -30,27 +30,49 @@ void pfs_free(void *ptr);
>>
>> inline uint randomized_index(const void *ptr, uint max_size)
>> {
>> + static uint seed1= 0;
>> + static uint seed2= 0;
>> + uint result;
>> + register intptr value;
>> +
>> if (unlikely(max_size == 0))
>> return 0;
>>
>> /*
>> - ptr is typically an aligned structure,
>> - so the last bits are not really random, but this has no effect.
>> - Apply a factor A*x to spread
>> - close values of ptr further apart (which helps with arrays),
>> - and to spread values way beyond a typical max_size.
>> - Then, apply a modulo to end within [0, max_size - 1].
>> - A is big prime numbers, to avoid resonating with max_size,
>> - to have a uniform distribution in [0, max_size - 1].
>> - The value of A is chosen so that index(ptr) and index(ptr + N) (for
>> arrays)
>> - are likely to be not similar for typical values of max_size
>> - (50, 100, 1000, etc).
>> - In other words, (sizeof(T)*A % max_size) should not be a small number,
>> - to avoid that with 'T array[max_size]', index(array[i])
>> - and index(array[i + 1]) end up pointing in the same area in [0,
>> max_size - 1].
>> + ptr is typically an aligned structure, and can be in an array.
>> + - The last bits are not random because of alignment,
>> + so we divide by 8.
>> + - The high bits are mostly constant, especially with 64 bits
>> architectures,
>> + but we keep most of them anyway, by doing computation in intptr.
>> + The high bits are significant depending on where the data is
>> + stored (the data segment, the stack, the heap, ...).
>> + - To spread consecutive cells in an array further, we multiply by
>> + a factor A. This factor should not be too high, which would cause
>> + an overflow and cause loss of randomness (droping the top high bits).
>> + The factor is a prime number, to help spread the distribution.
>> + - To add more noise, and to be more robust if the calling code is
>> + passing a constant value instead of a random identity,
>> + we add the previous results, for hysteresys, with a degree 2 polynom,
>> + X^2 + X + 1.
>> + - Last, a modulo is applied to be within the [0, max_size - 1] range.
>> + Note that seed1 and seed2 are static, and are *not* thread safe,
>> + which is even better.
>
> In spite of the copious details (thanks!), this is a bit too complicated
> for me but it all looks plausible and I'll admit it. I wish I had read
> this Knuth book about pseudo-random numbers, which is closed on my shelf
> since too long.
>
>> + Effect with arrays: T array[N]
>> + - ptr(i) = & array[i] = & array[0] + i * sizeof(T)
>> + - ptr(i+1) = ptr(i) + sizeof(T).
>> + What we want here, is to have index(i) and index(i+1) fall into
>> + very different areas in [0, max_size - 1], to avoid locality.
>> */
>> - return static_cast<uint>
>> - (((reinterpret_cast<intptr> (ptr)) * 2166179) % max_size);
>> + value= (reinterpret_cast<intptr> (ptr)) >> 3;
>> + value*= 1789;
>> + value+= seed2 + seed1 + 1;
>> + value%= max_size;
>> + + result= static_cast<uint> (value);
>
> I would have dropped the "value" variable and done:
> result= static_cast<uint>((((reinterpret_cast<intptr> (ptr)) >> 3)
> * 1789 + seed2 + seed1 + 1 ) % max_size;
> If the compiler really puts "value" in a register as requested in the
> declaration, it probably doesn't make a difference.
> Up to you.

I am sure the compiled binary will be the same.
I prefer to keep this form, for clarity.

>
>> + seed2= seed1*seed1;
>> + seed1= result;
>> +
>> + return result;
>> }
>>
>> void pfs_print_error(const char *format, ...);
>>
>> === modified file 'storage/perfschema/pfs_instr.cc'
>> --- a/storage/perfschema/pfs_instr.cc 2010-03-22 12:48:18 +0000
>> +++ b/storage/perfschema/pfs_instr.cc 2010-03-31 15:48:47 +0000
>> @@ -454,6 +454,7 @@ void cleanup_file_hash(void)
>> PFS_mutex* create_mutex(PFS_mutex_class *klass, const void *identity)
>> {
>> int pass;
>> + uint attempt= 0;
>> uint i= randomized_index(identity, mutex_max);
>>
>> /*
>> @@ -464,7 +465,9 @@ PFS_mutex* create_mutex(PFS_mutex_class {
>> PFS_mutex *pfs= mutex_array + i;
>> PFS_mutex *pfs_last= mutex_array + mutex_max;
>> - for ( ; pfs < pfs_last; pfs++)
>> + for ( ;
>> + (pfs < pfs_last) && (attempt < PFS_MAX_ALLOC_RETRY);
>> + pfs++, attempt++)
>
> This scans at most the first PFS_MAX_ALLOC_RETRY elements of [pfs,
> pfs_last[. I suggest to save one comparison and increment in each
> iteration by setting a single bound before the for(), like:
> PFS_mutex *pfs= mutex_array + i;
> PFS_mutex *pfs_last= mutex_array + mutex_max;
> PFS_mutex *bound= pfs + PFS_MAX_ALLOC_RETRY;
> if (pfs_last > bound)
> pfs_last= bound;
> for ( ; pfs < pfs_last; pfs++)

The change you suggested does PFS_MAX_ALLOC_RETRY for each pass, not 
total, which is a problem because it now saturates the [0, 
PFS_MAX_ALLOC_RETRY -1] area, so the real logic is a bit more complex ...

I agree however that we should preserve this loop:
   for ( ; pfs < pfs_last; pfs++)
after all the work we did to get to this simple form.

See the next patch, I hope you will like how the new iterator works ...

>
>> {
>> if (pfs->m_lock.is_free())
>> {
>
>> === modified file 'storage/perfschema/pfs_instr.h'
>> --- a/storage/perfschema/pfs_instr.h 2010-01-12 01:47:27 +0000
>> +++ b/storage/perfschema/pfs_instr.h 2010-03-31 15:48:47 +0000
>> @@ -136,6 +136,13 @@ struct PFS_table : public PFS_instr
>> */
>> #define LOCKER_STACK_SIZE 3
>>
>> +/**
>> + @def PFS_MAX_ALLOC_RETRY
>> + Maximun number of times the code attempts to allocate an item
>
> s/Maximun/Maximum
>

Fixed.

Regards,
-- Marc

Thread
bzr commit into mysql-next-mr-bugfixing branch (marc.alff:3144)Bug#52502Marc Alff31 Mar
  • Re: bzr commit into mysql-next-mr-bugfixing branch (marc.alff:3144)Bug#52502Guilhem Bichot31 Mar
    • Re: bzr commit into mysql-next-mr-bugfixing branch (marc.alff:3144)Bug#52502Marc Alff1 Apr