MySQL Lists are EOL. Please join:

List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:July 10 2009 3:09pm
Subject:Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)
View as plain text  
Hello Marc,

Marc Alff a écrit, Le 01.07.2009 19:38:
> Hi Guilhem
> Guilhem Bichot wrote:
>> Hello Marc,
>> Marc Alff a écrit, Le 26.05.2009 23:53:
>>> Hi Guilhem
>>> Guilhem Bichot wrote:
>>>> Bonjour Marc,
>>>> Marc Alff a écrit, Le 21.05.2009 21:42:
>>>>> #At file:///home/malff/BZR-TREE/mysql-6.0-perfschema/ based on
>>>>> revid:marc.alff@stripped
>>>>>  3150 Marc Alff    2009-05-21
>>>>>       Implemented code review comments, continued.
>>>>>             Reworked all the thread instrumentation:
>>>>>       - reverted the THD::store_global(bool) change implemented
>>>>> earlier,
>>>>>       - removed the init_and_bind_psi_thread calls
>>>>>             Instead, instrumented threads by intercepting the
>>>>> pthread_create()
>>>>>       call with an instrumented mysql_thread_create().

>>>>> === modified file 'sql/'
>>>>> --- a/sql/    2009-05-12 03:15:02 +0000
>>>>> +++ b/sql/    2009-05-21 19:42:38 +0000
>>>>> @@ -228,10 +228,7 @@ event_scheduler_thread(void *arg)
>>>>>    thd->thread_stack= (char *)&thd;              // remember
> where our
>>>>> stack is
>>>>> -#ifdef HAVE_PSI_INTERFACE
>>>>> -  thd->m_psi=
>>>>> init_and_bind_psi_thread_with_id(key_thread_event_scheduler,
>>>>> -                                               thd,
> thd->thread_id);
>>>>> -#endif
>>>>> +  mysql_thread_set_id(thd->thread_id);
>>>> Why not move the call to mysql_thread_set_id() into store_globals()?
>>>> Currently those two calls, mysql_thread_set_id() and store_globals(),
>>>> happen very close to each other; I have the impression that by doing the
>>>> move, all mysql_thread_set_id() calls (all: in sql/, sql/backup,
>>>> storage/myisam, storage/maria) would be gone. It also feels natural;
>>>> store_globals() is expected to do whatever is needed to get a functional
>>>> "just works"-THD. It would reduce the number of calls (everything would
>>>> happen in the existing store_globals() calls); there are already so many
>>>> "boilerplate" operations that we copy-paste every time we want to write
>>>> code to create a new utility thread (MyISAM backup driver, online backup
>>>> kernel, NDB binlog, replication, event scheduler...). Last, it would
>>>> reduce the presence of P_S calls in non-P_S code (like of events,
>>>> backup, engines), which generally makes things simpler to understand
>>>> (when one is reading code of feeature X it's better if she/he does not
>>>> meet code bits of feature Y, makes less to parse).
>>> Every thread needs to be instrumented, and having the instrumentation in
>>> the start_routine function for this thread seems natural to me.
>> But adding a P_S function which must be called in every thread function,
>> seems like a burden to me.
>> If mysql_thread_set_id(thd->thread_id) was done by THD::store_globals():
>> - every thread having a THD wouldn't need an explicit hard-coded call to
>> mysql_thread_set_id()
>> - those few threads which don't have a THD would still need such
>> hard-coded call in their thread function, yes (there are no such threads
>> in the patch right now).
>> mysql_thread_set_id(thd->thread_id) is about setting some
>> posix-thread-specific data by using some data from THD.
>> This existing call: "my_pthread_setspecific_ptr(THR_THD,  this)"
>> is also about setting some posix-thread-specific data (THR_THD key) by
>> using some data from THD ("this" is the THD pointer here), and it
>> happens in... THD::store_globals().
> I think you don't understand some parts here.
> THD::store_global() is to bind a physical thread running (pthread) to a
> logical thread (THD), using thread local storage.
> This is required because the server code does not pass a THD object to
> every code that needs it, so the code relies on _current_thd to get the
> THD back.
> This association has to be made again every time a thread from the
> thread pool picks up a THD and execute it, so to simplify at every query.
> mysql_thread_set_id(thd->thread_id) is to give an identifier to the
> instrumentation, for the PSI_thread associated with the connection.
> A THD is given an identifier only once, and the associated PSI_thread is
> given the same identifier only once, when a connection is created,
> typically.
> So:
> - THD::store_global() needs to be done for every query
> - mysql_thread_set_id(thd->thread_id) needs to be done once for a thread

A PSI_thread is created by create_thread(), via one of two routes:
a) pfs_spawn_thread(), which happens when creating an OS thread, so in 
this case a PSI_thread seems to represent an OS thread
b) PSI_server->new_thread(), which happens in libevent_add_connection(), 
so in this case a PSI_thread seems to represent a THD (as that moment we 
have a THD but no OS thread bound to it).

I try hard, but find some inconsistency between a) and b), and it's 
probably what makes me not understand explanations.
Big question for you: what does a PSI_thread represent? What is it?
I guess that the Performance Schema PROCESSLIST (to be named THREADS) 
table will be a list of PSI_thread's, and WL#2360 says
"The original idea was that this would correspond to what SHOW 
PROCESSLIST shows. However, we monitor more threads, and must supply 
different data. Therefore the implementor will add a new 
PERFORMANCE_SCHEMA.PROCESSLIST table which shows all threads."
So it seems to be about "threads". OS threads? or THD-like jobs?
If about OS threads, it means the table really lists OS threads, one per 
row (THREAD_ID is the primary key somehow), and gives the SELECTer, via 
an informative column (ID), a hint of what job (more or less THD) this 
OS thread is handling now.
But if it's about OS threads, (b) above does not make a lot of sense: 
PSI_server->new_thread() should not be called in 
libevent_add_connection(): a PSI_thread has already been created for the 
OS thread (of the threads pool), it is already a row in PROCESSLIST, and 
we should just adjust ID when the OS thread picks up a THD for processing.
On the other hand, if PSI_thread is about a THD-like job (or similar 
jobs done by the system threads which don't have a THD), then it should 
not be created in pfs_spawn_thread(), but rather when we create the job 
object (THD::THD() or equivalent for system threads which don't have a THD).

Could you please shed some light on my confusion, after which I could 
hopefully make more useful comments?

Defining PSI_thread clearly is worth doing it now, independently of the 
fact that pool-of-threads is not in the tree which Performance Schema is 
now based on.

> and as a result, it makes sense that these two lines of code are
> actually in different places. Having the 2 calls in the same place would
> be a design/code bug.
> Calling mysql_thread_set_id(thd->thread_id) in the code that spawn a new
> thread, in the start_routine code, is not a burden: it is exactly how
> things should be done.
> No change.
>>> Instrumented threads do not have to have an associated THD.
>> I was unclear, I meant the calls to
>> mysql_thread_set_id(*thd->thread_id*) only.
>> Those without a THD need an explicit call for sure.

>>>>> === modified file 'sql/'
>>>>> --- a/sql/    2009-05-16 00:41:08 +0000
>>>>> +++ b/sql/    2009-05-21 19:42:38 +0000
>>>>> @@ -513,11 +516,7 @@ static void libevent_add_connection(THD
>>>>>      DBUG_VOID_RETURN;
>>>>>    }
>>>>> -#ifdef HAVE_PSI_INTERFACE
>>>>> -  thd->m_psi= init_psi_thread_with_id(key_thread_one_connection,
> thd,
>>>>> thd->thread_id);
>>>>> -#else
>>>>> -  thd->m_psi= NULL;
>>>>> -#endif
>>>>> +  mysql_thread_prepare(key_thread_one_connection, thd,
>>>>> thd->thread_id, & thd->m_psi);
>>>> All calls to this function are:
>>>> mysql_thread_prepare(x, y, z, & thd->m_psi);
>>>> So, if P_S is not compiled in, they expand to
>>>> thd->m_psi= 0;
>>>> - I still don't understand why there has to be a m_psi in THD if P_S is
>>>> not compiled in (maybe I'll know when I read your previous replies)
>>> See the comments in thd_scheduler::m_psi
>> The comment says that this is "because doing so will change the binary
>> layout of THD, which is exposed to plugin code that may be compiled
>> differently.".
>> That is, it says that THD is part of the plugin API.
>> However Serg wrote, in a previous email:
>> "Marc is right, no ifdefs in the interface structures.
>> THD is not part of the interface, though.
>> handler is."
>> So I don't understand well.
>> However, it's not a very important issue to me.
> THD is used by some storage engines, therefore it is an interface.
> Whether this was by design or not is a different topic.
> I consider this point closed then.


>>>>> === modified file 'sql/'
>>>>> --- a/sql/    2009-05-18 15:36:08 +0000
>>>>> +++ b/sql/    2009-05-21 19:42:38 +0000
>>>>> @@ -461,10 +461,7 @@ pthread_handler_t handle_bootstrap(void
>>>>>  {
>>>>>    THD *thd= (THD*) arg;
>>>>> -#ifdef HAVE_PSI_INTERFACE
>>>>> -  thd->m_psi=
> init_and_bind_psi_thread_with_id(key_thread_bootstrap,
>>>>> -                                               thd,
> thd->thread_id);
>>>>> -#endif
>>>>> +  mysql_thread_set_id(thd->thread_id);
>> That's why I would name it mysql_thread_set_psi_id(); the ID which is
>> set is not the thread's ID (pthread_self()) it's a P_S-specific ID which
>> thus be mentioned in the name. I don't find that this violates the
>> naming convention at all. And I don't find that the problem is only due
>> to a bad naming convention in existing code either.
>> If you still disagree, let's keep this as a "disagreement item" for
>> later processing.
> Renamed to mysql_thread_set_psi_id().

So good.

>>>>>    do_handle_bootstrap(thd);
>>>>>    return 0;
>>>>> === modified file 'storage/perfschema/'
>>>>> --- a/storage/perfschema/    2009-05-19 15:56:07 +0000
>>>>> +++ b/storage/perfschema/    2009-05-21 19:42:38 +0000
>>>>> @@ -977,15 +977,65 @@ static void create_file_v1(PSI_file_key
>>>>>      file_lost++;
>>>>>  }
>>>>> +struct PFS_spawn_thread_arg
>>>>> +{
>>>>> +  PFS_thread *m_parent_thread;
>>>>> +  PSI_thread_key m_child_key;
>>>>> +  const void *m_child_identity;
>>>>> +  void *(*m_user_start_routine)(void*);
>>>> Use type pthread_handler_t here.
>>> I don't see pthread_handler_t on my platform, nor a pthread_handler.
>>> This comes from the prototype of pthread_create().
>> pthread_handler_t is defined in my_pthread.h. It's used like here:
>> pthread_handler_t handle_slave_io(void *arg)
>> {
> Ok, so using pthread_handler_t means that the performance schema
> instrumentation headers now depends on the server implementation.
> In turn, this means that a storage engine that only uses plain,
> unmodified pthread, would be polluted with the macros from my_pthread.h
> Implementors of storage engines using plain pthread, for example Falcon,
> should be able to instrument their code.
> No change.


Mr. Guilhem Bichot <guilhem@stripped>
Sun Microsystems / MySQL, Lead Software Engineer
Bordeaux, France /

bzr commit into mysql-6.0-perfschema branch (marc.alff:3150) Marc Alff21 May
  • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Guilhem Bichot26 May
    • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Marc Alff26 May
      • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Guilhem Bichot28 May
        • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Marc Alff1 Jul
          • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Guilhem Bichot10 Jul
            • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Marc Alff10 Jul
              • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Guilhem Bichot27 Jul
                • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Marc Alff28 Jul
                  • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Guilhem Bichot28 Jul