MySQL Lists are EOL. Please join:

List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:May 26 2009 12:48pm
Subject:Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)
View as plain text  
Bonjour Marc,

Marc Alff a écrit, Le 21.05.2009 21:42:
> #At file:///home/malff/BZR-TREE/mysql-6.0-perfschema/ based on
> revid:marc.alff@stripped
> 
>  3150 Marc Alff	2009-05-21
>       Implemented code review comments, continued.
>       
>       Reworked all the thread instrumentation:
>       - reverted the THD::store_global(bool) change implemented earlier,
>       - removed the init_and_bind_psi_thread calls
>       
>       Instead, instrumented threads by intercepting the pthread_create()
>       call with an instrumented mysql_thread_create().

> === modified file 'include/mysql/psi/mysql_mutex.h'
> --- a/include/mysql/psi/mysql_mutex.h	2009-05-12 03:15:02 +0000
> +++ b/include/mysql/psi/mysql_mutex.h	2009-05-21 19:42:38 +0000
> @@ -738,6 +738,65 @@ static inline int inline_mysql_cond_broa
>    return result;
>  }
>  
> +#ifdef HAVE_PSI_INTERFACE
> +  #define mysql_thread_create(K,P1,P2,P3,P4)
> inline_mysql_thread_create(K,P1,P2,P3,P4)
> +#else
> +  #define mysql_thread_create(K,P1,P2,P3,P4) pthread_create(P1,P2,P3,P4)
> +#endif
> +
> +#ifdef HAVE_PSI_INTERFACE
> +  #define mysql_thread_set_id(I) inline_mysql_thread_set_id(I)
> +#else
> +  #define mysql_thread_set_id(I) do while (0)
> +#endif
> +
> +#ifdef HAVE_PSI_INTERFACE
> +  #define mysql_thread_prepare(P1,P2,P3,P4)
> inline_mysql_thread_prepare(P1,P2,P3,P4)
> +#else
> +  #define mysql_thread_prepare(P1,P2,P3,P4) inline_mysql_thread_prepare(P4)
> +#endif

Those macros need doxygen comments like mysql_mutex_init(), as well as a 
description of the typical usage pattern (it's not obvious to know that 
prepare and create are mutually exclusive; one could rather wrongly 
guess that one should first prepare then create).

> === modified file 'include/mysql/psi/psi.h'
> --- a/include/mysql/psi/psi.h	2009-05-13 16:44:00 +0000
> +++ b/include/mysql/psi/psi.h	2009-05-21 19:42:38 +0000
> @@ -563,14 +563,18 @@ typedef void (*close_table_v1_t)(struct
>  */
>  typedef void (*create_file_v1_t)(PSI_file_key key, const char *name, File file);
>  
> +typedef int (*spawn_thread_v1_t)(PSI_thread_key key,
> +                                 pthread_t *thread, const pthread_attr_t *attr,
> +                                 void *(*start_routine)(void*), void *arg);
> +

Could this new one have a doxygen comment like others?

>  /**
>    Create instrumentation for a thread.
>    @param key the registered key
> -  @param identity the address of the rwlock itself
> +  @param identity an address typical of the thread

Have you considered using, as "identity", what's returned by 
pthread_self() (it's in THD::real_id)?
I see the following benefits:
- code writer needn't find what would be a typical address for his 
thread, it's spawn_thread_v1_t which would internally call 
pthread_self() or THD::real_id which would be reused.
- assuming two threads of the same class are created, it's not always 
easy for the code writer to find two different addresses, whereas 
pthread_self() will give different values
- this could help make connections between what one would see in P_S and 
what gdb prints (assume a developer making use of gdb and P_S at the 
same time, to troubleshoot a problem): gdb tells number of threads, 
which happen to be what pthread_self() returns, like this:
[Switching to Thread 0x418a3950 (LWP 8591)]
(gdb) p pthread_self()
$1 = 1099577680 # this is 0x418a3950

Here is another thought about P_S threads and debugging.
In pushbuild2, there are stress tests run by the Random Query Generator; 
when this tool finds that more than 3 threads haven't returned from 
running their statement since N seconds, it declares a deadlock and 
prints, via a debugger like gdb, all stack traces of all threads of the 
system; gdb prints thread ids like this:
Thread 9 (process 592335    ):
#0  0xff040094 in __lwp_park () from /lib/libc.so.1
and 592335 is also what pthread_self() returns for this thread.
Currently it's difficult, from the output of pushbuild2, to know what 
thread is stalled (because the tool prints all stack traces of 30 
running threads whereas there may only be 3 stalled threads).
If before printing stack traces, the tool printed the list of all 
threads' pthread_self() (if it were available from P_S), as well as the 
content of SHOW PROCESSLIST and EVENTS_WAITS_CURRENT and P_S.PROCESSLIST 
then:
- with SHOW PROCESSLIST or EVENTS_WAITS_CURRENT I could see what threads 
IDs (usual MySQL thread IDs of P_S thread IDs) have a long wait time 
(==stalled)
- from somewhere in P_S I could hopefully infer their pthread_self()
- and so I would know the pthread_self() of stalled threads
- and so I would know, in gdb's output, what are the stalled threads, 
which should get my interest.
Pushbuild2's debug information from gdb is post-mortem: when you learn 
about the deadlock, the process is dead, only a core is available 
somewhere. It is surely possible to log into the proper pushbuild2 
machine, look at the core file, and find out the above info by hand. 
However, I suspect that having it dumped by the tool automatically would 
reduce the burden. If P_S provided pthread_self() values (the text above 
is not to ask you to change the random query generator but only to give 
a use of having pthread_self() in P_S).
On the other hand, maybe this is something to consider rather for 
WL#2515 "Performance statements" (I discovered this shiny WL today).

>    @return an instrumented thread
>  */
>  typedef struct PSI_thread* (*new_thread_v1_t)
> -  (PSI_thread_key key, const void *identity);
> +  (PSI_thread_key key, const void *identity, ulong thread_id);

> === modified file 'mysql-test/suite/perfschema/r/no_threads.result'
> --- a/mysql-test/suite/perfschema/r/no_threads.result	2009-05-02 07:15:38 +0000
> +++ b/mysql-test/suite/perfschema/r/no_threads.result	2009-05-21 19:42:38 +0000
> @@ -24,7 +24,7 @@ count(*)
>  select count(*) from performance_schema.PROCESSLIST
>  where name like "thread/sql/OneConnection";
>  count(*)
> -1
> +2

Is it a normal change?

> === modified file 'sql/event_scheduler.cc'
> --- a/sql/event_scheduler.cc	2009-05-12 03:15:02 +0000
> +++ b/sql/event_scheduler.cc	2009-05-21 19:42:38 +0000
> @@ -228,10 +228,7 @@ event_scheduler_thread(void *arg)
>  
>    thd->thread_stack= (char *)&thd;              // remember where our stack
> is
>  
> -#ifdef HAVE_PSI_INTERFACE
> -  thd->m_psi= init_and_bind_psi_thread_with_id(key_thread_event_scheduler,
> -                                               thd, thd->thread_id);
> -#endif
> +  mysql_thread_set_id(thd->thread_id);

Why not move the call to mysql_thread_set_id() into store_globals()? 
Currently those two calls, mysql_thread_set_id() and store_globals(), 
happen very close to each other; I have the impression that by doing the 
move, all mysql_thread_set_id() calls (all: in sql/, sql/backup, 
storage/myisam, storage/maria) would be gone. It also feels natural; 
store_globals() is expected to do whatever is needed to get a functional 
"just works"-THD. It would reduce the number of calls (everything would 
happen in the existing store_globals() calls); there are already so many 
"boilerplate" operations that we copy-paste every time we want to write 
code to create a new utility thread (MyISAM backup driver, online backup 
kernel, NDB binlog, replication, event scheduler...). Last, it would 
reduce the presence of P_S calls in non-P_S code (like of events, 
backup, engines), which generally makes things simpler to understand 
(when one is reading code of feeature X it's better if she/he does not 
meet code bits of feature Y, makes less to parse).

> === modified file 'sql/ha_ndbcluster.cc'
> --- a/sql/ha_ndbcluster.cc	2009-05-12 03:15:02 +0000
> +++ b/sql/ha_ndbcluster.cc	2009-05-21 19:42:38 +0000
> @@ -10166,10 +10167,7 @@ pthread_handler_t ndb_util_thread_func(v
>    }
>  
>  /* FIXME: waiting for Bug#40899 */
> -#ifdef HAVE_PSI_INTERFACE
> -  thd->m_psi= init_and_bind_psi_thread_with_id(key_thread_ndb_util,
> -                                               thd, thd->thread_id);
> -#endif
> +  mysql_thread_set_id(thd->thread_id);

Actually, thd->thread_id is 0 here so the call does nothing (m_thread_id 
was already 0 before it).

> === modified file 'sql/mysql_priv.h'
> --- a/sql/mysql_priv.h	2009-05-18 15:36:08 +0000
> +++ b/sql/mysql_priv.h	2009-05-21 19:42:38 +0000
> @@ -2248,36 +2248,6 @@ extern PSI_file_key key_file_global_ddl_
>    Initialise all the performance schema instrumentation points used by the server.
>  */
>  void init_server_psi_keys();
> -
> -/**
> -  Create an instrumented thread.
> -  @param key the instrumentation thread key
> -  @param identity the thread identity
> -  @param id the thread identifier
> -  @return an instrumented thread, or NULL.
> -*/
> -PSI_thread* init_psi_thread_with_id(PSI_thread_key key,
> -                                    void* identity,
> -                                    ulong id);

By the way I like the new way much better than the previous one using 
store_globals(real_thread). Thanks!

> === modified file 'sql/scheduler.cc'
> --- a/sql/scheduler.cc	2009-05-16 00:41:08 +0000
> +++ b/sql/scheduler.cc	2009-05-21 19:42:38 +0000
> @@ -210,19 +210,21 @@ bool thd_scheduler::thread_attach()
>  {
>    DBUG_ASSERT(!thread_attached);
>  
> +  THD* thd = (THD*)list.data;
> +
>  #ifdef HAVE_PSI_INTERFACE
>    {
>      /*
>        Until now, this thread is running under the libevent instrumentation.
> -      Save this instrumentation before binding to the job instrumentation,

"job" -> "user job"
(as you use in another comment in the file)

> -      THD::m_psi, in setup_connection_thread_globals().
> +      Save this instrumentation, and bind to the job instrumentation, THD::m_psi.
>      */

> @@ -513,11 +516,7 @@ static void libevent_add_connection(THD
>      DBUG_VOID_RETURN;
>    }
>  
> -#ifdef HAVE_PSI_INTERFACE
> -  thd->m_psi= init_psi_thread_with_id(key_thread_one_connection, thd,
> thd->thread_id);
> -#else
> -  thd->m_psi= NULL;
> -#endif
> +  mysql_thread_prepare(key_thread_one_connection, thd, thd->thread_id, &
> thd->m_psi);

All calls to this function are:
mysql_thread_prepare(x, y, z, & thd->m_psi);
So, if P_S is not compiled in, they expand to
thd->m_psi= 0;
- I still don't understand why there has to be a m_psi in THD if P_S is 
not compiled in (maybe I'll know when I read your previous replies)
- I still don't understand why it has to be zeroed (as it shouldn't be 
used if P_S is not compiled in).

> === modified file 'sql/sql_parse.cc'
> --- a/sql/sql_parse.cc	2009-05-18 15:36:08 +0000
> +++ b/sql/sql_parse.cc	2009-05-21 19:42:38 +0000
> @@ -461,10 +461,7 @@ pthread_handler_t handle_bootstrap(void
>  {
>    THD *thd= (THD*) arg;
>  
> -#ifdef HAVE_PSI_INTERFACE
> -  thd->m_psi= init_and_bind_psi_thread_with_id(key_thread_bootstrap,
> -                                               thd, thd->thread_id);
> -#endif
> +  mysql_thread_set_id(thd->thread_id);

So here it feels like a misleading (too vague) function name. The line 
above suggests that this sets the thread's id. There are several thread 
ids (THD::thread_id, the one of P_S), the function's name does not 
suggest that this is about the P_S one. By default, when reading 
sql_parse, one may even guess wrongly (being more used to thd->thread_id 
than to P_S').

>  
>    do_handle_bootstrap(thd);
>    return 0;

> === modified file 'storage/perfschema/pfs.cc'
> --- a/storage/perfschema/pfs.cc	2009-05-19 15:56:07 +0000
> +++ b/storage/perfschema/pfs.cc	2009-05-21 19:42:38 +0000
> @@ -977,15 +977,65 @@ static void create_file_v1(PSI_file_key
>      file_lost++;
>  }
>  
> +struct PFS_spawn_thread_arg
> +{
> +  PFS_thread *m_parent_thread;
> +  PSI_thread_key m_child_key;
> +  const void *m_child_identity;
> +  void *(*m_user_start_routine)(void*);

Use type pthread_handler_t here.

> +  void *m_user_arg;
> +};
> +

> +static int spawn_thread_v1(PSI_thread_key key,
> +                           pthread_t *thread, const pthread_attr_t *attr,
> +                           void *(*start_routine)(void*), void *arg)
> +{
> +  PFS_spawn_thread_arg *psi_arg;
> +
> +  /* psi_arg can not be global, and can not be a local variable. */
> +  psi_arg= (PFS_spawn_thread_arg*) malloc(sizeof(PFS_spawn_thread_arg));

Any problem with using my_malloc()/my_free()?

> +  if (psi_arg == NULL)
> +    return EAGAIN;

As this is very unlikely, I'd use unlikely().

> === modified file 'storage/perfschema/pfs_instr_class.h'
> --- a/storage/perfschema/pfs_instr_class.h	2009-05-19 14:50:00 +0000
> +++ b/storage/perfschema/pfs_instr_class.h	2009-05-21 19:42:38 +0000
> @@ -125,8 +125,6 @@ struct PFS_thread_class
>    uint m_name_length;
>    /** True if this thread instrument is enabled. */
>    bool m_enabled;
> -  /** True if this thread instrument is timed. */
> -  bool m_timed;

Seems to be an unrelated change... what was it about?

Remove white lines added in ha_ndbcluster_binlog.cc and
ha_ndbcluster_connection.*. And in kill_server_thread(), 
handle_connections_namedpipes(), thd_scheduler::thread_attach() (I'm 
looking at the diff vs 6.0-main, I know these lines came accidentally 
through merges and merges).

I looked at all places where THD::m_psi is read/written, in the entire 
tree (I renamed it to m_psi1 and recompiled, to catch them all):
sql_class.h:2452:  PSI_thread *m_psi1;
mysqld.cc:5110:  mysql_thread_prepare(key_thread_one_connection, thd, 
thd->thread_id, & thd->m_psi1);
mysqld.cc:5128:    mysql_thread_prepare(key_thread_one_connection, thd, 
thd->thread_id, & thd->m_psi1);
scheduler.cc:224:      PSI_server->set_thread(thd->m_psi1);
scheduler.cc:519:  mysql_thread_prepare(key_thread_one_connection, thd, 
thd->thread_id, & thd->m_psi1);
So, the only place where it's read is
scheduler.cc:224:      PSI_server->set_thread(thd->m_psi1);
which is in thd_scheduler::thread_attach() i.e. code specific to 
thread-pooling: THD::m_psi serves only for thread-pooling.
Thus, it looks like the calls to mysql_thread_prepare() in
handle_connection_in_main_thread() and 
create_thread_to_handle_connection() should be removed.
Roughly, you now always rely on pthread-specific data for 
instrumentation (THR_PFS) except for thread-pooling where THD::m_psi is 
used.
Could you please strongly emphasize the very "niche" usage of THD::m_psi 
(thread-pooling) in a doxygen comment of that member in sql_class.h, and 
even rename it to "m_psi_for_thread_pooling", and define it only if
#if HAVE_POOL_OF_THREADS == 1
(see how this is used in mysqld.cc)? I even wonder if the right place 
for m_psi would be in THD::scheduler rather than in THD; I haven't 
investigated this a lot but, it looks like THD::scheduler is about 
per-THD scheduler-specific information, so could be good... For example, 
THD::scheduler has a member named "set_explain" which also fiddles with 
attach/detach THD/pthread (see the comment starting with "If during the 
session" in thd_scheduler::thread_detach()).
mysql_thread_prepare() is a function which should be rarely used (after 
the requested changes are implemented), sort of high-magic thing, its 
use should be clarified in a comment in its definition. Even, if it's 
used in only one place, I'd remove the function (which will remove 
issues with its naming) and just invoke PSI_server->new_thread() (as 
thd_scheduler::thread_detach() already uses PSI_server->set_thread()).

-- 
Mr. Guilhem Bichot <guilhem@stripped>
Sun Microsystems / MySQL, Lead Software Engineer
Bordeaux, France
www.sun.com / www.mysql.com
Thread
bzr commit into mysql-6.0-perfschema branch (marc.alff:3150) Marc Alff21 May
  • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Guilhem Bichot26 May
    • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Marc Alff26 May
      • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Guilhem Bichot28 May
        • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Marc Alff1 Jul
          • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Guilhem Bichot10 Jul
            • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Marc Alff10 Jul
              • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Guilhem Bichot27 Jul
                • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Marc Alff28 Jul
                  • Re: bzr commit into mysql-6.0-perfschema branch (marc.alff:3150)Guilhem Bichot28 Jul