| List: | Commits | « Previous MessageNext Message » | |
| From: | Marc Alff | Date: | March 19 2009 11:53pm |
| Subject: | [Fwd: Review of the performance schema instrumentation interface (was: moins de remplacements)] | ||
| View as plain text | |||
Forwarding to the commits list. -- Marc -------- Original Message -------- Subject: Review of the performance schema instrumentation interface (was: moins de remplacements) Date: Thu, 19 Mar 2009 11:07:20 -0600 From: Marc Alff <marc.alff@stripped> To: Guilhem Bichot <guilhem@stripped> CC: Peter Gulutzan <peterg@stripped>, Staale Deraas <Staale.Deraas@stripped> References: <49BFC0B2.2010505@stripped> Hi Guilhem Guilhem Bichot wrote: > Salut Marc, > > ceci s'est perdu je crois: > > Mar 16 19:32:26 <guilhem|pickkid> Dis, je suis un peu chiffonné par > les centaines de changements my_pread -> MYSQL_FILE_PREAD etc; > Mar 16 19:32:50 <guilhem|pickkid> as-tu évalué la > possibilité de > #define un peu différents qui te permettraient de ne plus avoir ces > changements; > Mar 16 19:33:06 <guilhem|pickkid> si oui, pour quelles raisons as-tu > rejeté cette idée alternative? > Mar 16 19:33:23 <guilhem|pickkid> Je préfère demander avant > d'étudier > moi-même la question et de te proposer une alternative :) > Replying in English for the other readers The problem: In include/mysql/psi.h today, the proposed change is to use macros like #ifdef HAVE_PERFORMANCE_SCHEMA #define MYSQL_FILE_PREAD <one way> #else #define MYSQL_FILE_PREAD <another way> #endif (Let's call this solution [1]) and change the code from my_pread(...) to MYSQL_FILE_PREAD(...) This cause a lot of changes in the performance schema patch. The question: Is it possible to define a different macro instead, that avoid changing the original code, as in: #ifdef HAVE_PERFORMANCE_SCHEMA #define my_pread(...) <one way> #else #define my_pread(...) <another way> #endif (Let's call this proposal [2]) and leave the original line of code: my_pread(...) unchanged. The short answer: no, that does not work. The long answer: There are several issues here. First, my_pread() is a too simple case, because it actually takes the same number of parameters with and without the performance schema instrumentation. While the instrumentation try to not add new parameters to *most* functions to make the life of the user easier, there are *some* functions that still needs extra data, because the performance schema just needs that data and can not make it up. In particular, every time an object is created or opened, it must be correlated to a name, and to a configuration setting in the P_S.SETUP_* tables. In case of a file, file instruments define something like: PSI_file_key key_X where key_X correspond to a "wait/io/file/sql/X" row in SETUP_INSTRUMENTS, with ENABLED and TIMED attributes. To open a file, the code needs to provide key_X as an extra parameter to MYSQL_OPEN (in solution [1]). file= MYSQL_OPEN(key_file_frm, path, O_RDONLY, MYF(0)) With proposal [2], the code will have to be changed anyway, to: #ifdef HAVE_PERFORMANCE_SCHEMA file= my_open(key_file_frm, path, O_RDONLY, MYF(0)) #else file= my_open(path, O_RDONLY, MYF(0)) #endif Obviously, this is unacceptable. The other possibility with [2] is: file= my_open(key_file_frm, path, O_RDONLY, MYF(0)) where the file parameter is sometimes used, sometimes not. This is a code change in the caller anyway. This would reduce the number of lines changed in the caller, but not eliminate them all (and this proposal still does not work, see below). Secondly, not only the parameters to functions can change, but the type of the parameters or return values also needs to change. In case of files, what is returned by my_fopen and used by my_fread is a FILE*, from the C library. What is returned by MYSQL_FOPEN and used by MYSQL_FILE_PREAD is a MYSQL_FILE*, which is a proxy structure used by the instrumentation. This means that the prototype of my_open and my_pread with [2] will be different depending on how the code is compiled. To accommodate for that, of course one could try: #ifdef HAVE_PERFORMANCE_SCHEMA #define FILE MYSQL_FILE #endif so that the following code always 'works': FILE *file; file= my_fopen(key_file_frm, ...) my_fread(file, ...); but this is when everything breaks loose: this proposal still does not work, see below. Third, a deployment using instrumented code is never 'homogeneous' but can be 'heterogeneous'. Let me explain here. 'homogeneous' means that *every* line of code compiled, linked, and loaded into a ./mysqld process is compiled with the same compiling option: either all the code is instrumented with the performance schema, or none. 'heterogeneous' means that some code compiled, linked, and loaded into a ./mysqld process is compiled with the performance schema instrumentation, and some code is not. Now, the boundary between 'with' and 'without' code can be on a library basis: - the server compiled one way - a plugin compiled another way or it can be on a file basis: - file_X.cc compiled one way - file_Y.cc compiled another way or it can be in the same compilation unit: - file_X.cc compiled with instrumentation for some code, without for some other code. Every combination is required: please consider very carefully consequences before dismissing this as 'too complex: a) same compilation unit If the server compiles: - sql/sql_foo.cc + header.h one way and if a plugin compiles: - plugin/bar.cc + header.h another way, the header.h in question better not contain any FILE, pthread_mutex_t, etc macros that expand sometime to one thing and other times to another, or the resulting binary will just be garbage. b) same file If a plugin compiles: - plugin/foo.cc + third_party_header.h the fact that FILE, pthread_mutex_t, pthread_cond_t, etc got re defined by mysql headers will break the content of the third_party_header.h An example of 'third party header' is stdio.h: things will go VERY bad if the mysql code defines (silently, to add insult to injury) FILE to be something different. Also, it's legitimate for a plugin or any code to want to instrument some code deemed important or critical, and not instrument other code. This is critical for debug code embedded in the product (yes, there is some), as well as unit tests which are a mix of test code and production code. One should not *have* to instrument all the unit test code that will end up *not* inside a ./mysqld process to ensure that the unit test binary won't crash. c) same library In theory, we should have clean interfaces between the core server and storage engines or other plugins. In practice, this is far from being the case. As long as some storage engines insist in using internal server objects that can be instrumented, such as LOCK_open or LOCK_thread_count, linking code compiled with #define pthread_mutex_t <something> and #define pthread_mutex_t <something else> is bound to fail. Lastly, to anticipate the question "is supporting heterogeneous deployments really necessary" ? Absolutely yes. If we drop that, we might as well cancel the performance schema project right now. From an organization point of view, there is no way to make a patch that cross *all* storage engines and can be reviewed and approved by every single team at once as a monolithic change, so any change that involves that level or coordination is bound to fail. If a given team maintaining a given storage engine does not have the willingness or the resources to instrument their code, this team should not prevent others to have instrumentation for the performance schema. From a customer point of view, there is no way we can tell a customer that, in order to use a feature we deliver with the server and all *our* storage engines, the customer also has to port their code first to instrument their storage engine implementation, if any. Obviously, the same logic applies to third parties implementing and distributing storage engines (PBXT). [more similar arguments omitted] I think this should summarize pretty well why we need the instrumentation interface as designed in the mysql-6.0-perf branch. I don't see how a "simpler" instrumentation can be used. Regards, -- Marc
