Hello Tor,
Here is an update.
I am done with fixing the friendship and encapsulation problems.
So I think I left none of your comments with neither a reply nor an
action, and you can resume your review. When it comes to
opt_trace.[hc]*, given that I reorganized lots of code, it may be easier
to directly look at the new files rather than at the diff.
I just merged the latest opt-backporting into the feature tree.
All that is pushed in mysql-next-mr-opt-backporting-wl4800. Phew...
Guilhem Bichot a écrit, Le 03.02.2011 17:36:
> Hello Tor,
>
> Apart from:
> - friendship (ongoing work, see below)
> - not initializing all of Opt_trace_disable_I_S (here adopting a
> stubborn attitude but with arguments :-) see below),
> I think I have now acted on everything, please find my comments below.
>
> Tor Didriksen a écrit, Le 31.01.2011 14:58:
>> On 2011-01-26 11:57, Guilhem Bichot wrote:
>>> Hello Tor,
>>>
>>> Ok with incorporating almost everything in this commit. I have pulled
>>> your revision verbatim, and done additional edits/reverts (commit email
>>> is on its way). I am doing this in a frozen tree, based on the same
>>> revision as you used; not on the latest wl4800 tree because the latest
>>> tree is slightly ahead of what you used.
>>> When we are more or less in agreement about my handling of review
>>> comments, I'll merge the frozen tree in the latest wl4800 tree. And then
>>> the latest trunk into the latest wl4800 tree.
>>>
>>> I didn't incorporate a few points, mentioned below. Also commenting on
>>> some incorporated points which deserved it.
>>> Thanks for all the reformatting which you did. And the SCOPED_TRACE()
>>> trick. And the composition/inheritance suggestion.
>>
>> I Guilhem.
>> Here are comments on the comments.
>>
>> -- didrik
>>
>>>
>>> Tor Didriksen a écrit, Le 11.01.2011 09:15:
>>>> #At file:///export/home/didrik/repo/next-mr-opt-backporting-wl4800/
>>>> based on revid:jorgen.loland@stripped
>>>>
>>>> 3243 Tor Didriksen 2011-01-11
>>>> WL#5257 Review comments.
>>>> More comments in followup email ...
>>>> @ sql/opt_range.cc
>>>> @ sql/opt_trace.cc
>>>>
>>>> added:
>>>> unittest/gunit/opt_notrace-t.cc
>>>> modified:
>>>> sql/opt_range.cc
>>>> sql/opt_trace.cc
>>>> sql/opt_trace.h
>>>> sql/opt_trace2server.cc
>>>> sql/sys_vars.cc
>>>> unittest/gunit/CMakeLists.txt
>>>> unittest/gunit/opt_trace-t.cc
>>>> === modified file 'sql/opt_range.cc'
>>>> --- a/sql/opt_range.cc 2010-12-21 07:34:33 +0000
>>>> +++ b/sql/opt_range.cc 2011-01-11 08:15:46 +0000
>>>> @@ -793,8 +793,10 @@ static
>>>> +// print_sel_tree declared 'static' but never defined
>>>> static void print_sel_tree(PARAM *param, SEL_TREE *tree, key_map
>>>> *tree_map,
>>>> const char *msg);
>>>> +// print_ror_scans_arr declared 'static' but never defined
>>>> static void print_ror_scans_arr(TABLE *table, const char *msg,
>>>> struct st_ror_scan_info **start,
>>>> struct st_ror_scan_info **end);
>>>> @@ -12393,6 +12395,7 @@ print_multiple_key_values(KEY_PART *key_
>>>> dbug_tmp_restore_column_maps(table->read_set, table->write_set,
>
>>>> old_sets);
>>>> }
>>>>
>>>> +// print_quick defined but not used.
>>>> static void print_quick(QUICK_SELECT_I *quick, const key_map
>>>> *needed_reg)
>>>
>>> Jorgen fixed those three since then.
>>>
>>>> === modified file 'sql/opt_trace.cc'
>>>> @@ -178,18 +177,18 @@ void Opt_trace_struct::add_struct(const }
>>>>
>>>>
>>>> -void Opt_trace_struct::do_destruct(void)
>>>> +void Opt_trace_struct::do_destruct()
>>>
>>> Yes, C++ does not need (void).
>>> I once got caught in C by an accidental:
>>>
>>> file.h:
>>> func(); /* when one meant (void) but didn't write it */
>>>
>>> file.c:
>>> func(int x);
>>>
>>> and a run-time surprise in the end.
>>> So I took the habit of always writing (void). Even in C++ because it
>>> makes the rule simpler (one rule for both languages).
>>> However, this is not a big deal, I incorporated those (void) removals.
>>>
>>>> @@ -236,7 +235,7 @@ Opt_trace_struct& Opt_trace_struct::do_a
>>>> ("select" and other language keywords). If we directly store
>>>> that mix in a
>>>> UTF8 column, the query_charset piece causes an issue:
>>>> Field_blob::store() will truncate the trace at first
>>>> unrecognized
>>>> - character. At worse, a single bad character in the expanded
>>>> query makes
>>>> + character. Even worse, a single bad character in the expanded
>>>> query makes
>>>
>>> Changed to "So just a single bad character".
>>>
>>>> @@ -281,6 +279,10 @@ Opt_trace_struct& Opt_trace_struct::do_a
>>>> +namespace {
>>>> +LEX_CSTRING readables[]= { { STRING_WITH_LEN("false") },
>>>> + { STRING_WITH_LEN("true") } };
>>>> +}
>>>
>>> why not use "static" instead of a namespace?
>>
>> a matter of taste perhaps.
>> An un-named namespace is the modern "C++" way of doing this,
>> since the word 'static' is overloaded (has too many meanings)
>
> I checked Meyers' books and he thinks the same. Eliminated all static-s
> which were out of classes.
>
>>>> @@ -606,6 +607,7 @@ bool Opt_trace_context::start(bool suppo
>>>> {
>>>> DBUG_PRINT("opt_trace", ("disabled: since_offset_0(%llu) >="
>>>> " offset(%lu) + limit(%lu)",
>>>> + // Both offset and limit are signed.
>>>
>>> changed %lu to %ld
>>>
>>>> (ulonglong)since_offset_0, offset,
>>>> limit));
>>>> new_stmt_support_I_S= false;
>>>> }
>>>> @@ -616,6 +618,7 @@ bool Opt_trace_context::start(bool suppo
>>>> new_stmt_support_I_S);
>>>> DBUG_PRINT("opt_trace",("new stmt %p support_I_S %d", stmt,
>>>> (int)new_stmt_support_I_S));
>>>> + // new cannot return NULL.
>>>
>>> Wow.
>>> Our code has such
>>> if (!(ptr= new someclass()))
>>> return error;
>>> where new is the normal C++ "new" (not a custom "new" declared in the
>>> class). For example with Event_queue_element (events.cc:351). Or
>>> hostname.h with class hash_filo.
>>>
>>> So those are all wrong, thanks for enlightening me.
>>> Looks like we are in the situation described at
>>> http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=170
>>>
>>> And also in the situation of:
>>> http://blog.mozilla.com/nnethercote/2011/01/18/the-dangers-of-fno-exceptions/
>
>>>
>>> which I just tested... Added a class containing an array of 100GB;
>>> did "new" for this class in mysqld.cc:main(), and it did crash, both
>>> debug and non-debug binary. new() calling __cxa_throw() calling
>>> std::terminate(), ending up in abort().
>>>
>>> I guess I should file a bug report about all those "graceful handling
>>> of OOM" which are not.
>>> I also found code which creates a THD (events.cc, event_scheduler.cc)
>>> like this
>>> if (!(thd= new THD())) do something.
>>> and THD's "new" is actually a custom one (from parent class "ilink")
>>> which itself calls
>>> my_malloc(..., MYF(MY_FAE...));
>>> and MY_FAE is designed to exit(1) if malloc() fails. So there is no
>>> chance to coming to "do something".
>>> Same for class Log_event. I didn't look for more examples. I should
>>> mention those in the bug report too.
>>>
>>> For my patch, I guess I have two choices:
>>> - keep my new() unchanged, delete all if(new returned NULL) and
>>> accept that if OOM it will just crash.
>>> - change my new() calls to new(std::nothrow) (I prefer that second
>>> solution)
>>
>> I would prefer new(nothrow) as well.
>
> changed all my new to new(nothrow).
>
>> Note that some of our classes have their own 'operator new' which
>> actually *can* return NULL.
>> If they do so, I guess the server will crash, since we are not using
>> the new(nothrow) syntax,
>> and the C++ runtime will run constructors on un-allocated memory.
>>
>>>
>>>> if (stmt == NULL)
>>>> DBUG_RETURN(true);
>>>>
>>>
>>>> @@ -683,26 +686,30 @@ void Opt_trace_context::end(void)
>>>> }
>>>>
>>>>
>>>> -size_t Opt_trace_context::allowed_mem_size(const Opt_trace_stmt
>>>> *for_stmt)
>>>> - const
>>>> +namespace {
>>>> +size_t allowed_mem_size(const Opt_trace_stmt *stmt_start,
>>>> + const Opt_trace_stmt *for_stmt)
>>>> {
>>>
>>> why not use "static" instead of a namespace?
>>
>> Quoting Bjarne Stroustrup, 1997:
>> In C and older C++ programs, the keyword 'static' is (confusingly)
>> used to mean
>> "use internal linkage". Don't use static except inside functions and
>> classes.
>>
>> (note that he said older C++ programs, in 1997 ....)
>
> done
>
>>>> - DBUG_ENTER("Opt_trace_context::allowed_mem_size");
>>>
>>>> @@ -750,20 +757,20 @@ void Opt_trace_iterator::operator++(int)
>>>> /**
>>>> - @note because allocation is done in big chunks,
>>>> buffer->Ptr[str_length]
>>>> + @note Because allocation is done in big chunks,
>>>> buffer->Ptr[str_length]
>>>> may be uninitialized while buffer->Ptr[allocated length] is 0,
>>>> so we
>>>> must use c_ptr_safe() as we want a NUL-terminated string (which
>>>> is easier
>>>> to manipulate in a debugger, or to compare in unit tests with
>>>> - ASSERT_STREQ).
>>>> + EXPECT_STREQ).
>>>> c_ptr_safe() may realloc an empty String from 0 bytes to 8 bytes,
>>>> - when it adds the closing NUL.
>>>> + when it adds the closing NULL.
>>>
>>> I meant NUL (the name of ASCII character '\0'). Should I write '\0'
>>> instead?
>>
>> I read that as a typo for NULL, so maybe '\0' is better, yes.
>
> I changed:
> "NUL-terminated" to "0-terminated"
> and "NUL" to \0, except in one place about charset subtleties where I
> had to distinguish: I wrote "if we see a byte equal to 0 it is really
> the UTF8 u0000 character (a.k.a. ASCII NUL)".
>
>>>> @@ -780,12 +787,9 @@ Opt_trace_info Opt_trace_iterator::opera
>>>> }
>>>>
>>>>
>>>> -Opt_trace_iterator Opt_trace_iterator::end;
>>>> -
>>>> -
>>>> /* Opt_trace_stmt class */
>>>>
>>>> -const size_t Opt_trace_stmt::empty_trace_len= 4;
>>>> +const size_t Opt_trace_stmt::empty_trace_len= 4; // magic number
>>>
>>> Incorporated all the "magic number" comments, but what is their
>>> intention?
>>
>> Ahh, my intention when commenting like this is usually:
>> "please give this magic number a name".
>>
>> This one actually *has* a name (empty_trace_len)
>> but there is no comment about why the value is four.
>
> I deleted empty_trace_len; a trace cannot be empty nowadays: either the
> statement is not traced then we don't even create an Opt_trace_stmt, or
> the statement is traced then we always add "steps" to the trace.
> For other "magic numbers", I added comments.
>
>>>> @@ -825,22 +829,30 @@ void Opt_trace_stmt::end(void)
>>>> -void Opt_trace_stmt::next_line(void)
>>>> +static const char my_spaces[] =
>>>> + " "
>>>> + " "
>>>> + " "
>>>> + " "
>>>> + " "
>>>> + " "
>>>> + " "
>>>> + ; // Extend as desired, to max supported indentation level.
>>>
>>> This is an elegant solution. However there is no max supported
>>> indentation level. It can get quite deep, when tracing greedy search (if
>>> there are many tables). So I propose
>>>
>>> int remnent= 2 * depth;
>>> const int spaces_len= sizeof(my_spaces) - 1;
>>> while (remnent > spaces_len)
>>> {
>>> buffer.append(&my_spaces[0]);
>>> remnent-= spaces_len;
>>> }
>>> if (remnent > 0)
>>> buffer.append(&my_spaces[spaces_len - remnent]);
>>
>> "remnent" is a strange name.
>
> Without spelling errors, it should have been "remnant".
> I changed to "to_be_printed".
>
>> If you want to combine my/your approach, I guess that is OK.
>> Please add a unit test then, since it is very easy to get
>> off-by-one errors in code like this.
>
> done.
> I also simplified the code, no 'index from the end' trick is needed
> because String::append() supports passing a length argument (so does not
> need a 0-terminated char* so we don't need to end "at the end of
> my_spaces"). This simplification makes a bug in this code unlikely but I
> still test it in the unit test (new test TEST_F(TraceContentTest, Indent)).
>
>>>> +void Opt_trace_stmt::next_line()
>>>> {
>>>> if (ctx->one_line)
>>>> return;
>>>> buffer.append('\n');
>>>> - char spaces[]= " ";
>>>> - // 2 spaces per nesting level
>>>> - const int nb= depth * 2, spaces_len= sizeof(spaces) - 1;
>>>> - // add spaces in chunks of spaces_len, better than many append(' ')
>>>> - for (int i= 0; i < nb/spaces_len; i++)
>>>> - buffer.append(spaces, spaces_len);
>>>> - buffer.append(spaces, (nb % spaces_len));
>>>> +
>>>> + // You can get away with a single append here:
>>>> + int ix= sizeof(my_spaces) - 2*depth - 1;
>>>> + if (ix < 0)
>>>> + ix= 0;
>>>> + buffer.append(&my_spaces[ix]);
>>>> }
>>>
>>>> @@ -1056,11 +1066,13 @@ bool Opt_trace_stmt::Buffer::append(char
>>>>
>>>> Opt_trace_disable_I_S::Opt_trace_disable_I_S(Opt_trace_context
>>>> *ctx_arg,
>>>> bool disable_arg) :
>>>> + ctx(ctx_arg),
>>>> + stmt(ctx ? ctx->current_stmt_in_gen : NULL),
>>>
>>> If 'disable_arg' is false, no other member variable than 'disable' will
>>> be used. So I would prefer to not initialize other members in that case.
>>> The idea all around the file is to initialize as least as possible.
>>
>> Which is a *very* bad practice.
>> A constructor is a contract: it turns raw bytes into an object.
>> Your constructor fails to do so.
>
> The constructor of Opt_trace_struct works the same: it initializes
> variable 'started' to 'false', and apart from that, the rest of
> initializations is conditioned by
> if (ctx_arg != NULL && ctx_arg->is_started()).
> So if the condition above is false, only 'started' is initialized (to
> 'false'), and all functions of this class (like add()) make sure to do
> nothing if 'started' is false.
>
> Back to Opt_trace_disable_I_S: if disable_arg is false, the caller is
> asking the object to not disable anything: the object is asked to have
> no action. Then it does not need to be fully initialized: it only needs
> to remember (by initializing 'disable' to false) that it has nothing to do.
> The object is functional from the point of view of the user of it: it
> does not disable anything; it's working and it's not raw bytes. The
> unset properties will not be used under any circumstances. Contracts are
> fine: the class exposes only a constructor and a destructor; the
> destructor works well on an instance where 'disabled' is false and the
> rest is unset. The user has no risk to running into a problem.
>
> The fact that we can have a non-acting object like this is a consequence
> of me following RAII. I cannot do
>
> if (some cond)
> disable the trace;
>
> because, with RAII, it would mean
> if (some cond)
> Opt_trace_disable_I_S();
> which would be deleted at the same code line when it's instantiated.
>
>>>> disable(disable_arg)
>>>> {
>>>> if (likely(!disable))
>>>
>>> I removed this likely(). Due to some past code changes, there is no more
>>> foreseenable likelihood of this.
>>>
>>>> return;
>>>> - ctx= ctx_arg;
>>>> +
>>>> if (ctx != NULL)
>>>> {
>>>> /* Disable in to-be-created future traces */
>>>
>>>> === modified file 'sql/opt_trace.h'
>>>> --- a/sql/opt_trace.h 2010-12-19 14:24:03 +0000
>>>> +++ b/sql/opt_trace.h 2011-01-11 08:15:46 +0000
>>>> @@ -290,8 +292,10 @@ class THD;
>>>> @li Use only simple characters for key names: a-ZA-Z_# (to denote a
>>>> number), and no space.
>>>>
>>>> +??? to denote a number ???
>>>
>>> I meant that '#' serves to denote a number (like "select#": number of a
>>> SELECT). Clarified to
>>>
>>> @li Use only simple characters for key names: a-ZA-Z_#, and no space.
>>> '#'
>>> serves to denote a number, like in "select#" .
>>>
>>>> class Opt_trace_context
>>>> {
>>>> /** Flags' numeric values for @@@@optimizer_trace variable */
>>>> + // the style guide says enum enum_flag_value { ... }
>>>
>>> Removed the enum's name.
>>>
>>>> + // the style guide does *not* mandate uppercase enum values .....
>>>
>>> Right. It does not forbid it either, and the sql/ code is has many
>>> uppercase enum values...
>>> Would you prefer uppercase or something else?
>>
>> I just think that they look ugly.
>> But you are right, the convention seems to be uppercase.
>
> I left them in uppercase.
>
>>>
>>>> enum flag_value {
>>>> FLAG_DEFAULT= 0, FLAG_ENABLED= 1, FLAG_END_MARKER= 2,
>>>> FLAG_ONE_LINE= 4
>>>> };
>>>
>>>> @@ -375,13 +384,16 @@ public:
>>>> +
>>>> /** Features' numeric values for @@@@optimizer_trace_features
>>>> variable */
>>>> + // These are actually decimal bit values(?)
>>>> + // so maybe write 1 << 0, 1 << 1, and so on???
>>>
>>> done
>>>
>>>> enum feature_value {
>>>> GREEDY_SEARCH= 1,
>>>> RANGE_OPTIMIZER= 2,
>>>> /*
>>>> - if you add here, update feature_value of empty implementation
>>>> - and FEATURES_DEFAULT!
>>>> + If you add here, update feature_value of empty implementation
>>>> + and default_features!
>>>
>>>> @@ -473,6 +488,7 @@ private:
>>>> OFFSET/LIMIT
>>>> */
>>>> ha_rows since_offset_0;
>>>> + // ha_rows seems strange, why not int/long?
>>>
>>> Originally this variable is about count of rows in the
>>> I_S.OPTIMIZER_TRACE table, and count of rows are supposed to be ha_rows.
>>> BUT as this variable gets compared with the values of
>>> @@optimizer_trace_offset/limit which are long, I can change to long.
>>> Changed to long, which eliminates some casts in comparisons.
>>>
>>>> @@ -506,15 +522,17 @@ private:
>>>> - /** Whether the settings above may be changed for a new trace */
>>>> + /** Whether the settings above may be changed for a new trace. */
>>>> + // comment is positive, variable name is negative.
>>>
>>> Changed comment to "may not be changed". There are also high chances
>>> that this variable goes away due to some other patch in preparation
>>> (more about this when Dmitri has reviewed my other patch).
>>>
>>>> bool cannot_change_settings;
>>>
>>>> @@ -523,22 +541,34 @@ private:
>>>> + // Having too many friends is asking for trouble ....
>>>
>>> The words of a misanthrope ;-)
>>>
>>>> + // Opt_trace_stmt needs only a const getter for one_line.
>>>> + // Opt_trace_struct needs a couple of const getters, plus
>>>> current_stmt_in_gen
>>>> + // Opt_trace_disable_I_S actually changes the state of the
>>>> context object!
>>>> + // Opt_trace_iterator needs only oldest_stmt_to_show and limit
>>>> + // and finally: the free function does not need friendship.
>>>
>>> Removed friendship from the free function.
>>> Before removing the rest of friendship, let's discuss. My intention with
>>> 'friend' was:
>>> - it's a nice party in opt_trace.cc, everybody knows and trusts
>>> everybody
>>> - but from outside of opt_trace.cc, nothing happening inside is visible
>>> (black windows).
>>>
>>> If I add getters to Opt_trace_context, they will be public, and so the
>>> outside will be able to know the value of Opt_trace_context::one_line. I
>>
>> Well, a public getter can do very little harm.
>>
>>> don't like it so much. Even worse, if Opt_trace_disable_I_S needs to
>>> call a setter Opt_trace_context::set_support_I_S() in order to change
>>> Opt_trace_context::support_I_S, this setter has to be
>>> public, so everyone including opt_trace.h can call it, and I don't find
>>> this isolated.
>>> Do you have in your C++ bag of tricks something which solves this: put
>>> order in the party in opt_trace.cc, but keep it opaque from the outside?
>>
>> As I said, "most of these friends should go away".
>> You can argue on a case-by-case basis, so e.g. Opt_trace_disable_I_S
>> might stay as friend, but isn't it more natural that the context class
>> is the one who knows what has to be saved and restored?
>>
>> To me, it seems to have one (private) setter/getter function handling
>> this.
>> Input argument: a struct containing new settings, output: the old
>> setting.
>
> Today I have tried eliminating "friend" declarations one by one, adding
> getters or setters each time I got a compilation error. After an hour of
> this, it compiles and I have:
> - getters which return int/bool or pointer-to-const (=safe getters): 9
> of them
> - getters which return pointer but not pointer-to-const (=dangerous
> getters): 4
> - setters: 13
>
> Damage in Opt_trace_stmt:
> - The constructor is now public (so a user can now create an
> Opt_trace_stmt; previously, only Opt_trace_context and other classes in
> opt_trace.h could).
> - end() is now public.
> - Class Buffer, and members query_buffer and trace_buffer are public
> (could use a getter and a setter instead, but would have same result).
> - push()/pop()/next_line()/separator() are now public
> - set_query() idem
>
> The code looks horrible. Frankly this is a regression :-)
> Either it's revealing a clear, profound design problem in my code, or
> the get/set thing is not adapted.
> The design with friends allows big mess in opt_trace.cc (but as
> maintainer of opt_trace.cc I find I keep it non-messy), but prevents
> anybody from the outside from doing wrong.
> The design with getters and setters introduces dangerous getters,
> setters, and public members, which exposes a lot to the outside of
> opt_trace.cc, which can now easily corrupt the structures by calling
> some public setter for example. So I'm not going to push it, rather
> going to reconsider all that.
> It already happened in the past of this development that I stumbled on a
> problem and had to go through a re-think and re-design to reach a
> cleaner organization in the end. Maybe this is happening again.
>
>> Re: bag of tricks:
>> Look up the 'pimpl idiom' which is perfect for hiding implementation
>> details to the outside world.
>
> I'll think about it. Expect more news.
>
>>>
>>>> +
>>>> friend class Opt_trace_stmt;
>>>> friend class Opt_trace_struct;
>>>> friend class Opt_trace_disable_I_S;
>>>> friend class Opt_trace_iterator;
>>>> - friend void opt_trace_print_expanded_query(THD *thd);
>>>> + //friend void opt_trace_print_expanded_query(THD *thd);
>>>> };
>>>
>>>
>>>> + class Buffer
>>>> {
>>>> -private:
>>>> + private:
>>>> size_t allowed_mem_size; ///< allowed memory size for this
> String
>>>> size_t missing_bytes; ///< how many bytes could not be added
>>>> bool malloc_error; ///< whether there was a
>>>> malloc/realloc() error
>>>> -public:
>>>> + String string_buf;
>>>> }
>>>> - inline const char *ptr(void) const { return String::ptr(); }
>>>> + const char *ptr() const { return string_buf.ptr(); }
>>>> +
>>>> + CHARSET_INFO *charset() { return string_buf.charset(); }
>>>
>>> ok, and I added "const" to it, like String has.
>>>
>>>> @@ -622,6 +653,7 @@ public:
>>>> friend class Opt_trace_iterator;
>>>> };
>>>>
>>>> + // Suggest renaming to trace_buffer: much easier to search for it.
>>>> Buffer buffer; ///< Where the trace is accumulated
>>>
>>> renamed to trace_buffer
>>>
>>>> Buffer query_buffer; ///< Where the query is put
>>>>
>>>> @@ -629,19 +661,21 @@ public:
>>>> Nesting level of the current structure.
>>>> The more nested ("deep"), the more indentation spaces we add
>>>> on the left.
>>>> */
>>>> + // how about current_depth: much easier to search for it.
>>>> int depth;
>>>
>>> renamed to current_depth
>>>
>>>> @@ -649,11 +683,18 @@ public:
>>>> Opt_trace_stmt *prev, *next; ///< list management
>>>>
>>>> /** By how much we should increase buffer's size when it's
>>>> becoming full */
>>>> + // At first sight it looks strange that Buffer doesn't handle
>>>> this itself.
>>>
>>> Ok, I moved the code using buffer_alloc_increment from
>>> Opt_trace_struct::do_construct() to a function in Buffer, named
>>> prealloc().
>>> I call this function from Opt_trace_struct::do_construct().
>>> So it's only called when opening a structure, not too often, which is
>>> what I want (don't want to do it in Buffer::append() for that reason).
>>>
>>>> + // Do we add to the length, or do we multiply (which is more
>>>> common I think)
>>>
>>> Right now we add. With multiplying I fear this would grow fast and waste
>>> memory (we already keep the whole trace in memory, so need to be
>>> cautious). The increment is 1kB.
>>> I could use multiplication instead, but then with some limit (like:
>>> double each time, but when trace is 10MB, then stop multiplying, and
>>> switch to adding 10MB each time). Let me know.
>>
>> I think that the common trick is to multiply by sqrt(2) or something.
>> If you multiply, the amortized cost of an append is constant.
>> If you add, the cost of append will grow as a function of string length.
>
> Assuming ::realloc() works like
>
> new_ptr=malloc(new_size);
> memcpy(new_ptr, old_ptr, old_size);
> free(old_ptr);
>
> then I agree with what you say above about the amortized cost. If we
> multiply by a constant 'x' then the amortized cost of such ::realloc()
> is constant and proportional to x/(x-1), from what I computed.
> I saw *2 and *1.5 (~sqrt(2)) for 'x' on internet.
>
> ::realloc() may not always be as "inefficient" as above though.
>
>> I suggest writing a unit test, and measure how it behaves.
>> If you run your unit test with --disable-tap-output
>> you will get the standard google output, and see millisecond execution
>> times for each test.
>
> Thanks. I just did this:
>
> int maxs=100000000; // max allocated size
> int loop=100; // number of iterations (to get times greater than 1ms)
> TEST_F(TraceContentTest, add)
> {
> int inc=1024;
> for (int y=0; y<loop; y++)
> {
> char *p= (char*)malloc(0);
> int s= 0;
> while (s < maxs)
> {
> int news= s+inc;
> p= (char*)realloc(p, news);
> memset(p+s, 25, news-s); // simulate adding trace content
> s= news;
> }
> free(p);
> }
> }
> TEST_F(TraceContentTest, mult15)
> {
> for (int y= 0; y<loop; y++)
> {
> char *p= (char*)malloc(0);
> int s=0;
> while (s < maxs)
> {
> int news= s*15/10+1;
> p= (char*)realloc(p, news);
> memset(p+s, 25, news-s);
> s= news;
> }
> free(p);
> }
> }
> TEST_F(TraceContentTest, mult20)
> {
> for (int y= 0; y<loop; y++)
> {
> char *p= (char*)malloc(0);
> int s=0;
> while (s < maxs)
> {
> int news= s*2+1;
> p= (char*)realloc(p, news);
> memset(p+s, 25, news-s);
> s= news;
> }
> free(p);
> }
> }
> Which tests the "add" method (increment=1024, as in opt_trace.cc) then
> *1.5 then *2. I varied max_allocated_size (and also the number of
> iterations so that the overall time stays reasonable). The machine has
> 12GB RAM in total, Linux64. Here are the times in ms:
>
> max_allocated_size | add | mult1.5 | mult2.0 | iterations
>
> 10k 800 1800 1200 1M
> 100k 700 1200 600 100k
> 1M 900 700 700 10k
> 10M 1400 1100 1700 1000
> 100M 6500 6900 6800 100
>
> "add" wins in two cases. mult1.5 wins in two cases; mult2 wins in two
> cases.
>
> Then I did a server benchmark: 20-table plan search, limiting trace size
> to 10MB (--optimizer-trace-max-mem-size=10MB, with no limit the machine
> would not have enough memory :-).
> - mult2.0: 32.5 secs (does 10s of ::realloc calls)
> - add: 33.3 secs (does 1000s of ::realloc calls)
> On the other hand, 10MB is unusually big. The biggest trace in
> mysql-test/r/optimizer_trace*.result is 10k long. I think that the
> typical traces will be of size 10k-100k, where I'm not sure realloc()
> has an impact. Even with 10MB, the impact is 1 second, not much a
> problem for us developers when debugging a query.
>
> So I don't really have an opinion. I can do *2 if you like. Or I can be
> lazy and keep "add". I don't really care :-) Let me know.
>
>>>> static const size_t buffer_alloc_increment;
>>>> /** Length of an empty trace */
>>>> static const size_t empty_trace_len;
>>>>
>>>> public:
>>>> + const Opt_trace_stmt *prev_() const { return prev; }
>>>> +
>>>> + size_t alloced_length() const
>>>> + { return buffer.alloced_length() + query_buffer.alloced_length(); }
>>>
>>> Do we need Doxygen comments for the two functions above?
>>>
>>>> @@ -671,7 +712,7 @@ public:
>>>> struct Opt_trace_info
>>>> {
>>>> /**
>>>> - String containing trace. It is NUL-terminated, only to aid
>>>> debugging or
>>>> + String containing trace. It is NULL-terminated, only to aid
>>>> debugging or
>>>
>>> I meant NUL
>>>
>>>> @@ -708,20 +749,23 @@ public:
>>>> '*' operator.
>>>> */
>>>> Opt_trace_iterator(Opt_trace_context *ctx) :
>>>> - cursor(ctx->oldest_stmt_to_show), row_count(1),
>>>> limit(ctx->limit) {}
>>>> - void operator++(int); ///< advances iterator to next trace
>>>> + cursor(ctx->oldest_stmt_to_show), row_count(1),
> limit(ctx->limit)
>>>> + {}
>>>> +
>>>> + ///< Advances iterator to next trace.
>>>
>>> If ///< is not on the same line as what it describes, doxygen does not
>>> pick it up. I put it back on the line.
>>>
>>>> + // A bit strange to implement postfix iterator, rather than prefix.
>>>
>>> renamed it to next().
>>
>> thanks
>>
>>>
>>>> + void operator++(int);
>>>> +
>>>> /**
>>>> Returns information about the trace on which the iterator is
>>>> positioned.
>>>> + @note Returns object by value!!!
>>>> */
>>>> Opt_trace_info operator*();
>>>
>>> renamed it to get_value()
>>
>> good!
>>
>>>
>>>> - /** Needed to compare the iterator with the list's end */
>>>> - bool operator!=(const Opt_trace_iterator &other) const
>>>> - { return cursor != other.cursor; }
>>>> - static Opt_trace_iterator end; ///< List's end
>>>> +
>>>> + bool at_end() const { return cursor == NULL; }
>>>> +
>>>> private:
>>>> - /** Only to construct the 'end' iterator */
>>>> - Opt_trace_iterator(void) : cursor(NULL) {}
>>>> Opt_trace_stmt *cursor; ///< trace which the iterator is
>>>> positioned on
>>>> ha_rows row_count; ///< how many traces yielded so far
>>>> ha_rows limit; ///< yield "empty" after yielding that
>>>> many traces
>>>> @@ -734,13 +778,13 @@ private:
>>>> When you want to add a structure to the trace, you create an
>>>> instance of
>>>> Opt_trace_object or Opt_trace_array, then you add information to
>>>> it with
>>>> add(), then the destructor closes the OOA (we use RAII, Resource
>>>> + ^^^ ???
>>>
>>> OOA used to mean Object or Array, I changed it to Structure everywhere
>>> (Jorgen preferred), had forgotten this one... Fixed.
>>>
>>>> Acquisition Is Initialization).
>>>> */
>>>> class Opt_trace_struct
>>>> {
>>>> protected:
>>>> /**
>>>> - Constructor.
>>>
>>> Why remove this? Useless?
>>
>> Yes, I think that is a useless comment.
>> We assume that the reader of this code recognizes a constructor when
>> he sees one :-)
>
> deleted a few such "Constructor" words.
>
>>>> @param ctx_arg Optimizer trace context for this structure
>>>> @param requires_key_arg whether this structure
>>>> requires/forbids keys
>>>> for values put inside it (an object requires
>>>> them, an
>>>
>>>> @@ -807,13 +851,18 @@ public:
>>>> @verbatim <some Opt_trace_struct>.add().add().add()
> @endverbatim
>>>> which have 3 if()s (one per @c add()), it has been measured
>>>> that wrapping
>>>> that inside a single if(), like
>>>> - @verbatim if(started) { <some
>>>> Opt_trace_struct>.add().add().add() } @endverbatim
>>>> + @verbatim if(started) { <some
>>>> Opt_trace_struct>.add().add().add() }
>>>> + @endverbatim
>>>> degraded performance (from 2.0 to 2.1 secs). Adding if() in
>>>> one single
>>>> - code line caused de-inlining of add() methods in several
>>>> unrelated places
>>>> + code line caused de-inlining of add() functions in several
>>>> unrelated places
>>>> and the performance degraded. So we don't do it. The
>>>> performance impact of
>>>> multiple chained add() seems comparable to a single add() anyway.
>>>> Removing @c likely() also increases from 2.0 to 2.1, so we leave
>>>> them.
>>>> +
>>>> + // I don't need to know all these details about milliseconds
>>>> here ....
>>>> + // This should be a documentation about the interface.
>>>
>>> I deleted this block, and put that in front of do_add() in opt_trace.cc:
>>> /**
>>> @note add() has an up-front if(), hopefully inlined, so that in the
>>> common
>>> case - tracing run-time disabled - we have no function call. If
>>> tracing is
>>> enabled, we
>>> call do_add().
>>> */
>>>
>>>> @@ -1029,6 +1085,7 @@ private:
>>>> When adding a value (or array or object) to an array, or a
>>>> key/value pair
>>>> to an object, we need this outer array or object.
>>>>
>>>> +Sentence below is very hard to read, too many ()()()
>>>
>>> Changed to
>>> It would be possible, when trying to add a key to an array, which
>>> is wrong
>>> in JSON, or similarly when trying to add a value without any key
>>> to an
>>> object, to catch it at compilation time, if the adder received, as
>>> function
>>> parameter, the type of the structure (like @c Opt_trace_array*).
>>> Then
>>> the @c add(key,val) call would not compile as Opt_trace_array
>>> wouldn't
>>> feature it.
>>>
>>>
>>>> It would be possible, when trying to add a key to an array
>>>> (which is wrong
>>>> in JSON) (or similarly when trying to add a value without any
>>>> key to an
>>>> object), to catch it at compilation time, if the outer object
>>>> was passed an
>>>
>>>> @@ -1063,22 +1120,34 @@ private:
>>>> Opt_trace_struct& operator=(const Opt_trace_struct&);
>>>>
>>>> bool started; ///< Whether the structure does tracing
>>>> +
>>>> /**
>>>> Whether the structure requires/forbids keys for values inside it.
>>>> 1: this is an object. 0: this is an array. Other: incorrect.
>>>>
>>>> + it is unclear at this point why it is int8 rather than bool
>>>
>>> It used to be bool. It was and is still used in code like
>>> 'some_array[the_variable]'.
>>> And Jorgen didn't like booleans as array subscripts, so I changed to
>>> int8 to make him happy :-)
>>> What should I do? bool or int8? I prefer bool.
>>> See also a reply to the your other commit, about arrays vs inline
>>> functions.
>>
>> It is a boolean property.
>> int8 looks like we are going to serialize it, and send it over the
>> wire in a package or something.
>> Anyways, we agreed on inline functions like 'const char
>> *open_brace(bool)' didn't we.
>
> We did. But we still need a "requires_key" member in the class, and I
> still use it as array subscript inside open_bracket(). Changed it back
> to bool.
>
>>>> @note The canonical way would be to not have such int8 per
>>>> instance, but
>>>> rather have a pure virtual method
>>>> Opt_trace_struct::requires_key(),
>>>> +C++ has member functions, not methods.
>>>
>>> Ok, ok... in my French OOP courses they were called "méthodes"...
> Thanks
>>> for fixing.
>>>
>>>> overloaded by Opt_trace_object (returning 1) and by
>>>> Opt_trace_array
>>>> (returning 0). But this would have drawbacks:
>>>> @li it would add a vtbl pointer to each instance which takes
>>>> even more
>>>> space than int8
>>>> +-- this depends on alignment (you happen to have a bool as a
>>>> neighbour..)
>>>> @li it would add requires_key() function calls which cost more
>>>> than
>>>> reading one int8
>>>> +-- you don't know that, until measured on different platforms.
>>>> +-- reading an int8 may require masking operations, unless
>>>> +-- your platform is byte addressable
>>>> @li Opt_trace_object::requires_key() would not be accessible from
>>>> Opt_trace_struct::construct() (which would complicate coding),
>>>> whereas the
>>>> int8 is.
>>>> +-- virtual function not accessible to CTOR/DTOR: true
>>>> +-- OK, so I buy this argument, but not the others :-)
>>>
>>> removed the others.
>>>
>>>> */
>>>> + // Hand-coded typechecking is generally a bad idea,
>>>> + // but since you need the information in the base CTOR/DTOR: OK.
>>>
>>> Thanks, so I leave it.
>>>
>>>> int8 requires_key;
>>>
>>>> @@ -1236,13 +1310,18 @@ private:
>>>> };
>>>>
>>>>
>>>> +// I believe the macros below are gone in newer versions of the code.
>>>> +// Anyways: the names aren't very good, and the formatting was bad ...
>>>
>>> Yes, they are gone.
>>>
>>>> @@ -1278,6 +1358,7 @@ struct TABLE_LIST;
>>>> sub-statement in the call chain), and this function decides to trace
>>>> (respectively not trace) the sub-statement, it returns "true"
>>>> (resp. false). Each sub-statement is responsible for ending the
>>>> trace which it
>>>> +()()()()()
>>>
>>> Changed to
>>> @note if tracing was already started by a top statement above the
>>> present
>>> sub-statement in the call chain, and this function decides to trace
>>> (respectively not trace) the sub-statement, it returns "true"
>>> (resp. false). Each sub-statement is responsible for ending the trace
>>> which it
>>> has started.
>>>
>>>> @@ -1452,12 +1536,14 @@ public:
>>>> /**
>>>> Helper to put the database/table name in the trace
>>>> @param t TABLE* pointer
>>>> +why macro, why not an ordinary function?
>>>
>>> If I made it an ordinary function, it would need to know the layout of
>>> TABLE_LIST. But I would like it be inline (don't want to have function
>>> calls when tracing is not enabled; see Opt_trace_struct::do_add/add()
>>> tricks).
>>> If I want it inline, I need to put it in a header file which includes
>>> table.h. I'm not sure I want to include table.h in opt_trace.h, this
>>> would bring in a lot of unwanted structures...
>>> Ideas?
>>
>> OK, that's a good enough reason.
>> Please add comment about this fact.
>
> as discussed elsewhere it changed to an inline member function.
>
>> And since it is a macro, it should be IN_UGLY_ALL_CAPS.
>>> Does it really hurt to have a macro?
>>
>> Macros are generally bad bad bad bad,
>> but can sometimes be justified.
>>
>>> Talking about macros:
>>> - OPT_TRACE and OPT_TRACE2 are gone
>>> - I could delete macros used when OPTIMIZER_TRACE is not defined
>>> (opt_trace_start, opt_trace_end, opt_trace_print_expanded_query(),
>>> opt_trace_add_select_number(), opt_trace_set_query()) and just write
>>> functions like this
>>> opt_trace_start(...)
>>> {
>>> #ifdef OPTIMIZER_TRACE
>>> lots of code
>>> #endif
>>> }
>>
>> as you may have noticed, I am not a big fan of ifdefs....
>> I actually think your current scheme is better than what you suggest
>> above.
>
> ok I'll keep it.
>
>>> so that when OPTIMIZER_TRACE is not defined they are empty functions.
>>> This would limit the number of empty macros which I need to update each
>>> time I add a parameter of opt_trace_start().
>>>
>>> This would leave only two macros in opt_trace.h: add_utf8_table and
>>> OPT_TRACE_TRANSFORM. That's few... maybe they can be let to live...??
>>>
>>>> */
>>>> #define
>>>> add_utf8_table(t) \
>>>> add_utf8("database", (t)->s->db.str,
>>>> (t)->s->db.length). \
>>>> add_utf8("table", (t)->alias)
>>>>
>>>> -#if !defined(DBUG_OFF) && !defined(OPTIMIZER_TRACE)
>>>> +#if !defined(DBUG_OFF) && !defined(OPTIMIZER_TRACE) \
>>>> + && !defined(OPTIMIZER_TRACE_UNITTEST)
>>>>
>>>> /*
>>>> A debug binary without optimizer trace compiled in, will miss some
>>>>
>>>> === modified file 'sql/opt_trace2server.cc'
>>>> @@ -113,16 +111,19 @@ bool opt_trace_start(THD *thd, const TAB
>>>> /*
>>>> We don't allocate it in THD's MEM_ROOT as it must survive until
>>>> a next
>>>> statement (SELECT) reads the trace.
>>>> */
>>>> if (thd->opt_trace == NULL)
>>>> {
>>>> - if ((thd->opt_trace= new Opt_trace_context()) == NULL)
>>>> + // (this version of) operator new can never return NULL.
>>>> + if ((thd->opt_trace= new Opt_trace_context) == NULL)
>>>
>>> Depends on the other "new-related" discussion earlier.
>
> changed to new(nothrow)
>
>>>> @@ -156,7 +158,7 @@ disable:
>>>> if (need_it_for_I_S && thd->opt_trace != NULL &&
>>>> thd->opt_trace->is_started())
>>>> {
>>>> need_it_for_I_S= false;
>>>> - goto start_trace;
>>>> + goto start_trace; // Ouch, criss-cross goto!
>>>
>>> Yes, though there's no bug (no risk of infinite loop).
>>> I changed like this:
>>> replaced "goto disable" with DBUG_RETURN(false) here:
>>>
>>> start_trace:
>>> if (thd->opt_trace->start(need_it_for_I_S,
>>> (var & Opt_trace_context::FLAG_END_MARKER),
>>> (var & Opt_trace_context::FLAG_ONE_LINE),
>>> thd->variables.optimizer_trace_offset,
>>> thd->variables.optimizer_trace_limit,
>>> thd->variables.optimizer_trace_max_mem_size,
>>> thd->variables.optimizer_trace_features))
>>> {
>>> if (allocated_here)
>>> {
>>> delete thd->opt_trace;
>>> thd->opt_trace= NULL;
>>> }
>>> DBUG_RETURN(false);
>>> }
>>> DBUG_RETURN(true); // started all ok
>>>
>>> So we can go from "disable:" to "start_trace:" but once in
>>> "start_trace:" we will leave the function.
>>> If we got an error in "start_trace:", there is no chance that starting
>>> it again in "disable:" would have better success, so DBUG_RETURNing
>>> immediately, as in the new code above, is good enough.
>>
>> I need to have a look at the new version of this code.
>
> I just pushed all changes to mysql-next-mr-guilhem.
> I could not push to mysql-next-mr-opt-backporting-wl4800 as I'm applying
> review comments to an old tree.
>
>>>> @@ -188,15 +191,16 @@ void opt_trace_print_expanded_query(THD
>>>> thd->lex->sql_command is SQLCOM_END in this case. So we do a check
>
>>>> of the
>>>> command:
>>>> */
>>>> +// Too many parens (()) in comment above, hard to read.
>>>
>>> In the latest tree this comment is gone, because the query is not
>>> printed from open_tables() anymore, which eliminates the problem.
>>>
>>>> @@ -265,6 +269,7 @@ ST_FIELD_INFO optimizer_trace_info[]=
>>>> /* name, length, type, value, maybe_null, old_name, open_method */
>>>> {"QUERY", 65535, MYSQL_TYPE_STRING, 0, false, "", SKIP_OPEN_TABLE},
>>>> {"TRACE", 65535, MYSQL_TYPE_STRING, 0, false, 0, SKIP_OPEN_TABLE},
>>>> + // why is this zero, others are "" ^
>>>
>>> This is member ST_FIELD_INFO::old_name. Some declarations in other files
>>> have 0 here, some have "". Changed all in the array above, to NULL.
>>>
>>>> {"MISSING_BYTES_BEYOND_MAX_MEM_SIZE", 20, MYSQL_TYPE_LONG,
>>>> 0, false, "", SKIP_OPEN_TABLE},
>>>> {"OS_MALLOC_ERROR", 1, MYSQL_TYPE_TINY, 0, false, "",
>>>> SKIP_OPEN_TABLE},
>>>
>>>> === added file 'unittest/gunit/opt_notrace-t.cc'
>>>> --- a/unittest/gunit/opt_notrace-t.cc 1970-01-01 00:00:00 +0000
>>>> +++ b/unittest/gunit/opt_notrace-t.cc 2011-01-11 08:15:46 +0000
>>>> @@ -0,0 +1,21 @@
>>>> +#include "my_config.h"
>>>> +#include <gtest/gtest.h>
>>>> +
>>>> +#if defined(OPTIMIZER_TRACE)
>>>> +#undef OPTIMIZER_TRACE
>>>> +#endif
>>>> +#define OPTIMIZER_TRACE_UNITTEST
>>>> +
>>>> +#include <opt_trace.h>
>>>> +
>>>> +namespace {
>>>> +
>>>> +TEST(Foo, Bar)
>>>> +{
>>>> + // Fill in more here, to verify that implementations are in sync!!!
>>>> + Opt_trace_context trace;
>>>> + Opt_trace_object oto(&trace);
>>>> + Opt_trace_array ota(&trace);
>>>> +}
>>>
>>> Ok, what would you want this to test?
>>> Various add() calls (with their different signatures), and in the end we
>>> verify that there is no trace in the iterator?
>>> Note that, if I understood correctly, in our release builds, optimizer
>>> trace will be compiled in (but run-time disabled, until the user changes
>>> the value of @@session.optimizer_trace).
>>> So opt_notrace-t will test a rare case (opt trace not compiled in).
>>
>> What is the point of being able to ifdef it out completely?
>
> - to benchmark the slowdown due to having the optimizer trace code
> compiled in but runtime-disabled
> - to allow easy turning off of the feature, should a problem be
> discovered after the push to trunk and before the public release
> - it's common to do this for brand new features in MySQL
> - it was like this in the prototype which I inherited.
>
> Later, once this feature has been in trunk for a long time, we could
> suppress ifdef. That's what happened with row-based replication, which
> was initially surrounded with many #ifdef HAVE_ROW_BASED in code.
>
> But some features are still compiled-out in libmysqld, even years after
> their implementation, because they are not useful for the embedded
> server (see HAVE_REPLICATION). So their ifdef is still there.
>
> To me the ifdef is useful in this period of development of the feature.
> Later, maybe not.
>
>> You actually implemented this feature (which I think is a bad idea, we
>> have more than enough ifdefs in the code).
>> If you want to keep the ability to ifdef opt-trace away completely,
>> then this is a way to ensure that things do not break if someone does
>> cmake -DOPTIMIZER_TRACE=0
>>
>> Put in enough test code, so that this unit test is likely to break, if
>> the tracing interface changes.
>
> Wait, it's not clear to me.
> I think you want to protect against someone changing the interface but
> forgetting to update the "dummy clone of it", i.e. the part of
> opt_trace.h enabled when OPTIMIZER_TRACE is not defined, where classes
> are mostly empty.
> In that case, a not-updated opt_notrace-t.cc will still compile (it uses
> the old not-updated dummy interface) and pass. So the dev will not even
> notice that he needs to update opt_notrace-t.cc (which in turn would
> make the test fail and tell him he needs to update the dummy interface).
> So I don't understand.
>
> Also, this method does not protect against someone adding a new add()
> call and forgetting to add it to the dummy interface. And it does not
> check whether opt_range.cc, which has #ifdef too, compiles without
> OPTIMIZER_TRACE defined.
>
> There is another approach: let me fix breakages now and once in a while
> in this development phase; later, let future breakages happen (and
> sometimes catch them during review). Should someone build without
> OPTIMIZER_TRACE and it breaks, we'll fix it promptly...
>
> Yet another approach. I would create a target opt_notrace-t which is
> built by compiling the existing *opt_trace-t.cc* but with "#undef
> OPTIMIZER_TRACE". That should still compile (after I tweak
> opt_trace-t.cc with a few #ifdef).
> When a dev updates the non-dummy interface we can hope that he will
> update opt_trace-t.cc (to test his new stuff), which in turn will break
> the compilation of opt_notrace-t if he forgot to update the dummy
> interface.
> opt_trace-t.cc would look like this:
>
> TEST_F(blah)
> {
> #ifndef DUMMY_TEST
> trace.start(...);
> #endif
> Opt_trace_object oto(...);
> oto.add(...);
> #ifndef DUMMY_TEST
> trace.end();
> Opt_trace_iterator it(...);
> EXPECT_EQ(...);
> EXPECT_FALSE(...);
> #endif
> }
> opt_notrace-t would have DUMMY_TEST defined, so only those:
>
> Opt_trace_object oto(...);
> oto.add(...);
>
> would be compiled.
>
>>>> === modified file 'unittest/gunit/opt_trace-t.cc'
>>>> --- a/unittest/gunit/opt_trace-t.cc 2010-11-24 18:54:26 +0000
>>>> +++ b/unittest/gunit/opt_trace-t.cc 2011-01-11 08:15:46 +0000
>>>> @@ -28,8 +28,11 @@
>>>> /* Some minimal working implementations to have a working trace... */
>>>> CHARSET_INFO *system_charset_info= &my_charset_utf8_general_ci;
>>>> /* ... end here. */
>>>> +// I didn't quite understand the comments above ...
>>>
>>> I removed the comment.
>>>
>>>> -const ulonglong all_features= Opt_trace_context::FEATURES_DEFAULT;
>>>> +namespace {
>>>> +
>>>> +const ulonglong all_features= Opt_trace_context::default_features;
>>>
>>> Why not "static"?
>>
>> The un-named namespace here covers the entire rest of the file.
>> This will ensure that no name-clashes can occur, now or in the future,
>> if someone accidentally uses a name in other parts of the code,
>> which you are using in your unit test code.
>
> ok
>
>>>
>>>> @@ -43,11 +46,12 @@ const ulonglong all_features= Opt_trace_
>>>> */
>>>> void check_json_compliance(const char *str, size_t length)
>>>> {
>>>> -#if 0
>>>> + return;
>>>> /*
>>>> Read from stdin, eliminate comments, parse as JSON. If invalid,
>>>> an exception
>>>> is thrown by Python, uncaught, which produces a non-zero error
>>>> code.
>>>> */
>>>> +#ifndef __WIN__
>>>> const char python_cmd[]=
>>>> "python -c \""
>>>> "import json, re, sys;"
>>>> @@ -56,22 +60,34 @@ void check_json_compliance(const char *s
>>>> "json.loads(s, 'utf-8')\"";
>>>> // Send the trace to this new process' stdin:
>>>> FILE *fd= popen(python_cmd, "w");
>>>> - ASSERT_NE((void*)NULL, fd);
>>>> - ASSERT_EQ((size_t)1, fwrite(str, length, 1, fd));
>>>> +// ASSERT makes sense here: if we fail to run python we must abort.
>>>> +// Most of the other uses of ASSERT should be changed to EXPECT,
>>>> +// so that we continue to run the rest of the test even if there
>>>> +// is a failed assumption.
>>>
>>> Ok here; more on this below.
>>>
>>>> + ASSERT_TRUE(NULL != fd);
>>>> + ASSERT_EQ(1U, fwrite(str, length, 1, fd));
>>>
>>> why 1U (ulong) instead of (size_t)1? On my Linux and Windows, fwrite()
>>> returns size_t...
>>> ?
>>> Isn't a comparison between ulong and size_t a promise for some
>>> "possible loss of precision" warning one day with some compiler?
>>
>> A C-style cast is generally very bad.
>
> I agree in principle (and I would like us to use -Wold-style-cast), but
> (size_t)1 is not that deadly. Not much harm can happen with that.
>
>> A C++ style cast is better.
>>
>> An unsiged int will be safely promoted to size_t in this case
>> (we are comparing, rather than assigning, two entities with possibly
>> different size)
>
> Ok. So I changed my (size_t) casts to using 'U', in the test.
>
> An observation: gtest macros "are strict" for some comparisons. I had
> compilation problems with
> ASSERT_EQ(NULL, pointer variable)
> on Solaris, it didn't compile (maybe NULL is a plain 0 there). Replacing
> NULL with (void*)0 worked.
>
>>>> int rc= pclose(fd);
>>>> rc= WEXITSTATUS(rc);
>>>> ASSERT_EQ(0, rc);
>>>> #endif
>>>> }
>>>>
>>>> +class TraceContentTest : public ::testing::Test
>>>> +{
>>>> +public:
>>>> + Opt_trace_context trace;
>>>> +};
>>>> +
>>>> +
>>>> +TEST_F(TraceContentTest, ConstructAndDestruct)
>>>> +{
>>>> +}
>>>> +
>>>>
>>>> /** Test empty trace */
>>>> -TEST(Trace_content_test, empty)
>>>> +TEST_F(TraceContentTest, empty)
>>>> {
>>>> - /* Create a trace */
>>>> - Opt_trace_context trace;
>>>> - ASSERT_EQ(false, trace.start(true, false, false, -1, 1, ULONG_MAX,
>>>> - all_features));
>>>> + ASSERT_FALSE(trace.start(true, false, false, -1, 1, ULONG_MAX,
>>>> all_features));
>>>> /*
>>>> Add at least an object to it. A really empty trace ("") is not
>>>> JSON-compliant, at least Python's JSON module raises an exception.
>>>> @@ -83,26 +99,24 @@ TEST(Trace_content_test, empty)
>>>> trace.end();
>>>> /* And verify trace's content */
>>>> Opt_trace_iterator it(&trace);
>>>> - ASSERT_EQ(true, it != Opt_trace_iterator::end);
>>>> + EXPECT_FALSE(it.at_end());
>>>
>>> This checks that the iterator is not at end. If it is at end, ASSERT
>>> will stop this test portion. EXPECT will rather print a message and
>>> continue execution , so it will execute "*it" at the next line. 'cursor'
>>> being NULL, this will segfault.
>>> I think it's normal that when you reached the end (one past the last),
>>> you don't try to read the iterator's value.
>>
>> OK.
>> I saw only ASSERT, and no EXPECT, in your code, so I changed a few of
>> them.
>> I just wanted you to make a concious decision for each case.
>>>
>>>> const Opt_trace_info info= *it;
>>>> const char expected[]= "{\n}";
>>>> - ASSERT_STREQ(expected, info.trace_ptr);
>>>> - ASSERT_EQ(sizeof(expected) - 1, info.trace_length);
>>>> + EXPECT_STREQ(expected, info.trace_ptr);
>>>> + EXPECT_EQ(sizeof(expected) - 1, info.trace_length);
>>>
>>> Even here, if the content is not as expected, I don't care that the
>>> length is tested. Or the JSON compliance below, or the value of
>>> missing_bytes.
>>> But on the other hand, it does not hurt to test them.
>>> So I changed all ASSERT to EXPECT, except when continuing would cause a
>>> crash (i.e. at_end() tests).
>>>
>>>> /** Test escaping of characters */
>>>> -TEST(Trace_content_test, escaping)
>>>> +TEST_F(TraceContentTest, escaping)
>>>
>>
>
>
--
Mr. Guilhem Bichot <guilhem.bichot@stripped>
Oracle / MySQL / Optimizer team, Lead Software Engineer
Bordeaux, France
www.oracle.com / www.mysql.com