List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:February 3 2011 4:36pm
Subject:Re: bzr commit into mysql-next-mr-bugfixing branch (tor.didriksen:3243)
WL#5257
View as plain text  
Hello Tor,

Apart from:
- friendship (ongoing work, see below)
- not initializing all of Opt_trace_disable_I_S (here adopting a 
stubborn attitude but with arguments :-) see below),
I think I have now acted on everything, please find my comments below.

Tor Didriksen a écrit, Le 31.01.2011 14:58:
> On 2011-01-26 11:57, Guilhem Bichot wrote:
>> Hello Tor,
>>
>> Ok with incorporating almost everything in this commit. I have pulled
>> your revision verbatim, and done additional edits/reverts (commit email
>> is on its way). I am doing this in a frozen tree, based on the same
>> revision as you used; not on the latest wl4800 tree because the latest
>> tree is slightly ahead of what you used.
>> When we are more or less in agreement about my handling of review
>> comments, I'll merge the frozen tree in the latest wl4800 tree. And then
>> the latest trunk into the latest wl4800 tree.
>>
>> I didn't incorporate a few points, mentioned below. Also commenting on
>> some incorporated points which deserved it.
>> Thanks for all the reformatting which you did. And the SCOPED_TRACE()
>> trick. And the composition/inheritance suggestion.
> 
> I Guilhem.
> Here are comments on the comments.
> 
> -- didrik
> 
>>
>> Tor Didriksen a écrit, Le 11.01.2011 09:15:
>>> #At file:///export/home/didrik/repo/next-mr-opt-backporting-wl4800/ 
>>> based on revid:jorgen.loland@stripped
>>>
>>>  3243 Tor Didriksen    2011-01-11
>>>       WL#5257 Review comments.
>>>             More comments in followup email ...
>>>      @ sql/opt_range.cc
>>>              @ sql/opt_trace.cc
>>>
>>>     added:
>>>       unittest/gunit/opt_notrace-t.cc
>>>     modified:
>>>       sql/opt_range.cc
>>>       sql/opt_trace.cc
>>>       sql/opt_trace.h
>>>       sql/opt_trace2server.cc
>>>       sql/sys_vars.cc
>>>       unittest/gunit/CMakeLists.txt
>>>       unittest/gunit/opt_trace-t.cc
>>> === modified file 'sql/opt_range.cc'
>>> --- a/sql/opt_range.cc    2010-12-21 07:34:33 +0000
>>> +++ b/sql/opt_range.cc    2011-01-11 08:15:46 +0000
>>> @@ -793,8 +793,10 @@ static
>>> +// print_sel_tree declared 'static' but never defined
>>>  static void print_sel_tree(PARAM *param, SEL_TREE *tree, key_map 
>>> *tree_map,
>>>                             const char *msg);
>>> +// print_ror_scans_arr declared 'static' but never defined
>>>  static void print_ror_scans_arr(TABLE *table, const char *msg,
>>>                                  struct st_ror_scan_info **start,
>>>                                  struct st_ror_scan_info **end);
>>> @@ -12393,6 +12395,7 @@ print_multiple_key_values(KEY_PART *key_
>>>    dbug_tmp_restore_column_maps(table->read_set, table->write_set, 
>>> old_sets);
>>>  }
>>>
>>> +// print_quick defined but not used.
>>>  static void print_quick(QUICK_SELECT_I *quick, const key_map 
>>> *needed_reg)
>>
>> Jorgen fixed those three since then.
>>
>>> === modified file 'sql/opt_trace.cc'
>>> @@ -178,18 +177,18 @@ void Opt_trace_struct::add_struct(const  }
>>>
>>>
>>> -void Opt_trace_struct::do_destruct(void)
>>> +void Opt_trace_struct::do_destruct()
>>
>> Yes, C++ does not need (void).
>> I once got caught in C by an accidental:
>>
>> file.h:
>> func(); /* when one meant (void) but didn't write it */
>>
>> file.c:
>> func(int x);
>>
>> and a run-time surprise in the end.
>> So I took the habit of always writing (void). Even in C++ because it
>> makes the rule simpler (one rule for both languages).
>> However, this is not a big deal, I incorporated those (void) removals.
>>
>>> @@ -236,7 +235,7 @@ Opt_trace_struct& Opt_trace_struct::do_a
>>>        ("select" and other language keywords). If we directly store 
>>> that mix in a
>>>        UTF8 column, the query_charset piece causes an issue:
>>>        Field_blob::store() will truncate the trace at first unrecognized
>>> -      character. At worse, a single bad character in the expanded 
>>> query makes
>>> +      character. Even worse, a single bad character in the expanded 
>>> query makes
>>
>> Changed to "So just a single bad character".
>>
>>> @@ -281,6 +279,10 @@ Opt_trace_struct& Opt_trace_struct::do_a
>>> +namespace {
>>> +LEX_CSTRING readables[]= { { STRING_WITH_LEN("false") },
>>> +                           { STRING_WITH_LEN("true") } };
>>> +}
>>
>> why not use "static" instead of a namespace?
> 
> a matter of taste perhaps.
> An un-named namespace is the modern "C++" way of doing this,
> since the word 'static' is overloaded (has too many meanings)

I checked Meyers' books and he thinks the same. Eliminated all static-s
which were out of classes.

>>> @@ -606,6 +607,7 @@ bool Opt_trace_context::start(bool suppo
>>>      {
>>>        DBUG_PRINT("opt_trace", ("disabled: since_offset_0(%llu) >="
>>>                                 " offset(%lu) + limit(%lu)",
>>> +                               // Both offset and limit are signed.
>>
>> changed %lu to %ld
>>
>>>                                 (ulonglong)since_offset_0, offset, 
>>> limit));
>>>        new_stmt_support_I_S= false;
>>>      }
>>> @@ -616,6 +618,7 @@ bool Opt_trace_context::start(bool suppo
>>>                                             new_stmt_support_I_S);
>>>    DBUG_PRINT("opt_trace",("new stmt %p support_I_S %d", stmt,
>>>                            (int)new_stmt_support_I_S));
>>> +  // new cannot return NULL.
>>
>> Wow.
>> Our code has such
>> if (!(ptr= new someclass()))
>>   return error;
>> where new is the normal C++ "new" (not a custom "new" declared in the 
>> class). For example with Event_queue_element (events.cc:351). Or 
>> hostname.h with class hash_filo.
>>
>> So those are all wrong, thanks for enlightening me.
>> Looks like we are in the situation described at
>> http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=170
>>
>> And also in the situation of:
>> http://blog.mozilla.com/nnethercote/2011/01/18/the-dangers-of-fno-exceptions/ 
>>
>> which I just tested... Added a class containing an array of 100GB; did 
>> "new" for this class in mysqld.cc:main(), and it did crash, both debug 
>> and non-debug binary. new() calling __cxa_throw() calling 
>> std::terminate(), ending up in abort().
>>
>> I guess I should file a bug report about all those "graceful handling 
>> of OOM" which are not.
>> I also found code which creates a THD (events.cc, event_scheduler.cc) 
>> like this
>> if (!(thd= new THD())) do something.
>> and THD's "new" is actually a custom one (from parent class "ilink") 
>> which itself calls
>> my_malloc(..., MYF(MY_FAE...));
>> and MY_FAE is designed to exit(1) if malloc() fails. So there is no 
>> chance to coming to "do something".
>> Same for class Log_event. I didn't look for more examples. I should 
>> mention those in the bug report too.
>>
>> For my patch, I guess I have two choices:
>> - keep my new() unchanged, delete all if(new returned NULL) and accept 
>> that if OOM it will just crash.
>> - change my new() calls to new(std::nothrow) (I prefer that second 
>> solution)
> 
> I would prefer new(nothrow) as well.

changed all my new to new(nothrow).

> Note that some of our classes have their own 'operator new' which 
> actually *can* return NULL.
> If they do so, I guess the server will crash, since we are not using the 
> new(nothrow) syntax,
> and the C++ runtime will run constructors on un-allocated memory.
> 
>>
>>>    if (stmt == NULL)
>>>      DBUG_RETURN(true);
>>>
>>
>>> @@ -683,26 +686,30 @@ void Opt_trace_context::end(void)
>>>  }
>>>
>>>
>>> -size_t Opt_trace_context::allowed_mem_size(const Opt_trace_stmt 
>>> *for_stmt)
>>> -  const
>>> +namespace {
>>> +size_t allowed_mem_size(const Opt_trace_stmt *stmt_start,
>>> +                        const Opt_trace_stmt *for_stmt)
>>>  {
>>
>> why not use "static" instead of a namespace?
> 
> Quoting Bjarne Stroustrup, 1997:
> In C and older C++ programs, the keyword 'static' is (confusingly) used 
> to mean
> "use internal linkage". Don't use static except inside functions and 
> classes.
> 
> (note that he said older C++ programs, in 1997 ....)

done

>>> -  DBUG_ENTER("Opt_trace_context::allowed_mem_size");
>>
>>> @@ -750,20 +757,20 @@ void Opt_trace_iterator::operator++(int)
>>>  /**
>>> -   @note because allocation is done in big chunks, 
>>> buffer->Ptr[str_length]
>>> +   @note Because allocation is done in big chunks, 
>>> buffer->Ptr[str_length]
>>>     may be uninitialized while buffer->Ptr[allocated length] is 0, so we
>>>     must use c_ptr_safe() as we want a NUL-terminated string (which 
>>> is easier
>>>     to manipulate in a debugger, or to compare in unit tests with
>>> -   ASSERT_STREQ).
>>> +   EXPECT_STREQ).
>>>     c_ptr_safe() may realloc an empty String from 0 bytes to 8 bytes,
>>> -   when it adds the closing NUL.
>>> +   when it adds the closing NULL.
>>
>> I meant NUL (the name of ASCII character '\0'). Should I write '\0' 
>> instead?
> 
> I read that as a typo for NULL, so maybe '\0' is better, yes.

I changed:
"NUL-terminated" to "0-terminated"
and "NUL" to \0, except in one place about charset subtleties where I
had to distinguish: I wrote "if we see a byte equal to 0 it is really
the UTF8 u0000 character (a.k.a. ASCII NUL)".

>>> @@ -780,12 +787,9 @@ Opt_trace_info Opt_trace_iterator::opera
>>>  }
>>>
>>>
>>> -Opt_trace_iterator Opt_trace_iterator::end;
>>> -
>>> -
>>>  /* Opt_trace_stmt class */
>>>
>>> -const size_t Opt_trace_stmt::empty_trace_len= 4;
>>> +const size_t Opt_trace_stmt::empty_trace_len= 4; // magic number
>>
>> Incorporated all the "magic number" comments, but what is their 
>> intention?
> 
> Ahh, my intention when commenting like this is usually:
> "please give this magic number a name".
> 
> This one actually *has* a name (empty_trace_len)
> but there is no comment about why the value is four.

I deleted empty_trace_len; a trace cannot be empty nowadays: either the
statement is not traced then we don't even create an Opt_trace_stmt, or
the statement is traced then we always add "steps" to the trace.
For other "magic numbers", I added comments.

>>> @@ -825,22 +829,30 @@ void Opt_trace_stmt::end(void)
>>> -void Opt_trace_stmt::next_line(void)
>>> +static const char my_spaces[] =
>>> +  "                                                                "
>>> +  "                                                                "
>>> +  "                                                                "
>>> +  "                                                                "
>>> +  "                                                                "
>>> +  "                                                                "
>>> +  "                                                                "
>>> +  ; // Extend as desired, to max supported indentation level.
>>
>> This is an elegant solution. However there is no max supported
>> indentation level. It can get quite deep, when tracing greedy search (if
>> there are many tables). So I propose
>>
>> int remnent= 2 * depth;
>> const int spaces_len= sizeof(my_spaces) - 1;
>> while (remnent > spaces_len)
>> {
>>   buffer.append(&my_spaces[0]);
>>   remnent-= spaces_len;
>> }
>> if (remnent > 0)
>>   buffer.append(&my_spaces[spaces_len - remnent]);
> 
> "remnent" is a strange name.

Without spelling errors, it should have been "remnant".
I changed to "to_be_printed".

> If you want to combine my/your approach, I guess that is OK.
> Please add a unit test then, since it is very easy to get
> off-by-one errors in code like this.

done.
I also simplified the code, no 'index from the end' trick is needed
because String::append() supports passing a length argument (so does not
need a 0-terminated char* so we don't need to end "at the end of
my_spaces"). This simplification makes a bug in this code unlikely but I
still test it in the unit test (new test TEST_F(TraceContentTest, Indent)).

>>> +void Opt_trace_stmt::next_line()
>>>  {
>>>    if (ctx->one_line)
>>>      return;
>>>    buffer.append('\n');
>>> -  char spaces[]= "                               ";
>>> -  // 2 spaces per nesting level
>>> -  const int nb= depth * 2, spaces_len= sizeof(spaces) - 1;
>>> -  // add spaces in chunks of spaces_len, better than many append(' ')
>>> -  for (int i= 0; i < nb/spaces_len; i++)
>>> -    buffer.append(spaces, spaces_len);
>>> -  buffer.append(spaces, (nb % spaces_len));
>>> +
>>> +  // You can get away with a single append here:
>>> +  int ix= sizeof(my_spaces) - 2*depth - 1;
>>> +  if (ix < 0)
>>> +    ix= 0;
>>> +  buffer.append(&my_spaces[ix]);
>>>  }
>>
>>> @@ -1056,11 +1066,13 @@ bool Opt_trace_stmt::Buffer::append(char
>>>
>>>  Opt_trace_disable_I_S::Opt_trace_disable_I_S(Opt_trace_context 
>>> *ctx_arg,
>>>                                               bool disable_arg) :
>>> +  ctx(ctx_arg),
>>> +  stmt(ctx ? ctx->current_stmt_in_gen : NULL),
>>
>> If 'disable_arg' is false, no other member variable than 'disable' will
>> be used. So I would prefer to not initialize other members in that case.
>> The idea all around the file is to initialize as least as possible.
> 
> Which is a *very* bad practice.
> A constructor is a contract: it turns raw bytes into an object.
> Your constructor fails to do so.

The constructor of Opt_trace_struct works the same: it initializes 
variable 'started' to 'false', and apart from that, the rest of
initializations is conditioned by
if (ctx_arg != NULL && ctx_arg->is_started()).
So if the condition above is false, only 'started' is initialized (to
'false'), and all functions of this class (like add()) make sure to do
nothing if 'started' is false.

Back to Opt_trace_disable_I_S: if disable_arg is false, the caller is
asking the object to not disable anything: the object is asked to have
no action. Then it does not need to be fully initialized: it only needs
to remember (by initializing 'disable' to false) that it has nothing to do.
The object is functional from the point of view of the user of it: it
does not disable anything; it's working and it's not raw bytes. The 
unset properties will not be used under any circumstances. Contracts are 
fine: the class exposes only a constructor and a destructor; the 
destructor works well on an instance where 'disabled' is false and the 
rest is unset. The user has no risk to running into a problem.

The fact that we can have a non-acting object like this is a consequence
of me following RAII. I cannot do

if (some cond)
   disable the trace;

because, with RAII, it would mean
if (some cond)
   Opt_trace_disable_I_S();
which would be deleted at the same code line when it's instantiated.

>>>    disable(disable_arg)
>>>  {
>>>    if (likely(!disable))
>>
>> I removed this likely(). Due to some past code changes, there is no more
>> foreseenable likelihood of this.
>>
>>>      return;
>>> -  ctx= ctx_arg;
>>> +
>>>    if (ctx != NULL)
>>>    {
>>>      /* Disable in to-be-created future traces */
>>
>>> === modified file 'sql/opt_trace.h'
>>> --- a/sql/opt_trace.h    2010-12-19 14:24:03 +0000
>>> +++ b/sql/opt_trace.h    2011-01-11 08:15:46 +0000
>>> @@ -290,8 +292,10 @@ class THD;
>>>    @li Use only simple characters for key names: a-ZA-Z_# (to denote a
>>>    number), and no space.
>>>
>>> +??? to denote a number ???
>>
>> I meant that '#' serves to denote a number (like "select#": number of a
>> SELECT). Clarified to
>>
>>   @li Use only simple characters for key names: a-ZA-Z_#, and no space.
>> '#'
>>   serves to denote a number, like in "select#" .
>>
>>>  class Opt_trace_context
>>>  {
>>>    /** Flags' numeric values for @@@@optimizer_trace variable */
>>> +  // the style guide says enum enum_flag_value { ... }
>>
>> Removed the enum's name.
>>
>>> +  // the style guide does *not* mandate uppercase enum values .....
>>
>> Right. It does not forbid it either, and the sql/ code is has many
>> uppercase enum values...
>> Would you prefer uppercase or something else?
> 
> I just think that they look ugly.
> But you are right, the convention seems to be uppercase.

I left them in uppercase.

>>
>>>    enum flag_value {
>>>      FLAG_DEFAULT= 0, FLAG_ENABLED= 1, FLAG_END_MARKER= 2, 
>>> FLAG_ONE_LINE= 4
>>>    };
>>
>>> @@ -375,13 +384,16 @@ public:
>>> +
>>>    /** Features' numeric values for @@@@optimizer_trace_features 
>>> variable */
>>> +  // These are actually decimal bit values(?)
>>> +  // so maybe write 1 << 0, 1 << 1, and so on???
>>
>> done
>>
>>>    enum feature_value {
>>>      GREEDY_SEARCH= 1,
>>>      RANGE_OPTIMIZER= 2,
>>>      /*
>>> -      if you add here, update feature_value of empty implementation
>>> -      and FEATURES_DEFAULT!
>>> +      If you add here, update feature_value of empty implementation
>>> +      and default_features!
>>
>>> @@ -473,6 +488,7 @@ private:
>>>       OFFSET/LIMIT
>>>    */
>>>    ha_rows since_offset_0;
>>> +  // ha_rows seems strange, why not int/long?
>>
>> Originally this variable is about count of rows in the
>> I_S.OPTIMIZER_TRACE table, and count of rows are supposed to be ha_rows.
>> BUT as this variable gets compared with the values of
>> @@optimizer_trace_offset/limit which are long, I can change to long.
>> Changed to long, which eliminates some casts in comparisons.
>>
>>> @@ -506,15 +522,17 @@ private:
>>> -  /** Whether the settings above may be changed for a new trace */
>>> +  /** Whether the settings above may be changed for a new trace. */
>>> +  // comment is positive, variable name is negative.
>>
>> Changed comment to "may not be changed". There are also high chances
>> that this variable goes away due to some other patch in preparation
>> (more about this when Dmitri has reviewed my other patch).
>>
>>>    bool cannot_change_settings;
>>
>>> @@ -523,22 +541,34 @@ private:
>>> +  // Having too many friends is asking for trouble ....
>>
>> The words of a misanthrope ;-)
>>
>>> +  // Opt_trace_stmt   needs only a const getter for one_line.
>>> +  // Opt_trace_struct needs a couple of const getters, plus 
>>> current_stmt_in_gen
>>> +  // Opt_trace_disable_I_S actually changes the state of the context 
>>> object!
>>> +  // Opt_trace_iterator needs only oldest_stmt_to_show and limit
>>> +  // and finally: the free function does not need friendship.
>>
>> Removed friendship from the free function.
>> Before removing the rest of friendship, let's discuss. My intention with
>> 'friend' was:
>> - it's a nice party in opt_trace.cc, everybody knows and trusts everybody
>> - but from outside of opt_trace.cc, nothing happening inside is visible
>> (black windows).
>>
>> If I add getters to Opt_trace_context, they will be public, and so the
>> outside will be able to know the value of Opt_trace_context::one_line. I
> 
> Well, a public getter can do very little harm.
> 
>> don't like it so much. Even worse, if Opt_trace_disable_I_S needs to
>> call a setter Opt_trace_context::set_support_I_S() in order to change
>> Opt_trace_context::support_I_S, this setter has to be
>> public, so everyone including opt_trace.h can call it, and I don't find
>> this isolated.
>> Do you have in your C++ bag of tricks something which solves this: put
>> order in the party in opt_trace.cc, but keep it opaque from the outside?
> 
> As I said, "most of these friends should go away".
> You can argue on a case-by-case basis, so e.g. Opt_trace_disable_I_S
> might stay as friend, but isn't it more natural that the context class
> is the one who knows what has to be saved and restored?
> 
> To me, it seems to have one (private) setter/getter function handling this.
> Input argument: a struct containing new settings, output: the old setting.

Today I have tried eliminating "friend" declarations one by one, adding 
getters or setters each time I got a compilation error. After an hour 
of this, it compiles and I have:
- getters which return int/bool or pointer-to-const (=safe getters): 9 
of them
- getters which return pointer but not pointer-to-const (=dangerous 
getters): 4
- setters: 13

Damage in Opt_trace_stmt:
- The constructor is now public (so a user can now create an 
Opt_trace_stmt; previously, only Opt_trace_context and other classes in 
opt_trace.h could).
- end() is now public.
- Class Buffer, and members query_buffer and trace_buffer are public 
(could use a getter and a setter instead, but would have same result).
- push()/pop()/next_line()/separator() are now public
- set_query() idem

The code looks horrible. Frankly this is a regression :-)
Either it's revealing a clear, profound design problem in my code, or 
the get/set thing is not adapted.
The design with friends allows big mess in opt_trace.cc (but as 
maintainer of opt_trace.cc I find I keep it non-messy), but prevents 
anybody from the outside from doing wrong.
The design with getters and setters introduces dangerous getters, 
setters, and public members, which exposes a lot to the outside of 
opt_trace.cc, which can now easily corrupt the structures by calling 
some public setter for example. So I'm not going to push it, rather 
going to reconsider all that.
It already happened in the past of this development that I stumbled on a 
problem and had to go through a re-think and re-design to reach a 
cleaner organization in the end. Maybe this is happening again.

> Re: bag of tricks:
> Look up the 'pimpl idiom' which is perfect for hiding implementation 
> details to the outside world.

I'll think about it. Expect more news.

>>
>>> +
>>>    friend class Opt_trace_stmt;
>>>    friend class Opt_trace_struct;
>>>    friend class Opt_trace_disable_I_S;
>>>    friend class Opt_trace_iterator;
>>> -  friend void opt_trace_print_expanded_query(THD *thd);
>>> +  //friend void opt_trace_print_expanded_query(THD *thd);
>>>  };
>>
>>
>>> +  class Buffer
>>>    {
>>> -private:
>>> +  private:
>>>      size_t allowed_mem_size; ///< allowed memory size for this String
>>>      size_t missing_bytes;    ///< how many bytes could not be added
>>>      bool   malloc_error;     ///< whether there was a 
>>> malloc/realloc() error
>>> -public:
>>> +    String string_buf;
>>>      }
>>> -    inline const char *ptr(void) const { return String::ptr(); }
>>> +    const char *ptr() const { return string_buf.ptr(); }
>>> +
>>> +    CHARSET_INFO *charset() { return string_buf.charset(); }
>>
>> ok, and I added "const" to it, like String has.
>>
>>> @@ -622,6 +653,7 @@ public:
>>>      friend class Opt_trace_iterator;
>>>    };
>>>
>>> +  // Suggest renaming to trace_buffer: much easier to search for it.
>>>    Buffer buffer;         ///< Where the trace is accumulated
>>
>> renamed to trace_buffer
>>
>>>    Buffer query_buffer;   ///< Where the query is put
>>>
>>> @@ -629,19 +661,21 @@ public:
>>>       Nesting level of the current structure.
>>>       The more nested ("deep"), the more indentation spaces we add on 
>>> the left.
>>>    */
>>> +  // how about current_depth: much easier to search for it.
>>>    int depth;
>>
>> renamed to current_depth
>>
>>> @@ -649,11 +683,18 @@ public:
>>>    Opt_trace_stmt *prev, *next; ///< list management
>>>
>>>    /** By how much we should increase buffer's size when it's 
>>> becoming full */
>>> +  // At first sight it looks strange that Buffer doesn't handle this 
>>> itself.
>>
>> Ok, I moved the code using buffer_alloc_increment from 
>> Opt_trace_struct::do_construct() to a function in Buffer, named 
>> prealloc().
>> I call this function from Opt_trace_struct::do_construct().
>> So it's only called when opening a structure, not too often, which is 
>> what I want (don't want to do it in Buffer::append() for that reason).
>>
>>> +  // Do we add to the length, or do we multiply (which is more 
>>> common I think)
>>
>> Right now we add. With multiplying I fear this would grow fast and waste
>> memory (we already keep the whole trace in memory, so need to be
>> cautious). The increment is 1kB.
>> I could use multiplication instead, but then with some limit (like:
>> double each time, but when trace is 10MB, then stop multiplying, and
>> switch to adding 10MB each time). Let me know.
> 
> I think that the common trick is to multiply by sqrt(2) or something.
> If you multiply, the amortized cost of an append is constant.
> If you add, the cost of append will grow as a function of string length.

Assuming ::realloc() works like

new_ptr=malloc(new_size);
memcpy(new_ptr, old_ptr, old_size);
free(old_ptr);

then I agree with what you say above about the amortized cost. If we
multiply by a constant 'x' then the amortized cost of such ::realloc() 
is constant and proportional to x/(x-1), from what I computed.
I saw *2 and *1.5 (~sqrt(2)) for 'x' on internet.

::realloc() may not always be as "inefficient" as above though.

> I suggest writing a unit test, and measure how it behaves.
> If you run your unit test with --disable-tap-output
> you will get the standard google output, and see millisecond execution 
> times for each test.

Thanks. I just did this:

int maxs=100000000; // max allocated size
int loop=100; // number of iterations (to get times greater than 1ms)
TEST_F(TraceContentTest, add)
{
   int inc=1024;
   for (int y=0; y<loop; y++)
   {
     char *p= (char*)malloc(0);
     int s= 0;
     while (s < maxs)
     {
       int news= s+inc;
       p= (char*)realloc(p, news);
       memset(p+s, 25, news-s); // simulate adding trace content
       s= news;
     }
     free(p);
   }
}
TEST_F(TraceContentTest, mult15)
{
   for (int y= 0; y<loop; y++)
   {
     char *p= (char*)malloc(0);
     int s=0;
     while (s < maxs)
     {
       int news= s*15/10+1;
       p= (char*)realloc(p, news);
       memset(p+s, 25, news-s);
       s= news;
     }
     free(p);
   }
}
TEST_F(TraceContentTest, mult20)
{
   for (int y= 0; y<loop; y++)
   {
     char *p= (char*)malloc(0);
     int s=0;
     while (s < maxs)
     {
       int news= s*2+1;
       p= (char*)realloc(p, news);
       memset(p+s, 25, news-s);
       s= news;
     }
     free(p);
   }
}
Which tests the "add" method (increment=1024, as in opt_trace.cc) then
*1.5 then *2. I varied max_allocated_size (and also the number of
iterations so that the overall time stays reasonable). The machine has
12GB RAM in total, Linux64. Here are the times in ms:

max_allocated_size | add  | mult1.5 | mult2.0 | iterations

10k                  800    1800      1200      1M
100k                 700    1200      600       100k
1M                   900    700       700       10k
10M                  1400   1100      1700      1000
100M                 6500   6900      6800      100

"add" wins in two cases. mult1.5 wins in two cases; mult2 wins in two cases.

Then I did a server benchmark: 20-table plan search, limiting trace size 
to 10MB (--optimizer-trace-max-mem-size=10MB, with no limit the machine 
would not have enough memory :-).
- mult2.0: 32.5 secs (does 10s of ::realloc calls)
- add: 33.3 secs (does 1000s of ::realloc calls)
On the other hand, 10MB is unusually big. The biggest trace in 
mysql-test/r/optimizer_trace*.result is 10k long. I think that the 
typical traces will be of size 10k-100k, where I'm not sure realloc() 
has an impact. Even with 10MB, the impact is 1 second, not much a 
problem for us developers when debugging a query.

So I don't really have an opinion. I can do *2 if you like. Or I can be 
lazy and keep "add". I don't really care :-) Let me know.

>>>    static const size_t buffer_alloc_increment;
>>>    /** Length of an empty trace */
>>>    static const size_t empty_trace_len;
>>>
>>>  public:
>>> +  const Opt_trace_stmt *prev_() const { return prev; }
>>> +
>>> +  size_t alloced_length() const
>>> +  { return buffer.alloced_length() + query_buffer.alloced_length(); }
>>
>> Do we need Doxygen comments for the two functions above?
>>
>>> @@ -671,7 +712,7 @@ public:
>>>  struct Opt_trace_info
>>>  {
>>>    /**
>>> -     String containing trace. It is NUL-terminated, only to aid 
>>> debugging or
>>> +     String containing trace. It is NULL-terminated, only to aid 
>>> debugging or
>>
>> I meant NUL
>>
>>> @@ -708,20 +749,23 @@ public:
>>>     '*' operator.
>>>    */
>>>    Opt_trace_iterator(Opt_trace_context *ctx) :
>>> -    cursor(ctx->oldest_stmt_to_show), row_count(1), 
>>> limit(ctx->limit) {}
>>> -  void operator++(int); ///< advances iterator to next trace
>>> +    cursor(ctx->oldest_stmt_to_show), row_count(1), limit(ctx->limit)
>>> +  {}
>>> +
>>> +  ///< Advances iterator to next trace.
>>
>> If ///< is not on the same line as what it describes, doxygen does not
>> pick it up. I put it back on the line.
>>
>>> +  // A bit strange to implement postfix iterator, rather than prefix.
>>
>> renamed it to next().
> 
> thanks
> 
>>
>>> +  void operator++(int);
>>> +
>>>    /**
>>>       Returns information about the trace on which the iterator is
>>>       positioned.
>>> +     @note Returns object by value!!!
>>>    */
>>>    Opt_trace_info operator*();
>>
>> renamed it to get_value()
> 
> good!
> 
>>
>>> -  /** Needed to compare the iterator with the list's end */
>>> -  bool operator!=(const Opt_trace_iterator &other) const
>>> -  { return cursor != other.cursor; }
>>> -  static Opt_trace_iterator end;   ///< List's end
>>> +
>>> +  bool at_end() const { return cursor == NULL; }
>>> +
>>>  private:
>>> -  /** Only to construct the 'end' iterator */
>>> -  Opt_trace_iterator(void) : cursor(NULL) {}
>>>    Opt_trace_stmt *cursor; ///< trace which the iterator is 
>>> positioned on
>>>    ha_rows row_count;      ///< how many traces yielded so far
>>>    ha_rows limit;          ///< yield "empty" after yielding that 
>>> many traces
>>> @@ -734,13 +778,13 @@ private:
>>>     When you want to add a structure to the trace, you create an 
>>> instance of
>>>     Opt_trace_object or Opt_trace_array, then you add information to 
>>> it with
>>>     add(), then the destructor closes the OOA (we use RAII, Resource
>>> +                                         ^^^ ???
>>
>> OOA used to mean Object or Array, I changed it to Structure everywhere
>> (Jorgen preferred), had forgotten this one... Fixed.
>>
>>>     Acquisition Is Initialization).
>>>  */
>>>  class Opt_trace_struct
>>>  {
>>>  protected:
>>>    /**
>>> -     Constructor.
>>
>> Why remove this? Useless?
> 
> Yes, I think that is a useless comment.
> We assume that the reader of this code recognizes a constructor when he 
> sees one :-)

deleted a few such "Constructor" words.

>>>       @param  ctx_arg  Optimizer trace context for this structure
>>>       @param  requires_key_arg  whether this structure 
>>> requires/forbids keys
>>>                        for values put inside it (an object requires 
>>> them, an
>>
>>> @@ -807,13 +851,18 @@ public:
>>>       @verbatim <some Opt_trace_struct>.add().add().add() @endverbatim
>>>       which have 3 if()s (one per @c add()), it has been measured 
>>> that wrapping
>>>       that inside a single if(), like
>>> -     @verbatim if(started) { <some 
>>> Opt_trace_struct>.add().add().add() } @endverbatim
>>> +     @verbatim if(started) { <some 
>>> Opt_trace_struct>.add().add().add() }
>>> +     @endverbatim
>>>       degraded performance (from 2.0 to 2.1 secs). Adding if() in one 
>>> single
>>> -     code line caused de-inlining of add() methods in several 
>>> unrelated places
>>> +     code line caused de-inlining of add() functions in several 
>>> unrelated places
>>>       and the performance degraded. So we don't do it. The 
>>> performance impact of
>>>       multiple chained add() seems comparable to a single add() anyway.
>>>       Removing @c likely() also increases from 2.0 to 2.1, so we leave
>>>       them.
>>> +
>>> +     // I don't need to know all these details about milliseconds 
>>> here ....
>>> +     // This should be a documentation about the interface.
>>
>> I deleted this block, and put that in front of do_add() in opt_trace.cc:
>> /**
>>    @note add() has an up-front if(), hopefully inlined, so that in the
>> common
>>    case - tracing run-time disabled - we have no function call. If
>> tracing is
>> enabled, we
>>    call do_add().
>> */
>>
>>> @@ -1029,6 +1085,7 @@ private:
>>>       When adding a value (or array or object) to an array, or a 
>>> key/value pair
>>>       to an object, we need this outer array or object.
>>>
>>> +Sentence below is very hard to read, too many ()()()
>>
>> Changed to
>>      It would be possible, when trying to add a key to an array, which
>> is wrong
>>      in JSON, or similarly when trying to add a value without any key 
>> to an
>>      object, to catch it at compilation time, if the adder received, as
>> function
>>      parameter, the type of the structure (like @c Opt_trace_array*). 
>> Then
>>      the @c add(key,val) call would not compile as Opt_trace_array 
>> wouldn't
>>      feature it.
>>
>>
>>>       It would be possible, when trying to add a key to an array 
>>> (which is wrong
>>>       in JSON) (or similarly when trying to add a value without any 
>>> key to an
>>>       object), to catch it at compilation time, if the outer object 
>>> was passed an
>>
>>> @@ -1063,22 +1120,34 @@ private:
>>>    Opt_trace_struct& operator=(const Opt_trace_struct&);
>>>
>>>    bool started; ///< Whether the structure does tracing
>>> +
>>>    /**
>>>       Whether the structure requires/forbids keys for values inside it.
>>>       1: this is an object. 0: this is an array. Other: incorrect.
>>>
>>> +     it is unclear at this point why it is int8 rather than bool
>>
>> It used to be bool. It was and is still used in code like
>> 'some_array[the_variable]'.
>> And Jorgen didn't like booleans as array subscripts, so I changed to
>> int8 to make him happy :-)
>> What should I do? bool or int8? I prefer bool.
>> See also a reply to the your other commit, about arrays vs inline 
>> functions.
> 
> It is a boolean property.
> int8 looks like we are going to serialize it, and send it over the wire 
> in a package or something.
> Anyways, we agreed on inline functions like 'const char 
> *open_brace(bool)' didn't we.

We did. But we still need a "requires_key" member in the class, and I
still use it as array subscript inside open_bracket(). Changed it back
to bool.

>>>       @note The canonical way would be to not have such int8 per 
>>> instance, but
>>>       rather have a pure virtual method 
>>> Opt_trace_struct::requires_key(),
>>> +C++ has member functions, not methods.
>>
>> Ok, ok... in my French OOP courses they were called "méthodes"... Thanks
>> for fixing.
>>
>>>       overloaded by Opt_trace_object (returning 1) and by 
>>> Opt_trace_array
>>>       (returning 0). But this would have drawbacks:
>>>       @li it would add a vtbl pointer to each instance which takes 
>>> even more
>>>       space than int8
>>> +-- this depends on alignment (you happen to have a bool as a 
>>> neighbour..)
>>>       @li it would add requires_key() function calls which cost more 
>>> than
>>>       reading one int8
>>> +-- you don't know that, until measured on different platforms.
>>> +-- reading an int8 may require masking operations, unless
>>> +-- your platform is byte addressable
>>>       @li Opt_trace_object::requires_key() would not be accessible from
>>>       Opt_trace_struct::construct() (which would complicate coding), 
>>> whereas the
>>>       int8 is.
>>> +-- virtual function not accessible to CTOR/DTOR: true
>>> +-- OK, so I buy this argument, but not the others :-)
>>
>> removed the others.
>>
>>>    */
>>> +  // Hand-coded typechecking is generally a bad idea,
>>> +  // but since you need the information in the base CTOR/DTOR: OK.
>>
>> Thanks, so I leave it.
>>
>>>    int8 requires_key;
>>
>>> @@ -1236,13 +1310,18 @@ private:
>>>  };
>>>
>>>
>>> +// I believe the macros below are gone in newer versions of the code.
>>> +// Anyways: the names aren't very good, and the formatting was bad ...
>>
>> Yes, they are gone.
>>
>>> @@ -1278,6 +1358,7 @@ struct TABLE_LIST;
>>>    sub-statement in the call chain), and this function decides to trace
>>>    (respectively not trace) the sub-statement, it returns "true"
>>>    (resp. false). Each sub-statement is responsible for ending the 
>>> trace which it
>>> +()()()()()
>>
>> Changed to
>>   @note if tracing was already started by a top statement above the 
>> present
>>   sub-statement in the call chain, and this function decides to trace
>>   (respectively not trace) the sub-statement, it returns "true"
>>   (resp. false). Each sub-statement is responsible for ending the trace
>> which it
>>   has started.
>>
>>> @@ -1452,12 +1536,14 @@ public:
>>>  /**
>>>     Helper to put the database/table name in the trace
>>>     @param  t  TABLE* pointer
>>> +why macro, why not an ordinary function?
>>
>> If I made it an ordinary function, it would need to know the layout of
>> TABLE_LIST. But I would like it be inline (don't want to have function
>> calls when tracing is not enabled; see Opt_trace_struct::do_add/add()
>> tricks).
>> If I want it inline, I need to put it in a header file which includes
>> table.h. I'm not sure I want to include table.h in opt_trace.h, this
>> would bring in a lot of unwanted structures...
>> Ideas?
> 
> OK, that's a good enough reason.
> Please add comment about this fact.

as discussed elsewhere it changed to an inline member function.

> And since it is a macro, it should be IN_UGLY_ALL_CAPS.
>> Does it really hurt to have a macro?
> 
> Macros are generally bad bad bad bad,
> but can sometimes be justified.
> 
>> Talking about macros:
>> - OPT_TRACE and OPT_TRACE2 are gone
>> - I could delete macros used when OPTIMIZER_TRACE is not defined
>> (opt_trace_start, opt_trace_end, opt_trace_print_expanded_query(),
>> opt_trace_add_select_number(), opt_trace_set_query()) and just write
>> functions like this
>> opt_trace_start(...)
>> {
>> #ifdef OPTIMIZER_TRACE
>> lots of code
>> #endif
>> }
> 
> as you may have noticed, I am not a big fan of ifdefs....
> I actually think your current scheme is better than what you suggest above.

ok I'll keep it.

>> so that when OPTIMIZER_TRACE is not defined they are empty functions.
>> This would limit the number of empty macros which I need to update each
>> time I add a parameter of opt_trace_start().
>>
>> This would leave only two macros in opt_trace.h: add_utf8_table and
>> OPT_TRACE_TRANSFORM. That's few... maybe they can be let to live...??
>>
>>>  */
>>>  #define 
>>> add_utf8_table(t)                                               \
>>>    add_utf8("database", (t)->s->db.str, 
>>> (t)->s->db.length).              \
>>>    add_utf8("table", (t)->alias)
>>>
>>> -#if !defined(DBUG_OFF) && !defined(OPTIMIZER_TRACE)
>>> +#if !defined(DBUG_OFF) && !defined(OPTIMIZER_TRACE) \
>>> + && !defined(OPTIMIZER_TRACE_UNITTEST)
>>>
>>>  /*
>>>    A debug binary without optimizer trace compiled in, will miss some
>>>
>>> === modified file 'sql/opt_trace2server.cc'
>>> @@ -113,16 +111,19 @@ bool opt_trace_start(THD *thd, const TAB
>>>    /*
>>>      We don't allocate it in THD's MEM_ROOT as it must survive until 
>>> a next
>>>      statement (SELECT) reads the trace.
>>>    */
>>>    if (thd->opt_trace == NULL)
>>>    {
>>> -    if ((thd->opt_trace= new Opt_trace_context()) == NULL)
>>> +    // (this version of) operator new can never return NULL.
>>> +    if ((thd->opt_trace= new Opt_trace_context) == NULL)
>>
>> Depends on the other "new-related" discussion earlier.

changed to new(nothrow)

>>> @@ -156,7 +158,7 @@ disable:
>>>    if (need_it_for_I_S && thd->opt_trace != NULL && 
>>> thd->opt_trace->is_started())
>>>    {
>>>      need_it_for_I_S= false;
>>> -    goto start_trace;
>>> +    goto start_trace; // Ouch, criss-cross goto!
>>
>> Yes, though there's no bug (no risk of infinite loop).
>> I changed like this:
>> replaced "goto disable" with DBUG_RETURN(false) here:
>>
>> start_trace:
>>   if (thd->opt_trace->start(need_it_for_I_S,
>>                             (var & Opt_trace_context::FLAG_END_MARKER),
>>                             (var & Opt_trace_context::FLAG_ONE_LINE),
>>                             thd->variables.optimizer_trace_offset,
>>                             thd->variables.optimizer_trace_limit,
>>                             thd->variables.optimizer_trace_max_mem_size,
>>                             thd->variables.optimizer_trace_features))
>>   {
>>     if (allocated_here)
>>     {
>>       delete thd->opt_trace;
>>       thd->opt_trace= NULL;
>>     }
>>     DBUG_RETURN(false);
>>   }
>>   DBUG_RETURN(true); // started all ok
>>
>> So we can go from "disable:" to "start_trace:" but once in 
>> "start_trace:" we will leave the function.
>> If we got an error in "start_trace:", there is no chance that starting
>> it again in "disable:" would have better success, so DBUG_RETURNing
>> immediately, as in the new code above, is good enough.
> 
> I need to have a look at the new version of this code.

I just pushed all changes to mysql-next-mr-guilhem.
I could not push to mysql-next-mr-opt-backporting-wl4800 as I'm applying 
review comments to an old tree.

>>> @@ -188,15 +191,16 @@ void opt_trace_print_expanded_query(THD      
>>> thd->lex->sql_command is SQLCOM_END in this case. So we do a check of 
>>> the
>>>      command:
>>>    */
>>> +// Too many parens (()) in comment above, hard to read.
>>
>> In the latest tree this comment is gone, because the query is not
>> printed from open_tables() anymore, which eliminates the problem.
>>
>>> @@ -265,6 +269,7 @@ ST_FIELD_INFO optimizer_trace_info[]=
>>>    /* name, length, type, value, maybe_null, old_name, open_method */
>>>    {"QUERY", 65535, MYSQL_TYPE_STRING, 0, false, "", SKIP_OPEN_TABLE},
>>>    {"TRACE", 65535, MYSQL_TYPE_STRING, 0, false, 0, SKIP_OPEN_TABLE},
>>> +  // why is this zero, others are ""            ^
>>
>> This is member ST_FIELD_INFO::old_name. Some declarations in other files
>> have 0 here, some have "". Changed all in the array above, to NULL.
>>
>>>    {"MISSING_BYTES_BEYOND_MAX_MEM_SIZE", 20, MYSQL_TYPE_LONG,
>>>     0, false, "", SKIP_OPEN_TABLE},
>>>    {"OS_MALLOC_ERROR", 1, MYSQL_TYPE_TINY, 0, false, "", 
>>> SKIP_OPEN_TABLE},
>>
>>> === added file 'unittest/gunit/opt_notrace-t.cc'
>>> --- a/unittest/gunit/opt_notrace-t.cc    1970-01-01 00:00:00 +0000
>>> +++ b/unittest/gunit/opt_notrace-t.cc    2011-01-11 08:15:46 +0000
>>> @@ -0,0 +1,21 @@
>>> +#include "my_config.h"
>>> +#include <gtest/gtest.h>
>>> +
>>> +#if defined(OPTIMIZER_TRACE)
>>> +#undef OPTIMIZER_TRACE
>>> +#endif
>>> +#define OPTIMIZER_TRACE_UNITTEST
>>> +
>>> +#include <opt_trace.h>
>>> +
>>> +namespace {
>>> +
>>> +TEST(Foo, Bar)
>>> +{
>>> +  // Fill in more here, to verify that implementations are in sync!!!
>>> +  Opt_trace_context trace;
>>> +  Opt_trace_object oto(&trace);
>>> +  Opt_trace_array  ota(&trace);
>>> +}
>>
>> Ok, what would you want this to test?
>> Various add() calls (with their different signatures), and in the end we
>> verify that there is no trace in the iterator?
>> Note that, if I understood correctly, in our release builds, optimizer
>> trace will be compiled in (but run-time disabled, until the user changes
>> the value of @@session.optimizer_trace).
>> So opt_notrace-t will test a rare case (opt trace not compiled in).
> 
> What is the point of being able to ifdef it out completely?

- to benchmark the slowdown due to having the optimizer trace code
compiled in but runtime-disabled
- to allow easy turning off of the feature, should a problem be
discovered after the push to trunk and before the public release
- it's common to do this for brand new features in MySQL
- it was like this in the prototype which I inherited.

Later, once this feature has been in trunk for a long time, we could
suppress ifdef. That's what happened with row-based replication, which 
was initially surrounded with many #ifdef HAVE_ROW_BASED in code.

But some features are still compiled-out in libmysqld, even years after
their implementation, because they are not useful for the embedded
server (see HAVE_REPLICATION). So their ifdef is still there.

To me the ifdef is useful in this period of development of the feature. 
Later, maybe not.

> You actually implemented this feature (which I think is a bad idea, we 
> have more than enough ifdefs in the code).
> If you want to keep the ability to ifdef opt-trace away completely,
> then this is a way to ensure that things do not break if someone does
>   cmake -DOPTIMIZER_TRACE=0
> 
> Put in enough test code, so that this unit test is likely to break, if 
> the tracing interface changes.

Wait, it's not clear to me.
I think you want to protect against someone changing the interface but
forgetting to update the "dummy clone of it", i.e. the part of
opt_trace.h enabled when OPTIMIZER_TRACE is not defined, where classes
are mostly empty.
In that case, a not-updated opt_notrace-t.cc will still compile (it uses 
the old not-updated dummy interface) and pass. So the dev will not even 
notice that he needs to update opt_notrace-t.cc (which in turn would 
make the test fail and tell him he needs to update the dummy interface).
So I don't understand.

Also, this method does not protect against someone adding a new add()
call and forgetting to add it to the dummy interface. And it does not
check whether opt_range.cc, which has #ifdef too, compiles without 
OPTIMIZER_TRACE defined.

There is another approach: let me fix breakages now and once in a while 
in this development phase; later, let future breakages happen (and 
sometimes catch them during review). Should someone build without 
OPTIMIZER_TRACE and it breaks, we'll fix it promptly...

Yet another approach. I would create a target opt_notrace-t which is 
built by compiling the existing *opt_trace-t.cc* but with "#undef 
OPTIMIZER_TRACE". That should still compile (after I tweak 
opt_trace-t.cc with a few #ifdef).
When a dev updates the non-dummy interface we can hope that he will 
update opt_trace-t.cc (to test his new stuff), which in turn will break 
the compilation of opt_notrace-t if he forgot to update the dummy interface.
opt_trace-t.cc would look like this:

TEST_F(blah)
{
   #ifndef DUMMY_TEST
   trace.start(...);
   #endif
   Opt_trace_object oto(...);
   oto.add(...);
   #ifndef DUMMY_TEST
   trace.end();
   Opt_trace_iterator it(...);
   EXPECT_EQ(...);
   EXPECT_FALSE(...);
   #endif
}
opt_notrace-t would have DUMMY_TEST defined, so only those:

Opt_trace_object oto(...);
oto.add(...);

would be compiled.

>>> === modified file 'unittest/gunit/opt_trace-t.cc'
>>> --- a/unittest/gunit/opt_trace-t.cc    2010-11-24 18:54:26 +0000
>>> +++ b/unittest/gunit/opt_trace-t.cc    2011-01-11 08:15:46 +0000
>>> @@ -28,8 +28,11 @@
>>>  /* Some minimal working implementations to have a working trace... */
>>>  CHARSET_INFO *system_charset_info= &my_charset_utf8_general_ci;
>>>  /* ... end here. */
>>> +// I didn't quite understand the comments above ...
>>
>> I removed the comment.
>>
>>> -const ulonglong all_features= Opt_trace_context::FEATURES_DEFAULT;
>>> +namespace {
>>> +
>>> +const ulonglong all_features= Opt_trace_context::default_features;
>>
>> Why not "static"?
> 
> The un-named namespace here covers the entire rest of the file.
> This will ensure that no name-clashes can occur, now or in the future,
> if someone accidentally uses a name in other parts of the code,
> which you are using in your unit test code.

ok

>>
>>> @@ -43,11 +46,12 @@ const ulonglong all_features= Opt_trace_
>>>  */
>>>  void check_json_compliance(const char *str, size_t length)
>>>  {
>>> -#if 0
>>> +  return;
>>>    /*
>>>      Read from stdin, eliminate comments, parse as JSON. If invalid, 
>>> an exception
>>>      is thrown by Python, uncaught, which produces a non-zero error 
>>> code.
>>>    */
>>> +#ifndef __WIN__
>>>    const char python_cmd[]=
>>>      "python -c \""
>>>      "import json, re, sys;"
>>> @@ -56,22 +60,34 @@ void check_json_compliance(const char *s
>>>      "json.loads(s, 'utf-8')\"";
>>>    // Send the trace to this new process' stdin:
>>>    FILE *fd= popen(python_cmd, "w");
>>> -  ASSERT_NE((void*)NULL, fd);
>>> -  ASSERT_EQ((size_t)1, fwrite(str, length, 1, fd));
>>> +// ASSERT makes sense here: if we fail to run python we must abort.
>>> +// Most of the other uses of ASSERT should be changed to EXPECT,
>>> +// so that we continue to run the rest of the test even if there
>>> +// is a failed assumption.
>>
>> Ok here; more on this below.
>>
>>> +  ASSERT_TRUE(NULL != fd);
>>> +  ASSERT_EQ(1U, fwrite(str, length, 1, fd));
>>
>> why 1U (ulong) instead of (size_t)1? On my Linux and Windows, fwrite()
>> returns size_t...
>> ?
>> Isn't a comparison between ulong and size_t a promise for some
>> "possible loss of precision" warning one day with some compiler?
> 
> A C-style cast is generally very bad.

I agree in principle (and I would like us to use -Wold-style-cast), but 
(size_t)1 is not that deadly. Not much harm can happen with that.

> A C++ style cast is better.
> 
> An unsiged int will be safely promoted to size_t in this case
> (we are comparing, rather than assigning, two entities with possibly 
> different size)

Ok. So I changed my (size_t) casts to using 'U', in the test.

An observation: gtest macros "are strict" for some comparisons. I had
compilation problems with
ASSERT_EQ(NULL, pointer variable)
on Solaris, it didn't compile (maybe NULL is a plain 0 there). Replacing 
NULL with (void*)0 worked.

>>>    int rc= pclose(fd);
>>>    rc= WEXITSTATUS(rc);
>>>    ASSERT_EQ(0, rc);
>>>  #endif
>>>  }
>>>
>>> +class TraceContentTest : public ::testing::Test
>>> +{
>>> +public:
>>> +  Opt_trace_context trace;
>>> +};
>>> +
>>> +
>>> +TEST_F(TraceContentTest, ConstructAndDestruct)
>>> +{
>>> +}
>>> +
>>>
>>>  /** Test empty trace */
>>> -TEST(Trace_content_test, empty)
>>> +TEST_F(TraceContentTest, empty)
>>>  {
>>> -  /* Create a trace */
>>> -  Opt_trace_context trace;
>>> -  ASSERT_EQ(false, trace.start(true, false, false, -1, 1, ULONG_MAX,
>>> -                               all_features));
>>> +  ASSERT_FALSE(trace.start(true, false, false, -1, 1, ULONG_MAX, 
>>> all_features));
>>>    /*
>>>      Add at least an object to it. A really empty trace ("") is not
>>>      JSON-compliant, at least Python's JSON module raises an exception.
>>> @@ -83,26 +99,24 @@ TEST(Trace_content_test, empty)
>>>    trace.end();
>>>    /* And verify trace's content */
>>>    Opt_trace_iterator it(&trace);
>>> -  ASSERT_EQ(true, it != Opt_trace_iterator::end);
>>> +  EXPECT_FALSE(it.at_end());
>>
>> This checks that the iterator is not at end. If it is at end, ASSERT
>> will stop this test portion. EXPECT will rather print a message and
>> continue execution , so it will execute "*it" at the next line. 'cursor'
>> being NULL, this will segfault.
>> I think it's normal that when you reached the end (one past the last),
>> you don't try to read the iterator's value.
> 
> OK.
> I saw only ASSERT, and no EXPECT, in your code, so I changed a few of them.
> I just wanted you to make a concious decision for each case.
>>
>>>    const Opt_trace_info info= *it;
>>>    const char expected[]= "{\n}";
>>> -  ASSERT_STREQ(expected, info.trace_ptr);
>>> -  ASSERT_EQ(sizeof(expected) - 1, info.trace_length);
>>> +  EXPECT_STREQ(expected, info.trace_ptr);
>>> +  EXPECT_EQ(sizeof(expected) - 1, info.trace_length);
>>
>> Even here, if the content is not as expected, I don't care that the
>> length is tested. Or the JSON compliance below, or the value of
>> missing_bytes.
>> But on the other hand, it does not hurt to test them.
>> So I changed all ASSERT to EXPECT, except when continuing would cause a
>> crash (i.e. at_end() tests).
>>
>>>  /** Test escaping of characters */
>>> -TEST(Trace_content_test, escaping)
>>> +TEST_F(TraceContentTest, escaping)
>>
> 


-- 
Mr. Guilhem Bichot <guilhem.bichot@stripped>
Oracle / MySQL / Optimizer team, Lead Software Engineer
Bordeaux, France
www.oracle.com / www.mysql.com


Thread
bzr commit into mysql-next-mr-bugfixing branch (tor.didriksen:3243) WL#5257Tor Didriksen11 Jan
  • Re: bzr commit into mysql-next-mr-bugfixing branch (tor.didriksen:3243)WL#5257Guilhem Bichot26 Jan
Re: bzr commit into mysql-next-mr-bugfixing branch (tor.didriksen:3243)WL#5257Guilhem Bichot3 Feb
  • Re: bzr commit into mysql-next-mr-bugfixing branch (tor.didriksen:3243)WL#5257Tor Didriksen4 Feb
  • Re: bzr commit into mysql-next-mr-bugfixing branch (tor.didriksen:3243)WL#5257Guilhem Bichot16 Feb