List:Commits« Previous MessageNext Message »
From:Rafal Somla Date:February 27 2008 9:25pm
Subject:Re: your review of WL#4212
View as plain text  
Hi Chuck,

Here I explain what I have and what I have not changed (and why) in my latest 
patch with regard to your review comments.

Chuck Bell wrote:
> Rafal,
> 
> My response and rebuttal. Explanations and proposed resolutions not
> commented on are fully agreed.
> 
> Chuck
>  
>> To be implemented
>> =================
>>
>> Handling duplicate database names when populating backup catalogue.
>> -------------------------------------------------------------------
(cut)
>>
>> I'll add the has_db() method to Image_info class.
>>
> 
> ok. Good alternative.
> 

The new method is added.

>> Fix index types
>> ---------------
(cut)
>> Yes, need to fix that. Thanks for spotting the problem. My 
>> thinking here is as 
>> follows:
>> - there can be a lots of tables, even more that 1M;
>> - the possible number of databases is smaller and will never 
>> exceed 64K;
>> - the possible number of backup/restore drivers used is even 
>> smaller - it is 
>> also limited to 256 by the backup stream format which uses 1 
>> byte for storing it.
>>
>> Therefore, the following types can be used for indexing these 
>> kinds of objects:
>> - ulong for tables (and other per database objects),
>> - uint for databases,
>> - ushort for snapshots,
>>
>> I'll make code consistent wrt. using these types for indexes.
> 
> Ok. Please use #define or something for these values when used inline, e.g.
> array[MAX_SNAP] or something.
> 

I fixed the index types as described. As explained in the previous email, the 
256 limit on the number of snapshot is not a configurable parameter of the code 
but a fixed limit imposed by the format of the backup image. Therefore I left it 
  inline and added appropriate comments.

>> Remove SHOW_ARCHIVE command
>> ---------------------------
>>
>> Good point - I'll wipe it out from the code.
>>

The command is removed

>> Remove TEST_ERROR_IF macros
>> ---------------------------
>>

The macro is removed

>> Names for IStream and OStream classes
>> -------------------------------------
>>
>>> * Please add a note that your OStream and IStream are not 
>> to be confused
>>> with ostream and istream respectfully. Also, 'OStream' 
>> violates our coding
>>> guidelines. It should be 'Ostream'.
>> I'll rename them to backup::Input_stream and 
>> backup::Output_stream, respectively.
>>
> 
> Thank you!!! :))
> 

Renamed.

> 
>> Use object services API for DDL blocker
>> ---------------------------------------
>>> * Although I do not like the reorganization of the code WRT calls to
>>> progress tables functions and DDL operations, I must 
>> request that you change
>>> the DDL calls to use the new si_objects abstractions for 
>> DDL blocker.
>>
>> Good point. We are not using the si_objects abstractions in 
>> the current code and 
>> this is a good occasion to change that - I'll add this to the patch.
>>

I changed code to use the si_objects.cc abstractons.

>> Fixes in documentation
>> ----------------------
>>

I did my best to improve the documentation and to correct spelling. Apparently 
I'm unable to spell every word correct :( And yes, I am using a spell checker.

>>
>> Misc issues
>> -----------
>>
>>> * Why implement this function? Allow it to be virtual so 
>> that we do not need
>>> to define a 'base' operation. The compiler is our friend 
>> (sometimes).  ;) 
>>> +const char* Snapshot_info::name() const
>>> +{ return "<Unknown>"; }
>> I'll make it pure virtual so that each derived class must 
>> define it explicitly.
>>

Now it is a pure virtual function.

>>> * Why was this line changed?
>>> -  bzero(m_snap, sizeof(m_snap));
>>> +  bzero(m_snap,sizeof(m_snap));
>>> * More spacing issues.
>>> +  if (info.snap_count()==0 || info.table_count()==0) // 
>> nothing to backup
>>
>>> * Spacing violations for operators.
>>> +  return no+1;
>> Right, I'll revise spacing and also cut the too long lines.
>>

I hope I removed most of such incorrect layouts. I also fixed layout for class 
constructors and cut some long lines.

>>> (...) I would request that 'no' be changed to something
>>> more meaningful ('num' perhaps) -- 'no' is not a normal 
>> abbreviation for
>>> number -- 'no.' is the correct abbreviation. Likewise, 
>> every m_no should be
>>> changed.
>> Would it be correct to replace 'no' with 'nr'? E.g. 
>> Buffer::m_no will become 
>> Buffer::m_nr.
> 
> Actually, what about num? m_num, your_num, my_num, etc.?
> 

I used num.

>> Explanations
>> ============
>>
>> Order of selecting backup drivers.
>> ------------------------------------
>>
(cut)
>> Backup_info::find_backup_engine()). Thus the order 
>> of Snaphsot_info objects on the snapshots list is as follows:
>>
>> - snapshots for native drivers,
>> - snapshot for the CS driver,
>> - snapshot for the default driver.
>>
>> This guarantees that CS driver will be used before the 
>> default driver, and 
>> native drivers (if exist) before the CS driver.
>>
> 
> Ok, just checking. If not already there, can you add some comment to the
> code supporting this so that 8 months from now we don't second guess
> ourselves?
> 

I added some comments in the find_backup_engine() function and in the 
constructor of Backup_info class.

>> Timing of DDL blocker calls
>> ---------------------------
>>
(cut)
>>
>> The thd->DDL_exception flag is set up inside the object 
>> services code. I can 
>> agree that it is not the best place to do this. Makes more 
>> sense to me if the 
>> caller of object services sets this up, so that object 
>> services code doesn't 
>> make arbitrary decision about ignoring the DDL blocker. I'll 
>> chagne the code to 
>> a) not turn on thd->DDL_exception inside si_objects.cc and b) 
>> turn it on in 
>> Backup_restore_ctx::prepare() just after the DDL_blocker 
>> being activated.
> 
> Again, I think this information is vital to our mainainability. I think
> these comments need to be in the code -- perhaps even in the documentation.
> Please add them where you feel it is appropriate.
> 

I added some comments in the kernel.cc

(cut)
>> Backup drivers and engines
>> --------------------------
(cut)
>>
>> However, backup drivers and backup engines are still two 
>> different things and it 
>> is not possible to ignore this difference in the code 
>> documentation. To recap, 
>> backup driver is an object which is created to backup data of 
>> a given list of 
>> tables. Similar, restore driver is an object used to restore 
>> data for given 
>> tables. Backup engine, is an object which provides 
>> backup/restore drivers when 
>> needed.
>>
>> One should think that backup engine is a static object which 
>> exists as long as 
>> the server is running, while drivers are temporary objects 
>> created for the 
>> duration of a single backup/restore operation. Also, each 
>> backup/restore driver 
>> instance is tied to a particular list of tables it is going 
>> to process. This 
>> list of tables is specified when driver is created using 
>> get_backup() or 
>> get_restore() method of backup driver. To talk about a driver 
>> we should have a 
>> list of tables in mind (because a driver exists only to 
>> backup/restore 
>> particular list of tables). When the list of tables is not in 
>> the context, it is 
>> better to talk about backup engine, as no drivers have been 
>> created yet.
>>
>> What is used to backup table data - backup driver or backup 
>> engine? The most 
>> precise answer is that this is done by backup engine using a 
>> backup driver. At 
>> the time when backup method is selected for a given table, no 
>> backup drivers are 
>> created yet because table lists are not completed yet. Thus 
>> saying that we 
>> search for a backup driver for a table is not entirely 
>> correct - we don't have 
>> any drivers to choose from at that time. We rather search for 
>> a backup engine 
>> which we will use to backup given table. When we decide which 
>> backup engine is 
>> used for each of the tables, then we will know the list of 
>> tables handled by 
>> each of the backup engines. Only then backup engines will 
>> create backup drivers 
>> to handle their lists.
> 
> I think the distinction is splitting hairs and not worth the struggle to
> maintain that an engine exists. The boundary is artificial and induces
> confusion if not used properly -- much like 'object' vs. 'class' in UML. I
> request that we drop the concept of engine altogether in the comments and
> documentation. In my mind it is an unimportant detail (for users) a small
> technical detail of how a driver instance is created.
> 

I disagree. The documentation of the code should accurately describe the code as 
it is. And in the code we do have backup engines and backup drivers as two 
different entities. Calling backup engine a backup driver will not make things 
clearer but, on the contrary, it will introduce more confusion.

>> Define constants for values used in the code
>> --------------------------------------------
>>> * Please do not use hard coded values in code. Please 
>> change all such
>>> references to #defines or other tunable mechanisms. Also, why 256? 
>>> +   Snapshot_info *m_snap[256];
>>> +  if (info.snap_count() > 256)
>>> * Please make version number a #define or something similar 
>> which can be
>>> changed without respect to hard coded numbers in the code itself.
>> I don't think it is a good idea in these particular cases. 
>> The way I see it, 
>> defining a constant like this
>>
>>   #define SOME_PARAM   47
>>
>> means that the code is parametrized by this value. Thus one 
>> can change the value 
>> of the parameter, recompile the code and get a fully 
>> functional program which 
>> uses the new value.
>>
>> But this is not the case with the format version number. The 
>> code which is 
>> written now supports only version 1 of the format and no 
>> other. Having definition
>>
>>   #define FORMAT_VERSION 1
>>
>> if someone changes it to 2 and recompiles, he will get an 
>> incorrect code! This 
>> code will write backup image in format 1 but store in it 
>> information that it is 
>> using format version 2 - that is very bad.
>>
>> Therefore I think the format version number should be hard 
>> coded. If we want to 
>> change the format we must change the code anyway.
>>
>>> * Here's another hard coded seemingly arbitrary value. 
>> Please use #define or
>>> similar.
>>> +  for (uint no=0; no<256; ++no)
>> Similar, the maximal number of snapshots supported by the 
>> kernel is not a 
>> tunable parameter. It is determined by the format of backup 
>> image which uses one 
>> byte for storing this number. Therefore the 256 limit is hard 
>> coded and should 
>> not be changed.
>>
>>> * Here also.
>>> +  char buf[255];                        // buffer for llstr
>> Above code is taken from send_reply() function. It declares a 
>> buffer which is 
>> local to that function and used only for converting a long 
>> int number into its 
>> decimal representation. I think it is an over kill to 
>> parametrize the size of 
>> this buffer. It is ok to fix the size of a temporary buffer 
>> like this since it 
>> is well known how big it must be (to accommodate a decimal 
>> representation of a 
>> long int number) and there is no reason for changing its size.
>>
> 
> I will agree with the temporary buffer statements you made, but I must
> insist on #defines for the other values. It is the same argument you used
> for the one-line functions. It's a maintainability issue. In this case, it
> also improves readibility.
> 

In my argumentation I explained why these values are not configurable parameters 
and what bad consequences could result from changing them. I don't think 
improving readability is a good enough reason to take the risks. I think keeping 
the values inlined, as now, actually helps maintaining our code by forcing us to 
be very conscious when changing backup image format.

Thus I left the values inlined and added explanatory comments.

>> Other issues
>> ------------
>>
>>>> * What is this? "Use base POD structure fd_stream..."
>> POD means "Plain Old Datastructure" as opposed to C++ classes 
>> which can have 
>> virtual tables. In plain C it is common to cast a pointer to 
>> re-interpret it as 
>> a pointer to a different structure. The backup stream library 
>> uses this a lot as 
>> a replacement of abstraction mechanisms absent in plain C. 
>> But when we have a 
>> class with virtual members, we cast it down to a POD type and 
>> then cast back, 
>> the result can be incorrect because C casts ignore virtual 
>> table pointers. To 
>> avoid this problem, I intraduced a POD base class for the 
>> stream classes. 
>> Casting back to the POD base is avoiding the problems with 
>> virtual table.
>>
> 
> Ok, but please spell out POD. It isn't clear what that means without an
> explanation.
> 

This is in cset comments which I can't change now.

>>> * Does the CREATED state replace the READY and INIT states for the
>>> Backup_info class?
>> Yes, Backup_info has fewer states now because preparations 
>> for backup have been 
>> moved to Backup_restore_ctx class. When Backup_instance is 
>> created inside 
>> Backup_restore_ctx::prepare_for_backup() it is ready for 
>> selecting databases to 
>> be backed up. Then one needs to call Backup_info::close() and 
>> the object moves 
>> to CLOSED state and becomes ready for backup operation.
>>
>>> * Please remove references to online backup being a plugin 
>> until such time
>>> as it is a plugin.
>>> +  @note This function is called when server shuts down its plugins.
>> The note doesn't refer to online backup as a plugin. It only 
>> informs that the 
>> function initializing the online backup system is called 
>> during the server 
>> initialization sequence at the time when server loads its 
>> plugins (such as 
>> storage engines and such). This is important to know since 
>> the time at which the 
>> function is called determines what can be done inside it (how 
>> much of the server 
>> is already (de-)initialized). Do you see a way of 
>> re-formulating the note in a 
>> less ambiguous way?
>>
> 
> Take out 'it's plugins'.
> 

I reformulated the comment using more words.

>>> * The changes to the cmake file and the makefile do not 
>> agree. Please make
>>> sure all changes to the makefile are reflected in the cmake 
>> file. It appears
>>> that the restore_info is different. I could verify this if 
>> I could compile.
>>
>> I think this is OK. Restore_info is special because it has 
>> only header file -- 
>> no .cc file. This is why it appears in Makefile, where header 
>> files are listed, 
>> but not in the cmake file where only source files are listed.
>>
(cut)
>>> * This is another hard coded value that is sure to bite us 
>> later. Especially
>>> if a third party vendor adds their own driver for their engine or an
>>> alternative for one of ours. Please change this to a 
>> #define or something.
>>> Why eight? Why not 11 or 20 or ...?
>>> +  native_snapshots(8)
>> This value is not as hard coded as it might seem. This is 
>> only the initial size 
>> of the native_snapshots map, internally used to initialize 
>> the HASH structure. 
>> The map can grow dynamically as needed (because HASH does).
>>
> 
> Please add comments to the code to explain this. I really detest using hard
> coded values that are not explained away (as you have above) or are #defines
> (and explained accordingly). I'll spot you the argument for temporary
> buffers though (see above).
> 

Such fixed, arbitrary initializations for dynamically growing data structures 
are all around the code. For example grep for my_init_dynamic_array() in 
sql_acl.cc (also in other files). Maybe we can live with that... ?

>>> * Please combine assertions into one statement. Debugging 
>> code with multiple
>>> assertions together like the following can be annoying to 
>> the extreme when
>>> debugging the code on Windows (not your fault -- it's a 
>> platform issue).
>>> +  DBUG_ASSERT(is_valid());
>>> +  DBUG_ASSERT(m_state == PREPARED_FOR_BACKUP);
>>> +  DBUG_ASSERT(m_thd);
>>> +  DBUG_ASSERT(m_stream);
>>> +  DBUG_ASSERT(m_catalog);
>> There is a big advantage in having asserts organizing that 
>> way. When one of them 
>> fails, the error log of the server will point at the 
>> assertion which failed and 
>> give us more valuable information about the cause of the 
>> failure. Note that we 
>> don't always have access to the core file to examine variable 
>> values with a 
>> debugger. Knowing how hard it is to trace some bugs, I think 
>> that every piece of 
>> additional information is extremely important. Therefore I 
>> disagree to change 
>> the layout here.
>>
> 
> Realize this will cause a lot problems for me should I have to debug the
> code. I respectfully request you reconsider your position and/or limit the
> use of the assertions. There are two questions I did not ask; 1) Are all of
> these are needed? and 2) Are they really assertions or are these exceptions
> that should be handled as gates? Please remove those that are not
> specifically critical to the method (if possible). I won't budge on this one
> because of the tremendous pains I have already suffered working on bugs in
> the server code. Life is not grand on Windows in mysql and using assertions
> like this makes it far worse.
> 

I'm sorry if you have problems with assertions but I don't think the quality of 
the code can be compromised because of that. Not only I think that it is good to 
have as many assertions as possible, but I also heard such statement made 
somewhere and I even had impression that it is a kind of a company wide 
recommendation (you can check that with Lars). If this creates a problem in your 
debug environment, you should solve it somehow. For example, you can consider 
redefining DBUG_ASSERT() macro in your local trees to do nothing.

For your information, my rules of thumb for using assertions include this:

- whenever dereferencing a pointer, assert that it is not NULL,
- all pre-conditions for function calls should be asserted if possible,
- invalid switch() cases should throw assertion
- etc...

All this helps in detecting incorrect execution paths in the code and thus 
control/improve its quality.

> Agreed. I will endeavor to do a better job of listing exactly what I find
> wrong and restrain from blanket statements of the critical nature. I went
> back through the patches and found one that I think spurred my comment:
> 
> +/********************************************************************
> + 
> +   Classes for representing various object types.
> + 
> + ********************************************************************/ 
> 
> Which does violate the coding guidelines.
>

Depends how you read them... For example this point in the coding guidelines:

#  Put a comment in front of its subject. In particular, function and method 
comments should be placed in front of implementation rather than declaration. 
Class comments should be put in front of class declaration.

suggest that these guidelines are concerned about and apply to comments which
stand in front of functions, methods classes and such. And the comment you found 
is not of that type - it is a visual separator between various sections of the 
source file. Then I can think that the coding guidelines don't apply to such 
comments. And you perhaps agree that it is nice if they are visually distinct as 
it helps to navigate inside the source file. Well, it is bending the rules a bit 
but let's be reasonable here!

Rafal
Thread
Re: your review of WL#4212Rafal Somla26 Feb
  • RE: your review of WL#4212Chuck Bell26 Feb
    • Re: your review of WL#4212Rafal Somla27 Feb
      • RE: your review of WL#4212Chuck Bell29 Feb
        • Re: your review of WL#4212Rafal Somla3 Mar
          • RE: your review of WL#4212Chuck Bell3 Mar