List:Internals« Previous MessageNext Message »
From:Mats Kindahl Date:June 16 2009 6:56pm
Subject:Re: RFC: draft of physical structure of server code
View as plain text  
Hi Kostja,

Thank you for your comments. I have some answers to your questions below.

Konstantin Osipov wrote:
> Hi Mats,
> 
> Thank you very much for defining the basis of the server module
> management. 
> 
> * Mats Kindahl <mats@stripped> [09/06/12 00:07]:
> 
>> In order to accomplish this, we want to have a structure that supports
>> hierarchical testing, that is, directly testing individual components of
>> the server as well as testing subsystems of the server.
> 
> I think this property could be explored/described in a bit higher
> detail. Please see below on package dependencies, versions, and
> interaction with the plugin registry and plugin architecture.

This worklog has no interaction with the plugin registry nor the plugin
architecture except when it comes to how the build system will handle it.

Regarding the package dependencies and the versions, I have answers below.

>> Note that this proposal mentions a few coding rules that are needed to
>> support the proposed structure, but the goal of this work is *not* to
>> produce a set of coding guidelines.  The maintenance of coding
>> guidelines is a separate issue and is maintained by the coding
>> guidelines committee (which is currently headed by Kostja).
> 
> This document does suggest several additions to the coding guidelines,
> such as having the header guards, naming convention for
> header guards, use of C++ namespaces, naming convention for
> namespaces.
> These I would love the coding style group to consider. 
> 
> Alternatively, if this proposal is just accepted as a whole
> package, I will unilaterally extend the coding standard to include
> bits that pertain to the coding standard. 

I can supply the coding style issues as separate proposals for the committee to
consider. This might be for the best since that will include a rationale for
each style guide as well.

The coding guidelines in this description is a result of how the code is organized.

>> This draft has been through several revision rounds already, trying to
>> take many aspects into account while still not imposing rules that are
>> to strict to allow future development of the server.  We would very
>> much like to have feedback from the MySQL development community on
>> this draft.
> 
> When and how do we set the bar for the new components and features
> to follow these guidelines?
> 
> We will be transitioning indefinitely unless we make
> sure that all new code adheres to the new standard.

I agree that we need to have a bar for introduction of this structure, but the
time-line for transition is outside the scope of this worklog. It is however the
structure that the reengineering team will base all its work on, and all new
packages should follow this structure.

In order to now lose track of the goal to ensure that we actually transition to
the new structure, I have added a note in the worklog that this need to be
planned for.


>> High-level structure
>> ====================
>> This in turn
>> requires the packages to provide the necessary information so that the
>> build frame can do its job.
> 
> <cut>
> 
>> Components
>> ==========
>>
>> A component consist of a set of header files and a set of associated
>> C/C++ files. The component is the smallest unit of the physical
>> design.
> 
> How does component and package dependencies work?
> 
> Perhaps it is part of a separate document, describing the build
> system, but I think it's necessary to mention the concept
> here: part of adding a new component or package is to specify its
> dependencies.
> 
> What considerations should one have when adding a
> new component to the package, especially when it requires 
> a change in package dependency tree?

An addition of a new package always require changes in the dependency tree in
the sense that any new package have dependencies on other parts of the system.
However, the management of the dependencies is part of the build system.

Basically, each package should only describe what it is dependent on, and
nothing else. These dependencies are visible as an include of a package
interface file from another package: in that case, the package is dependent on
the other package via the package interface file.

In this way, package dependencies can be deduced from the #include directives in
the code.

> Do we allow recursive dependencies?

How do you mean?

> Do we support the notion of component or package
> versions or we assume that all packages have the
> same version? 

We assume that all packages together is for a certain version of the server.

> Perhaps in the first approximation we don't need to bother. But
> long term, packages such as a storage engine or libmysql will have
> a version independent of the server version, and we will need to 
> address the compatibility issue.

Yes, that is correct and it needs to be considered in the future. However, I do
not see that there are any changes necessary to the structure described in this
draft proposal to handle that. The structure is (supposed to be) ignorant of
versions and it is the responsibility of the package authors as well as the
build frame to handle such issues.

> How strict do we want to be in enforcing the structure of a
> package? What if the package has historically different
> conventions for component management?

The goals of this worklog is to allow *any* piece of software to be treated as a
package. The only necessary addition is a configuration file that needs to be
added to provide information about the package to the build system. The exact
name and format of this configuration file will be defined in the build system
worklog (WL#4875).

The other naming conventions apply to MySQL code only (within reason): we do not
want to change third-party distributions in order to allow them to be used with
the MySQL server.

>> Typically, each component consists of a header file and a C/C++ file
>> with a common base name, for example "parser.h" and
>> "parser.cc". However, there are some cases where it makes sense to
>> have multiple header files for a component and cases when it makes
>> sense to have multiple source files.
>>
>> - Using several header files can be used to present multiple
>>   interfaces into a single component.
> 
> Later on you advocate use of forward declarations.
> Additionally you say that all names and declarations should be
> in the component code, not in the user code.
> 
> I think it boils down to requiring a package to export forward
> declaration header files in addition to component headers. 

Hmmm... good point, it is not very clear.

The goal of this part is to ensure that dependencies (as in "linking
dependencies") between components can easily be tracked by observing the
#include directives.

However, a forward declaration of a class does not in itself introduce a
dependency on another component, but it is likely to require the component to be
dependent on another component. Extern declarations, however, do always
introduce a (link-time) dependency.

Consider this somewhat artificial case (removing some stuff to make it easier to
read):

a_comp.h:

	class AClass {
		...
	};

a_comp.cc:


	/* Implementation details */

b_comp.h:

	class AClass;

	extern int perform_magic_on(AClass *);

b_comp.cc:

	int remember_me(AClass *ptr) {
	  memento.push_back(ptr);
	}

In this case, there is not any link-time dependency from b_comp to a_comp even
though the class is mentioned (compile this and check with objdump or nm), so
the lack of an #include of the definition here is proper.

However, should the b_comp.cc file do something with the AClass instance that
requires it to see the definition, it has to include the a_comp.h file and in
that case there is a real (link-time) dependency between the components.

In other words, forward declarations in themselves do no introduce a dependency,
but since the component is referencing the class, it is likely to require
including its package interface file as well.

If the forward declarations are complicated, it is wise to provide a forward
declaration file similar to how the IOStreams library provide the iosfwd file
for this use, but I am not sure if we should explicitly require this from
package authors.

>> Packages
>> ========
>>
>> Packages are collections of components that serves a common purpose.
>> This formulation is deliberately not exact since what actually makes
>> sense to turn into a package vary from case to case. However, the
>> following issues should be considered when deciding whether a
>> candidate package makes sense as a package:
>>
>> - Can the candidate package be released independently of the rest of
>>   the server? If not, i.e., changes to this package is likely to
>>   require changes to other packages, then maybe it should not be a
>>   package.
>>
>>   Releasing here does *not* mean distributing the code in isolation,
>>   it means releasing, e.g., a new version of the package for use with
>>   the rest of the server.
> 
> Here you mention the concept of package version without having
> introduced it.

Bad choice of words. In this case, I just mean a set of patches forming an
improvement of functionality in some sense, and not really a proper "version" as
in "stepping a version when releasing a package."

The goal is to have a package structure so that normally patches only change a
single package (there will of course always be exceptions, but these should be a
very rare case). To handle this, it is usually good to consider if a "normal"
change will stay within the package. If that is not the case, then it is
probably not a good candidate package.

> In addition to good common sense reasoning here, perhaps it makes
> sense to follow the letter of Lacos here, and introduce a strict
> notation; define the notion of a class, header, component and package
> dependency and specify that the metric of "minimal unnecessary
> class/header/component/package dependencies" as the key criteria
> in splitting and merging functionality.

I don't want to advocate too much for how packages should be formed in this
worklog, just provide a vehicle for organizing and using packages.

I do, however, agree that we need to document and describes how to build and
manage packages, which could include what you describe above, but I am not sure
that it should be part of this worklog.

> 
>> - Is the candidate package very small, e.g., a single component? In
>>   this case it might make sense to group several such candidate
>>   packages with similar purpose into a single package.
>>
>>   A typical example would be support for individual character sets,
>>   that does not make sense to place in a single package each, but is
>>   sensible as a package of "character set information".
>>
>>
>> Package naming and structure
>> ----------------------------
>>
>> Each package is represented as a directory. The basic assumption is
>> that everything related to a package should be placed in the
>> directory. This includes, but is not limited to: header files, source
>> files, and unit tests.
>>
>> Basic goals and assumptions are:
>>
>> - Changes in the package internals should not inadvertently affect
>>   other packages that use the package
>>
>> - It shall be possible to support third-party solutions as package in
>>   the package structure and shall not require re-organization to fit
>>   the package structure
>>
>> The package directories will be placed in a directory alongside the
>> ``sql/`` directory. Apart from that, all packages are placed at the
>> same level. We are placing the packages in a new directory to be able
>> to distinguish between "unorganized" and "organized" code.
>>
>> The following package directories are proposed (some directories
>> already exists and almost have the basic structure proposed):
>>
>>     ========== ==============================================
>>     Package    Purpose
>>     ========== ==============================================
>>     storage/   Storage engines
>>     server/    Server modules
>>     common/    Common utilities
>>     ========== ==============================================
> 
> You refer to sql/ and server/ directories meaning the
> same thing. Perhaps it is easier to just say that normally each
> package is stored at the top level of the directory structure,
> unless it falls into one of the predefined categories, such as 
> a storage engine or a part of the core server.

Yes, that is probably easier. I have got some more comments on this section, so
I will reformulate it.

>> In order to use an interface of a package, the header file is included
>> using the form:
>>
>>     #include "package_name/interface.h"
> 
> In other words, I always say 
> #include "falcon/falcon.h", even if falcon is in storage?

Yes, that was the intention. The storage/, etc. directories are just intended to
make is easier to work with the actual code by grouping packages, but they are
not visible at the "development level." From the view of a developer, the system
consists of a set of packages, and everything else is the responsibility of the
build frame/system to fix so that it can be used as if this was the case.

> I think this document is large enough to justify introduction of a
> summary that only contains the suggested changes and omits the
> rationale and discussion. I'm looking forward to us moving forward
> with this proposal.

Good point, I will compose one.

Best wishes,
Mats Kindahl
-- 
Mats Kindahl
Senior Software Engineer
Database Technology Group
Sun Microsystems
Thread
RFC: draft of physical structure of server codeMats Kindahl11 Jun
  • Re: RFC: draft of physical structure of server codeKonstantin Osipov16 Jun
    • Re: RFC: draft of physical structure of server codeMats Kindahl16 Jun
  • Re: RFC: draft of physical structure of server codeMARK CALLAGHAN16 Jun
    • Re: RFC: draft of physical structure of server codeMats Kindahl16 Jun
      • Re: RFC: draft of physical structure of server codeMARK CALLAGHAN16 Jun