Hi Ingo,
Thank you for your comments and my apologies again for delaying my answer
for so long.
The actions which I'm going (or not going) to take in response to your
comments are:
1. Add to HLS the proposed alternative design where storage modules are not
concerned with storing version numbers, marking backup image data etc.
2. Update specifications of "Get information ..." service to explain what
happens in case location does not contain a valid backup image and
clarifying the meaning of "timestamp".
3. Update specification of "Open stream ..." services which should return a
preferred I/O block size.
4. I would prefer to not include in this WL the idea of module stack but to
have a separate WL if we consider implementing it (see below for arguments).
5. I noted your propositions for error and non-ascii string handling, but I
would like to leave these aspects open for further discussion for a while.
I think some of your questions come from different understanding of the HLS
specifications which I wrote, so I want to explain this first.
My intention with HLS specifications is to give a general design on an
abstract level. This level is far above details such as concrete function
declarations, data types used to pass information etc. On this level I only
want to specify what services a backup module should implement. Also, for
each service I want to specify what information must be provided and what
information is returned from the service.
I intentionally do not specify how the information is passed and how it is
represented. I consider these to be implementation details and I expect them
to be specified in LLD, not in HLS.
The goal is to first have an overview of what needs to be implemented,
before we specify how it is going to be implemented. For example, we can
first realize what kind of data we must pass to/from services, before we
settle what data types we will use to represent it. I hope my intention is
more clear now - I explain it more in replies to your comments.
And here are my replies to your comments/questions.
Ingo Strüwing wrote:
> Hi Rafal,
>
> Rafal Somla, 07.07.2009 17:05:
>
> ...
>> Based on these ideas, I created WL#5046 for designing and possibly
>> implementing such architecture. I put some initial design there. Please
>> have a look at it and let me know your thoughts.
>
> That's a great specification. I agree with most of it. Please let me
> request some changes:
>
> 1. "Responsibilities": The verification if magic number and version
> should not be duplicated in every storage module. It should be done in a
> central place, I'll call it the "core". Same with compression and
> encryption.
I'm not so sure that it is obvious that "it should be done in a central
place". First, as mentioned in the specification, marking of backup image
data does not need to be done by means of a magic number. Magic number is
most obvious option for file storage, but for other storages there could be
better options. For example, in XBSA objects stored on XBSA server have
'resourceType' attribute. This attribute can be used to mark that object
contains a backup image and then there is no need to use magic number on top
of that. The possibility of using such storage-specific means of marking
backup image, which sometimes can be better than the default magic number
method, was the reason for making it one of the responsibilities of a
storage module.
I agree that the disadvantage of this design is that storage module has more
to implement. As for duplication of code, this can be avoided by providing a
"toolkit" library for storage module writers which will provide default
implementation of the service.
I think both options are viable and choice is a matter of weighting
advantages and disadvantages of each. I'll describe the alternative in HLS.
>
> 2. "Note 1": The second sentence should go away. Storage modules should
> just move bytes. No need to do any interpretation, whatsoever.
Depends on chosen design. The note describes the design which I proposed. I
think I understand your alternative and will describe it in HLS.
>
> 3. "Note 2": Let us re-phrase it: The storage module must ensure that it
> does not confuse the backup image with any other information. It must be
> able to identify the backup image uniquely. It must be able to return
> the same backup image unmodified on request.
The intention of my design was to make it a responsibility of the storage
module to mark locations which contain backup images so that when a location
is opened, the module can tell if it is a backup image or something else.
You propose a different design, where recognizing that location contains
backup image data will be done by backup kernel by means of a magic number
(as is done now).
>
> 4. "Note 3": I suggest to drop that. The "core" should send the required
> information together with the image, like we do it with files and pipes
> today. No need for the module to deal with this internal information.
As above, the design which I propose gives a possibility to use a
storage-specific method for associating version number with backup image
data. For example, in XBSA this can be implemented using 'objectInfo'
attribute of an XBSA object (location).
>
> 5. "Session": We should specify, who allocates and frees the context.
> And if the session needs to be freed too, and by whom.
I consider memory allocation issues to be part of lower-level design. I
think it should be clarified in LLD. On this level of specification the
life-cycle of a session looks as follows:
- It is created by "Initialize backup storage session" service.
- It should be closed by "Terminate backup storage session", or
- It should be cancelled by "Abort storage session".
The most obvious implementation would probably do resource allocation and
freeing inside these services - no need to bother the user.
>
> 6. "acknowledgement": The services look like they all return a status.
> Why do we need a second status return here? Can't the return status have
> different values that encode different (non-)problems? Another method
> could be to set algorithm "none" if the preferred one was rejected.
On this level of specification, I only want to say *what* information is
passed in and out of each service. I do not want to say *how* it is done. I
consider it part of LLD.
In this case I want to say that "Set compression algorithm service" informs
its user whether it successfully set-up compression or not. I do not want to
specify yet, how it is done - all the options which you propose are valid. I
expect this to be specified/clarified in LLD.
>
> 7. "image format version number": Since the storage module has no use
> for the version number, but just stores and returns it with the image,
> it would be cleaner to integrate the number in the stream, like we do
> today with files and pipes. Method parameters besides buffer and length
> should be used only if they serve another purpose than to provide data
> to be stored and retrieved. The version number has no semantical meaning
> to the storage module. It is just part of the "payload".
In the design which I have in mind, storing the format version number is a
responsibility of a storage module (thus it does not have to be stored in
the stream). The version number is passed to storage module when location is
opened for writing. When a location is opened for reading, storage module
informs about the version number associated with that location.
>
> 8. "stream offset": I don't see, where it is ever used. Reads and writes
> don't specify a position. Hence it is expected that the module keeps
> track of the position (which I concur with). Hence there is no need that
> the module tells the core, how much extra data it adds.
The reason which I had in mind, is a possible optimizations which takes into
account I/O block size. Suppose that storage module informs backup kernel
that it likes data to be written/read in 4k blocks. Still, the
implementation needs to reserve 10 bytes at the beginning of the first block
for its internal uses. If backup kernel starts to write data in 4k blocks,
being ignorant of the reserved 10 bytes, then the data is miss-aligned and
optimizations might be not possible. As in this picture:
1st block 2nd block 3rd block
data storage: |--+-----------|--------------|--------------| ...
^
'- reserved area (10 bytes)
data writes/reads: |--------------|--------------| ....
Knowing the reserved prefix, backup kernel can align written/read data to
block boundaries.
>
> 9. "amount of data that has been written": What can this information be
> used for (except of stopping backup)?
This is similar to standard POSIX write() function. Client of the service
request writing of 100 bytes. But implementation can write only 10 bytes at
the moment and returns to the caller to avoid waiting. Then client must take
into account that only 10 bytes have been written and send remaining 90
bytes again.
Such specifications allows for greater flexibility in implementing I/O
operations.
>
> 10. "data buffer and its size": This should be two parameters.
>
On this level I only want to say what information must be provided, not how
it is represented. I expect the exact data structures used to pass this
information to be specified as part of LLD. Passing it as two parameters to
a function is a valid option.
> 11. "information about end of stream": Why do you dislike the common
> style to report this with a zero length read? I think this could be
> simpler to implement in the module, and as simple to use in the core. If
> the module reads in blocks, and the last block is full, it would need to
> read ahead to decide if this was the last block. A simple module won't
> do double buffering and thus return an empty buffer at the next call
> anyway. I like simplicity.
I hope it is clear already that I don't dislike any particular
implementation, especially a simple one. On this level, I avoid specifying
how information is represented and focus on realizing what information needs
to be passed around.
>
> 12. "timestamp of the image": It would allow for simpler implementations
> if we allow to return the time of creation *or* time of close after
> creation. This raises the question at which point in time this method
> can be called. After session creation? After open? After close? When
> used in a backup session, it would make sense after close only, I guess.
> For a restore session it might be possible at any time.
>
You are right that there is ambiguity about what "timestamp" means. But
first, I address your questions. The intention of this service was that it
can be called at any time after creating a session. Note that each backup
storage session is associated with a location - the location passed to
"Initialize backup storage" service. This location can be either "empty" or
contain some data. The "Get information about image..." service will be
meaningful only for sessions which are associated with a location which
contains valid backup image. If this is not the case then user should get
information that "there is no backup image at the location". I will add it
to the specification as a possible output of that service.
As for the meaning of "timestamp", I propose that it is implementation
defined. The implementation will choose if this is the time when image was
opened or closed or in the middle. The intended purpose of this service is
to get a rough idea when the image was created. I will describe it in the
specifications. Let me know if you have alternative propositions.
> 13. "canonical name": What will we use the information for?
I was thinking about reporting to the user. For example, in backup logs we
report backup location. I think it makes sense to use a "canonical" name for
such purposes. For example, for file-system storage, the "canonical name" of
a location would be the patch after making it absolute and removing all ".."
etc., as it is done in the current code. For other storage modules,
different clean-ups can be appropriate.
>
> 14. "Negotiating preferred/optimal I/O block size": I suggest the module
> to return its preference together with the session handle. The core can
> use that information, or ignore it.
I think it is better to communicate it when location is opened for
reading/writing. Thus I'll extend specifications of "Open stream for
writing/reading" services to make them return the preferred I/O block size.
If you think that returning it "Initialize backup storage session" is a
better alternative, let me know.
>
> 15. "Error reporting": The methods should return non-zero on error.
> There should be a get_errno() and a get_errmsg() method. Internally the
> modules must be allowed to use include/mysqld_error.h and my_error() or
> my_printf_error() from include/my_sys.h.
>
I really don't like specifying *how* modules should be implemented - for
example that they should include some header files. For me, this breaks
modularity. I would rather like to specify *what* modules should do, without
telling how they should implement this. I would even imagine that a storage
module is implemented in a language different than C, as long as it
implements correct API and can be correctly linked with our code.
This being said, providing an library which should be used by storage
modules for error reporting is a more acceptable option. I'm just afraid
that we do not have such library - error handling code is part of our
monolithic server. Requiring/allowing that backup storage modules link with
the core server and use all its internal functions is something which breaks
modular design and which I would very much like to avoid.
From a more abstract perspective, the most important design decision to
make w.r.t. error reporting is: who reports error to the final user? Is it
the backup storage module or the backup kernel. If I understand correctly,
your suggestion means that it is the storage module which reports error to
the final user. I am afraid of such design because it requires a lot of
knowledge from the module and makes it harder to separate it from the rest
of the system. Among other things, a module which wants to report errors to
the user must know these things:
- How to send error messages to the user: is it server error log, or
connection error stack or backup log or other locations as specified by the
user. For some of these options the module would have to be aware of such
things as active connection between mysql client and server.
- What are user preferences regarding error reporting: for example the
language in which they should be reported, the level of errors which should
be reported etc.
For all these reasons I'm more inclined to a solution where it is the backup
kernel which reports errors and backup storage module only signals errors to
its client. Still there are questions how it should work. I'd like to have
some more time to formulate my proposition more precisely.
Note: using global error numbers is a bit problematic in an open
architecture. The global set of error numbers must fit any possible
implementation of a backup storage module. It is difficult to predict all
possible errors, thus either storage modules will not be able to signal some
errors or they will hack-around, breaking the modularity of the system. In
the ideal world, I'd like to have a system to which an independently
developed backup storage module can be plugged and it will integrate
smoothly, including satisfactory error reporting.
> 16. "non-ascii strings": The interface should define strings in system
> character set.
Using system character set creates, unnecessary in my opinion, dependence
between a storage module and the server code. The module would have to
access this variable. Instead, we could specify "utf8" to be the standard
encoding for strings. This seems to be an obvious option but I have not put
it down yet, because I'm not sure about all the consequences. And perhaps,
we can be fine with just us-ascii strings?
>
> 17. "Set compression algorithm": Here an architecture alternative:
> Instead of having a method for every possible "filtering" of the stream,
> have something like this: create_storage_stack(); push_module("gzip");
> push_module("AES"); push_module("XBSA").
>
This is an interesting idea, and already appeared before. However, I would
say that it is independent from the idea of backup storage modules. The
reason why I think so is because if you create a module stack for processing
backup stream, there must be some final destination where the data goes:
[backup kernel] -> [module1] -> [module2] -> ... -> [final destination]
I see backup storage modules as possible final destinations in such a pipe.
Having infrastructure for backup storage modules does not prevent having
another infrastructure for creating module stacks. However, even without the
module stack, backup storage modules will be a usable and useful concept.
Thus, in the spirit of divide-and-concquer methodology, I would not extend
this WL with providing the module stack but create a separate WL for the later.
Another issue which you signalled earlier, is whether it makes sense to have
compression and/or encryption as a service provided by storage module. You
are afraid of duplicating the same functionality in several modules. I think
this can be avoided because:
1. Storage module is not obliged to implement compression - it can ignore
"set compression algorithm" requests.
2. The design does not exclude the possibility of doing compression
externally, without participation of a storage module. Backup kernel can, if
it chooses, to compress data itself and send a compressed stream to the
location handled by the storage module.
On the other hand, the advantage of having a compression service in backup
storage module interface is that it opens a possibility of using efficient,
storage-specific compression methods. I'm thinking, e.g., about a tape
storage device which does hardware compression/encryption and which could
export such functionality with this interface.
Rafal