List:Backup« Previous MessageNext Message »
From:Rafal Somla Date:October 13 2009 12:51pm
Subject:Re: WL#4056: Pluggable storage modules for backup
View as plain text  
Hi Ingo,

Thank you for your comments and my apologies again for delaying my answer 
for so long.

The actions which I'm going (or not going) to take in response to your 
comments are:

1. Add to HLS the proposed alternative design where storage modules are not 
concerned with storing version numbers, marking backup image data etc.

2. Update specifications of "Get information ..." service to explain what 
happens in case location does not contain a valid backup image and 
clarifying the meaning of "timestamp".

3. Update specification of "Open stream ..." services which should return a 
preferred I/O block size.

4. I would prefer to not include in this WL the idea of module stack but to 
have a separate WL if we consider implementing it (see below for arguments).

5. I noted your propositions for error and non-ascii string handling, but I 
would like to leave these aspects open for further discussion for a while.


I think some of your questions come from different understanding of the HLS 
specifications which I wrote, so I want to explain this first.

My intention with HLS specifications is to give a general design on an 
abstract level. This level is far above details such as concrete function 
declarations, data types used to pass information etc. On this level I only 
want to specify what services a backup module should implement. Also, for 
each service I want to specify what information must be provided and what 
information is returned from the service.

I intentionally do not specify how the information is passed and how it is 
represented. I consider these to be implementation details and I expect them 
to be specified in LLD, not in HLS.

The goal is to first have an overview of what needs to be implemented, 
before we specify how it is going to be implemented. For example, we can 
first realize what kind of data we must pass to/from services, before we 
settle what data types we will use to represent it. I hope my intention is 
more clear now - I explain it more in replies to your comments.

And here are my replies to your comments/questions.

Ingo Strüwing wrote:
> Hi Rafal,
> 
> Rafal Somla, 07.07.2009 17:05:
> 
> ...
>> Based on these ideas, I created WL#5046 for designing and possibly
>> implementing such architecture. I put some initial design there. Please
>> have a look at it and let me know your thoughts.
> 
> That's a great specification. I agree with most of it. Please let me
> request some changes:
> 
> 1. "Responsibilities": The verification if magic number and version
> should not be duplicated in every storage module. It should be done in a
> central place, I'll call it the "core". Same with compression and
> encryption.

I'm not so sure that it is obvious that "it should be done in a central 
place". First, as mentioned in the specification, marking of backup image 
data does not need to be done by means of a magic number. Magic number is 
most obvious option for file storage, but for other storages there could be 
better options. For example, in XBSA objects stored on XBSA server have 
'resourceType' attribute. This attribute can be used to mark that object 
contains a backup image and then there is no need to use magic number on top 
of that. The possibility of using such storage-specific means of marking 
backup image, which sometimes can be better than the default magic number 
method, was the reason for making it one of the responsibilities of a 
storage module.

I agree that the disadvantage of this design is that storage module has more 
to implement. As for duplication of code, this can be avoided by providing a 
  "toolkit" library for storage module writers which will provide default 
implementation of the service.

I think both options are viable and choice is a matter of weighting 
advantages and disadvantages of each. I'll describe the alternative in HLS.

> 
> 2. "Note 1": The second sentence should go away. Storage modules should
> just move bytes. No need to do any interpretation, whatsoever.

Depends on chosen design. The note describes the design which I proposed. I 
think I understand your alternative and will describe it in HLS.

> 
> 3. "Note 2": Let us re-phrase it: The storage module must ensure that it
> does not confuse the backup image with any other information. It must be
> able to identify the backup image uniquely. It must be able to return
> the same backup image unmodified on request.

The intention of my design was to make it a responsibility of the storage 
module to mark locations which contain backup images so that when a location 
is opened, the module can tell if it is a backup image or something else. 
You propose a different design, where recognizing that location contains 
backup image data will be done by backup kernel by means of a magic number 
(as is done now).

> 
> 4. "Note 3": I suggest to drop that. The "core" should send the required
> information together with the image, like we do it with files and pipes
> today. No need for the module to deal with this internal information.

As above, the design which I propose gives a possibility to use a 
storage-specific method for associating version number with backup image 
data. For example, in XBSA this can be implemented using 'objectInfo' 
attribute of an XBSA object (location).

> 
> 5. "Session": We should specify, who allocates and frees the context.
> And if the session needs to be freed too, and by whom.

I consider memory allocation issues to be part of lower-level design. I 
think it should be clarified in LLD. On this level of specification the 
life-cycle of a session looks as follows:

- It is created by "Initialize backup storage session" service.
- It should be closed by "Terminate backup storage session", or
- It should be cancelled by "Abort storage session".

The most obvious implementation would probably do resource allocation and 
freeing inside these services - no need to bother the user.

> 
> 6. "acknowledgement": The services look like they all return a status.
> Why do we need a second status return here? Can't the return status have
> different values that encode different (non-)problems? Another method
> could be to set algorithm "none" if the preferred one was rejected.

On this level of specification, I only want to say *what* information is 
passed in and out of each service. I do not want to say *how* it is done. I 
consider it part of LLD.

In this case I want to say that "Set compression algorithm service" informs 
its user whether it successfully set-up compression or not. I do not want to 
specify yet, how it is done - all the options which you propose are valid. I 
expect this to be specified/clarified in LLD.

> 
> 7. "image format version number": Since the storage module has no use
> for the version number, but just stores and returns it with the image,
> it would be cleaner to integrate the number in the stream, like we do
> today with files and pipes. Method parameters besides buffer and length
> should be used only if they serve another purpose than to provide data
> to be stored and retrieved. The version number has no semantical meaning
> to the storage module. It is just part of the "payload".

In the design which I have in mind, storing the format version number is a 
responsibility of a storage module (thus it does not have to be stored in 
the stream). The version number is passed to storage module when location is 
opened for writing. When a location is opened for reading, storage module 
informs about the version number associated with that location.

> 
> 8. "stream offset": I don't see, where it is ever used. Reads and writes
> don't specify a position. Hence it is expected that the module keeps
> track of the position (which I concur with). Hence there is no need that
> the module tells the core, how much extra data it adds.

The reason which I had in mind, is a possible optimizations which takes into 
account I/O block size. Suppose that storage module informs backup kernel 
that it likes data to be written/read in 4k blocks. Still, the 
implementation needs to reserve 10 bytes at the beginning of the first block 
for its internal uses. If backup kernel starts to write data in 4k blocks, 
being ignorant of the reserved 10 bytes, then the data is miss-aligned and 
optimizations might be not possible. As in this picture:


                     1st block      2nd block      3rd block
data storage:   |--+-----------|--------------|--------------| ...
                  ^
                  '- reserved area (10 bytes)

data writes/reads: |--------------|--------------| ....

Knowing the reserved prefix, backup kernel can align written/read data to 
block boundaries.

> 
> 9. "amount of data that has been written": What can this information be
> used for (except of stopping backup)?

This is similar to standard POSIX write() function. Client of the service 
request writing of 100 bytes. But implementation can write only 10 bytes at 
the moment and returns to the caller to avoid waiting. Then client must take 
into account that only 10 bytes have been written and send remaining 90 
bytes again.

Such specifications allows for greater flexibility in implementing I/O 
operations.

> 
> 10. "data buffer and its size": This should be two parameters.
> 

On this level I only want to say what information must be provided, not how 
it is represented. I expect the exact data structures used to pass this 
information to be specified as part of LLD. Passing it as two parameters to 
a function is a valid option.

> 11. "information about end of stream": Why do you dislike the common
> style to report this with a zero length read? I think this could be
> simpler to implement in the module, and as simple to use in the core. If
> the module reads in blocks, and the last block is full, it would need to
> read ahead to decide if this was the last block. A simple module won't
> do double buffering and thus return an empty buffer at the next call
> anyway. I like simplicity.

I hope it is clear already that I don't dislike any particular 
implementation, especially a simple one. On this level, I avoid specifying 
how information is represented and focus on realizing what information needs 
to be passed around.

> 
> 12. "timestamp of the image": It would allow for simpler implementations
> if we allow to return the time of creation *or* time of close after
> creation. This raises the question at which point in time this method
> can be called. After session creation? After open? After close? When
> used in a backup session, it would make sense after close only, I guess.
> For a restore session it might be possible at any time.
> 

You are right that there is ambiguity about what "timestamp" means. But 
first, I address your questions. The intention of this service was that it 
can be called at any time after creating a session. Note that each backup 
storage session is associated with a location - the location passed to 
"Initialize backup storage" service. This location can be either "empty" or 
contain some data. The "Get information about image..." service will be 
meaningful only for sessions which are associated with a location which 
contains valid backup image. If this is not the case then user should get 
information that "there is no backup image at the location". I will add it 
to the specification as a possible output of that service.

As for the meaning of "timestamp", I propose that it is implementation 
defined. The implementation will choose if this is the time when image was 
opened or closed or in the middle. The intended purpose of this service is 
to get a rough idea when the image was created. I will describe it in the 
specifications. Let me know if you have alternative propositions.

> 13. "canonical name": What will we use the information for?

I was thinking about reporting to the user. For example, in backup logs we 
report backup location. I think it makes sense to use a "canonical" name for 
such purposes. For example, for file-system storage, the "canonical name" of 
a location would be the patch after making it absolute and removing all ".." 
etc., as it is done in the current code. For other storage modules, 
different clean-ups can be appropriate.

> 
> 14. "Negotiating preferred/optimal I/O block size": I suggest the module
> to return its preference together with the session handle. The core can
> use that information, or ignore it.

I think it is better to communicate it when location is opened for 
reading/writing. Thus I'll extend specifications of "Open stream for 
writing/reading" services to make them return the preferred I/O block size. 
If you think that returning it "Initialize backup storage session" is a 
better alternative, let me know.

> 
> 15. "Error reporting": The methods should return non-zero on error.
> There should be a get_errno() and a get_errmsg() method. Internally the
> modules must be allowed to use include/mysqld_error.h and my_error() or
> my_printf_error() from include/my_sys.h.
> 

I really don't like specifying *how* modules should be implemented - for 
example that they should include some header files. For me, this breaks 
modularity. I would rather like to specify *what* modules should do, without 
telling how they should implement this. I would even imagine that a storage 
module is implemented in a language different than C, as long as it 
implements correct API and can be correctly linked with our code.

This being said, providing an library which should be used by storage 
modules for error reporting is a more acceptable option. I'm just afraid 
that we do not have such library - error handling code is part of our 
monolithic server. Requiring/allowing that backup storage modules link with 
the core server and use all its internal functions is something which breaks 
modular design and which I would very much like to avoid.

 From a more abstract perspective, the most important design decision to 
make w.r.t. error reporting is: who reports error to the final user? Is it 
the backup storage module or the backup kernel. If I understand correctly, 
your suggestion means that it is the storage module which reports error to 
the final user. I am afraid of such design because it requires a lot of 
knowledge from the module and makes it harder to separate it from the rest 
of the system. Among other things, a module which wants to report errors to 
the user must know these things:

- How to send error messages to the user: is it server error log, or 
connection error stack or backup log or other locations as specified by the 
user. For some of these options the module would have to be aware of such 
things as active connection between mysql client and server.

- What are user preferences regarding error reporting: for example the 
language in which they should be reported, the level of errors which should 
be reported etc.

For all these reasons I'm more inclined to a solution where it is the backup 
kernel which reports errors and backup storage module only signals errors to 
its client. Still there are questions how it should work. I'd like to have 
some more time to formulate my proposition more precisely.

Note: using global error numbers is a bit problematic in an open 
architecture. The global set of error numbers must fit any possible 
implementation of a backup storage module. It is difficult to predict all 
possible errors, thus either storage modules will not be able to signal some 
errors or they will hack-around, breaking the modularity of the system. In 
the ideal world, I'd like to have a system to which an independently 
developed backup storage module can be plugged and it will integrate 
smoothly, including satisfactory error reporting.

> 16. "non-ascii strings": The interface should define strings in system
> character set.

Using system character set creates, unnecessary in my opinion, dependence 
between a storage module and the server code. The module would have to 
access this variable. Instead, we could specify "utf8" to be the standard 
encoding for strings. This seems to be an obvious option but I have not put 
it down yet, because I'm not sure about all the consequences. And perhaps, 
we can be fine with just us-ascii strings?

> 
> 17. "Set compression algorithm": Here an architecture alternative:
> Instead of having a method for every possible "filtering" of the stream,
> have something like this: create_storage_stack(); push_module("gzip");
> push_module("AES"); push_module("XBSA").
> 

This is an interesting idea, and already appeared before. However, I would 
say that it is independent from the idea of backup storage modules. The 
reason why I think so is because if you create a module stack for processing 
backup stream, there must be some final destination where the data goes:

[backup kernel] -> [module1] -> [module2] -> ... -> [final destination]

I see backup storage modules as possible final destinations in such a pipe. 
Having infrastructure for backup storage modules does not prevent having 
another infrastructure for creating module stacks. However, even without the 
module stack, backup storage modules will be a usable and useful concept. 
Thus, in the spirit of divide-and-concquer methodology, I would not extend 
this WL with providing the module stack but create a separate WL for the later.

Another issue which you signalled earlier, is whether it makes sense to have 
compression and/or encryption as a service provided by storage module. You 
are afraid of duplicating the same functionality in several modules. I think 
   this can be avoided because:

1. Storage module is not obliged to implement compression - it can ignore 
"set compression algorithm" requests.

2. The design does not exclude the possibility of doing compression 
externally, without participation of a storage module. Backup kernel can, if 
it chooses, to compress data itself and send a compressed stream to the 
location handled by the storage module.

On the other hand, the advantage of having a compression service in backup 
storage module interface is that it opens a possibility of using efficient, 
storage-specific compression methods. I'm thinking, e.g., about a tape 
storage device which does hardware compression/encryption and which could 
export such functionality with this interface.

Rafal
Thread
Re: WL#4056: Pluggable storage modules for backupRafal Somla13 Oct
  • Re: WL#4056: Pluggable storage modules for backupIngo Strüwing15 Oct
    • Re: WL#4056: Pluggable storage modules for backupRafal Somla22 Oct
      • Re: WL#4056: Pluggable storage modules for backupIngo Strüwing22 Oct
        • RE: WL#4056: Pluggable storage modules for backupAndreas Almroth22 Oct
        • Re: WL#4056: Pluggable storage modules for backupRafal Somla23 Oct
          • Re: WL#4056: Pluggable storage modules for backupIngo Strüwing23 Oct
            • Re: WL#4056: Pluggable storage modules for backupRafal Somla3 Nov