Replies to your remaining comments follow.
Ingo Strüwing wrote:
> Hi Rafal,
> thanks for the update. Here is the second part of my comments, which
> refer to the new HLS.
> Rafal Somla, 14.10.2009 14:34:
>> Ingo, Andreas,
>> I have updated HLS. You might want to check if I accurately described
>> the alternatives which you suggested in your comments and if I haven't
>> missed something you consider important.
>> Responsibilities of a Backup Storage Module
>> R3 Ensure that locations opened for reading contain backup image data (and
>> not some other kind of data) - warn if this is not the case.
> Not agreed so far.
Yes, this HLS describes my design proposal. Different alternatives are
listed in "Alternatives" section - look there to see if your ideas are
>> 4. Listing and more advanced management of locations is not required from a
>> backup storage module. However, a module can implement such services to be used
>> by other clients. For example, an external application for managing backup
> I'm not convinced that it is a good idea to rely on clients in
> general. This would mean that remote access to a MySQL server is not
> enough to manage it. Management services don't need to be implemented
> right now, IMHO, but we shouldn't exclude them that strict.
I have two argument for not including this into considerations:
1. I think that doing backup location management using (non-standard) SQL
statements is a rather bad idea and I think it is unlikely we will ever go
2. I think that from business perspective, it is good strategy to implement
only core tools in the free code and then have opportunity to sell
applications which make managing the system easier.
My vision is that some day "MySQL Administrator" will be extended with
backup functionality. It will have its infrastructure for managing backups
including managing backup locations, backup scheduling etc. It will have
nice GUI for performing all these tasks. Under the hood it will connect to
server and use our BACKUP/RESTORE statements to perform the operations. The
tool will keep track of backup locations and provide correct location
strings to these statements.
A user who does not want to pay for Enterprise MySQL would have to organize
backups himself. I think it is a fair deal.
>> Backup Storage Module Services
>> S4 Set compression algorithm.
>> [IN] backup storage session
>> [IN] optional name of the algorithm
>> [OUT] acknowledgement
>> Request that given compression algorithm will be used when writing to
>> the location. This request should be made before opening location for
>> writing. If no compression algorithm is given then a default one is
>> used. If module does not support compression then it refuses to
>> acknowledge this request but otherwise storage session remains valid and
>> can be used to write uncompressed image.
> If a module provides its own compression, I would prefer to
> specify this with the location string. Otherwise we need SQL syntax to
> request it.
I describe it as alternative A2. Note that we already have SQL syntax for
requesting compression (WITH COMPRESSION clause). With your proposition we
will remove it.
> In other services we do not have something like "acknowledgement". We
> either assume, they cannot fail, or report errors in a non-specified
> way, or return some invalid value. But here we specify a dedicated
> return argument. I suggest to remove it and specify error handling for
> the interface. The latter would specify how to report problems or
> acknowledgements for all services.
Already elaborated on it in my previous email. Hope my view on this is more
>> S5 Open stream for writing.
>> [IN] backup storage session
>> [IN] image format version number
>> [OUT] preferred I/O block size (optional)
>> [OUT] stream offset
>> Prepare session for writing backup image in a given format. The
>> implementation ensures that format version number is stored in the
>> location and will be reported when that location is opened for reading
>> (see below). If location already contains a backup image then error is
>> returned. There is a service request for freeing such occupied location
>> (see below).
> "extra handling of version": See "What do we consider part of the
> backup image?" in the other email.
> Specifying that an error is returned here, but not at all places where
> this should be expected, seems inconsistent. I suggest to specify that
> the service must "fail" if the location exists.
OK, changed to "Service fails if location is occupied".
>> Depending on implementation, storage module can reserve certain number
>> of bytes at the beginning of the stream for storing internal data. The
>> size of the reserved area is returned as stream offset. It should be
>> taken into account when calculating I/O block boundaries. Note that the
>> reserved area is not accessible by the client. In particular, the write
>> request below will write data after the reserved area.
> Though I do now understand the magic behind "offset", I believe.
> But I'm still not really happy with it. And I'm not convinced about it's
> usefulness. But anyway, it shouldn't hurt.
> However, to emphasize its intended use, I suggest to have a "preferred
> first I/O block size (optional)" parameter instead of "stream offset".
I changed parameter name to "size of a reserved area at the beginning of the
>> Not yet covered
>> * Error reporting.
> I think, the HLS should specify something here. E.g. "Every service
> signals success or failure. In case of failure, information about the
> cause must be provided." In other words, every service should have some
> implicit information like: "[OUT] status (success/failure); [OUT] error
> description (in case of failure)."
I added such explanation.
>> * Handling non-ascii strings/internationalization support.
> I think it should be a matter of cause that non-ascii strings can be
> exchanged. So no mention of this seems necessary in the HLS.
Well, I think that normally people pass char* strings around and assume that
everything is in us-ascii (e.g., when counting characters of a string). I
would better give it a moment to think about the issue and mention it in the
>> A1 Storing meta-info in backup stream.
>> Meta-info about backup image is the backup image's format version number
>> and a "fingerprint" which distinguishes it from other kinds of data. In
>> the above design it is responsibility of a backup storage module to store
>> this meta-info (R3 & R4). The alternative is that meta-information is
>> stored in the backup image and storage module is not aware of it.
>> Advantage: backup modules do not have to implement meta-info storage
>> which makes them simpler. Potential code duplication is avoided.
>> Disadvantage: More efficient, storage-specific implementation of
>> meta-info is not possible. For example using XBSA object attributes.
> Sorry. Disagree. We don't need to prohibit the storage module to add
> meta information. The extra ~10 bytes in the image won't be a major
> annoyance, even if unnecessary for *every* storage module. But as
> mentioned above, it still could serve as another level of consistency check.
Storage module can, of course, store the meta-information in the
storage-specific way - no one will prohibit that. However, it does not
matter if we do not use this functionality in the kernel, but instead use
the 10 bytes prefix stored in the stream. And we can not use it if we do not
have API for that.
It is also true, that if we have API to store meta-information
of-the-stream, we can still store it in the stream as well. But for me it
would be a strange solution. If we are after detecting data inconsistencies
we should implement checksums - storing 8 magic bytes at the beginning of
the stream does not really solve this problem.
> I guess it became obvious that I strongly vote for A1.
Yes, very obvious.
>> A2 No native compression support.
>> The alternative is to remove responsibility R5 and corresponding service
>> S4 from the interface. Compression, if any, will be done by backup kernel
>> without participation of a storage module.
>> Advantage: makes the interface simpler.
>> Disadvantage: Closes a possibility to use more efficient,
>> storage-specific compression mechanisms if present (e.g., hardware
> I think, Andreas and I agreed that this could be requested through the
> "location string". The backup kernel can use its compression through the
> SQL syntax. So we have all flexibility without the "Set compression
> algorithm" service.
In the recent update I included this alternative but I miss-understood it
slightly. I need to think about it bit more before I describe it better.