List:Backup« Previous MessageNext Message »
From:Ingo Strüwing Date:October 15 2009 8:23pm
Subject:Re: WL#5046 (PLuggable Storage Modules) - HLS updated
View as plain text  
Hi Rafal,

thanks for the update. Here is the second part of my comments, which
refer to the new HLS.

Rafal Somla, 14.10.2009 14:34:

> Ingo, Andreas,
> 
> I have updated HLS. You might want to check if I accurately described
> the alternatives which you suggested in your comments and if I haven't
> missed something you consider important.

HLS:

> Responsibilities of a Backup Storage Module
> ===========================================
...
> R3 Ensure that locations opened for reading contain backup image data (and 
>    not some other kind of data) - warn if this is not the case.


Not agreed so far.

> R4 Store and inform about format version number of an image stored in a 
>    given location.


Not agreed so far.

...
> Notes
> -----
> 1. Backup storage modules do not need to understand backup image format to
> implement their functionality. But they understand that they work with backup 
> images and that these are versioned.


See "What do we consider part of the backup image?" in the other email.

> 
> 2. Marking stored backup image so that it can be distinguished from other types 
> of data is a responsibility of storage module. This can be implemented in number 
> of ways: magic number, file extension, external file attributes etc.


See "Marking of backup images" in the other email.

> 
> 3. Storing image format version together with backup image is also a
> responsibility of backup storage module and can be implemented in a way most
> suitable for the underlying media.


See "What do we consider part of the backup image?" in the other email.

> 
> 4. Listing and more advanced management of locations is not required from a
> backup storage module. However, a module can implement such services to be used 
> by other clients. For example, an external application for managing backup 
> locations.


I'm not convinced that it is a good idea to rely on clients in
general. This would mean that remote access to a MySQL server is not
enough to manage it. Management services don't need to be implemented
right now, IMHO, but we shouldn't exclude them that strict.

...
> Backup Storage Module Services
> ==============================
...
> S4  Set compression algorithm.
>     [IN]  backup storage session
>     [IN]  optional name of the algorithm
>     [OUT] acknowledgement
> 
>     Request that given compression algorithm will be used when writing to 
>     the location. This request should be made before opening location for 
>     writing. If no compression algorithm is given then a default one is 
>     used. If module does not support compression then it refuses to 
>     acknowledge this request but otherwise storage session remains valid and 
>     can be used to write uncompressed image.


If a module provides its own compression, I would prefer to
specify this with the location string. Otherwise we need SQL syntax to
request it.

In other services we do not have something like "acknowledgement". We
either assume, they cannot fail, or report errors in a non-specified
way, or return some invalid value. But here we specify a dedicated
return argument. I suggest to remove it and specify error handling for
the interface. The latter would specify how to report problems or
acknowledgements for all services.

> 
> S5  Open stream for writing.
>     [IN]  backup storage session
>     [IN]  image format version number
>     [OUT] preferred I/O block size (optional)
>     [OUT] stream offset
> 
>     Prepare session for writing backup image in a given format. The 
>     implementation ensures that format version number is stored in the 
>     location and will be reported when that location is opened for reading 
>     (see below). If location already contains a backup image then error is 
>     returned. There is a service request for freeing such occupied location 
>     (see below).


"extra handling of version": See "What do we consider part of the
backup image?" in the other email.

Specifying that an error is returned here, but not at all places where
this should be expected, seems inconsistent. I suggest to specify that
the service must "fail" if the location exists.

...

>     Depending on implementation, storage module can reserve certain number 
>     of bytes at the beginning of the stream for storing internal data. The 
>     size of the reserved area is returned as stream offset. It should be 
>     taken into account when calculating I/O block boundaries. Note that the 
>     reserved area is not accessible by the client. In particular, the write 
>     request below will write data after the reserved area.


Though I do now understand the magic behind "offset", I believe.
But I'm still not really happy with it. And I'm not convinced about it's
usefulness. But anyway, it shouldn't hurt.

However, to emphasize its intended use, I suggest to have a "preferred
first I/O block size (optional)" parameter instead of "stream offset".

> 
> S6  Open stream for reading.
>     [IN]  backup storage session
>     [OUT] image format version number
>     [OUT] preferred I/O block size (optional)
>     [OUT] stream offset
> 
>     Prepare session for reading. Error is reported if location is empty or 
>     contains invalid data (not a backup image). If it contains a backup 
>     image, the version number of its format is returned.


"version number": See "What do we consider part of the
backup image?" in the other email.

> 
>     The meaning of preferred I/O block size is as for "Open for writing" 
>     service. If there are bytes reserved by the implementation at the 
>     beginning of the stream then stream offset is the size of that reserved 
>     area. However, client does not have access to these reserved bytes - the 
>     read request below will start reading from the position indicated by 
>     stream offset.


Same comment as for "Open stream for writing".

> 
> S7  Write bytes to location.
>     [IN]  backup storage session
>     [IN]  data buffer
>     [IN]  amount of data to be written
>     [OUT] amount of data that has been written
> 
>     Write given amount of bytes to location which was previously opened for 
>     writing. It can happen that less bytes than requested has been written. 
>     The amount of data actually written is returned.


See "Amount of data that has been written" in the other email.

> 
> S8  Read bytes from location.
>     [IN]  backup storage session
>     [IN]  data buffer and its size
>     [OUT] amount of data read
>     [OUT] information about end of stream
> 
>     Read bytes from location which was previously opened for reading. The 
>     amount of bytes read is reported. If there are no more bytes in the 
>     location then end of stream is reported.


See "Information about end of stream" in the other email.

...
> Not yet covered
> ---------------
> * Error reporting. 


I think, the HLS should specify something here. E.g. "Every service
signals success or failure. In case of failure, information about the
cause must be provided." In other words, every service should have some
implicit information like: "[OUT] status (success/failure); [OUT] error
description (in case of failure)."

> * Handling non-ascii strings/internationalization support.


I think it should be a matter of cause that non-ascii strings can be
exchanged. So no mention of this seems necessary in the HLS.

> 
> 
> Alternatives
> ============
> 
> A1 Storing meta-info in backup stream.
> --------------------------------------
>    Meta-info about backup image is the backup image's format version number 
>    and a "fingerprint" which distinguishes it from other kinds of data. In 
>    the above design it is responsibility of a backup storage module to store 
>    this meta-info (R3 & R4). The alternative is that meta-information is 
>    stored in the backup image and storage module is not aware of it.
> 
>    Advantage: backup modules do not have to implement meta-info storage 
>    which makes them simpler. Potential code duplication is avoided.
> 
>    Disadvantage: More efficient, storage-specific implementation of 
>    meta-info is not possible. For example using XBSA object attributes.


Sorry. Disagree. We don't need to prohibit the storage module to add
meta information. The extra ~10 bytes in the image won't be a major
annoyance, even if unnecessary for *every* storage module. But as
mentioned above, it still could serve as another level of consistency check.

I guess it became obvious that I strongly vote for A1.

> 
> A2 No native compression support.
> ---------------------------------
>    The alternative is to remove responsibility R5 and corresponding service 
>    S4 from the interface. Compression, if any, will be done by backup kernel 
>    without participation of a storage module.
> 
>    Advantage: makes the interface simpler.
> 
>    Disadvantage: Closes a possibility to use more efficient, 
>    storage-specific compression mechanisms if present (e.g., hardware 
>    compression).


I think, Andreas and I agreed that this could be requested through the
"location string". The backup kernel can use its compression through the
SQL syntax. So we have all flexibility without the "Set compression
algorithm" service.

So I clearly vote for A2, but with a changed title:
A2 No "Set compression algorithm" service.

Regards
Ingo
-- 
Ingo Strüwing, Database Group
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Geschäftsführer: Thomas Schröder,   Wolfgang Engels,   Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Häring   HRB München 161028
Thread
WL#5046 (PLuggable Storage Modules) - HLS updatedRafal Somla14 Oct
  • RE: WL#5046 (PLuggable Storage Modules) - HLS updatedAndreas Almroth15 Oct
  • Re: WL#5046 (PLuggable Storage Modules) - HLS updatedIngo Strüwing15 Oct
    • RE: WL#5046 (PLuggable Storage Modules) - HLS updatedAndreas Almroth15 Oct
  • Re: WL#5046 (PLuggable Storage Modules) - HLS updatedIngo Strüwing15 Oct
    • RE: WL#5046 (PLuggable Storage Modules) - HLS updatedAndreas Almroth15 Oct
  • Re: WL#5046 (PLuggable Storage Modules) - HLS updatedIngo Strüwing15 Oct
    • Re: WL#5046 (PLuggable Storage Modules) - HLS updatedRafal Somla22 Oct
      • Re: WL#5046 (PLuggable Storage Modules) - HLS updatedIngo Strüwing22 Oct
        • RE: WL#5046 (PLuggable Storage Modules) - HLS updatedAndreas Almroth22 Oct
        • Re: WL#5046 (PLuggable Storage Modules) - HLS updatedRafal Somla23 Oct
  • Re: WL#5046 (PLuggable Storage Modules) - HLS updatedIngo Strüwing23 Oct