Rafal Somla, 22.10.2009 13:04:
> Ingo Strüwing wrote:
> However, the ultimate solution is to do
> proper consistency checking using checksums or the like. One could argue
> that it is better to rely on a complete solution rather than have a
> false safety feeling based on some simplistic and partial solution.
Checksums don't help. With a wrong version number, we might expect the
checksum at a wrong place. How could we tell the user if the image is
corrupted or the storage module confused the version number? The
difference is relevant. In the latter case the user might recover his
data by fixing the mess in the storage.
>> If we define "magic bytes" + "version number" as part of the image, we
>> would immediately detect a failing storage module.
> ... unless it fails after giving us correct magic bytes. E.g., if
> version number is corrupted we are screwed anyway. So even in this
> variant we must trust BSM to a big extend.
If the backup storage module is not able to return the blob unmodified,
no solution can work.
Checksums are on the roadmap. There is no point in thinking about
possible problems by not having them. With the current format we cannot
reliably detect image corruption. But with the next format we will.
If the storage module returns a modified blob, we have image corruption.
Nothing else. If the mark is included in the image, we reduced two
possible failures to one.
>>>> 6. "acknowledgement": The services look like they all return a status.
> I think that for "set compression algorithm" service, reporting error
> and reporting that compression is not supported are two different
> things. In the latter case the service completes successfully, informing
> the user that compression is not supported. This is different from the
> situation where service fails because of some reason.
> So, we have three possibilities here:
> 1. Service fails with error.
> 2. Service completes successfully and reports that compression is on.
> 3. Service completes successfully and reports that compression is not
Possibility 3 is not what I would accept as a user. I want compression.
The system tells me: Backup successful, not compressed. Eh?
And we do not have extra notifiers for other things that could be seen
as success, e.g. empty image: Restore successful, nothing restored;
wrong location: Open of location successful, not a backup image. If I
spend more time on it, I can probably come up with more ridiculous examples.
Sorry to become emotional here. Maybe I disqualify myself as a voter for
>> "Amount of data that has been written"
> From this perspective, it is very easy to imagine that a BSM will talk
> to some more or less exotic devices such as tapes, where it can even
> require physical loading of the tape before any bytes can be written.
> The POSIX write() interface allows the application to resume control and
> do something in such situations, while your simplified solution would
> mean that the backup kernel (or one of its threads) will be blocked
> waiting for the device to be ready.
So you suggest asynchronous operation again as a use case.
You exclude management functions from the interface. You say this shall
be implemented outside of the server.
You exclude progress information from the interface. As a user I would
desire this a lot.
But you include an option for asynchronous operation. What could the
backup kernel do with it to serve the user?
I want to say that I see the selection of features as arbitrary. At
least it doesn't follow my taste.
>> "Information about end of stream"
>> If you see it this way, why don't you think it could be good that during
>> a stream write the storage module gets some idea about when the stream
>> is at its end?
>> You probably will answer that the SM can assume end of stream when a
>> "close" is requested. And the I will ask, why the backup kernel cannot
>> assume end of stream when the SM does not deliver any data any more
>> (zero length read)?
> Sorry, I don't fully understand your concern(s). For the last sentence,
> SM can not deliver any data for other reasons than end of stream.
> Perhaps it knows that it will have to wait very long for next byte from
> the stream and thus it returns control to the caller informing that no
> bytes have arrived yet.
So we are back at asynchronous operation. But for true asynchronous
operation, we need a service, which allows the storage module to wake
the kernel when there is more data available.
>> In my eyes the "information about end of stream" is an inconsistency in
>> level of detail with other service specifications. I suggest to get rid
>> of it. It is obvious that end of stream needs to be signaled somehow.
> Maybe it is obvious, but in the HLS I try to list all information which
> goes in and out of each service.
I don't believe that. I guess there is a lot of information implicit
because it is so obvious that we don't think about it.
For example, why don't the services tell about begin of stream? Or that
the open service tells if read services can now be used to get data?
And what if the user intentionally removed a tape during the operation?
From a standpoint one could see this as successful abort of the
operation. But there is no "aborted" information.
Sure, your suggestion can work. It's just that I'm unhappy with some
details. I may have to live with that.
Ingo Strüwing, Database Group
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Geschäftsführer: Thomas Schröder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Häring HRB München 161028