The discussion continues...
Ingo Strüwing wrote:
> Hi Rafal,
> Rafal Somla, 27.10.2009 10:08:
>> Ingo Strüwing wrote:
>>> Rafal Somla, 24.10.2009 16:02:
>> But, do you have any concrete propositions what error situations for
>> which services should be distinguishable (across all BSMs)? Or you are
>> only concerned with possible future extensions?
> I am concerned about human fallibility. Often during implementation or
> changes (e.g. bug fixes) problems pop up, which haven't been foreseen
> during specification.
Sure. Working at MySQL I've learned that probably the best way to deal with
1. do our best to design as good interface as we can at the moment,
accepting that we can not predict everything;
2. when unpredicted issue arises, rework the interface.
> Hence I'm not in favor of an interface specification, which requires all
> (if only non-fatal) problems to be identified in advance.
After thinking about it for a long while, for me this boils down to the
following desing choice (don't ask me why :)):
Currently, when a service is called, only two general outcomes are possible:
1. Service succeeds and provides the specified information.
2. Service fails and this is a fatal error - the whole session is interrupted.
But perhaps we want to have three possible outcomes:
1. Service succeeds and provides the specified information.
2. Service fails with fatal error - the whole session is interrupted.
3. Service fails with non-fatal error - the session can still be used.
I am still not convinced that we really need it. Although I think I can also
buy it. The only think which stops me right now is that I'd rather keep it
simpler if possible.
If we are to go this way, then I think a user of a storage module (backup
kernel) can not decide on its own whether given error is fatal or non-fatal.
In the end, it is the storage module which knows whether the failure that
has happened prevents further operation or not.
The distinction could be done by convention (some error codes are fatal and
some are not) or by other means. But I don't think it needs to be specified
on this level of abstraction. Most important is to agree whether storage
module can report non-fatal errors or, as in the current design, it must
either succeed or otherwise all errors are fatal.
> If we want to specify all behavioral aspects, it might be easier to
> adapt the specification to reality by changing a list of error codes
> than by changing a service signature.
I'm trying to imagine how the alternative specification would look then. As
far as I can tell, the difference would be that instead of explicitly saying
that e.g., service S10 (get information about backup image) informs us if
the location is empty, there will be a global error constant like
BSM_LOC_EMPTY with implicit understanding that:
a) whenever some service tries to access backup image at location which is
empty, it should return BSM_LOC_EMPTY error (it took me a while to formulate
b) error BSM_LOC_EMPTY is non-fatal: storage session can be used after it
has been reported.
I prefer to make such choices more explicit. Also, I see such specification
as more low-level, because it implies a particular implementation choice
(use of error codes). I know this is an obvious and natural choice, but
still I don't think it is necessary and appropriate to force it on this
level of specification.
With my specification, I'm only informing what information must be passed
out of service - I do not tell how this should be done. In particular, it
can be done as described above: an implementation of service S10
S10 Get information about image stored in the location.
[IN] backup storage session
[OUT] size and timestamp of the image or information that location does
not contain a backup image.
can be a function:
int get_image_info(session, ...)
0 - upon normal termination
BSM_LOC_EMPTY - if location is empty
BSM_WRONG_DATA - if location does not contain backup image
error - negative error code if fatal error has been encountered.
This is a valid implementation of the above specification. The information
that location does not contain a backup image is passed in form of positive
return value. This is distinguished from (fatal) errors which are signalled
via negative return values.
> We can leave it to the backup kernel, which errors to take as fatal, and
> which to work around. Backup kernel could be fixed in this respect,
> without changing the interface.
But first of all, backup kernel must know if backup storage session is
usable after an error or not. This information must be passed somehow from
module to the kernel - the kernel can not decide it on its own. But once it
knows what kind of error has happened, then sure, it can freely decide what
to do about that. As far as we speak about the interface, it is most
important to specify what information is passed through it and then how.
>>>> There is no global convention about which error number means what.
>>> This is something, which might bite us one day. If multiple modules
>>> report similar error messages for problems that are pretty different to
>>> handle by the user, then the support team might have a hard time to
>>> figure out, what happened exactly. Sure, the final message contains the
>>> module name, but often the customer doesn't remember the exact text.
>>> Especially if he is no native English speaker. "The backup said no such
>>> tape", but it was "file not found". Perhaps the xbsa: type specifier had
>>> been forgotten. A globally unique error number would help a lot.
>> I don't understand the example. How "The backup said no such tape" could
>> possibly appear if we are using a filesystem BSM and the real problem
>> was "file not found"?
> By plain user error. Users don't read error messages carefully.
> Especially if they don't understand the language well. To the example:
> User wants to restore from tape. He forgets xbsa:. Error message is
> "file not found". User identifies the words "not found". From his school
> English he understands it as "not there". But the tape is there! So
> what's wrong? Let's call support and tell them MySQL is stupid. It
> claims "tape not there", while it is there.
> The user would do better, if the message was in his native language.
> Support would have better chances if there is a unique error number.
And what if users makes mistake when reporting error number ;)
> Some (management-)applications could also profit from unique error
> numbers. So they won't need to parse the error text.
Ok, I still see an issue how global error codes (code ranges) would be
assigned to particular BSM implementations? The only solution which I can
come up with is that we reserve certain range for MySQL and then all BSMs
developed at MySQL will use unique error numbers from that range. But any
external implementers will use arbitrary chosen numbers outside of that
range. Then the error numbers are bound to overlap and the advantage you
have in mind is not going to happen. Thus I consider my proposition to have
local error codes to be more "clean" and not promising something we can not
>>> should be a way to handle internationalization for storage modules.
>> Easier said than done :) Any propositions? If the proposition is that
>> BSMs use my_error() to report errors then I will oppose such solution...
> See for example:
> WL#2940 - plugin service: error reporting
> WL#751 - Error message construction
> But I must admit that this is not solved for us yet. We may not want to
> spend the effort at the moment. So I will not further insist in
> internationalized messaged from storage modules for now, but I want to
> have the interface specified so that it can be added later without an
> interface change. This might be doable when the plugin services (here
> WL#2940) are implemented. You have already suggested a way to select the
Thanks for digging these out (I had a vague recollection that such WLs
exist). Indeed WL#2940 does not solve our problems:
- it says nothing about global error codes
- it does not say how locale setting is passed to the module
For the moment, I have updated HLS as suggested: locale setting is passed to
"create storage session" service and error descriptions returned by storage
module should be in appropriate language. I think this specification should
not be strict. That is, BSM should do its best to describe errors in the
selected language, but if not possible, it can use other language (English).
Any thoughts about how to best specify it?