Rafal,
I agree that the solution is not ideal, and that we should strive to
have well-defined function specifications. However, quite a few things
are broken here and we need to make the best of it. Currently,
"ER_OUT_OF_RESOURCES" will give the user correct information about how
to deal with the situation. I understand your argument that this may
change in the future, but is that a good enough argument for providing
an error message that provide no useful information to the user? What
could a user do about "Could not access the contents of the backup
catalog when writing backup image preamble"? (What is a backup catalog,
by the way? Is that defined in the documentation?)
I think the core problem here is that we have functions that neither is
able to log errors themselves, nor is able to report error information
back to the caller. Missing context for error reporting could be solved
by giving up the distinction between Image_info and Backup_info. Then,
there would be no need for bcat_iterator_get() to guess about errors.
It is pretty clear from the comments I added to get_tablespaces() that
the only expected error is allocation failure, and that callers need to
deal with NULL returns. Hence, I think it is a good chance that people
changing get_tablespaces() will realize that if other error scenarios
are introduced, they need to deal with the code calling get_tablespaces,
too.
--
Øystein
Rafal Somla wrote:
> Hi Oystein,
>
> I noticed that in your new patch you introduced changes like this one:
>
>> @@ -1511,27 +1533,23 @@ void* bcat_iterator_get(st_bstream_image
>> case BSTREAM_IT_TABLESPACE: // table spaces
>> {
>> Iterator *it= info->get_tablespaces();
>> - - if (!it)
>> + if (!it) {
>> - info->m_ctx.fatal_error(ER_BACKUP_CAT_ENUM);
>> - return NULL;
>> + info->m_ctx.fatal_error(ER_OUT_OF_RESOURCES);
>> }
>> -
>> + return it;
>> }
>
> So, after your patch the code will look like this:
>
> case BSTREAM_IT_TABLESPACE: // table spaces
> {
> Iterator *it= info->get_tablespaces();
> if (!it)
> {
> info->m_ctx.fatal_error(ER_OUT_OF_RESOURCES);
> }
>
> return it;
> }
>
>
> Note that this code assumes that if info->get_tablespaces() fails, then
> it failed because of missing resources. Whether this assumption is true
> or not, depends on the current implementation of
> Image_info::get_tablespaces() method.
>
> While I can see the assumption is currently true, I think that such code
> is difficult to maintain. It creates a dependency: whenever someone
> changes implementation of get_tablespaces(), he should check if all
> implicit assumptions made elsewhere in the code are still valid and
> update the code accordingly if this is not the case. This is very
> unlikely to happen. First, how should the author of new implementation
> find all these places? Will he understand all assumptions which are not
> explicitly stated? Will he make no mistakes when updating code which is
> not his primary concern?
>
> I think writing code whose correctness depends on implementation details
> of functions used in this code is a bad practice. Instead, we should
> consider function specifications as the binding contract of how one can
> use given function, not their implementations.
>
> In this case, the specification of Image_info::get_tablespaces() should
> say how this function reports errors. The specification is not precise,
> unfortunately, but I think we agree that intention is that in case of
> errors function returns NULL. Since function's output is binary: either
> we get pointer to the iterator (in case of success) or NULL (in case of
> error), there is no way to know what caused the error.
>
> So, if get_tablespaces() returns NULL we know that there was a problem,
> but we don't know its cause. And this was reflected in the original code
> which reported ER_BACKUP_CAT_ENUM, informing the user that it was not
> possible to enumerate backup catalogue. Since we don't know the exact
> reason of the problem, we don't report it.
>
> Now, if we decide that this information is not enough, then I think the
> way to go is not to give more information based on the knowledge of
> internal implementation of get_tablespaces(). Instead, we should extend
> the interface of get_tablespaces() function so that it gives the caller
> more info about the cause of the problem.
>
> What do you think?
>
> Rafal