Chuck,
Thanks for your review. I agree that error reporting is not satisfactory. But I
see it as a more general problem and a limitation of the current backup kernel
(there are other places in the kernel where errors are reported only on a very
general level). Soon, it should be solved somehow (I have "error reporting" item
on my TODO list). Before that, I'd rather not implement any ad-hoc solutions,
especially since it might be not as trivial as it seems. See my more lengthy
explanation below.
Chuck Bell wrote:
> Rafal,
>
> I see in the code where you are capturing invalid paths for Windows, but it
> still outputs the same error message for both invalid paths and overwrite
> error. The following is what the code does now:
>
> # attempt to backup to an invalid location on Windows:
> mysql> backup database test to 'e:\\source\\c++\\back.bak';
> ERROR 1614 (HY000): Can't write to backup location
> 'e:\source\c++\back.bak' <--- I don't have an e: drive
>
> # attempt to backup to a valid location on Windows:
> mysql> backup database test to 'd:\\source\\c++\\back.bak';
> +-----------+
> | backup_id |
> +-----------+
> | 72 |
> +-----------+
> 1 row in set (3.03 sec)
>
> # attempt to overwrite a backup file on Windows:
> mysql> backup database test to 'd:\\source\\c++\\back.bak';
> ERROR 1614 (HY000): Can't write to backup location
> 'd:\source\c++\back.bak'
>
> I think this is confusing and doesn't clearly state what the problem is.
> Notice the error for the third backup command. The error message is the same
> as the first command and the user doesn't know whether the path or filename
> is wrong or if he is trying to overwrite a file.
>
> To illustrate, I think the above execution should result in the following:
>
> # attempt to backup to an invalid location on Windows:
> mysql> backup database test to 'e:\\source\\c++\\back.bak';
> ERROR 1614 (HY000): Can't write to backup location
> 'e:\source\c++\back.bak'
> ^ Tells me something wrong with path or file ^
>
> # attempt to backup to a valid location on Windows:
> mysql> backup database test to 'd:\\source\\c++\\back.bak';
> +-----------+
> | backup_id |
> +-----------+
> | 72 |
> +-----------+
> 1 row in set (3.03 sec)
>
> # attempt to overwrite a backup file on Windows:
> mysql> backup database test to 'd:\\source\\c++\\back.bak';
> ERROR 1613 (HY000): Attempting to overwrite file 'd:\source\c++\back.bak'
> ^ Tells me I cannot overwrite the file and makes more sense (to me) ^
>
> This gives the user more information to go on. The first invalid command
> tells the user there is something wrong with the file/path and the third one
> tells the user he is trying to overwrite the file (which is not allowed). Do
> you agree this is more appropriate and accurate?
>
> This can be accomplished using:
>
> (kernel.cc)
>
> + if (!loc || !loc->is_valid())
> + {
> + my_error(ER_BACKUP_INVALID_LOC,MYF(0),lex->backup_dir.str);
> + DBUG_RETURN(ER_BACKUP_INVALID_LOC);
> + }
> + if (!my_access(lex->backup_dir.str,F_OK))
> + {
> + my_error(ER_BACKUP_CANNOT_OVERWRITE,MYF(0),lex->backup_dir.str);
> + DBUG_RETURN(ER_BACKUP_CANNOT_OVERWRITE);
> + }
>
> (errmsg.txt)
> + ER_BACKUP_CANNOT_OVERWRITE
> + eng "Attempting to overwrite file '%-.64s'"
>
> Note: This works on both Linux and Windows. :))
>
I agree that the code should give more information to the user. However,
implementing this feature would be more work than you suggest. The reason is
that the kernel was written so that, in the future, it can handle different
types of backup locations - not only files in the local file system. For
example, in the future we would like to support
BACKUP ... TO 'http://backup.server/some/path'
or
BACKUP ... TO 'xbsa://xbsa.server/some/path'
Therefore, kernel doesn't assume that lex->backup_dir.str contains a path to a
local file but, rather, that it is a URL. Function Location::find() is
responsible for parsing this URL string and constructing appropriate instance of
Location class. For each supported type of location there will be a
corresponding class derived from Location. Right now we have only File_location.
An important detail of this design is that Location object is not supposed to
represent an existing object (e.g. a file) but rather a possible location of an
object (i.e. a valid file path). This location can (and in case of BACKUP
operation should) be empty. Thus Location::is_valid() should return FALSE only
if the location is wrong, like if the file path is malformed. This doesn't work
perfect, as you pointed out with your 'e:\path\on\non\existent\drive' example,
but well..., it is as good at detecting problems as the function
check_if_legal_filename() is. We could work on improving it but I'd prefer to
leave it for future work.
Now, because of the above design, backup kernel doesn't make any assumptions
about what type of location the loc object returned by Location::find() is.
Therefore, the kernel can't give precise error description - only the loc object
knows the details of the problem. If there was a problem while opening location
for writing, the only thing which kernel knows is that open_for_write(loc)
returned NULL, i.e., that there was a problem, not what kind of problem it was.
This is why the error information is not as precise as we would like it to be.
I think it is a typical situation which makes good error reporting difficult.
The problem is that the caller, who can easily report an error doesn't know its
details. On the other hand, in the place where error is detected, we don't
always have all the infrastructure for reporting it.
Instead of implementing some some ad hoc solution for each case where error
reporting is not satisfactory, I'd rather try to find some time to design and
implement a general solution for error reporting in the online backup code. The
current solution is simplistic and allows backup kernel to report only the fact
that error has happened without any more precise information about the cause of
the problem (sometimes it can be inferred from the context, but not always).
Thus I think we could easily find much more places where error information given
to the user is not satisfactory. I'm aware of that limitation and this is why I
have error reporting on my TODO list for backup kernel improvements.
Given all this I propose:
- Separate the two issues:
a) ensuring that files are not silently overwritten by BACKUP command (which
is a serious danger);
b) giving user adequate information in case of this and other errors.
- Agree that BUG#33119 is about solving issue a) with as good error reporting as
is available in the current kernel.
- Agree that issue b) should be treated separately as a whole in the "error
reporting" WL.
How about this?
> Otherwise, everything else in the patch looks fine. Depending on who pushes
> first we may need to change my patch for BUG#33024 to add the remove file
> command. But that's a minor thing and easy to do.
>
> How about the other 2 bugs? Is this a "three-fer" where one patch fixes
> three bugs?
>
Yes, luckily, doing proper cleanup in all tests fixes the two other bugs. I put
a note in the bug reports.
Rafal