List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:August 25 2010 4:50pm
Subject:Re: bzr commit into mysql-trunk-bugfixing branch (marc.alff:3197)
Bug#55873
View as plain text  
Hello,

Marc Alff a écrit, Le 24.08.2010 23:25:
> 
> Hi Guilhelm
> 
> On 8/23/10 6:34 AM, Guilhem Bichot wrote:
>> Hello Marc,
>>
>> Marc Alff a écrit, Le 11.08.2010 19:00:
>>> #At file:///Users/malff/BZR_TREE/mysql-trunk-bugfixing-55873/ based on
>>> revid:jonathan.perkin@stripped
>>>
>>> 3197 Marc Alff 2010-08-11
>>> Bug#55873 short startup options do not work in 5.5
>>> Before this fix, the server did not recognize 'short' (as in -a) options
>>> but only 'long' (as in --ansi) options in the startup command line.
>>> The root cause is that handle_options() did not honor the
>>> my_getopt_skip_unknown flag
>>> when parsing 'short' options.
>>> The fix changes handle_options(), so that my_getopt_skip_unknown is
>>> honored in all cases.
>>

>>> === modified file 'mysys/my_getopt.c'
>>> --- a/mysys/my_getopt.c 2010-07-16 21:00:50 +0000
>>> +++ b/mysys/my_getopt.c 2010-08-11 17:00:26 +0000
>>> @@ -464,6 +464,17 @@ int handle_options(int *argc, char ***ar
>>> }
>>> if (!opt_found)
>>> {
>>> + if (my_getopt_skip_unknown)
>>> + {
>>> + /*
>>> + Preserve this unknown short option.
>>> + If there are any parameters associated with this short option,
>>> + as in "-X p1 p2 p3", parameters like p1 p2 p3 will be copied
>>> + in the result by the outer loop.
>>> + */
>>> + (*argv)[argvpos++]= *pos;
>>> + break;
>>> + }
>>
>> The corresponding code for long options has a do...while() instead.
>> After reading handle_options() I cannot see a case where the do...while
>> will do more than one iteration (i.e. where pos!=first). I'm not asking
>> you to change do...while to be a single iteration (like your new code
>> above), it's a detail and I don't want to introduce a subtle bug, just
>> would like to know whether you agree on the observation or I'm missing
>> something.
> 
> This is also my understanding of the code.
> The loop is useless, but does not seem to cause any harm,
> so I avoided touching it to avoid introducing more risk.
> 
>> There is a problem with the new "break;" above: it will break out of
>> for (optend= cur_arg etc)
>> and come to
>> if ((error= setval(optp, value, argument, set_maximum_value)))
>> below, which will do a setval() for a non-identified option.
>>
> 
> Yes, you are correct.

On second thought I re-checked this in a debugger ("how can this code be 
wrong as Marc's tests pass?") and it seems that setval() does no harm 
(does nothing) because optp is full of zeroes and set_maximum_value is 
false (and set_maximum_value cannot be true for short options I believe; 
it is true when a long option is prefixed with "max" or "maximum" I 
don't remember). It would feel better to not go to setval() still, but 
this isn't technically a bug...

> Overall, handle_option() just feels like spaghetti code

Very right; I have a hard time reading it. I wonder why it has to be so 
complex.

>, and the fix was 
> incomplete.
> 
> While analysing this code again after a week of rest, I think there is 
> much more to it, and that the issue is not that easy to resolve.
> 
> In particular, consider the following loop:
> 
>       else  /* must be short option */
>       {
>         for (optend= cur_arg; *optend; optend++)
>         {
> ...
>         }
>       }
> 
> My current understanding of this loop is that the code attempts to find 
> *several* short options in the *same* argv[i] command line argument.

looks so

> For example, "mysqld -alw5" is interpreted as "mysqld -a -l -w5"
> 
> The problem here is that I don't see how the server can parse "-XYZ" 
> without knowing if X, Y and Z are options of not, that takes argument or 
> not, because "-XYZ" could mean:
> -X -Y -Z, where X Y and Z are short options with no arguments
> -X -YZ, where Y is a short option with a value of Z
> -XYZ, where X is a short option with a value of YZ
> 
> My conclusion then is that handle_option() can _NOT_ honor the skip 
> unknown options flag and support this syntax at the same time, for short 
> options all concatenated in the same argv[] argument.

looks so. Good catch.

> Please confirm whether this is also your analysis, or if you find 
> something missing.

I think you are right

> At this point, I think the solutions to resolve the bug are:
> 
> 1) Get rid of short options all together in 5.5
> This is what the bug reporter actually suggested.

Fine with this.

> 2) Continue to support short options in 5.5, but only with the explicit 
> "-X -Y -Z" syntax, and not with the "-XYZ" syntax.
> In that case, supporting my_getopt_skip_unknown in handle_options() for 
> short options seems possible and not too intrusive.
> 
> This will be an incompatible change, forcing users to replace
> "-anCutf8" with "-a -n -Cutf8",
> or better with "--ansi --new --character_set_server=utf8".
> 
> I actually see this as a good thing.

Fine with this.

> Another option I explored was:
> 
> 3) declare every options that has a "short" name as PARSE_EARLY, and 
> reorder the initialization code accordingly, so that every short option 
> is known to handle_options() the first time, so the code does not need 
> to support my_getopt_skip_unknown.
> This is however creating chaos in main_mysqld, and is a fix that I 
> consider way to risky and way to intrusive.
> The problem here is that:
> - reordering options (between PARSE_EARLY and PARSE_NORMAL) based on the 
> option *semantic* is ok: very low levels (performance schema, logging, 
> ...) are ok to parse first.
> - reordering options based on whether there is a short hand notation or 
> not, regardless of the option semantic and what it controls, just 
> completely destabilize this otherwise very fragile initialization 
> sequence. The embedded server initialization is deeply impacted too.
> 
> In short, I don't think 3) is viable.

ok

> Before going further, please let me know if you see another solution to 
> this. My preference so far is 2).

(1) and (2) are fine with me; though I foresee resistance; I suspect it 
falls in the realm of the deprecation policy ("you can't remove this 
behaviour suddently, you need to issue a deprecation warning in the next 
N versions and then you can remove the options on the version after" and 
then it has to be approved in managers' meetings...).
my_getopt serves not only to the server but also to the client; while 
the server's short options are few and I didn't even know about them, 
the client's short options are possibly quite used; I use
./mysql -uroot -h127.0.0.1
daily. So I'm not sure that we can remove client's short options so 
easily. On the other hand, our clients probably have a single call to 
handle_options(), using the default value of my_getopt_skip_unknown 
("false") so they could keep their short options without harm.

If (1) or (2) cannot be done in 5.5 and we need to close this bug in 
5.5, here is a suggestion for 5.5 (a good-enough idea until (1) or (2) 
happen).
In time order, the calls to handle_options in mysqld currently are:
a) my_getopt_skip_unknown is set to true
b) one call for perfschema options (which are only options)
c) one for the general-purpose server options ("ansi", etc, which can be 
short options)
d) one for each plugin (only long options, of the form --plugin-name-XXX)
e) one for the "server-uuid" option (long option)
f) my_getopt_skip_unknown is set to false
g) one for no options (I mean an empty array in the "longopts" argument 
of handle_options(); this is to scan for the remaining 
not-handled-so-far strings on the command line and issue an error message).
So only call (b) may have short options in its "longopts" array.
So if -XYZ is seen in (a), -X is guaranteed unknown, we can treat -XYZ 
as a black box and just push it for the next calls to handle_options() 
to handle.
Then -XYZ is seen in (b). Is -X known?
* no: push -XYZ again (for the next calls to handle)
* yes: so we know whether this is -X YZ (YZ argument of -X) or -X -YZ; 
in the former case we're done; in the latter case, is -Y known?
** no: we treat -YZ as a black box and push it for the next calls (that 
may require crafting a "-YZ" string to put on argv)
** yes: we know whether this is -Y Z (then done) or -Y -Z; in the latter 
case, is -Z known?
*** no: we push "-Z"
*** yes: we handle it and are done.

Then we come to (c) with possibly -XYZ or -YZ or -Z (or even nothing) 
unknown, they will continue being pushed (no short option in c...g) 
until (g) which will properly bail out with "unknown option".
The key is that if -X is unknown in (a), it's ok to push -XYZ blindly 
without worrying that we are not handling a potentially known -Y/-Z: 
there are no short options in (a), so we cannot know -Y/-Z at this early 
stage.
And if -X is unknown in (b), it's ok again to not handle -Y/-Z, even if 
they are known, because we will bail out anyway on unknown -X in (g). 
Ok, it means that if I pass "-xnCutf8" (where 'x' is unknown), the 
server will not put itself in "new" mode, will not switch to "utf8", but 
it's ok to not do it as it will bail out on "x" later anyway. Whereas if 
I pass "-nxCutf8" it will put itself in "new" mode before bailing out. I 
don't think a user could validly request that we parse known options 
before bailing out; especially as no known option can make -x become valid.
Thread
bzr commit into mysql-trunk-bugfixing branch (marc.alff:3197) Bug#55873Marc Alff11 Aug
  • Re: bzr commit into mysql-trunk-bugfixing branch (marc.alff:3197)Bug#55873Guilhem Bichot23 Aug
    • Re: bzr commit into mysql-trunk-bugfixing branch (marc.alff:3197)Bug#55873Marc Alff24 Aug
      • Re: bzr commit into mysql-trunk-bugfixing branch (marc.alff:3197)Bug#55873Guilhem Bichot25 Aug