Hi Sergei
Sergei Golubchik wrote:
> Hi!
>
> On Jun 23, Marc Alff wrote:
>
>>> On Mar 28, marc.alff@stripped wrote:
>>>
>>>
>>>> ChangeSet@stripped, 2008-03-28 11:32:08-06:00,
>>>> malff@stripped. +12 -0
>>>> Bug#35577 (CREATE PROCEDURE causes either crash or syntax error
>>>> depending on build)
>>>>
>>>> mysql-test/r/parser_stack.result@stripped, 2008-03-28 11:32:04-06:00,
>>>> malff@stripped. +191 -0
>>>> Bug#35577 (CREATE PROCEDURE causes either crash or syntax error
>>>> depending on build)
>>>> diff -Nrup a/sql/sp.cc b/sql/sp.cc
>>>> --- a/sql/sp.cc 2008-02-19 08:27:17 -07:00
>>>> +++ b/sql/sp.cc 2008-03-28 11:32:03 -06:00
>>>> @@ -443,6 +443,8 @@ db_load_routine(THD *thd, int type, sp_n
>>>> {
>>>> Lex_input_stream lip(thd, defstr.c_ptr(), defstr.length());
>>>> thd->m_lip= &lip;
>>>> + Parser_state parser_state;
>>>> + thd->m_parser_state= &parser_state;
>>>>
>>> Add a comment here that thd->m_parser_state will be set to NULL in
>>> the parser END_OF_INPUT rule.
>>>
>> What this points to is that, in fact, the interface exposed by the parser
>> to the caller is awkward.
>>
>> The caller should not have to set *2* attributes in THD to invoke the
>> parser, and since the parser will work even when thd->parser_state is
>> NULL in most cases (when no overflow occur), the code around
>> END_OF_INPUT was changed to prevent bugs, should another place in the
>> code invoke MYSQLparse() ... which happens as the code evolves
>> (partitions, events).
>>
>> This in turn is awkward, since THD::m_parser_state is set in one place
>> and cleared in another, as you have noted by asking for a comment.
>>
>
> What I meant - what triggered a "read light" for me - that you store a
> pointer to a local variable in a thd and leave that stack frame without
> resetting thd->m_parser_state. That is thd->m_parser_state is invalid
> and points nowhere.
>
> I wanted an explanation and a comment why it's safe.
> It could be safe because thd->m_parser_state is reset in the parser
> END_OF_INPUT rule. It could be safe because a caller must always set
> m_parser_state before invoking a parser so there's no chance for parser
> to access a rogue pointer. If the latter is the reason - it's too error
> prone and should be fixed, that is when we leave that stack frame
> m_parser_state should not point into it anymore.
>
>
Agreed, thd->m_parser_state should be set to NULL to avoid the dangling
pointer.
I will fix that with other review comments, if any, before pushing.
For the record, I have been told in the past (by others) that coding
like this was "not acceptable" (sic),
since it's a useless instruction that "degrade performances", and that
writing robust code was in fact "bad".
I am glad to see that there are some sane people left in this company.
>>>> lex_start(thd);
>>>> thd->spcont= NULL;
>>>> ret= MYSQLparse(thd);
>>>>
> ...
>
>>> Ok, I agree that in principle is the correct approach, the parser state
>>> should not be in the Lex.
>>>
>> Good.
>>
>>
>>> But why did you create a second class for the parser state ? There is
>>> already Lex_input_stream class - which contain parser state
>>>
>>> variables, such as tok_start, tok_end, next_state, yylval, etc.
>>>
>> These variables (yylval, tok_start and friends) are by definition the
>> interface exposed by the lexer to the parser, so it's normal to have
>> both sql_lex.cc and sql_yacc.yy dependencies to these attributes in
>> Lex_input_stream.
>>
>>
>>> Surely, there's not much sense in splitting parser state in two
>>> classes - yyss and yyvs belong to the same class where yylval and
>>> next_state do.
>>>
>>>
>> Actually, I think there is, and this is far reaching.
>>
> ...
>
>> I agree that, *today*, the content of:
>> (a) - Lex_input_stream
>> (b) - yyss / yysv
>> has exactly the same life cycle, and that there is a 1-1 relationship
>> between the two.
>>
> ...<skip>...
>
>> I hope that this (rather long) reply will convince you that yyss and
>> yysv don't belong *in* Lex_input_stream, even when I agree that yyss
>> and yysv are really *close* to Lex_input_stream.
>>
>
> Okay, you say there's a logical difference, but today they have a same
> lifetine and 1:1 relationship.
>
> What about moving them in two separate structures - as logically as you
> want - and put these two in one. Like in:
>
> struct Lex_input_stream /* rename as you please */
> {
> struct Lexer_state lstate;
> struct Parser_state pstate;
> }
>
> this way you'll emphasise conceptual differences between different state
> variables, but still show that they have a same lifecycle now. Later
> it'd be easy to separate them if the need arises.
>
>
Indeed, this is what was implemented already, see the 2nd patch.
Regards,
-- Marc