From: Jorgen Loland Date: November 15 2010 10:10am Subject: Re: bzr push into mysql-next-mr-bugfixing branch (guilhem:3228 to 3229) List-Archive: http://lists.mysql.com/commits/123867 Message-Id: <4CE106FF.3090709@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Hello Guilhem, We're almost done with this bath now :-) On 11/12/2010 03:32 PM, Guilhem Bichot wrote: > Hello, > > Jorgen Loland a écrit, Le 09.11.2010 13:55: >> Review of changesets up to 3218: >> >> === modified file 'mysql-test/r/optimizer_trace_no_prot.result' >> --- mysql-test/r/optimizer_trace_no_prot.result 2010-09-22 19:50:59 +0000 >> +++ mysql-test/r/optimizer_trace_no_prot.result 2010-10-05 13:19:55 +0000 >> > @@ -1962,7 +1949,15 @@ QUERY_ID TRACE MISSING_BYTES_BEYOND_MAX_ >> > } >> > ] /* steps */ >> > } 0 0 >> > -set optimizer_trace="skip_plan=off"; >> > +set @@optimizer_trace_features="greedy_search=on,misc=off"; >> > +explain select * from t1,t2; >> > +id select_type table type possible_keys key key_len ref rows Extra >> > +1 SIMPLE t2 ALL NULL NULL NULL NULL 2 > +1 SIMPLE t1 ALL NULL NULL >> NULL NULL 3 Using join buffer (BNL, regular buffers) >> > +select * from information_schema.OPTIMIZER_TRACE; >> > +QUERY_ID TRACE MISSING_BYTES_BEYOND_MAX_MEM_SIZE OS_MALLOC_ERROR >> > +0 0 0 >> > +set @@optimizer_trace_features=default; >> >> jl: So turning feature "misc" off is the same as disabling tracing? I >> don't find that all to intuitive. If this is correct (misc=off == >> enable=off), why don't we remove the misc feature name? > > What happens is that the top object (the {} around the entire trace) is > in category "misc" (every structure needs to have a category...), so if > you disable "misc", you disable this top object, and thus everything > below it. > Indeed "misc" isn't intended to be used by users, so I will hide it. Great :) >> > === modified file 'sql/opt_trace.cc' >> > --- sql/opt_trace.cc 2010-09-22 19:50:59 +0000 >> > +++ sql/opt_trace.cc 2010-09-27 08:00:25 +0000 >> > @@ -86,15 +105,27 @@ void Opt_trace_ooa::do_construct(Opt_tra >> > */ >> > size_t new_size= alloced + increment; >> > const size_t max_size= stmt->buffer.allowed_mem_size; >> > - if (new_size > max_size /* don't pre-allocate more than the limit >> */ && >> > - max_size >= alloced /* don't shrink string if limit exceeded */ ) >> > + if (new_size > max_size) // don't pre-allocate more than the limit >> > new_size= max_size; >> > - stmt->buffer.realloc(new_size - 1 /* because String::realloc has >> +1 */); >> > + const size_t overhead= >> > + 1 + (ALIGN_SIZE(1) - 1) // see start of String::realloc() >> > + /* >> > + Stay strictly below max_size or it would trigger truncation at >> > + next append(), which is potentially wrong for a "pre-emptive >> > + allocation" as we do here. >> > + */ >> > + + 1; >> >> jl: I have problems understanding the size of the overhead. Why is 1 >> used as parameter to ALIGN_SIZE? > > This is an horribly complicated piece of code on which I spent hours, > until truncation finally happened when I expected :-( > Trying to remember... > The idea is that the MySQL user wants to limit memory size. So we should > not be interested in the String's length but in the String's _allocated_ > length (which is possibly bigger). > We are going to re-allocate the String to make room for more bytes (as > you see we try to allocate in increments of 1024 to not have too > frequent realloc()). > So the simple code would be: > > // grow by 1024 bytes > size_t new_size= alloced + increment; > const size_t max_size= stmt->buffer.allowed_mem_size; > if (new_size > max_size) // don't pre-allocate more than the > // limit > new_size= max_size; > stmt->buffer.realloc(new_size); > But String::realloc() does not reallocate "new_size" bytes, it has a > more complex formula: it reallocates ALIGN_SIZE(length + 1). > So the simple code above will do > realloc(ALIGN_SIZE(new_size + 1)). > Imagine that max_size is 0 (because all previous traces, still kept due > to optimizer_trace_limit and optimizer_trace_offset, have consumed all > allowed memory). new_size is then 0, and the above realloc() allocates 8 > bytes. So for each new kept trace, we will eat 8 bytes. I found it was a > "security issue", or at least not "works as expectable". I wanted that > we never allocate if max_size is 0. Which is why we dive into the gory > details of String::realloc(): we say that this function has an overhead > of 1+(ALIGN_SIZE(1)-1). Ok I don't remember why I didn't write directly > ALIGN_SIZE(1): this means 8 bytes (sizeof(double) exactly), and if you > look at the start of String::realloc() this is indeed correct. Then I > finally add 1, as the comment in code explains. > Let me know what you think; there is probably a simpler way to write > this code... Ok, this makes a little more sense now. But the fact that you didn't remember exactly why you did it like this indicates that it would be a good idea to add to the comments :-) >> > --- sql/sql_view.cc 2010-08-30 06:38:09 +0000 >> > +++ sql/sql_view.cc 2010-10-05 10:29:43 +0000 >> > @@ -1138,6 +1139,12 @@ bool mysql_make_view(THD *thd, File_pars >> > table->definer.user.str= table->definer.host.str= 0; >> > table->definer.user.length= table->definer.host.length= 0; >> > >> > + Opt_trace_object oto0(thd->opt_trace); >> > + Opt_trace_object oto1(thd->opt_trace, "view"); >> > + oto1.add("db", table->db, table->db_length). >> > + add("name", table->table_name, table->table_name_length). >> > + add("in_select#", old_lex->select_lex.select_number); >> > + >> >> jl: This has been changed to add_str in a later cs, but don't we need >> to use add_str_escaped when we don't know the character set? > > Normally in MySQL (I just checked with Bar the specialist), TABLE's and > TABLE_LIST's db and table_name are in utf8. So we can pass them to > add_str() without charset conversion; but you are right that they can > contain characters like \n which must be escaped. I have put it in my > TODO to reconsider all add_str* calls in code, to see which ones need > escaping, and to add tests (see also same discussion in a review of your > range opt patch). I think we should add a few escaped table/view/column names to the tests to verify that everything is ship shape. Btw, at time of implementation of the range tracing, I had the impression that add_str did escaping, so I can imagine that there are some comments on that in your review :) >> opt_trace2server.cc >> ------------------- >> > if (thd->opt_trace->start(need_it_for_I_S, >> > (var & Opt_trace_context::FLAG_END_MARKER), >> > (var & Opt_trace_context::FLAG_ONE_LINE), >> > thd->variables.optimizer_trace_offset, >> > thd->variables.optimizer_trace_limit, >> > thd->variables.optimizer_trace_max_mem_size, >> > >> Opt_trace_context::feature_value(thd->variables.optimizer_trace_features))) >> >> >> jl: long line > > Yes... even just > Opt_trace_context::feature_value(thd->variables.optimizer_trace_features > is 72 chars, so I don't see how to break the line properly. What would > you suggest? Touchè. I see no good way to break it, but I would probably start a new line before the argument (between "(" and "thd"). I'm not sure if that is any better. I'll leave it for you to decide. >> > char buff[1024]; >> > String str(buff,(uint32) sizeof(buff), system_charset_info); >> > str.length(0); >> > /** >> > If this statement is not SELECT, what is shown here can be >> > inexact. INSERT SELECT is shown as SELECT. DELETE WHERE is shown >> > as SELECT WHERE. >> > This is acceptable given the audience (developers) and the goal (the >> > inexact parts are irrelevant for the optimizer). >> > */ >> > thd->lex->unit.print(&str, enum_query_type(QT_EXPAND_VIEW_DEFINITION | >> > QT_SHOW_SELECT_NUMBER)); >> >> jl: what happens if the expanded query turns out to be longer than >> 1024 characters? > > Most print() uses String::append(), which calls String::realloc(), which > has this logic: > if (this string's buffer was allocated) then realloc() it > else malloc() a new buffer and use it for the String. > Normally, when we exceed 1024 bytes for the first time, we fall into the > "else" case. > The 1024-byte buffer is only, I think, an optimization to use the stack > instead of the heap, most of the times. But is not a limitation. > It's by reading code that I say this, you may want to double-check, or > ask me to add a test with a very long query and run with Valgrind. > > I will make a commit when ready. Yes, if you're not 100% certain I would like you to double check that it does not break. Apart from that I'm happy with the explanation. -- Jørgen Løland | Senior Software Engineer | +47 73842138 Oracle MySQL Trondheim, Norway