Hello,
More thoughts:
Guilhem Bichot a écrit, Le 15.10.2010 20:45:
> Hello,
>
> I'm attaching a result file for discussion.
> I got it by using your patch and disabling the assertion in
> Opt_trace_context::syntax_error(), so that the test can proceed further.
> There is a first problem in "attached_conditions", but easy to solve and
> I have just pushed a fix for it (revno 3221). The more interesting
> happens at end of object "join_optimization" (look for the "invalid
> JSON" words around there).
>
> For such query:
>
> select * from t1, t2
> where t2.a between t1.a - interval 2 day and t1.a + interval 2 day;
>
> we can see that the range optimizer is called once per row of t1: we
> read a row of t1, and we call the range optimizer with the range "2 days
> before or after this row of t1". This is the so-called "range checked
> for each record" (visible in EXPLAIN). Two rows in t1, so two pieces of
> trace related to the range optimizer.
> In my test there are only two outer rows, but imagine if there would be
> 1000 outer rows: we would have 1000 traces of the range optimizer.
> What is the right approach?
> 1) Try to disable tracing at the right moment, to avoid this?
If we do this, the right moment could be in
join_init_quick_read_record(), putting this
Opt_trace_disable_I_S otd(tab->join->thd->opt_trace, true);
at start of the function.
That also fixes the syntax error in index_merge_myisam
--opt-trace-protocol , where we have the range optimizer called once per
evaluation of a subquery.
> 2) Or let this tracing be, saying that the range optimizer, even called
> 1000 times in the execution phase, is still an important optimization
> information, and just count on truncation by
> @@optimizer_trace_max_mem_size (default: 16kB) if things go unreasonably
> big? That doesn't sound too bad to me. If we go this way, I should
> define a proper container for those range-opt-at-exec-time traces: a
> "join_execution" object in the trace seems mandated, with a "steps"
> array underneath; each range optimizer trace becomes an element of the
> array. There is another problem: traces generated by execution have the
> potential to be big, so truncation by max_mem_size would become
> frequent; a truncated trace is not parsable by a JSON parser. So the
> only way for a user to get a parsable trace is to increase max_mem_size
> until it's big enough to host the execution trace, and it may be
> unreasonably big then. So maybe we need a flag in JOIN::exec()'s top
> object ("EXECUTION"?), so that the user can request no tracing in
> JOIN::exec() and its children. Then the trace stays small, below
> max_mem_size, not truncated, and parsable.
If we do this, then what should be the default for the "EXECUTION" flag:
enabled or disabled? disabled is safer, and more in line with the name
"optimizer trace". Its drawback is that we lose the join_optimization
block of subqueries (as they are optimized only at execution of the top
join).
> Disabling tracing at the start of JOIN::exec() is not a solution: in
> other queries (like the first one using a subquery, in
> optimizer_trace_no_prot.test), JOIN::optimize for the subquery is called
> by the top query's JOIN::exec... so if I disable tracing of
> JOIN::exec(), I lose tracing of the subquery's JOIN::optimize().
>
> Do you have an opinion?
Still asking the question :-)
I am hesitant. I think that, to keep things under control, I would
prefer to silence the execution phase, either by disabling the range opt
calls done in execution, or by having the EXECUTION flag off by default.
--
Mr. Guilhem Bichot <guilhem.bichot@stripped>
Oracle / MySQL / Optimizer team, Lead Software Engineer
Bordeaux, France
www.oracle.com / www.mysql.com