List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:October 16 2010 4:12pm
Subject:Re: one trace per row scanned, by range optimizer
View as plain text  
Hello,

More thoughts:

Guilhem Bichot a écrit, Le 15.10.2010 20:45:
> Hello,
> 
> I'm attaching a result file for discussion.
> I got it by using your patch and disabling the assertion in 
> Opt_trace_context::syntax_error(), so that the test can proceed further.
> There is a first problem in "attached_conditions", but easy to solve and 
> I have just pushed a fix for it (revno 3221). The more interesting 
> happens at end of object "join_optimization" (look for the "invalid 
> JSON" words around there).
> 
> For such query:
> 
> select * from t1, t2
> where t2.a between t1.a - interval 2 day and t1.a + interval 2 day;
> 
> we can see that the range optimizer is called once per row of t1: we 
> read a row of t1, and we call the range optimizer with the range "2 days 
> before or after this row of t1". This is the so-called "range checked 
> for each record" (visible in EXPLAIN). Two rows in t1, so two pieces of 
> trace related to the range optimizer.
> In my test there are only two outer rows, but imagine if there would be 
> 1000 outer rows: we would have 1000 traces of the range optimizer.
> What is the right approach?
> 1) Try to disable tracing at the right moment, to avoid this?

If we do this, the right moment could be in 
join_init_quick_read_record(), putting this
   Opt_trace_disable_I_S otd(tab->join->thd->opt_trace, true);
at start of the function.
That also fixes the syntax error in index_merge_myisam 
--opt-trace-protocol , where we have the range optimizer called once per 
evaluation of a subquery.

> 2) Or let this tracing be, saying that the range optimizer, even called 
> 1000 times in the execution phase, is still an important optimization 
> information, and just count on truncation by 
> @@optimizer_trace_max_mem_size (default: 16kB) if things go unreasonably 
> big? That doesn't sound too bad to me. If we go this way, I should 
> define a proper container for those range-opt-at-exec-time traces: a 
> "join_execution" object in the trace seems mandated, with a "steps" 
> array underneath; each range optimizer trace becomes an element of the 
> array. There is another problem: traces generated by execution have the 
> potential to be big, so truncation by max_mem_size would become 
> frequent; a truncated trace is not parsable by a JSON parser. So the 
> only way for a user to get a parsable trace is to increase max_mem_size 
> until it's big enough to host the execution trace, and it may be 
> unreasonably big then. So maybe we need a flag in JOIN::exec()'s top 
> object ("EXECUTION"?), so that the user can request no tracing in 
> JOIN::exec() and its children. Then the trace stays small, below 
> max_mem_size, not truncated, and parsable.

If we do this, then what should be the default for the "EXECUTION" flag: 
enabled or disabled? disabled is safer, and more in line with the name 
"optimizer trace". Its drawback is that we lose the join_optimization 
block of subqueries (as they are optimized only at execution of the top 
join).

> Disabling tracing at the start of JOIN::exec() is not a solution: in 
> other queries (like the first one using a subquery, in 
> optimizer_trace_no_prot.test), JOIN::optimize for the subquery is called 
> by the top query's JOIN::exec... so if I disable tracing of 
> JOIN::exec(), I lose tracing of the subquery's JOIN::optimize().
> 
> Do you have an opinion?

Still asking the question :-)
I am hesitant. I think that, to keep things under control, I would 
prefer to silence the execution phase, either by disabling the range opt 
calls done in execution, or by having the EXECUTION flag off by default.

-- 
Mr. Guilhem Bichot <guilhem.bichot@stripped>
Oracle / MySQL / Optimizer team, Lead Software Engineer
Bordeaux, France
www.oracle.com / www.mysql.com
Thread
one trace per row scanned, by range optimizerGuilhem Bichot15 Oct
  • Re: one trace per row scanned, by range optimizerGuilhem Bichot16 Oct
    • Re: one trace per row scanned, by range optimizerJorgen Loland18 Oct
      • Re: one trace per row scanned, by range optimizerGuilhem Bichot21 Oct
        • Re: one trace per row scanned, by range optimizerJorgen Loland21 Oct
          • Re: one trace per row scanned, by range optimizerGuilhem Bichot22 Oct