Hello Jorgen,
Jorgen Loland a écrit, Le 17.09.2010 15:08:
> #At
> file:///export/home/jl208045/mysql/wl4800/mysql-next-mr-opt-backporting-wl4800-patchcleanup/
> based on revid:guilhem@stripped
>
> 3203 Jorgen Loland 2010-09-17
> WL4800 - Optimizer trace/debugger
>
> Intended for premature review only - will not be pushed.
>
> Changeset contains
> * --trace-protocol for mtr
> * Tracing for the range optimizer
in the patch there are strings:
add("1", "found_records")
add("false", "usable")
which could be numbers or bools:
add(1, "found_records")
add(false, "usable")
The advantage is that if a language parses this (like Python with its
"json" module), 1 produces a number and false produces a bool, which can
then be used in computations and tests.
QA (Philip Stoev) had requested that we avoid spaces in key names: "key
parts" should be "key_parts", and so on. Thought it would be more grep-able.
About
oto1.add("SYSTEM table", "cause"):
"SYSTEM table" is the "cause" of what?
About
oto1.add("returning empty NULL row", "info"):
"info" is a new tag. Imagine how this would be used by an application
(which has parsed the trace into a Python dictionary or Perl hash): it
wants to know whether an empty NULL row is returned. Should it search
for the value of the "info" key into some object hierarchy, and compare
it to the desired value? My preference would be to have
oto1.add(true, "returning_empty_NULL_row")
instead. Then the app would search for key "returning_empty_NULL_row",
and if there is a such key or the value is true, then it knows.
Rephrasing: the property we are searching for should be described in the
key, if possible.
I have not followed this "logic" for the "cause" key; I have written
add("outer join", "cause")
instead of
add(true, "cause_is_outer_join").
and I'm still balancing between the two... The last one is in line with
my logic, the first one is a generic tag useful to just query "I want to
know the cause of this".
In situations like
Opt_trace_array ota(...);
for (...)
{
code adds one element to "ota" per for() iteration
}
in one case (pull_out_semijoin_tables()) I have put a short-cut at the
start of the function, so that the array is not shown if it will be
empty. I wanted to avoid showing subquery-related information if there
is no subquery.
That is not mandatory for every loop, just something to keep in mind and
probably apply in some cases.
The fact that tracing is in SQL_SELECT::test_quick_select(), which is
probably unavoidable, adds the problem that, this function being called
from several places, we always have to identify the proper context
(array or object). I see your "//temporary fix" about this. Those fixes
may have to stay or be replaced by what you suggested (the
"auto-open-object" feature you explained earlier), let's think a bit
more about this.
No objection to the approach taken for tracing, in the patch.
I don't understand certain new trace points, for example:
"possible min-max indices": [
...
{
"index": "i3",
"usable": false,
"cause": "not applicable"
}
why is this index not applicable? Should that be explained in the trace?
In code it seems to be that !keys_to_use.is_set(idx) is true. But I have
no clue why this key isn't allowed for use. In index_merge_myisam.test
with --trace-protocol there seems to be a lot of such mysterious "not
applicable" indices; they make the trace grow quite a bit alas. But
maybe it has to be (maybe they carry real information?).
"using quick", "TRP", "cost with sweep", "(SP)": we should find
alternative words understandable by a user. I heard that in MySQL
"quick" roughly means "range optimizer" but this isn't very intuitive ;-)
You could push the trace-protocol part now. I suggest naming it
--opt-trace-protocol and using also this name for variables in
mysqltest.cc (there are traces for so many things: DBUG trace,
dtrace...). trace_re could be changed to catch "EXPLAIN EXTENDED
SELECT". In run_trace():
- "select trace" is a fixed query: we don't need to allocate 256 bytes
for it, the default of 128 should do.
- I think we don't need ds_warnings_messages.