Hello Tor,
Tor Didriksen a écrit, Le 22.07.2010 09:06:
> #At file:///export/home/didrik/repo/next-mr-opt-backporting-valgrind/ based on
> revid:roy.lyseng@stripped
>
> 3217 Tor Didriksen 2010-07-22
> Bug #55042 valgrind reports error in best_access_path when running
> subquery_mat_all
>
> The code was using loose_scan_pos, even though it was explicitly set to
> invalid.
> @ sql/sql_select.cc
> Terminate search in optimize_wo_join_buffering() if we has a cost of DBL_MAX
> (as determined by best_access_path())
>
> Remove sj_inner_fanout output argument, it was unused.
>
> modified:
> sql/sql_select.cc
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc 2010-07-16 12:21:31 +0000
> +++ b/sql/sql_select.cc 2010-07-22 07:06:33 +0000
> @@ -13299,8 +13299,6 @@ static bool check_interleaving_with_nj(J
> table
> reopt_rec_count OUT New output record count
> reopt_cost OUT New join prefix cost
> - sj_inner_fanout OUT Fanout in the [first_tab; last_tab] range that
> - is produced by semi-join-inner tables.
>
> DESCRIPTION
> Given a join prefix [0; ... first_tab], change the access to the tables
> @@ -13317,23 +13315,19 @@ static bool check_interleaving_with_nj(J
> void optimize_wo_join_buffering(JOIN *join, uint first_tab, uint last_tab,
> table_map last_remaining_tables,
> bool first_alt, uint no_jbuf_before,
> - double *reopt_rec_count, double *reopt_cost,
> - double *sj_inner_fanout)
> + double *reopt_rec_count, double *reopt_cost)
> {
> - double cost, outer_fanout, inner_fanout= 1.0;
> + double cost=0.0, outer_fanout=1.0, inner_fanout= 1.0;
> table_map reopt_remaining_tables= last_remaining_tables;
> uint i;
>
> + DBUG_ENTER("optimize_wo_join_buffering");
> +
> if (first_tab > join->const_tables)
> {
> cost= join->positions[first_tab - 1].prefix_cost.total_cost();
> outer_fanout= join->positions[first_tab - 1].prefix_record_count;
> }
> - else
> - {
> - cost= 0.0;
> - outer_fanout= 1;
> - }
moving this "else" block to unconditional initializations is more a
matter of taste than a bugfix. I liked the original code better: it says
that if the table is the first in join order, the cost so far is 0, and
the fanout is 1, that's logical. Whereas the new code sets to 0 and 1
and corrects that if the table is not first.
But that is a detail, you can keep your change if you like it.
> for (i= first_tab; i <= last_tab; i++)
> reopt_remaining_tables |= join->positions[i].table->table->map;
> @@ -13346,16 +13340,28 @@ void optimize_wo_join_buffering(JOIN *jo
> if ((i == first_tab && first_alt) ||
> join->positions[i].use_join_buffer)
> {
> /* Find the best access method that would not use join buffering */
> - best_access_path(join, rs, reopt_remaining_tables, i,
> - test(i < no_jbuf_before), inner_fanout*outer_fanout,
> - &pos, &loose_scan_pos);
> + best_access_path(join,
> + rs,
> + reopt_remaining_tables,
> + i,
> + test(i < no_jbuf_before),
This isn't your code, but this test() is superfluous, we could just use
"i < no_jbuf_before", it is up to you.
> + inner_fanout*outer_fanout,
This isn't your code, but the coding style asks for spaces around *.
> + &pos,
> + &loose_scan_pos);
That makes 8 lines out of 3, and I found the 3 were sufficiently
readable, so... do you really want this layout change?
> }
> else
> pos= join->positions[i];
>
> - if ((i == first_tab && first_alt))
> + if (i == first_tab && first_alt)
> pos= loose_scan_pos;
>
> + if (pos.read_time == DBL_MAX)
I suggest adding a comment describing this condition (it must be about
something being impossible, though I don't picture myself what an
"impossible pos" can be).
> + {
> + *reopt_rec_count = DBL_MAX;
no space before =
> + *reopt_cost= DBL_MAX;
> + DBUG_VOID_RETURN;
> + }
> +
> reopt_remaining_tables &= ~rs->table->map;
Should this arithmetic above be moved to before the if(), to mark that
the table has been handled? Or maybe we can rather say that it has not
been handled?
> cost += pos.read_time;
> @@ -13525,11 +13533,10 @@ void advance_sj_state(JOIN *join, table_
> Got a complete FirstMatch range.
> Calculate correct costs and fanout
> */
> - double reopt_cost, reopt_rec_count, sj_inner_fanout;
> + double reopt_cost, reopt_rec_count;
> optimize_wo_join_buffering(join, pos->first_firstmatch_table, idx,
> remaining_tables, FALSE, idx,
> - &reopt_rec_count, &reopt_cost,
> - &sj_inner_fanout);
> + &reopt_rec_count, &reopt_cost);
> /*
> We don't yet know what are the other strategies, so pick the
> FirstMatch.
> @@ -13589,7 +13596,7 @@ void advance_sj_state(JOIN *join, table_
> first=join->positions + pos->first_loosescan_table;
> uint n_tables=
> my_count_bits(first->table->emb_sj_nest->sj_inner_tables);
> /* Got a complete LooseScan range. Calculate its cost */
> - double reopt_cost, reopt_rec_count, sj_inner_fanout;
> + double reopt_cost, reopt_rec_count;
> /*
> The same problem as with FirstMatch - we need to save POSITIONs
> somewhere but reserving space for all cases would require too
> @@ -13600,18 +13607,23 @@ void advance_sj_state(JOIN *join, table_
> TRUE, //first_alt
> pos->first_loosescan_table + n_tables,
> &reopt_rec_count,
> - &reopt_cost, &sj_inner_fanout);
> - /*
> - We don't yet have any other strategies that could handle this
> - semi-join nest (the other options are Duplicate Elimination or
> - Materialization, which need at least the same set of tables in
> - the join prefix to be considered) so unconditionally pick the
> - LooseScan.
> - */
> - pos->sj_strategy= SJ_OPT_LOOSE_SCAN;
> - *current_read_time= reopt_cost;
> - *current_record_count= reopt_rec_count;
> - handled_by_fm_or_ls= first->table->emb_sj_nest->sj_inner_tables;
> + &reopt_cost);
> + if (reopt_cost < DBL_MAX)
> + {
> + /*
> + We don't yet have any other strategies that could handle this
> + semi-join nest (the other options are Duplicate Elimination or
> + Materialization, which need at least the same set of tables in
> + the join prefix to be considered) so unconditionally pick the
> + LooseScan.
> + */
> + pos->sj_strategy= SJ_OPT_LOOSE_SCAN;
> + *current_read_time= reopt_cost;
> + *current_record_count= reopt_rec_count;
> + handled_by_fm_or_ls= first->table->emb_sj_nest->sj_inner_tables;
> + }
> + else
> + DBUG_PRINT("info", ("Cannot use LooseScan"));
> }
> }
The function comment of optimize_wo_join_buffering() does not suggest
that this function can "fail", I mean, can return a plan with DBL_MAX
which apparently means "not usable". I think the comment should say that
this is possible. And, is such "failure" possible only if first_alt is
true? If yes, I would state it in the comment. If no, i.e. it can happen
with first_alt==false, then the code for SJ_OPT_FIRSTMATCH:
optimize_wo_join_buffering(join, pos->first_firstmatch_table, idx,
remaining_tables, FALSE, idx,
&reopt_rec_count, &reopt_cost,
&sj_inner_fanout);
<cut comment>
pos->sj_strategy= SJ_OPT_FIRST_MATCH;
*current_read_time= reopt_cost;
*current_read_time= reopt_cost;
*current_record_count= reopt_rec_count;
handled_by_fm_or_ls= pos->firstmatch_need_tables;
is incorrect (needs to test for reopt_cost==DBL_MAX).