Hi Ole John,
I think including sort time is good to achieve earlier pruning of
plans. I also think that the patch looks good and have only some minor
comments.
OK to push this patch when you have considered my comments.
Olav
On 02/ 1/11 11:27 AM, Ole John Aske wrote:
> #At file:///net/fimafeng09/export/home/tmp/oleja/mysql/mysql-trunk/ based on
> revid:ole.john.aske@stripped
>
> 3553 Ole John Aske 2011-02-01
> Last addendum fix related to bug#59326, and particularly
> its relatives/duplicates bug#41740 and bug#58225.
>
> If a filesort is required by the final result, we can include
> the cost for the filesort in the pruning of 'bad' plans, and
> possible prune the plan at an earlier stage than today.
>
> NOTE1: The 'sort_cost' is calculated with logic similar
> to the one in plan_is_complete() which adds the sort_cost
> to the best plan.
>
> NOTE2: The 'sort_cost' is calculated as '= current_record_count'.
> This is not entirely correct when used in the pruning logic
> as we do not yet #rows in the *final* result which is what
> we have to sort. However, it should be a safe assumption that
> 'final #rows'>= 'intermediat #rows'. So this prune-calculation
> should always be on the safe side.
>
> NOTE3: Calculating 'sort_cost= current_record_count' is a bad
> estimate as most sort algorithms are O(n*log(n)).
>
> This fix reduce the query compile time for the infamous bug#58225 problem
> further from 50ms to ~20ms
>
> modified:
> sql/sql_select.cc
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc 2011-02-01 09:43:41 +0000
> +++ b/sql/sql_select.cc 2011-02-01 10:27:28 +0000
> @@ -8325,7 +8325,31 @@ best_extension_by_limited_search(JOIN
> backout_nj_sj_state(remaining_tables, s);
> continue;
> }
> - }
> +
> + /*
> + There might later apply an addition cost for filesorting
> + if the result can't be ordered by using an ordered index.
> + Can prune if 'cost sofar + later_sort' already exceeds 'best_read'.
> + Note: Similar code in plan_is_complete().
> + */
> + if (idx> join->const_tables&& //
> '->positions[const_tables]' valid?
I think the "positions array" should be valid also for the const tables
portion. If it is not then it would be good to know the reason why and
add that as comment instead.
On the other side: Since this is just a heuristics I think it is fine to
skip this test when idx is small then const_tables.
> + join->sort_by_table&&
> + join->sort_by_table !=
> + join->positions[join->const_tables].table->table)
> + {
> + double sort_time = current_record_count;
Two tiny proposal for changes:
1. add "const" to the sort_time.
2. remove the space in front of "="
About using current_record_count as an estimate for sort_time: I agree
with your comments in the commit message and I think it is a good idea
to use the same as in plan_is_complete(). Still, I feel that at least
for small number of records it would be more correct to divide this
number with the TIME_TO_COMPARE constant here.
> + if (current_read_time + sort_time>= join->best_read)
> + {
> + DBUG_EXECUTE("opt", print_plan(join, idx+1,
> + current_record_count,
> + read_time,
> + current_read_time,
> + "prune_by_sortcost_heuristic"););
> + backout_nj_sj_state(remaining_tables, s);
> + continue;
> + }
> + }
> + } // if (prune...)
>
> if ( (search_depth> 1)&& (remaining_tables&
> ~real_table_bit)& allowed_tables )
> {
>
>
>
>