On 24.06.11 15.00, Guilhem Bichot wrote:
> Roy Lyseng a écrit, Le 22.06.2011 14:46:
>> Hi Guilhem,
>> thank you for a clever bugfix, it certainly makes the logic of the optimizer
>> simpler and clearer.
>> I have a few typographical comments, and a suggestion for a followup patch
>> below, otherwise the bugfix is approved.
> ok, please see below
>> On 17.06.11 14.46, Guilhem Bichot wrote:
>>> #At file:///home/mysql_src/bzrrepos_new/mysql-next-mr-opt-backporting/ based
>>> on revid:jorgen.loland@stripped
>>> 3385 Guilhem Bichot 2011-06-17
>>> Fix for
>>> BUG#12586926 "EXTRA ROW WITH JOIN + GROUP BY + ORDER BY WITH JCL>=5 AND
>>> BUG#12619510 "JCL: MORE ROWS AND DIFFERENT OUTPUT WITH JCL>=5"
>>> BUG#12619868 "JCL: MORE ROWS OF OUTPUT WHEN JCL>=5":
>> I'd appreciate it if the commit log lines were max. 70 characters wide. Some
>> tools have a nasty habit of wrapping lines wider than that.
> I try to watch the length of lines (you already asked for it in the past) but
> sometimes I forget; gcommit does not warn about this.
>>> the function takes at attempt at disabling the choice of index ordering for
>>> GROUP BY
>> "an attempt"
>>> 3) The recently added assertion in setup_join_buffering() is removed;
>>> this assertion probably intended to say that if we disable join buffering in
>>> function we should already have known it before, when we computed costs; but
>>> - it could never fire before, because no_jbuf_after was always bigger than
>>> - now it can fire, due to "FORCE INDEX FOR (GROUP|ORDER) BY" (which now
>>> properly disables join buffering); in that case it's ok that the cost
>>> calculation was not aware: when the user uses FORCE INDEX she/he
>>> ignores cost calculations.
>> I do not agree to this. There is good reason to have join buffering considered
>> as early as possible.
>> If we have a join of tables t1, t2 and t3 and force to use an index for t1
>> (which effectlively makes the table first in the join order), efficient
>> calculation of the join plan for t2 and t3 still matters. By applying this
>> information at the earlier stage inside the greedy optimizer, would make the
>> cost estimates more accurate and hence there is a better chance of getting an
>> optimal plan.
> ok, now I understand.
>> It think it would be possible to do this as follows, in a separate followup
>> 1. When a table has a FORCE INDEX FOR GROUP BY/ORDER BY on it, make it first
>> in the join plan by setting all other tables to be dependent on this table
>> (unless the table itself has dependencies to it).
>> 2. When considering the first table in best_access_path(), set a flag in the
>> JOIN object that tells whether to consider join buffering for subsequent
>> tables or not.
>> 3. Take this flag into consideration when determining whether to use join
>> buffering for all tables.
> I think I could, in a follow-up patch, do 2) and 3) and restore your assertion.
> As for 1), it scares me to death - I fear it changes plans, or I mess up with
> table dependencies.
> Let me know if that compromise would be ok, and I'll put it on my todo.
Sure, no problem. As for point 1, we could check with others if they think this
is a good idea.