Hello,
Roy Lyseng a écrit, Le 22.06.2011 14:46:
> Hi Guilhem,
>
> thank you for a clever bugfix, it certainly makes the logic of the
> optimizer simpler and clearer.
Thanks!
> I have a few typographical comments, and a suggestion for a followup
> patch below, otherwise the bugfix is approved.
ok, please see below
> On 17.06.11 14.46, Guilhem Bichot wrote:
>> #At file:///home/mysql_src/bzrrepos_new/mysql-next-mr-opt-backporting/
>> based on revid:jorgen.loland@stripped
>>
>> 3385 Guilhem Bichot 2011-06-17
>> Fix for
>> BUG#12586926 "EXTRA ROW WITH JOIN + GROUP BY + ORDER BY WITH
>> JCL>=5 AND MRR ENABLED"
>> BUG#12619510 "JCL: MORE ROWS AND DIFFERENT OUTPUT WITH JCL>=5"
>> BUG#12619868 "JCL: MORE ROWS OF OUTPUT WHEN JCL>=5":
>
> I'd appreciate it if the commit log lines were max. 70 characters wide.
> Some tools have a nasty habit of wrapping lines wider than that.
I try to watch the length of lines (you already asked for it in the
past) but sometimes I forget; gcommit does not warn about this.
>> the function takes at attempt at disabling the choice of
>> index ordering for GROUP BY
>
> "an attempt"
fixed
>> 3) The recently added assertion in setup_join_buffering() is
>> removed;
>> this assertion probably intended to say that if we disable
>> join buffering in this
>> function we should already have known it before, when we
>> computed costs; but
>> - it could never fire before, because no_jbuf_after was
>> always bigger than tableno
>> - now it can fire, due to "FORCE INDEX FOR (GROUP|ORDER) BY"
>> (which now
>> properly disables join buffering); in that case it's ok that
>> the cost
>> calculation was not aware: when the user uses FORCE INDEX
>> she/he intentionally
>> ignores cost calculations.
>
> I do not agree to this. There is good reason to have join buffering
> considered as early as possible.
>
> If we have a join of tables t1, t2 and t3 and force to use an index for
> t1 (which effectlively makes the table first in the join order),
> efficient calculation of the join plan for t2 and t3 still matters. By
> applying this information at the earlier stage inside the greedy
> optimizer, would make the cost estimates more accurate and hence there
> is a better chance of getting an optimal plan.
ok, now I understand.
> It think it would be possible to do this as follows, in a separate
> followup patch:
>
> 1. When a table has a FORCE INDEX FOR GROUP BY/ORDER BY on it, make it
> first in the join plan by setting all other tables to be dependent on
> this table (unless the table itself has dependencies to it).
>
> 2. When considering the first table in best_access_path(), set a flag in
> the JOIN object that tells whether to consider join buffering for
> subsequent tables or not.
>
> 3. Take this flag into consideration when determining whether to use
> join buffering for all tables.
I think I could, in a follow-up patch, do 2) and 3) and restore your
assertion. As for 1), it scares me to death - I fear it changes plans,
or I mess up with table dependencies.
Let me know if that compromise would be ok, and I'll put it on my todo.
>> +CREATE TABLE t2 (
>> +pk int(11) NOT NULL AUTO_INCREMENT,
>> +col_varchar_key varchar(1) NOT NULL,
>> +PRIMARY KEY (pk)
>> +);
>
> Perhaps delete AUTO_INCREMENT.
done
>> +CREATE TABLE t1 (
>> + col_varchar_key varchar(1))
>> + ENGINE=MyISAM DEFAULT CHARSET=latin1;
>> +CREATE TABLE t2 (
>> + pk int(11) NOT NULL AUTO_INCREMENT,
>> + col_int_nokey int(11) NOT NULL,
>> + col_int_key int(11) NOT NULL,
>> + PRIMARY KEY (pk),
>> + KEY col_int_key (col_int_key)
>> +) ENGINE=MyISAM AUTO_INCREMENT=21 DEFAULT CHARSET=latin1;
>
> I think that you can delete AUTO_INCREMENT and DEFAULT CHARSET above.
done, and deleted ENGINE=MyISAM (it's superfluous).