Hello,
Jorgen Loland a écrit, Le 16.03.2011 15:35:
> Guilhem,
>
> Thank you for the thorough review. I had to think a little bit and write
> a function that prints the SEL_ARG tree before replying. See comments
> inline.
thanks for the clear graphs and the explanations here and on IRC.
Could you please spell completely "red-black tree" in opt_range.cc in
"all intervals in the list form an RB-tree"
?
> Guilhem Bichot wrote:
>>> + ] /* ranges */,
>>> + "index_only": false,
>>> + "records": 6,
>>> + "cost": 10.21,
>>> + "rowid_ordered": false,
>>> + "chosen": false,
>>> + "cause": "cost"
>>> + }
>>> + ] /* range_scan_alternatives */,
>>> + "analyzing_roworder_intersect": {
>>> + "usable": false,
>>> + "cause": "too_few_roworder_scans"
>>> + } /* analyzing_roworder_intersect */
>>> + } /* analyzing_range_alternatives */,
>>> + "chosen_range_access_summary": {
>>> + "range_access_plan": {
>>> + "type": "range_scan",
>>> + "index": "PRIMARY",
>>> + "records": 2,
>>> + "ranges": [
>>> + "4 <= key1a <= 4 AND 3 < key1b < 7",
>>
>> this one above is not printed (as indicated in the commit comment).
>>
>> In sel_arg_range_seq_next(), we seem to navigate:
>> walk_up_n_right:
>> while (key_tree->prev && key_tree->prev != &null_element)
>> {
>>
>> don't you need such "&& key_tree->prev != &null_element" in your
>> navigation too?
> Please see attachments that shows the SEL_ARG tree for this query
>
> EXPLAIN SELECT * FROM t2 WHERE (key1b > 2 and key1b < 4)
> or (key1b > 5 and key1b < 6)
> or (key1b > 8 and key1b < 9)
> or (key1b > 10 and key1b < 4)
> or (key1b > 15 and key1b < 16)
> or (key1b > 17 and key1b < 20)
> or (key1b > 22 and key1b < 24)
> or (key1b > 25 and key1b < 28)
> or (key1b > 30 and key1b < 35);
>
> The illustrations show this:
> * rb_prevnext shows prev pointers in black and next pointers in green
> * rb_leftright shows left and right pointers
>
> As can be seen, next/prev are null-pointers when reaching the end of the
> ranges. null_element is what left/right point to when a node does not
> have a child. Wikipedia has an excellent explanation of RB trees without
> next/prev pointers, and null_element is used exactly as described there.
>
> So, instead of adding key_tree->prev/next != &null_element to my code
> change, I tried to replace existing such tests with DBUG_ASSERTS. As
> expected, no mtr test failed.
I vote for putting DBUG_ASSERTs then.
And please put a comment explaining that NULL is sentinel for next/prev
whereas null_element is sentinel for left/right.
And chase wrong comparisons of &null_element with prev/next.
>>> === modified file 'sql/opt_range.cc'
>>> --- a/sql/opt_range.cc 2011-02-16 15:06:06 +0000
>>> +++ b/sql/opt_range.cc 2011-03-15 13:31:03 +0000
>>> @@ -2018,13 +2018,17 @@ void TRP_RANGE::trace_basic_info(const P
>>> add_utf8("index", cur_key.name).add("records", records);
>>>
>>> Opt_trace_array trace_range(param->thd->opt_trace, "ranges");
>>> - for (const SEL_ARG *current= key;
>>> - current;
>>> - current= current->next)
>>> +
>>> + const SEL_ARG *current_range= key;
>>
>> do you understand why TRP_RANGE::key is the _last_ of ranges?
>
> It's not :) It is the ROOT in the RB tree. Consider the trace output of
> the query pasted above without my patch:
>
> "analyzing_range_alternatives": {
> "range_scan_alternatives": [
> {
> "index": "i1b",
> "ranges": [
> "2 < key1b < 4",
> "5 < key1b < 6",
> "8 < key1b < 9",
> "15 < key1b < 16",
> "17 < key1b < 20",
> "22 < key1b < 24",
> "25 < key1b < 28",
> "30 < key1b < 35"
> ],
> "index_only": false,
> "records": 12,
> "cost": 22.41,
> "rowid_ordered": false,
> "chosen": true
> }
> ],
> "analyzing_roworder_intersect": {
> "usable": false,
> "cause": "too_few_roworder_scans"
> }
> },
> "chosen_range_access_summary": {
> "range_access_plan": {
> "type": "range_scan",
> "index": "i1b",
> "records": 12,
> "ranges": [
> "15 < key1b < 16",
> "17 < key1b < 20",
> "22 < key1b < 24",
> "25 < key1b < 28",
> "30 < key1b < 35"
> ]
> },
> "records_for_plan": 12,
> "cost_for_plan": 22.41,
> "chosen": true
> }
>
ok
>>
>> In the testcase (added to the .inc file), TRP_RANGE::key is set here
>> (opt_range.cc:5367):
>>
>> if (key_to_read)
>> {
>> idx= key_to_read - tree->keys;
>> if ((read_plan= new (param->mem_root) TRP_RANGE(*key_to_read, idx,
>> best_mrr_flags)))
>> (the constructor sets TRP_RANGE::key to *key_to_read).
>> Is there, in the existing code, some place where we take this
>> TRP_RANGE::key and go up following the "prev" pointers?
>
> When constructing QUICK_SELECT (get_quick_keys()) we recursively call
> get_quick_keys() on the left-pointer until it's left pointer is
> null_element. So this is a good point... maybe I should replace prev
> with left (and then do the !null_element thingy) in my patch. Do you
> prefer that?
Yes, it would be closer to existing code this way. And as you wrote in
another mail, it should make less hops.
>> Without an existing example, I'm unsure of what we're doing... I'm
>> unsure of what the _execution_ really searches for in the table:
>> both
>> "4 <= key1a <= 4 AND 3 < key1b < 7",
>> "5 <= key1a <= 5 AND 2 < key1b < 10"
>> or just
>> "5 <= key1a <= 5 AND 2 < key1b < 10"
>> ??
>
> The execution does the right thing due to the mentioned
> get_quick_keys(). Imagine that execution had missed this range... in
> that case, we would have missing record-bugs.
ok
>>
>>> + // make current_range point to the first interval
>>> + while (current_range->prev)
>>
>> is it 100% sure that current_range starts non-NULL (so that we can
>> read its "prev" above)?
>
> Something is terribly wrong if we try to do range access and have no
> ranges, so if that happens we have bigger problems at hand. If there are
> no ranges we shouldn't even consider to do range access.
ok
>>> + current_range= current_range->prev;
>>> +
>>> + while (current_range)
>>
>> in other existing code I find:
>> /* Step down if we can */
>> if (key_tree->next && key_tree->next != &null_element)
>> {
>> // Step down; update the tuple
>>
>> so don't we need "&& current_range != &null_element" in the above
>> while() condition?
>
> Explained above. I replaced it with a debug_assert that next is not
> null_element. No failures.
Let's have DBUG_ASSERT() then.
So your patch is all ok except that we agree that the patch should
navigate with "left" instead of "prev".