|List:||Commits||« Previous MessageNext Message »|
|From:||Guilhem Bichot||Date:||August 12 2010 7:11pm|
|Subject:||Re: bzr commit into mysql-next-mr-opt-backporting branch (guilhem:3208)|
|View as plain text|
Hello, Some more thoughts: Guilhem Bichot a écrit, Le 11.08.2010 17:26: >>>>> + JOIN_TAB *first_unmatched= join_tab->first_unmatched; >>>>> + if ((first_unmatched= first_unmatched->first_upper)&& >>>>> + first_unmatched->last_inner != join_tab) >>>> > + first_unmatched= 0; >>>> >>>> Suggestion: Rewrite the above four lines like this: >>>> JOIN_TAB *first_unmatched= join_tab->first_unmatched->first_upper; >>>> if (first_unmatched && first_unmatched->last_inner != > join_tab) >>>> first_unmatched= NULL; >>> >>> done, like this: >>> JOIN_TAB *first_unmatched= join_tab->first_unmatched->first_upper; >>> if (first_unmatched != NULL && >>> first_unmatched->last_inner != join_tab) >>> first_unmatched= NULL; >>> join_tab->first_unmatched= first_unmatched; >>> >>>> If possible, please document what this code block does... >>> >>> Jorgen also asked for documentation of this same block. In my patch >>> it has >>> documentation: >>> /* >>> From the point of view of the rest of execution, this record matches >>> (it has been built and satisfies conditions, no need to do more >>> evaluation >>> on it). See similar code in evaluate_join_record(). >>> */ >>> and in evaluate_join_record() it has comment: >>> /* >>> Check whether join_tab is not the last inner table >>> for another embedding outer join. >>> */ >>> So somehow, the first comment suggest to look at the second comment. >>> Do you mean I should copy-paste the comment "check whether etc" into >>> evaluate_null_etc() ? >> >> I think that "check whether..." can be derived from the code. But what >> does the assignment to first_unmatched really mean? This goes beyond >> your fix, so do not take it as a requirement. > > first_unmatched is described in the "IMPLEMENTATION" part of the > function comment of sub_select(). > What this code block is doing? consider this easy sample query: > SELECT > ... > FROM > t1 > LEFT JOIN > ( > ( > t2 > LEFT JOIN > (t3, t4) > ON t3.a=1 AND t2.b=t4.b > ), > ( > t5 > LEFT JOIN > ( > (t6, t7) > LEFT JOIN > t8 > ON t7.b=t8.b AND t6.b < 10 > ) > ON t6.b >= 2 AND t5.b=t7.b > ) > ON (t3.b=2 OR t3.c IS NULL) AND (t6.b=2 OR t6.c IS NULL) AND > (t1.b=t5.b OR t3.c IS NULL OR t6.c IS NULL or t8.c IS NULL) AND > (t1.a != 2) > > The query plan is t1,t2,t3,t4,t5,t7,t6,t8. When we are in > evaluate_join_record() for join_tab==t4 (I'm using a tree without my > patch), and come to just before the start of this code: > } > /* > Check whether join_tab is not the last inner table > for another embedding outer join. > */ > if ((first_unmatched= first_unmatched->first_upper) && > first_unmatched->last_inner != join_tab) > first_unmatched= 0; > first_unmatched is t3 and first_unmatched->first_upper is t2 > (understandable from the comment of sub_select()): so far we didn't > have a match in (t3,t4) (for the join to t2), so the first_unmatched of > t4 was t3, now we have a match, so we look up (first_upper) as it's time > to re-evaluate ON/WHERE conditions which were so far disabled: do we > have a match in t2 now (for the join to t1)? Well, t2's last_inner is > t8, which is not t4, so we go into first_unmatched=0: t4's > first_unmatched is now 0. > Why t2's last inner is t8? I really wonder, to me t8 is in a different > nest, joined with t2's nest via an _inner_ - not outer - join, so I > cannot see why t8 would matter to t2. > Even if it were correct, why should t4's first_unmatched be 0? I wonder. > How did I come to the simple query above? I put a DBUG_ASSERT(0) in the > "first_unmatched=0" block, and run the "optimizer tests" (group*.test, > join*.test select*.test subquery*.test to name a few), and it fired in a > single query of a single test (join_nested.test). > So it seems to be rarely executed. And if I comment out > "first_unmatched=0", join_nested.test still passes. > Sounds fishy? Oh yes. > However I believe that my patch is correct, because before my patch, the <cut the rest> What breaks my mental picture is that t2's last_inner is t8, I expected it to be t4: I expect last_inner to be the most inner table of the outer join of which t2 is the left member. IF it were the case, here is how I would explain this code: > if ((first_unmatched= first_unmatched->first_upper) && > first_unmatched->last_inner != join_tab) > first_unmatched= 0; We found a match in t3,t4; can we know say that the row of t2 is a match? Two cases: - if t4 is the last table of t2's nested join, for example the query is like t1 LEFT JOIN (t2 LEFT JOIN (blah LEFT JOIN (blih etc ... ( very deep ... (. ... t4 WHERE t2.a IS NULL)))))) then the row of t2 has passed all obstacles and can be called a match, so we can evaluate IS NULL: we evaluate, then will go up again following first_unmatched and first_upper (next iteration) to the enclosing nests - otherwise, the query could be like t1 LEFT JOIN (t2 LEFT JOIN (blah LEFT JOIN t4) LEFT JOIN t5 WHERE t2.a IS NULL) in which case, we need to look at t5, it's yet another obstacle for t2's row: indeed the IS NULL cannot yet be tested, as the presence of t5 rows will influence whether t2's row passes or is eliminated and a null-complemented t2 row has to be made. So it's not time to evaluate the portion of WHERE clause which applies to t2, so we break the loop and tell t4 that it does not have to deal with its first_unmatched (set it to 0); in execution of evaluate_join_record() for "t5 etc" further down the stack, we will finally reach the moment when all obstacles for t2's row have been passed, and only then we will evaluate the WHERE clause for t2.