From: Roy Lyseng Date: October 21 2010 9:21pm Subject: Re: bzr commit into mysql-next-mr-bugfixing branch (roy.lyseng:3260) Bug#45174 Bug#50019 Bug#52068 List-Archive: http://lists.mysql.com/commits/121580 Message-Id: <4CC0AEE3.5080504@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 21.10.10 22.30, Guilhem Bichot wrote: > Hi again, > What I don't understand is why here it's not evil to equate outer fields with > inner fields. I thought doing this was the cause of BUG#45174? > If we here create equality ot.a=it1.a, why won't this equality be used when > materializing the table and produce a bug? There is even a comment in this same > function which says "we must not construct general equalities like > ot1.col=it1.col"...? We need to enforce ot1.col=it1.col to execute the IN condition, but the convention is that we perform this only for the first table from the semijoin. If there is more than 1 column within the semijoin, the implication is that the additional columns are matched with the first SJ column, during materialization. The used_tables filtering within make_cond_for_table() and make_cond_after_sjm() will make sure that the equalities are evaluated at the correct place. > >>>> + The latter equalities are the ones that must be tagged with a marker, >>>> + to prevent them from being evaluated at the wrong place. >>>> */ >>>> - TABLE_LIST *emb_nest= - >>>> item_field->field->table->pos_in_table_list->embedding; >>>> - if (!item_const && emb_nest && emb_nest->sj_mat_exec) >>>> + head= item_const ? item_const : item_equal->get_first(item); >>>> + if (head == item) >>> >>> REQ this if(head==item) branch, I don't understand. >> >> Notice that we skip the list head when there is no const item... > > yes, makes sense to favour const item. > >> It means that if (head == item), "item" must be the first item that is within >> an SJM nest. > > Can't it instead be, in some cases, that "item" is merely the first of all items > of the Item_equal? I think you missed the point of the "Notice" above: If there is no const item, we start substitution from the second item in the multiple equality. So, the first item is never substituted (unless it is substituted with a constant value). >>> This function has comment: >>> "A regular semi-join materialization is always non-correlated, ie >>> the subquery is always resolved by performing a lookup generated in >>> create_subquery_equalities, hence this function should never produce >>> any condition for regular semi-join materialization. >>> For a scan semi-join materialization, this function may return a condition to be >>> checked." >>> >>> but, I see that this function does produce a condition for >>> materialization-lookup in some cases, before and after your patch. For example, >>> in subquery_sj_mat.test, in the section for BUG#45174, for the "EXPLAIN SELECT >>> varchar_nokey etc", a condition is produced by make_cond_after_sjm(). This >>> happens before and after your patch, though the produced conditions are >>> different (and I find that your patch makes things better, as it manages to test >>> more conditions on t2 when materializing; more conditions at low levels, is >>> better :-). >>> >>> So this comment looks wrong, but this isn't due to your patch. >> >> Nevertheless, it is a good catch, and it's definitely something that should be >> kept in sync. > > Then I suggest you amend this wrong comment. Will do. > >> I eliminated the call to make_join_after_sjm() in case of SJM-lookup, to >> eliminate generating the redundant after-condition. There were no result >> changes after this fix. > > ok > >>>> { >>>> + if (!cond) >>>> + return (Item*) NULL; >>> >>> REQ but make_cond_for_table() behaves differently: it crashes if >>> cond is NULL. Can we make those two twins have the same expectations on 'cond'? >> >> I think that if I should do that, then there are also quite a few "if (cond)" >> expressions that can be removed from the callers. But I also wonder if we >> should change these functions to follow "normal" error handling practice, ie >> returning TRUE on failure and FALSE otherwise. > > Apparently those functions may have only OS malloc errors (in "new"). In this > case they return NULL. It will lead to bad query results (I tested: simulated > failures of all "new" operator calls in those functions). > So indeed, returning NULL in this case is wrong, there should be an error instead. > It can either be a TRUE/FALSE bool error return code, then have "cond" as in/out > parameter as you suggest. Or keep Item* as return code, but use a special > impossible pointer to signal error; 0x1 is used in some places of the code > already as an "impossible pointer value". > But I'm not asking you to implement any of this; the shortest course of action > would probably be to just keep make_cond_after_sjm() as it was before the patch, > i.e. require that cond!=NULL in this function. > Then I can file a bug about those OOM problems. > How does this sound? Sounds good. > PS: I have replaced the next-mr-guilhem tree with something else (I needed to > test something on pb2 and this tree is registered in pb2; let me know if you > need it again and I'll restore it). > It's a good idea, but right now I have a lot of bugs and code reviews to complete... Thanks, Roy