List:Internals« Previous MessageNext Message »
From:Guilhem Bichot Date:March 12 2010 3:32pm
Subject:Re: Question about the subquery transform
View as plain text  
Hello,

Zhixuan Zhu a écrit, Le 04.03.2010 18:00:
> Hello,
> 
> We're using mysql-5.1.40.
> 
> I have a question about the subquery transform in MySQL's optimizer.
> 
> Given the following query:
> 
> select 1 from sub1 where (c1,c2) in (select min(c1),min(c2) from sub2
> group by c3 having count(c1) > 4);
> 
> It generates the following extended plan output:
> 
> mysql> show warnings;
>
> +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> | Level | Code | Message
> 
> 
> 
> 
> 
> 
> 
> 
>         |
>
> +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> | Note  | 1003 | select 1 AS `1` from `qrefd01m`.`sub1` where
>
> <in_optimizer>((`qrefd01m`.`sub1`.`c1`,`qrefd01m`.`sub1`.`c2`),<exists>(select
> min(`qrefd01m`.`sub2`.`c1`) AS `min(c1)`,min(`qrefd01m`.`sub2`.`c2`)
> AS `min(c2)` from `qrefd01m`.`sub2` group by `qrefd01m`.`sub2`.`c3`
> having ((count(`qrefd01m`.`sub2`.`c1`) > 4) and
> ((<cache>(`qrefd01m`.`sub1`.`c1`) = min(`qrefd01m`.`sub2`.`c1`)) or
> isnull(min(`qrefd01m`.`sub2`.`c1`))) and
> ((<cache>(`qrefd01m`.`sub1`.`c2`) = min(`qrefd01m`.`sub2`.`c2`)) or
> isnull(min(`qrefd01m`.`sub2`.`c2`))) and
> <is_not_null_test>(min(`qrefd01m`.`sub2`.`c1`)) and
> <is_not_null_test>(min(`qrefd01m`.`sub2`.`c2`))))) |
>
> +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> 1 row in set (0.00 sec)
> 
> I found this transformation very confusing. MySQL pushs the join
> condition down to the subquery, which is smart. But I don't understand
> why a "or isnull" has to be attached to the join. This is in the WHERE
> clause, and it's a IN variant (not NOTIN), can't NULL result be safely
> ignored? Even if it can't, the optimizer again adds a
> <is_not_null_test> checker to the having clause, which exactly
> contradict the "or isnull" filter it adds earlier.
> 
> So each of the item on the SELECT clause of the IN subquery is transformed to
> 
> (cache=item or item is null) having item is not null,

More precisely
(cache=item or item is null) and item is not null

> which to my understanding is equivalent to (cache = item).

Yes.

> I must have missed something or/and misunderstand something about this
> transformation. Please advice.

I suggest to try with a more recent 5.1.
Even if this still happens with that new version, it's not really a 
problem. It may be that the optimizer does something slightly 
inefficient here, but that has probably no performance impact: once the 
MIN() has been calculated, the HAVING clause has one single row per 
GROUP BY group to handle, so NULL checks probably don't add a penalty, 
even if they are useless.
I cannot tell more without debugging, but it doesn't look serious.

-- 
Mr. Guilhem Bichot <guilhem@stripped>
Sun Microsystems / MySQL, Lead Software Engineer
Bordeaux, France
www.sun.com / www.mysql.com
Thread
Question about the subquery transformZhixuan Zhu4 Mar
  • Re: Question about the subquery transformGuilhem Bichot12 Mar