List:General Discussion« Previous MessageNext Message »
From:Johan De Meersman Date:June 3 2010 10:51am
Subject:Re: Slow when using sub-query
View as plain text  
The short answer is that the optimizer is amazingly stupid about subqueries,
and it assumes that they are dependent even when they're not - that subquery
gets executed for every row in your main query.

The fastest way to do this, would probably be to run your subquery, have
your code assemble the appropriate IN clause, and then run your main query.

The long answer is that there's a rather good Advanced Tuning course that
addresses all this and more, as does Baron & C° 's excellent MySQL book.



On Wed, Jun 2, 2010 at 10:05 PM, Jerry Schwartz <jerry@stripped> wrote:

> I've heard that sub-queries aren't well-optimized, but this case seems
> ridiculous.
>
> First, a little setup:
>
> SELECT pub_id FROM pub WHERE pub_code = 'GD' INTO @P;
>
> === Inner Query by Itself ===
>
> us-gii >SELECT prod_pub_prod_id FROM prod
>    -> WHERE pub_id = @P
>    -> AND prod_discont = 0
>    -> GROUP BY prod_pub_prod_id
>    -> HAVING COUNT(*) > 1;
> +------------------+
> | prod_pub_prod_id |
> +------------------+
> | NULL             |
> | GDAE0106ICR      |
> | GDME0002TR       |
> | GDME0023IAR      |
> | GDME0059IAR      |
> +------------------+
> 5 rows in set (0.05 sec)
>
> us-gii >EXPLAIN
>    -> SELECT prod_pub_prod_id FROM prod
>    -> WHERE pub_id = @P
>    -> AND prod_discont = 0
>    -> GROUP BY prod_pub_prod_id
>    -> HAVING COUNT(*) > 1
>    -> \G
> *************************** 1. row ***************************
>           id: 1
>  select_type: SIMPLE
>        table: prod
>         type: ref
> possible_keys: pub_id,pub_id_2
>          key: pub_id
>      key_len: 48
>          ref: const
>         rows: 1543
>        Extra: Using where; Using temporary; Using filesort
>
> === Outer Query without Inner Query ===
>
> us-gii >SELECT prod_num FROM prod
>    -> WHERE pub_id = @P
>    ->  AND prod_pub_prod_id IN
>    ->  (
>    ->  NULL,
>    ->  'GDAE0106ICR',
>    ->  'GDME0002TR',
>    ->  'GDME0023IAR',
>    ->  'GDME0059IAR'
>    ->  )
>    -> ;
> +----------+
> | prod_num |
> +----------+
> |    83298 |
> |    85092 |
> |    88728 |
> |    97231 |
> |    97235 |
> |    98368 |
> |   107693 |
> |   112461 |
> +----------+
> 8 rows in set (0.01 sec)
>
> us-gii >EXPLAIN
>    -> SELECT prod_num FROM prod
>    -> WHERE pub_id = @P
>    ->  AND prod_pub_prod_id IN
>    ->  (
>    ->  NULL,
>    ->  'GDAE0106ICR',
>    ->  'GDME0002TR',
>    ->  'GDME0023IAR',
>    ->  'GDME0059IAR'
>    ->  )
>    -> \G
> *************************** 1. row ***************************
>           id: 1
>  select_type: SIMPLE
>        table: prod
>         type: ref
> possible_keys: prod_pub_prod_id,pub_id,pub_id_2
>          key: pub_id
>      key_len: 48
>          ref: const
>         rows: 1543
>        Extra: Using where
>
> === Outer Query with Sub-query ===
>
> us-gii >EXPLAIN
>    -> SELECT prod_num FROM prod
>    -> WHERE pub_id = @P
>    ->  AND prod_pub_prod_id IN
>    ->          (SELECT prod_pub_prod_id FROM prod
>    ->          WHERE pub_id = @P
>    ->          AND prod_discont = 0
>    ->          GROUP BY prod_pub_prod_id
>    ->          HAVING COUNT(*) > 1)
>    -> \G
> *************************** 1. row ***************************
>           id: 1
>  select_type: PRIMARY
>        table: prod
>         type: ref
> possible_keys: pub_id,pub_id_2
>          key: pub_id
>      key_len: 48
>          ref: const
>         rows: 1543
>        Extra: Using where
> *************************** 2. row ***************************
>           id: 2
>  select_type: DEPENDENT SUBQUERY
>        table: prod
>         type: index
> possible_keys: pub_id,pub_id_2
>          key: prod_pub_prod_id
>      key_len: 768
>          ref: NULL
>         rows: 72
>        Extra: Using where; Using filesort
>
> I don't know how long the Outer Query with Sub-query would take: I killed
> it
> after several minutes. I'm guessing that it has to do with the fact that
> the
> inner query is "dependent", but why is that happening?
>
> === Rewritten as Join ===
>
> us-gii >SELECT prod_num FROM prod JOIN
>    ->  (SELECT prod_pub_prod_id FROM prod
>    ->  WHERE pub_id = @P
>    ->  AND prod_discont = 0
>    ->  GROUP BY prod_pub_prod_id
>    ->  HAVING COUNT(*) > 1) AS x
>    -> ON prod.prod_pub_prod_id = x.prod_pub_prod_id
>    -> WHERE prod.pub_id = @P
>    ->  AND prod.prod_discont = 0;
> +----------+
> | prod_num |
> +----------+
> |    98368 |
> |   107693 |
> |    83298 |
> |    85092 |
> |    88728 |
> |    97231 |
> |    97235 |
> |   112461 |
> +----------+
> 8 rows in set (0.05 sec)
>
> us-gii >EXPLAIN
>    -> SELECT prod_num FROM prod JOIN
>    ->  (SELECT prod_pub_prod_id FROM prod
>    ->  WHERE pub_id = @P
>    ->  AND prod_discont = 0
>    ->  GROUP BY prod_pub_prod_id
>    ->  HAVING COUNT(*) > 1) AS x
>    -> ON prod.prod_pub_prod_id = x.prod_pub_prod_id
>    -> WHERE prod.pub_id = @P
>    ->  AND prod.prod_discont = 0
>    -> \G
> *************************** 1. row ***************************
>           id: 1
>  select_type: PRIMARY
>        table: <derived2>
>         type: ALL
> possible_keys: NULL
>          key: NULL
>      key_len: NULL
>          ref: NULL
>         rows: 5
>        Extra:
> *************************** 2. row ***************************
>           id: 1
>  select_type: PRIMARY
>        table: prod
>         type: ref
> possible_keys: prod_pub_prod_id,pub_id,pub_id_2
>          key: prod_pub_prod_id
>      key_len: 768
>          ref: x.prod_pub_prod_id
>         rows: 1
>        Extra: Using where
> *************************** 3. row ***************************
>           id: 2
>  select_type: DERIVED
>        table: prod
>         type: ref
> possible_keys: pub_id,pub_id_2
>          key: pub_id
>      key_len: 48
>          ref:
>         rows: 1544
>        Extra: Using where; Using temporary; Using filesort
> 3 rows in set (0.03 sec)
>
> =====
>
> What a difference! I don't understand it, though.
>
> Does anyone want to take on the challenge of educating me?
>
> Regards,
>
> Jerry Schwartz
> Global Information Incorporated
> 195 Farmington Ave.
> Farmington, CT 06032
>
> 860.674.8796 / FAX: 860.674.8341
>
> www.the-infoshop.com
>
>
>
>
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql?unsub=1
>
>


-- 
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

Thread
Slow when using sub-queryJerry Schwartz2 Jun
  • Re: Slow when using sub-queryJohan De Meersman3 Jun
    • RE: Slow when using sub-queryJerry Schwartz3 Jun