Hello.
See:
http://bugs.mysql.com/bug.php?id=12106
"Matthias Pigulla" <mp@stripped> wrote:
> Hi all,
>
> can someone explain me why this query gets executed the way it is? I
> simplified it as much as possible and think it's a conceptual/logical
> thing, so I'll omit - at least for now - the table definitions and
> sample data for brevity.
>
> -- "Superquery"
> EXPLAIN SELECT t1.id, t1.name
> FROM document AS t1
> WHERE t1.id
> IN (
> -- "Subquery"
> SELECT DISTINCT data_id
> FROM wfd_reference
> WHERE wfd_field_id =3D699
> AND ref_data_id
> IN ( 171 )
> )
>
> This gives:
>
> *************************** 1. row ***************************
> id: 1
> select_type: PRIMARY
> table: t1
> type: ALL
> possible_keys: NULL
> key: NULL
> key_len: NULL
> ref: NULL
> rows: 277
> Extra: Using where
> *************************** 2. row ***************************
> id: 2
> select_type: DEPENDENT SUBQUERY
> table: wfd_reference
> type: ref
> possible_keys: field_data,test
> key: field_data
> key_len: 4
> ref: const,func
> rows: 4
> Extra: Using where; Using temporary
>
> However - why do we need the *dependent* subquery at all? The subquery
> can be executed on its own, as it does not depend on any information of
> the "superquery".
>
> EXPLAIN SELECT DISTINCT data_id
> FROM wfd_reference
> WHERE wfd_field_id =3D699
> AND ref_data_id
> IN ( 171 )=20
>
> *************************** 1. row ***************************
> id: 1
> select_type: SIMPLE
> table: wfd_reference
> type: ref
> possible_keys: field_data,test
> key: test
> key_len: 4
> ref: const,const
> rows: 9
> Extra: Using where; Using temporary
>
> Now if I just take the result of this query, concat the data_ids on the
> application level and build the superquery as follows:
>
> -- "two-staged superquery variant"
> EXPLAIN SELECT t1.id, t1.name
> FROM document AS t1
> WHERE t1.id
> IN (
> 32, 31, 30, 53, 56, 57, 58, 59, 60, 111
> )
>
> *************************** 1. row ***************************
> id: 1
> select_type: SIMPLE
> table: t1
> type: range
> possible_keys: PRIMARY
> key: PRIMARY
> key_len: 2
> ref: NULL
> rows: 10
> Extra: Using where
>
> the results seem to be much better.=20
>
> I was afraid of the dependent subquery for the "ALL" scan of t1, as t1
> will become huge. OTOH, the subquery will be very restrictive: The
> number of data_ids will always be very small, at least compared to the
> number of rows in t1.
>
> Are there any non-obvious reasons for the behaviour described above? Is
> that something that cannot be optimized right now? Am I too
> short-sighted with my "optimization" approach?
>
> Thanks a lot,
> Matthias
>
>
--
For technical support contracts, goto https://order.mysql.com/?ref=ensita
This email is sponsored by Ensita.NET http://www.ensita.net/
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Gleb Paharenko
/ /|_/ / // /\ \/ /_/ / /__ Gleb.Paharenko@stripped
/_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.NET
<___/ www.mysql.com