List:General Discussion« Previous MessageNext Message »
From:Dan Buettner Date:October 25 2006 10:32pm
Subject:Re: Optimizer Bug?
View as plain text  
My understanding of what is happening here is this:

The 'rows' column of EXPLAIN output is an estimate of how many rows
MySQL thinks it will likely have to examine in a table to get your
answer.  When there's an index, it will hopefully be able to use that
to exmaine a small subset of the rows in the table.

Problem here is, MySQL thinks it will have to examine 1463 of 1950
rows.  At that point (or any point higher than about 30%) MySQL will
decide that a table scan may be faster.  Hence the decision to not use
the d_id index.

When you drop the index, MySQL can no longer plan to eliminate any
rows using an index, so it knows up front it will have to do a table
scan, giving you the 1950 answer for the table with no d_id index.

What's probably happening is that you have a large grouping of the
5098 number in your data, based on a quick read of your query.

Make sense?

Dan



On 10/25/06, David Hillman <david.hillman@stripped> wrote:
>
> All;
>
>     Am I crazy, or doesn't this have to be an optimizer/explain bug?
> SQL interspersed with comments follow...
>
>                 mysql> CREATE TEMPORARY TABLE `table_a` (   `s_id` int(11) NOT
> NULL
> default '0',   `r_id` int(11) NOT NULL default '0',   `d_id`            int
> (11) NOT NULL default '0',   `status` enum('open','close') NOT NULL
> default 'open',   key `s_id` ( `s_id` ),   key `d_id` ( `d_id` ) )
>                         SELECT MAX(fs.s_id) as s_id, fs.r_id, fs.d_id, fs.status
> FROM
> table_c AS fs WHERE fs.d_id=5098   AND fs.status='close' GROUP BY
> fs.r_id;
>                 Query OK, 1950 rows affected (0.03 sec)
>                 Records: 1950  Duplicates: 0  Warnings: 0
>
>     We created a temp table, and stuck 1950 rows in it.
>
>                 mysql> explain SELECT fs.s_id, fs.r_id, fst.*         FROM
> table_a
> AS fs         LEFT OUTER JOIN table_b AS fst ON
> fs.s_id=fst.s_id                        LEFT OUTER JOIN table_d AS ff ON
> ff.f_id=fst.f_id         WHERE fs.d_id='5098' AND ff.status='active';
>                 +----+-------------+-------+--------+----------------
> +---------------+---------+-------------------------+------
> +-------------+
>                 | id | select_type | table | type   | possible_keys  |
> key           | key_len | ref                     | rows | Extra       |
>                 +----+-------------+-------+--------+----------------
> +---------------+---------+-------------------------+------
> +-------------+
>                 |  1 | SIMPLE      | fs    | ALL    | d_id     | NULL          |
> NULL | NULL                    | 1463 | Using where |
>                 |  1 | SIMPLE      | fst   | ref    | s_id  | s_id |       4 |
> database.fs.s_id |    1 |             |
>                 |  1 | SIMPLE      | ff    | eq_ref | PRIMARY,status |
> PRIMARY       |       4 | database.fst.f_id     |    1 | Using where |
>                 +----+-------------+-------+--------+----------------
> +---------------+---------+-------------------------+------
> +-------------+
>                 3 rows in set (0.00 sec)
>
>     Now when we join on that temporary table_a ( aka "fs" ), there's
> two interesting things happening.  One, there's a possible_key called
> d_id, but it's not using it.  Two, it's allegedly an "ALL" join, but
> only showing 1463 of the 1950 rows.
>
>                 mysql> alter table table_a drop index d_id;
>                 Query OK, 1950 rows affected (0.01 sec)
>                 Records: 1950  Duplicates: 0  Warnings: 0
>
>     Now we drop the "possible_key" that it wasn't using anyway,
> and... run the same explain...
>
>                 mysql> explain SELECT fs.s_id, fs.r_id, fst.*         FROM
> table_a
> AS fs         LEFT OUTER JOIN table_b AS fst ON
> fs.s_id=fst.s_id                        LEFT OUTER JOIN table_d AS ff ON
> ff.f_id=fst.f_id         WHERE fs.d_id='5098' AND ff.status='active';
>                 +----+-------------+-------+--------+----------------
> +---------------+---------+-------------------------+------
> +-------------+
>                 | id | select_type | table | type   | possible_keys  |
> key           | key_len | ref                     | rows | Extra       |
>                 +----+-------------+-------+--------+----------------
> +---------------+---------+-------------------------+------
> +-------------+
>                 |  1 | SIMPLE      | fs    | ALL    | NULL           |
> NULL          |    NULL | NULL                    | 1950 | Using where |
>                 |  1 | SIMPLE      | fst   | ref    | s_id  | s_id |       4 |
> database.fs.s_id |    1 |             |
>                 |  1 | SIMPLE      | ff    | eq_ref | PRIMARY,status |
> PRIMARY       |       4 | database.fst.f_id     |    1 | Using where |
>                 +----+-------------+-------+--------+----------------
> +---------------+---------+-------------------------+------
> +-------------+
>                 3 rows in set (0.00 sec)
>
>      Now that join is still an "ALL", but it's looking at more rows?
> How can it look at more than "ALL" rows?  Why does dropping an index
> that it wasn't using change, well, anything?  Can someone make some
> sense of this for me?
>
>     This is on version 4.1.18.  Thanks.
>
> --
> David Hillman
> LiveText, Inc
> 1.866.LiveText x235
>
>
>
Thread
Optimizer Bug?David Hillman25 Oct
  • Re: Optimizer Bug?Dan Buettner26 Oct
    • Re: Optimizer Bug?David Hillman27 Oct
      • Re: Optimizer Bug?Dan Nelson27 Oct
        • Re: Optimizer Bug?David Hillman27 Oct
          • Re: Optimizer Bug?Dan Nelson27 Oct