List:General Discussion« Previous MessageNext Message »
From:Michael Widenius Date:November 9 1999 10:41pm
Subject:Optimizer behavior with date ranges optimal?
View as plain text  
>>>>> "maurice" == maurice  <maurice@stripped> writes:

>> Description:
maurice> We have the following table in which we record every login to our site:

maurice> CREATE TABLE logins (
maurice> registration int(10) unsigned DEFAULT '0' NOT NULL,
maurice> login_date datetime DEFAULT '0000-00-00 00:00:00' NOT NULL,
maurice> KEY registration (registration,login_date),
maurice> KEY login_date (login_date)
maurice> );                

maurice> We are running the following query to find all the registrants who
maurice> logged in the previous day:

maurice> SELECT DISTINCT registration 
maurice> FROM logins 
maurice> WHERE login_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) AND 
maurice> login_date < CURRENT_DATE();

maurice> When the number of matching rows is small, EXPLAIN indicates that a 
maurice> "range" search is used.  For example, with 101,000 total rows, 
maurice> 1,000 of which are from the previous day, explain shows:

maurice>
> +--------+-------+---------------+------------+---------+------+------+-------+
maurice> | table  | type  | possible_keys | key        | key_len | ref  | rows | Extra
> |
maurice>
> +--------+-------+---------------+------------+---------+------+------+-------+
maurice> | logins | range | login_date    | login_date |    NULL | NULL |  676 |      
> |
maurice>
> +--------+-------+---------------+------------+---------+------+------+-------+

maurice> That looks good.  However, when the number of rows for a specific day
maurice> becomes a larger percentage of the total rows, the optimizer chooses an 
maurice> "index" search based on the registration key.  Here's the EXPLAIN results 
maurice> with 120,000 total rows, 20,000 of which are from the previous day:

maurice>
> +--------+-------+---------------+--------------+---------+------+--------+-------------------------+
maurice> | table  | type  | possible_keys | key          | key_len | ref  | rows   |
> Extra                   |
maurice>
> +--------+-------+---------------+--------------+---------+------+--------+-------------------------+
maurice> | logins | index | login_date    | registration |      12 | NULL | 120000 |
> where used; Using index |
maurice>
> +--------+-------+---------------+--------------+---------+------+--------+-------------------------+
>          

maurice> If the registration column isn't referenced, it reverts back to a range
maurice> search:

maurice> EXPLAIN SELECT DISTINCT login_date FROM logins WHERE 
maurice> login_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) AND 
maurice> login_date < CURRENT_DATE();

maurice>
> +--------+-------+---------------+------------+---------+------+-------+---------------+
maurice> | table  | type  | possible_keys | key        | key_len | ref  | rows  | Extra
>         |
maurice>
> +--------+-------+---------------+------------+---------+------+-------+---------------+
maurice> | logins | range | login_date    | login_date |    NULL | NULL | 20048 | ;
> Using index |
maurice>
> +--------+-------+---------------+------------+---------+------+-------+---------------+  
>                          

maurice> My question is, is the optimizer making the best decision in this case?
maurice> Is it really faster to use an "index" search in the above scenario?

Hi!

MySQL tries to calculate how many 'blocks' it needs to access to
satisfy a query.

If it uses the 'login_date' index, MySQL will have to pull out each
row in a random order from disk.
If MySQL uses the registration key, MySQL will only have to consult
the index blocks, which hopefully are cached in the key buffer.

Which version is faster is largely depending on how your table is
laid out on your disk, but neither version should be that bad.

Regards,
Monty
Thread
Optimizer behavior with date ranges optimal?maurice1 Nov
  • Re: Optimizer behavior with date ranges optimal?sinisa1 Nov
    • Re: Optimizer behavior with date ranges optimal?Maurice Aubrey2 Nov
      • Re: Optimizer behavior with date ranges optimal?sinisa2 Nov
        • Re: Optimizer behavior with date ranges optimal?Maurice Aubrey2 Nov
          • Re: Optimizer behavior with date ranges optimal?sinisa3 Nov
            • Re: Optimizer behavior with date ranges optimal?Maurice Aubrey4 Nov
              • Re: Optimizer behavior with date ranges optimal?sinisa4 Nov
                • Re: Optimizer behavior with date ranges optimal?Maurice Aubrey4 Nov
                  • Re: Optimizer behavior with date ranges optimal?sinisa5 Nov
                    • Re: Optimizer behavior with date ranges optimal?Maurice Aubrey6 Nov
                      • Re: Optimizer behavior with date ranges optimal?sinisa6 Nov
          • Re: Optimizer behavior with date ranges optimal?Michael Widenius10 Nov
  • Re: Optimizer behavior with date ranges optimal?Steve Ruby2 Nov
    • Re: Optimizer behavior with date ranges optimal?Maurice Aubrey3 Nov
      • RE: Optimizer behavior with date ranges optimal?John Correa3 Nov
  • Re: Optimizer behavior with date ranges optimal?Steve Ruby2 Nov
  • Re: Optimizer behavior with date ranges optimal?Steve Ruby3 Nov
    • Re: Optimizer behavior with date ranges optimal?Michael Widenius10 Nov
  • Optimizer behavior with date ranges optimal?Michael Widenius10 Nov
  • Re: Optimizer behavior with date ranges optimal?Steve Ruby10 Nov