Ack... premature bug report... I waited for the table to get a little
larger (it was at 4000 rows, now it's at 140,000 rows), and now the
optimizer works like I expected it to, using range on the primary key.
I panicked a little too early.. I didn't even consider that it'd be faster
to do a full scan vs. a range when there were less than 'x' number of
rows...
Thanks;
Justin
----- Original Message -----
From: "Dan Nelson" <dnelson@stripped>
To: <jmat@stripped>
Cc: <mysql@stripped>
Sent: Monday, April 28, 2003 1:56 PM
Subject: Re: strange behavior with query optimizer regarding 'range'
> In the last episode (Apr 28), jmat@stripped said:
> > >Description:
> >
> > I have a table that collects hit statistics for a website. I am storing
the data hourly, based on a user
> > identifier. The table structure:
> >
> > CREATE TABLE hits_hourly (
> > uid mediumint(8) unsigned NOT NULL default '0',
> > hournum int(3) unsigned NOT NULL default '0',
> > hitcount smallint(5) unsigned NOT NULL default '0',
> > PRIMARY KEY (hournum,uid),
> > KEY uid (uid)
> > ) TYPE=MyISAM DELAY_KEY_WRITE=1;
> >
> > But... when I do more than 3...
> >
> > explain select sum(hitcount) from hits_hourly where hournum between
17560744 and 17560747;
> >
+-------------+------+---------------+------+---------+------+-------+------
-------+
> > | table | type | possible_keys | key | key_len | ref | rows |
Extra |
> >
+-------------+------+---------------+------+---------+------+-------+------
-------+
> > | hits_hourly | ALL | PRIMARY | NULL | NULL | NULL | 21674 |
Using where |
> >
+-------------+------+---------------+------+---------+------+-------+------
-------+
> >
> > we go back to a full table scan, and it's VERY slow (well,
comparitively... 0.40 seconds to .04 seconds).
>
> How many rows total are in your table? Mysql might have decided that a
> full table scan would be faster than a range scan and multiple random
> record pulls (it has to go to the record to get hitcount).
>
> > If I create a new index, consisting of (hournum,hitcount), it will
> > use that key consistently -- but that is duplicating a lot of data
> > into the index that shouldn't really be there.
>
> Remember, all indexes are "duplicating" data :) With that index,
> though, the query will always be very fast since it can use that index
> to generate your result without having to do any table lookups at all.
>
> --
> Dan Nelson
> dnelson@stripped
>