In the last episode (May 06), Brandon Shuey said:
> I have a table with several(7+million) million rows of data. I want
> to group count as portion of the data. However, mysql is performing
> a complete table scan and IT TAKES FOREVER (several hours). How can I
> speed this thing up? I was hoping that indexes would have helped!
> Here is the table and the query and the explain result:
> The total data between '1999-05-01 00:00:00' and '1999-05-02
> 00:00:00' is 80488 (24.7% of all the records). Shouldn't it use an
> index for this?
Mysql has no idea of the record order of your file. Assuming a record
length of 370 for your table, and a disk block size of 8192 bytes, you
should be able to fit 22 records in a disk block. If mysql thinks that
a query will return a large enough fraction of the entire table, then
it will use a full table scan. After all, why use an index lookup,
then do possibly random seeks through a table, when you're going to hit
every block anyway?
Oracle has a concept of an "index-organized" table, where the data is
guaranteed to be in the same order as the primary key. This makes
pulling a data based on a range very fast.
How many records do you have in your table again? First you said
7000000, then you said that 80488 = 24.7% if your database (i.e.
330000). Even assuming you have 7 million records, that's only a table
2.5 gig in size. Assuming a 5MB/sec drive, a full table scan should
only take around 10 minutes. If you really have 7 million records,
though, mysql certainly should have used the index.
If you can, see if you can split your tables into smaller ones based on
date (partition by month maybe). That will ease the penalty of a full
table scan. Also if you aren't using SCSI disks, get some.