I have a table with several(7+million) million rows of data. I want to group count as
portion of the data. However, mysql is
performing a complete table scan and IT TAKES FOREVER (several hours).
How can I speed this thing up? I was hoping that indexes would have helped! Here is the
table and the query and the explain
result:
The total data between '1999-05-01 00:00:00' and '1999-05-02 00:00:00' is 80488 (24.7% of
all the records).
Shouldn't it use an index for this?
describe tblVisits;
+-----------------+-----------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+-----------+------+-----+---------------------+-------+
| VisitLogID | char(20) | | PRI | | |
| HostAccount | char(20) | | MUL | | |
| Style | char(20) | | | | |
| Category | char(20) | | | | |
| Server | char(20) | | | | |
| UserID | char(20) | | | | |
| Format | char(20) | | | | |
| Source | char(20) | | | | |
| RemoteIP | char(20) | | MUL | | |
| UserAgent | char(20) | | | | |
| RemoteHost | char(20) | | | | |
| Referer | char(100) | | | | |
| Date_Time | datetime | | MUL | 0000-00-00 00:00:00 | |
| ReferralAccount | char(20) | | | | |
| Summarized | int(11) | | MUL | 0 | |
| Quality | int(11) | | | 0 | |
+-----------------+-----------+------+-----+---------------------+-------+
show index from tblVisits;
+-----------+------------+------------------+--------------+-------------+-----------+-------------+----------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation |
Cardinality | Sub_part |
+-----------+------------+------------------+--------------+-------------+-----------+-------------+----------+
| tblVisits | 0 | PRIMARY | 1 | VisitLogID | A |
332995 | NULL |
| tblVisits | 1 | HostAccount_indx | 1 | HostAccount | A |
NULL | NULL |
| tblVisits | 1 | HostAccount_indx | 2 | Style | A |
NULL | NULL |
| tblVisits | 1 | HostAccount_indx | 3 | Category | A |
NULL | NULL |
| tblVisits | 1 | HostAccount_indx | 4 | Server | A |
NULL | NULL |
| tblVisits | 1 | Date_Time | 1 | Date_Time | A |
NULL | NULL |
| tblVisits | 1 | Date_Time | 2 | HostAccount | A |
NULL | NULL |
| tblVisits | 1 | Date_Time | 3 | Style | A |
NULL | NULL |
| tblVisits | 1 | Date_Time | 4 | Style | A |
NULL | NULL |
| tblVisits | 1 | Date_Time | 5 | Category | A |
NULL | NULL |
| tblVisits | 1 | Date_Time_2 | 1 | Date_Time | A |
NULL | NULL |
| tblVisits | 1 | Date_Time_2 | 2 | Server | A |
NULL | NULL |
| tblVisits | 1 | RemoteIP | 1 | RemoteIP | A |
NULL | NULL |
| tblVisits | 1 | RemoteIP | 2 | Date_Time | A |
NULL | NULL |
| tblVisits | 1 | Summarized | 1 | Summarized | A |
NULL | NULL |
| tblVisits | 1 | Summarized | 2 | Quality | A |
NULL | NULL |
+-----------+------------+------------------+--------------+-------------+-----------+-------------+----------+
EXPLAIN SELECT
Date_Format(Date_Time,'%Y-%m-%d') as cDate,
HostAccount,
Style,
Category,
Count(*)
FROM tblVisits
WHERE Date_Time >= '1999-05-01 00:00:00' and Date_Time <'1999-05-02 00:00:00'
GROUP BY cDate,HostAccount,Style,Category;
+-----------+------+-----------------------+------+---------+------+--------+------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-----------+------+-----------------------+------+---------+------+--------+------------+
| tblVisits | ALL | Date_Time,Date_Time_2 | NULL | NULL | NULL | 332995 | where used |
+-----------+------+-----------------------+------+---------+------+--------+------------+
select count(*) from tblVisits;
+----------+
| count(*) |
+----------+
| 332995 |
+----------+
Brandon Shuey
WebCentric Inc.
(316) 612-8030
brandon@stripped