List:MySQL on Win32« Previous MessageNext Message »
From:Michael Widenius Date:June 29 1999 3:01am
Subject:Problem with count(*)
View as plain text  
Hi!

>>>>> "Simson" == Simson L Garfinkel <simsong@stripped>
> writes:
Simson> #
Simson> # Table structure for table 'log'
Simson> #
Simson> CREATE TABLE log (
Simson>   logid int(11) DEFAULT '0' NOT NULL auto_increment,
Simson>   host int(11) DEFAULT '0' NOT NULL,
Simson>   hostip int(11) DEFAULT '0' NOT NULL,
Simson>   username int(11) DEFAULT '0' NOT NULL,
Simson>   when int(11) DEFAULT '0' NOT NULL,
Simson>   request_type int(11) DEFAULT '0' NOT NULL,
Simson>   request_url int(11) DEFAULT '0' NOT NULL,
Simson>   protocol int(11) DEFAULT '0' NOT NULL,
Simson>   result int(11) DEFAULT '0' NOT NULL,
Simson>   bytes int(11) DEFAULT '0' NOT NULL,
Simson>   refer int(11) DEFAULT '0' NOT NULL,
Simson>   agent int(11) DEFAULT '0' NOT NULL,
Simson>   request_file int(11) DEFAULT '0' NOT NULL,
Simson>   request_domain int(11) DEFAULT '0' NOT NULL,
Simson>   ssl enum('TRUE','FALSE') DEFAULT 'FALSE',
Simson>   image enum('TRUE','FALSE') DEFAULT 'FALSE',
Simson>   PRIMARY KEY (logid),
Simson>   KEY host (host),
Simson>   KEY hostip (hostip),
Simson>   KEY when (when),
Simson>   KEY request_type (request_type),
Simson>   KEY request_url (request_url),
Simson>   KEY result (result),
Simson>   KEY bytes (bytes),
Simson>   KEY refer (refer),
Simson>   KEY agent (agent),
Simson>   KEY request_file (request_file),
Simson>   KEY request_domain (request_domain)
Simson> );

Simson> #
Simson> # Table structure for table 'strings'
Simson> #
Simson> CREATE TABLE strings (
Simson>   id int(11) DEFAULT '0' NOT NULL auto_increment,
Simson>   string varchar(255) DEFAULT '' NOT NULL,
Simson>   lastref timestamp(14),
Simson>   PRIMARY KEY (id),
Simson>   KEY string (string)
Simson> );

Simson> I want to find out how many log entries have a result of 200. 

mysql> select count(*) from log;
Simson> +----------+
Simson> | count(*) |
Simson> +----------+
Simson> |   888688 |
Simson> +----------+
Simson> 1 row in set (0.00 sec)

mysql> select count(*) from log where result=200;
Simson> +----------+
Simson> | count(*) |
Simson> +----------+
Simson> |   681518 |
Simson> +----------+
Simson> 1 row in set (42.00 sec)

mysql> 

Simson> Hello? Why does it take 42 seconds to find out how many of the entires have
> result=200?
Simson> Result is a table key. So I run explain:

mysql> explain select count(*) from log where result=200;
Simson>
> +-------+------+---------------+--------+---------+------+--------+-------------+
Simson> | table | type | possible_keys | key    | key_len | ref  | rows   | Extra      
> |
Simson>
> +-------+------+---------------+--------+---------+------+--------+-------------+
Simson> | log   | ref  | result        | result |       4 | ???  | 706205 | Using index
> |
Simson>
> +-------+------+---------------+--------+---------+------+--------+-------------+
Simson> 1 row in set (0.02 sec)

mysql> 

Simson> Okay. Sure enough, the database tells me that the query will use the result
> index.
Simson> So this doesn't make any sense at all. 

Simson> Any thoughts?

Yes, of course.

The query finds 681518 entires.

Result is 4 bytes long
The pointer to the row is 4 bytes long
Each key block (1024 bytes) is normally about 2/3 full.

This means that you get about 1024*2/3/8 = 85 keys / key block

To find 681518 entries, MySQL needs to do 681518/85 = 8017 disk reads.

Each disk read will probably require a disk-seek.  If your disk requires 10
milliseconds / seek, it will take 80 seconds.  The caching will help
a little and put down the total to 40 seconds.


If you use 'isamchk -Si table'

the indexed will be ordered and you will have much fewer disk seeks.

To get things like the above faster, you should from time to time
create summary tables that holds information that you want to access
quickly.

For example a summary table grouped by: host and result would be much
smaller and faster than your current table...

Regards,
Monty



Thread
Problem with count(*)Simson L. Garfinkel25 Jun
  • Problem with count(*)Michael Widenius29 Jun
  • Re: Problem with count(*)Simson Garfinkel29 Jun
    • Re: Problem with count(*)Michael Widenius30 Jun
  • Re: Problem with count(*)Simson Garfinkel30 Jun
    • Re: Problem with count(*)Michael Widenius30 Jun