List:General Discussion« Previous MessageNext Message »
From:Michael Widenius Date:October 5 1999 7:27pm
Subject:Long count(*)
View as plain text  
>>>>> "Mike" == Mike Perik <mikep@stripped> writes:

Mike> Here is some diagnostic information concerning a 'select count(*) ..."
Mike> that takes 28 seconds.  I'm baffled because it is using indexes on the
Mike> two tables.   The EXPLAIN shows what seems to be a massive
Mike> reduction of rows in both tables.  I think this is bogus, I think it's
Mike> only showing the first 4 digits of the
Mike> number.  I say this because the first four digits of the number of rows
Mike> in each table matches them pretty
Mike> close.  What's up with this?

Mike> How can I speed this up?

Mike> We want to get a count of the total result set so we can do fixed size
Mike> paging of  results via the LIMIT option
Mike> on the select.  The subsequent select using the LIMIT option are very
Mike> quick.

<cut>


mysql> explain select count(*) from gas_alarms straight_join
Mike> gas_alarm_attributes where gas_alarms.internal_id =
Mike> gas_alarm_attributes.alarm_id and state = "closed" and timestamp <=
Mike> FROM_UNIXTIME(938719038);
Mike>
> +----------------------+------+---------------+------+---------+------------------------+------+-------------------------+

Mike> | table                | type | possible_keys | key  | key_len |
Mike> ref                    | rows | Extra                   |
Mike>
> +----------------------+------+---------------+------+---------+------------------------+------+-------------------------+

Mike> | gas_alarms           | ref  | PRIMARY,idx2  | idx2 |       1 |
Mike> closed                 | 2408 | ; Using index           |
Mike> | gas_alarm_attributes | ref  | idx1,idx2     | idx2 |       4 |
Mike> gas_alarms.internal_id | 7109 | where used; Using index |
Mike>
> +----------------------+------+---------------+------+---------+------------------------+------+-------------------------+

Mike> 2 rows in set (0.00 sec)

mysql> select count(*) from gas_alarms straight_join
Mike> gas_alarm_attributes where gas_alarms.internal_id =
Mike> gas_alarm_attributes.alarm_id and state = "closed" and timestamp <=
Mike> FROM_UNIXTIME(938719038);
Mike> +----------+
Mike> | count(*) |
Mike> +----------+
Mike> |   707588 |
Mike> +----------+
Mike> 1 row in set (23.15 sec)


Hi!

The timestamp can't be used as a key to solve this query as you use '<='

To know how much work MySQL has to do to find out the above result,
you can do.


select count(*) from gas_alarms where state = "closed";

select count(*) from gas_alarms straight_join gas_alarm_attributes
where gas_alarms.internal_id = gas_alarm_attributes.alarm_id and state = "closed";

Note that even if MySQL is fast, each key lookup to find the
gas_alarm_attributes that corresponds to the gas_alarms will take some
time.  If you have many rows to process the total query will always
take a lot of time.

The question is;  Can you figure out a way how to do the above
query to make it fast ? (Note that the only 'tools you have to your
disposal is to scanning through all entries, use a 'key' to look things 
up quickly (in the same way as you would use a sorted dictionary) and
store temporary results in new tables).  If you can find a solution,
you can almost always get MySQL to use it.  If you can't solve this, the
only way to make your application fast is to store some information
into summary tables that you update on the fly or in batch from time to time.

Regards,
Monty
Thread
Long count(*)Mike Perik30 Sep
  • Re: Long count(*)Jan Dvorak1 Oct
  • Long count(*)Michael Widenius5 Oct