List:General Discussion« Previous MessageNext Message »
From:Victor Danilchenko Date:April 11 2008 6:22pm
Subject:Re: MySQl 5.0 optimization problem
View as plain text  
	First of all, my bad -- I forgot to mention that I use MyISAM.

mysql> show table status from example like 'leads'\G
*************************** 1. row ***************************
            Name: leads
          Engine: MyISAM
         Version: 10
      Row_format: Dynamic
            Rows: 1267995
  Avg_row_length: 224
     Data_length: 284349972
Max_data_length: 281474976710655
    Index_length: 201081856
       Data_free: 0
  Auto_increment: NULL
     Create_time: 2008-04-11 14:03:14
     Update_time: 2008-04-11 14:04:26
      Check_time: 2008-04-11 14:07:51
       Collation: utf8_general_ci
        Checksum: NULL
  Create_options:
         Comment:


Ben Clewett wrote:
> Hi,
> 
> Can you please post your query?  I also need to know your table type as 
> different settings effect different table types?

	The query is simply:

 > select count(email1) from leads;

	The table structure is as follows:

mysql> describe leads;
+----------------------------+--------------+------+-----+---------------------+-------+
| Field                      | Type         | Null | Key | Default 
        | Extra |
+----------------------------+--------------+------+-----+---------------------+-------+
| id                         | varchar(36)  | NO   | PRI | 
        | deleted                    | tinyint(1)   | NO   |     | 0 
                | converted                  | tinyint(1)   | NO   | 
  | 0                   | date_entered               | datetime     | NO 
   |     | 0000-00-00 00:00:00 |       |
| date_modified              | datetime     | NO   |     | 0000-00-00 
00:00:00 |       |
| modified_user_id           | varchar(36)  | YES  |     | NULL 
        | assigned_user_id           | varchar(36)  | YES  | MUL | NULL 
                | created_by                 | varchar(36)  | YES  | 
  | NULL                | salutation                 | varchar(5)   | 
YES  |     | NULL                | first_name                 | 
varchar(25)  | YES  |     | NULL                | last_name 
      | varchar(25)  | YES  | MUL | NULL                | title 
              | varchar(100) | YES  |     | NULL                | 
refered_by                 | varchar(100) | YES  |     | NULL 
      | lead_source                | varchar(100) | YES  |     | NULL 
              | lead_source_description    | mediumtext   | YES  |     | 
NULL                | status                     | varchar(100) | YES  | 
     | NULL                | status_description         | mediumtext   | 
YES  |     | NULL                | department                 | 
varchar(100) | YES  |     | NULL                | reports_to_id 
      | varchar(36)  | YES  |     | NULL                | do_not_call 
              | char(3)      | YES  |     | 0                   | 
phone_home                 | varchar(25)  | YES  |     | NULL 
      | phone_mobile               | varchar(25)  | YES  |     | NULL 
              | phone_work                 | varchar(25)  | YES  |     | 
NULL                | phone_other                | varchar(25)  | YES  | 
     | NULL                | phone_fax                  | varchar(25)  | 
YES  |     | NULL                | email1                     | 
varchar(100) | YES  | MUL | NULL                | email2 
      | varchar(100) | YES  | MUL | NULL                | email_opt_out 
              | char(3)      | YES  |     | 0                   | 
primary_address_street     | varchar(150) | YES  |     | NULL 
      | primary_address_city       | varchar(100) | YES  |     | NULL 
              | primary_address_state      | varchar(100) | YES  |     | 
NULL               | primary_address_postalcode | varchar(20)  | YES  | 
     | NULL                | primary_address_country    | varchar(100) | 
YES  |     | NULL                | alt_address_street         | 
varchar(150) | YES  |     | NULL                | alt_address_city 
      | varchar(100) | YES  |     | NULL                | 
alt_address_state          | varchar(100) | YES  |     | NULL 
      | alt_address_postalcode     | varchar(20)  | YES  |     | NULL 
              | alt_address_country        | varchar(100) | YES  |     | 
NULL                | description                | mediumtext   | YES  | 
     | NULL                | account_name               | varchar(150) | 
YES  |     | NULL                | account_description        | 
mediumtext   | YES  |     | NULL                | contact_id 
      | varchar(36)  | YES  | MUL | NULL                | account_id 
              | varchar(36)  | YES  | MUL | NULL                | 
opportunity_id             | varchar(36)  | YES  | MUL | NULL 
      | opportunity_name           | varchar(255) | YES  |     | NULL 
              | opportunity_amount         | varchar(50)  | YES  |     | 
NULL                | campaign_id                | varchar(36)  | YES  | 
     | NULL                | portal_name                | varchar(255) | 
YES  |     | NULL                | portal_app                 | 
varchar(255) | YES  |     | NULL                | invalid_email 
      | tinyint(1)   | YES  |     | 0 
+----------------------------+--------------+------+-----+---------------------+-------+
50 rows in set (0.04 sec)


Indexes are as follows:


mysql> show index from leads;
+-------+------------+---------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name            | Seq_in_index | Column_name 
      | Collation | Cardinality | Sub_part | Packed | Null | Index_type 
| Comment |
+-------+------------+---------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+
| leads |          0 | PRIMARY             |            1 | id 
      | A         |     1267995 |     NULL | NULL   |      | BTREE 
|         |
| leads |          1 | idx_lead_last_first |            1 | last_name 
      | A         |     1267995 |     NULL | NULL   | YES  | BTREE 
|         |
| leads |          1 | idx_lead_last_first |            2 | first_name 
      | A         |     1267995 |     NULL | NULL   | YES  | BTREE 
|         |
| leads |          1 | idx_lead_last_first |            3 | deleted 
      | A         |     1267995 |     NULL | NULL   |      | BTREE 
|         |
| leads |          1 | idx_lead_del_stat   |            1 | last_name 
      | A         |     1267995 |     NULL | NULL   | YES  | BTREE 
|         |
| leads |          1 | idx_lead_del_stat   |            2 | status 
      | A         |     1267995 |     NULL | NULL   | YES  | BTREE 
|         |
| leads |          1 | idx_lead_del_stat   |            3 | deleted 
      | A         |     1267995 |     NULL | NULL   |      | BTREE 
|         |
| leads |          1 | idx_lead_del_stat   |            4 | first_name 
      | A         |     1267995 |     NULL | NULL   | YES  | BTREE 
|         |
| leads |          1 | idx_lead_opp_del    |            1 | 
opportunity_id   | A         |     1267995 |     NULL | NULL   | YES  | 
BTREE      |         |
| leads |          1 | idx_lead_opp_del    |            2 | deleted 
      | A         |     1267995 |     NULL | NULL   |      | BTREE 
|         |
| leads |          1 | idx_leads_acct_del  |            1 | account_id 
      | A         |     1267995 |     NULL | NULL   | YES  | BTREE 
|         |
| leads |          1 | idx_leads_acct_del  |            2 | deleted 
      | A         |     1267995 |     NULL | NULL   |      | BTREE 
|         |
| leads |          1 | idx_cont_email1     |            1 | email1 
      | A         |     1267995 |     NULL | NULL   | YES  | BTREE 
|         |
| leads |          1 | idx_cont_email1     |            2 | deleted 
      | A         |     1267995 |     NULL | NULL   |      | BTREE 
|         |
| leads |          1 | idx_cont_email2     |            1 | email2 
      | A         |     1267995 |     NULL | NULL   | YES  | BTREE 
|         |
| leads |          1 | idx_cont_email2     |            2 | deleted 
      | A         |     1267995 |     NULL | NULL   |      | BTREE 
|         |
| leads |          1 | idx_lead_assigned   |            1 | 
assigned_user_id | A         |      126799 |     NULL | NULL   | YES  | 
BTREE      |         |
| leads |          1 | idx_lead_contact    |            1 | contact_id 
      | A         |     1267995 |     NULL | NULL   | YES  | BTREE 
|         |
+-------+------------+---------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+
21 rows in set (0.00 sec)


	BTW, it's been up for a few hours, and the cache seems to have kicked 
in; now the count(email1) (and other fields) takes just a few seconds. 
However, this morning, right after I increased cache allocation and 
restarted the server, it was taking two minutes -- and I had replicated 
that result on a few different indexed columns of the same table.

	email1 an indexed varchar(100) field, I also thought it should simply 
use the index for count -- but it didn't, or did in an incredibly 
time-consuming way.


> 
> You are right that a SELECT COUNT(*) WHERE field = 'value' should hit 
> the index, but does depend on your query.
> 
> You might also try EXPLAIN before your query, which will show the 
> approximate number of rows you are hitting.  Look at SHOW TABLE STATUS 
> which will give an idea of the size of the rows and indexs.
> 
> Marry one to the other and it will give an idea of the cache settings to 
> get the query into cache.  But still depends a lot on the table type!
> 
> Ben
> 
> 
> Wm Mussatto wrote:
>> On Fri, April 11, 2008 06:47, Ben Clewett wrote:
>>> Are you using MyIsam or InnoDB?  Or something else?
>>>
>>> In either case the speed to get a COUNT() is largely down to the speed
>>> if your disks and size of disk caching.  A COUNT() forces the system to
>>> read every row in order to count them, and any large table is probably
>>> larger than your caches.
>>>
>>> In some ways this is not so important, since it is unusual for a query
>>> to want to read every row of a table.  (I have 250GB tables which have
>>> excellent performance but would take minutes to count every row :)
>>>
>>> It might be better to consider the type of queries you will be using,
>>> and the type of table, and optimise for that...
>>>
>>> Ben
>>>
>>> Victor Danilchenko wrote:
>>>>     Hi,
>>>>
>>>>     I am trying to optimize our DB server. We have one table which has
>>>> 1.3M entries, and the keys are GUIDs (so the key space is large).
>>>> However, I have it all indexed. The performance was iffy, though, so I
>>>> increased memory allocation, and the searches on the indexed fields 
>>>> seem
>>>> to be OK now. Still, a simple count(field) on this table still takes,
>>>> like, 2 minutes! I am guessing i am missing something obvious, but I
>>>> read through a few MySQL optimization guides, and it seems like i am
>>>> covering my bases.
>>>>
>>>>     Here is the relevant slice of my my.cnf:
>>>>
>>>> #
>>>> # * Fine Tuning
>>>> #
>>>> key_buffer              = 256M
>>>> max_allowed_packet      = 64M
>>>> thread_stack            = 512K
>>>> thread_cache_size       = 32
>>>> #
>>>> # * Query Cache Configuration
>>>> #
>>>> query_cache_limit       = 32M
>>>> query_cache_size        = 256M
>>>> query_cache_type        = 1
>>>> table_cache             = 512
>>>> sort_buffer_size        = 32M
>>>>
>>>>
>>>>     I am running Ubuntu 7.04, with 1GB of RAM. The relevant fields are
>>>> all indexed, but I can't imagine why a simple count() would take so
>>>> long, when the actual query by value on the same field is effectively
>>>> instant (after my cache setting expansion).
>>>>
>>>>     Does anyone have an idea of what I am missing? Also, if you think
>>>> any of the above settings seem wrong for a server with 1GB of RAM,
>>>> please let me know.
>> If the field you are counting is the first field in an index I would 
>> think
>> it would go much faster (system will just use the index).  If some of 
>> your
>> counts are fast and some are slow (you said iffy) that might explain the
>> difference.
>>
>> ------
>> William R. Mussatto
>> Systems Engineer
>> http://www.csz.com
>> 909-920-9154
>>
>>
> 
> 
> 


-- 
	Victor Danilchenko
	Senior Software Engineer, AskOnline.net
	victor@stripped - 617-273-0119
Thread
MySQl 5.0 optimization problemVictor Danilchenko11 Apr
  • Re: MySQl 5.0 optimization problemBen Clewett11 Apr
    • Re: MySQl 5.0 optimization problemWm Mussatto11 Apr
      • Re: MySQl 5.0 optimization problemBen Clewett11 Apr
        • Re: MySQl 5.0 optimization problemVictor Danilchenko11 Apr
    • Re: MySQl 5.0 optimization problemRob Wultsch11 Apr