List:General Discussion« Previous MessageNext Message »
From:Dan Nelson Date:November 27 2006 2:00am
Subject:Re: Performance Problems With Two Tables With Over 500K Rows
View as plain text  
In the last episode (Nov 26), John Kopanas said:
> Thanks a lot for your help.
> 
> The query should and only does return 1-6 rows depending on the id.
> Never more then that.  Here are the comperative EXPLAINs:
> 
> mysql> EXPLAIN SELECT * FROM purchased_services WHERE id = 1000;
>
> +----+-------------+--------------------+-------+---------------+---------+---------+-------+------+-------+
> | id | select_type | table              | type  | possible_keys | key   | key_len |
> ref   | rows | Extra |
>
> +----+-------------+--------------------+-------+---------------+---------+---------+-------+------+-------+
> |  1 | SIMPLE      | purchased_services | const | PRIMARY       | PRIMARY |       4 |
> const |    1 |       |
>
> +----+-------------+--------------------+-------+---------------+---------+---------+-------+------+-------+
> 1 row in set (0.00 sec)

This query definitly should run almost instantly, since it looks like a
direct lookup on the primary key.
 
> mysql> EXPLAIN SELECT * FROM purchased_services WHERE company_id = 1000;
>
> +----+-------------+--------------------+------+-------------------------------------+------+---------+------+--------+-------------+
> | id | select_type | table              | type | possible_keys              | key  |
> key_len | ref  | rows   | Extra       |
>
> +----+-------------+--------------------+------+-------------------------------------+------+---------+------+--------+-------------+
> |  1 | SIMPLE      | purchased_services | ALL  | purchased_services_company_id_index
> | NULL |    NULL | NULL | 626188 | Using where |
>
> +----+-------------+--------------------+------+-------------------------------------+------+---------+------+--------+-------------+
> 1 row in set (0.00 sec)

This EXPLAIN indicates that mysql thinks that the query would match
~600k rows and will do a full table scan.  Mysql only keeps a single
"cardinality" value for each index that estimates how many records have
a unique value in the index.  This can cause problems for the optimizer
if you have one value for say 60% of the table, and unique values for
the rest.  You can try adding a FORCE INDEX clause to the query and see
if that helps.
 
> Here is the explain for the SELECT COUNT(id)
> 
> mysql> EXPLAIN SELECT count(id) FROM companies;
>
> +----+-------------+-----------+-------+---------------+---------+---------+------+--------+-------------+
> | id | select_type | table     | type  | possible_keys | key     | key_len | ref  |
> rows   | Extra       |
>
> +----+-------------+-----------+-------+---------------+---------+---------+------+--------+-------------+
> |  1 | SIMPLE      | companies | index | NULL          | PRIMARY |  4 | NULL | 533821
> | Using index |
>
> +----+-------------+-----------+-------+---------------+---------+---------+------+--------+-------------+
> 1 row in set (0.10 sec)
> 
> The explain takes a fraction of a second and returns the amound of row
> plus some.  But when I just as for the count it took me 5 seconds.
> Something is broken.

Note that a primary index scan on an InnoDB table really is a full
table scan.  Try creating another index on just the id field and force
mysql to use it with a FORCE INDEX clause.  Innodb's query optimizer
will always prefer the primary index even if the secondary is smaller,
which is why you have to force it here.  The row count in the estimate
is off because Innodb's query optimizer doesn't know the exact row
count and has to guess.
 
> My innodb_buffer_pool_size is:
> innodb_buffer_pool_size         | 8388608
> 
> That looks like 8MB... that sounds small if I have a DB with over 1M
> rows to process.  No?

Yes, that's extremely small.  I'd go for at least 256M, and maybe 512M
if your machine will primarily be doing mysql duties.

-- 
	Dan Nelson
	dnelson@stripped
Thread
Performance Problems With Two Tables With Over 500K RowsJohn Kopanas25 Nov
  • Re: Performance Problems With Two Tables With Over 500K RowsJohn Kopanas26 Nov
    • Re: Re: Performance Problems With Two Tables With Over 500K RowsJohn Kopanas26 Nov
      • Re: Re: Re: Performance Problems With Two Tables With Over 500K RowsDan Buettner26 Nov
        • Re: Re: Re: Re: Performance Problems With Two Tables With Over 500K RowsJohn Kopanas27 Nov
  • Re: Performance Problems With Two Tables With Over 500K RowsDan Nelson26 Nov
    • Re: Re: Performance Problems With Two Tables With Over 500K RowsJohn Kopanas27 Nov
      • Re: Performance Problems With Two Tables With Over 500K RowsDan Nelson27 Nov
        • Re: Re: Performance Problems With Two Tables With Over 500K RowsJohn Kopanas27 Nov
          • Re: Performance Problems With Two Tables With Over 500K RowsDan Nelson27 Nov
            • Re: Re: Performance Problems With Two Tables With Over 500K RowsJohn Kopanas27 Nov
          • Re: Re: Performance Problems With Two Tables With Over 500KRowsmos27 Nov
        • Re: Performance Problems With Two Tables With Over 500K RowsDominik Klein27 Nov
  • Re: Performance Problems With Two Tables With Over 500K RowsDuncan Hill27 Nov