List:General Discussion« Previous MessageNext Message »
From:Johan De Meersman Date:February 15 2013 12:21pm
Subject:MyISAM table size vs actual data, and performance
View as plain text  

Hey list, 

I've got another peculiar thing going on :-) Let me give you a quick summary of the
situation first: we host a number of Drupal sites, each site and it's db on separate VMs
for reasons that are not important to this scenario. MySQL is 5.0.51a-24+lenny4-log
(Debian); I don't have the exact Drupal version here but it's likely to be a 5.x branch. 

The easy thing to say would of course be "upgrade your versions", but that's not an option
right now. I don't really care if that means I have no actual *fix* for the problem - I
know how to work around it. I'm just looking for a cause, ideally maybe even a specific
known bug. Strangely enough, I'm seeing this on three distinct installs; but others with
the same versions and setup (but different sites) seem to not exhibit the issue. 

So, what I'm seeing is this: Drupal's "variable" table keeps growing, but there does not
seem to be more data. I understand how record allocation and free space in datafiles
works, but this is well beyond the normal behaviour. 


http://www.tuxera.be/filestore/heciexohhohj/df-year.png

As you can see here (the lime green line of /data), growth occurs gradually (and the issue
happened in september, as well), until it seems to reach a certain point. At some point,
however, performance on that table (notably select * - it's a drupal thing) pretty much
instantly plummets, and the query takes around half a minute to run - whereas now, after
reclaiming the free space, it takes 0.03 seconds. 

I don't have the exact numbers as I wasn't on-site yesterday evening, but since the disk
is 5GB, the reclaimed space yesterday must have been around 850MB - for a table that is
now 30MB. No records were deleted from the table, the workaround is as simple as
"OPTIMIZE TABLE variable" - simply rebuild the table. The logs make no mention of a
crashed table, so it's very unlikely that this is a borked index. Even if it were, I
wouldn't expect a scan of 30MB in 1202 rows to take half a minute, on a table that is
accessed so often that it's relevant blocks are bound to be in the filesystem cache. 

The table's structure is fairly simple, too: 



CREATE TABLE `variable` ( 
`name` varchar(128) NOT NULL DEFAULT '', 
`value` longtext NOT NULL, 
PRIMARY KEY (`name`) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8 




I currently have another system that's also growing that table, here's a bit of session: 


<blockquote>
mysql> show table status like 'variable'; 
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------------+----------+----------------+---------+

| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length |
Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time |
Check_time | Collation | Checksum | Create_options | Comment | 
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------------+----------+----------------+---------+

| variable | MyISAM | 10 | Dynamic | 1188 | 795 | 493277732 | 281474976710655 | 41984 |
492332716 | NULL | 2011-12-13 16:18:53 | 2013-02-15 12:35:18 | 2012-10-17 15:45:11 |
utf8_general_ci | NULL | | | 
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------------+----------+----------------+---------+


12:36:55|root@xendbprod1-1:/data/mysql 0 # mysqlcheck --verbose DBNAME variable 
# Connecting to localhost... 
DBBAME.variable OK 
# Disconnecting from localhost... 

12:37:07|root@xendbprod1-1:/data/mysql 0 # mysqlcheck --verbose -g DBNAME variable 
# Connecting to localhost... 
DBNAME.variable OK 
# Disconnecting from localhost... 

mysql> show table status where name like "variable"; 
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------------+----------+----------------+---------+

| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length |
Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time |
Check_time | Collation | Checksum | Create_options | Comment | 
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------------+----------+----------------+---------+

| variable | MyISAM | 10 | Dynamic | 1188 | 497 | 493277732 | 281474976710655 | 41984 |
492686616 | NULL | 2011-12-13 16:18:53 | 2013-02-15 12:37:35 | 2013-02-15 12:37:07 |
utf8_general_ci | NULL | | | 
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------------+----------+----------------+---------+

1 row in set (0.00 sec) 

mysql> optimize table variable; 
+-----------------+----------+----------+----------+ 
| Table | Op | Msg_type | Msg_text | 
+-----------------+----------+----------+----------+ 
| DBNAME.variable | optimize | status | OK | 
+-----------------+----------+----------+----------+ 
1 row in set (2.37 sec) 

mysql> show table status where name like "variable"; 
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------------+----------+----------------+---------+

| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length |
Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time |
Check_time | Collation | Checksum | Create_options | Comment | 
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------------+----------+----------------+---------+

| variable | MyISAM | 10 | Dynamic | 1188 | 497 | 590940 | 281474976710655 | 39936 | 0 |
NULL | 2011-12-13 16:18:53 | 2013-02-15 12:39:30 | 2013-02-15 12:39:30 | utf8_general_ci
| NULL | | | 
+----------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+---------------------+-----------------+----------+----------------+---------+

1 row in set (0.00 sec) 

</blockquote>


As you can see, the table is almost 500MB. Running mysqlcheck shows that it is fine and
not in need of a format upgrade. Optimizing the table, however, results in it being
reduced to half an MB... At any point in this cycle, however, no excessive amount of data
is inserted in the table. Insert, updates and deletes are fairly common, but not to the
point where there would have been a couple of hundred MB of data in the file. Had I left
this one for some more time, I'm sure that it would eventually also have reached the
tipping point where the side suddenly becomes unusably slow. 

The performance degradation isn't perfectly traceable to a single point in time; the
slowlog does show that query being slow on occasion; however it seems that it is
intermittent until it reaches a point of no return, when the queries get slow enough that
a cascade of pending connections happens until we run out of free handles and the site
just stops responding. 


Now, in my understanding, the size of the file, while unusual, really shouldn't have much
bearing on the execution time of a full table scan, should it? I mean, even a full
tablescan is simply going to scan the row index and only read the blocks that are
actually in use? Hell, even IF every single row would somehow have ended up in a
different block (which the allocation alghorithm should have prevented), that's still
only slightly over a thousand blocks, or half a meg. 

Thus, my question to the smart people on this list is threefold: 

1) Has anyone seen this behaviour before, and maybe know of a specific bug? 
2) What might cause the file to grow this large? 
3) How could the size of the file impact the speed of a full tablescan if only a few
blocks in the file are actually in use? 

Am I missing something obvious, here? 

I have now updated those tables to InnoDB, time will tell if the issue reasserts itself.
In the mean time, though I would very much like to understand what's going on. 


Thank you for your thoughts, 
/johan 


-- 

What's tiny and yellow and very, very dangerous? 
A canary with the root password. 

Thread
MyISAM table size vs actual data, and performanceJohan De Meersman15 Feb
  • RE: MyISAM table size vs actual data, and performanceRick James21 Feb
    • Re: MyISAM table size vs actual data, and performanceJohan De Meersman22 Feb