List:General Discussion« Previous MessageNext Message »
From:Rick James Date:February 21 2013 7:47pm
Subject:RE: MyISAM table size vs actual data, and performance
View as plain text  
* Smells like some huge LONGTEXTs were INSERTed, then DELETEd.  Perhaps just a single one of nearly 500M. 

* Yes, there is an impact on full table scans -- it has to step over the empty spots.  Or maybe not -- one big cow chip of 500MB would be easy to leap over.

* OPTIMIZE TABLE is the primary way to recover the space.  It _may_ be that space on the _end_ is automatically recovered.  If so, you might see the .MYD shrink even when OPTIMIZE is not run.

* LONGTEXT is almost never useful.  Do you really think there are thingies that big?  Consider changing it to MEDIUMTEXT -- that would truncate any biggies to 16MB.

* Smells like a key-value (EAV) schema design.  Such is destined to fail when trying to scale.  Yeah, you are probably stuck with Drupal.  Here are my comments and recommendations on EAV:  http://mysql.rjweb.org/doc.php/eav

* Please try to find a way in your Email client to display STATUS without losing the spacing.

* When you switched to InnoDB, I hope you had innodb_file_per_table turned on.  That way, you can actually recoup the space when doing ALTER.  Otherwise, you will be stuck with a bloated ibdata1 file that you cannot easily shrink.

* In InnoDB, the LONGTEXT will usually be stored separately, thereby making a full table scan relatively efficient.

> -----Original Message-----
> From: Johan De Meersman [mailto:vegivamp@stripped]
> Sent: Friday, February 15, 2013 4:21 AM
> To: mysql.
> Subject: MyISAM table size vs actual data, and performance
> 
> 
> 
> Hey list,
> 
> I've got another peculiar thing going on :-) Let me give you a quick
> summary of the situation first: we host a number of Drupal sites, each
> site and it's db on separate VMs for reasons that are not important to
> this scenario. MySQL is 5.0.51a-24+lenny4-log (Debian); I don't have
> the exact Drupal version here but it's likely to be a 5.x branch.
> 
> The easy thing to say would of course be "upgrade your versions", but
> that's not an option right now. I don't really care if that means I
> have no actual *fix* for the problem - I know how to work around it.
> I'm just looking for a cause, ideally maybe even a specific known bug.
> Strangely enough, I'm seeing this on three distinct installs; but
> others with the same versions and setup (but different sites) seem to
> not exhibit the issue.
> 
> So, what I'm seeing is this: Drupal's "variable" table keeps growing,
> but there does not seem to be more data. I understand how record
> allocation and free space in datafiles works, but this is well beyond
> the normal behaviour.
> 
> 
> http://www.tuxera.be/filestore/heciexohhohj/df-year.png
> 
> As you can see here (the lime green line of /data), growth occurs
> gradually (and the issue happened in september, as well), until it
> seems to reach a certain point. At some point, however, performance on
> that table (notably select * - it's a drupal thing) pretty much
> instantly plummets, and the query takes around half a minute to run -
> whereas now, after reclaiming the free space, it takes 0.03 seconds.
> 
> I don't have the exact numbers as I wasn't on-site yesterday evening,
> but since the disk is 5GB, the reclaimed space yesterday must have been
> around 850MB - for a table that is now 30MB. No records were deleted
> from the table, the workaround is as simple as "OPTIMIZE TABLE
> variable" - simply rebuild the table. The logs make no mention of a
> crashed table, so it's very unlikely that this is a borked index. Even
> if it were, I wouldn't expect a scan of 30MB in 1202 rows to take half
> a minute, on a table that is accessed so often that it's relevant
> blocks are bound to be in the filesystem cache.
> 
> The table's structure is fairly simple, too:
> 
> 
> 
> CREATE TABLE `variable` (
> `name` varchar(128) NOT NULL DEFAULT '', `value` longtext NOT NULL,
> PRIMARY KEY (`name`)
> ) ENGINE=MyISAM DEFAULT CHARSET=utf8
> 
> 
> 
> 
> I currently have another system that's also growing that table, here's
> a bit of session:
> 
> 
> <blockquote>
> mysql> show table status like 'variable';
> +----------+--------+---------+------------+------+----------------+---
> ----------+-----------------+--------------+-----------+---------------
> -+---------------------+---------------------+---------------------+---
> --------------+----------+----------------+---------+
> | Name | Engine | Version | Row_format | Rows | Avg_row_length |
> | Data_length | Max_data_length | Index_length | Data_free |
> | Auto_increment | Create_time | Update_time | Check_time | Collation |
> | Checksum | Create_options | Comment |
> +----------+--------+---------+------------+------+----------------+---
> ----------+-----------------+--------------+-----------+---------------
> -+---------------------+---------------------+---------------------+---
> --------------+----------+----------------+---------+
> | variable | MyISAM | 10 | Dynamic | 1188 | 795 | 493277732 |
> | 281474976710655 | 41984 | 492332716 | NULL | 2011-12-13 16:18:53 |
> | 2013-02-15 12:35:18 | 2012-10-17 15:45:11 | utf8_general_ci | NULL |
> |
> | |
> +----------+--------+---------+------------+------+----------------+---
> ----------+-----------------+--------------+-----------+---------------
> -+---------------------+---------------------+---------------------+---
> --------------+----------+----------------+---------+
> 
> 12:36:55|root@xendbprod1-1:/data/mysql 0 # mysqlcheck --verbose DBNAME
> variable # Connecting to localhost...
> DBBAME.variable OK
> # Disconnecting from localhost...
> 
> 12:37:07|root@xendbprod1-1:/data/mysql 0 # mysqlcheck --verbose -g
> DBNAME variable # Connecting to localhost...
> DBNAME.variable OK
> # Disconnecting from localhost...
> 
> mysql> show table status where name like "variable";
> +----------+--------+---------+------------+------+----------------+---
> ----------+-----------------+--------------+-----------+---------------
> -+---------------------+---------------------+---------------------+---
> --------------+----------+----------------+---------+
> | Name | Engine | Version | Row_format | Rows | Avg_row_length |
> | Data_length | Max_data_length | Index_length | Data_free |
> | Auto_increment | Create_time | Update_time | Check_time | Collation |
> | Checksum | Create_options | Comment |
> +----------+--------+---------+------------+------+----------------+---
> ----------+-----------------+--------------+-----------+---------------
> -+---------------------+---------------------+---------------------+---
> --------------+----------+----------------+---------+
> | variable | MyISAM | 10 | Dynamic | 1188 | 497 | 493277732 |
> | 281474976710655 | 41984 | 492686616 | NULL | 2011-12-13 16:18:53 |
> | 2013-02-15 12:37:35 | 2013-02-15 12:37:07 | utf8_general_ci | NULL |
> |
> | |
> +----------+--------+---------+------------+------+----------------+---
> ----------+-----------------+--------------+-----------+---------------
> -+---------------------+---------------------+---------------------+---
> --------------+----------+----------------+---------+
> 1 row in set (0.00 sec)
> 
> mysql> optimize table variable;
> +-----------------+----------+----------+----------+
> | Table | Op | Msg_type | Msg_text |
> +-----------------+----------+----------+----------+
> | DBNAME.variable | optimize | status | OK |
> +-----------------+----------+----------+----------+
> 1 row in set (2.37 sec)
> 
> mysql> show table status where name like "variable";
> +----------+--------+---------+------------+------+----------------+---
> ----------+-----------------+--------------+-----------+---------------
> -+---------------------+---------------------+---------------------+---
> --------------+----------+----------------+---------+
> | Name | Engine | Version | Row_format | Rows | Avg_row_length |
> | Data_length | Max_data_length | Index_length | Data_free |
> | Auto_increment | Create_time | Update_time | Check_time | Collation |
> | Checksum | Create_options | Comment |
> +----------+--------+---------+------------+------+----------------+---
> ----------+-----------------+--------------+-----------+---------------
> -+---------------------+---------------------+---------------------+---
> --------------+----------+----------------+---------+
> | variable | MyISAM | 10 | Dynamic | 1188 | 497 | 590940 |
> | 281474976710655 | 39936 | 0 | NULL | 2011-12-13 16:18:53 | 2013-02-15
> | 12:39:30 | 2013-02-15 12:39:30 | utf8_general_ci | NULL | | |
> +----------+--------+---------+------------+------+----------------+---
> ----------+-----------------+--------------+-----------+---------------
> -+---------------------+---------------------+---------------------+---
> --------------+----------+----------------+---------+
> 1 row in set (0.00 sec)
> 
> </blockquote>
> 
> 
> As you can see, the table is almost 500MB. Running mysqlcheck shows
> that it is fine and not in need of a format upgrade. Optimizing the
> table, however, results in it being reduced to half an MB... At any
> point in this cycle, however, no excessive amount of data is inserted
> in the table. Insert, updates and deletes are fairly common, but not to
> the point where there would have been a couple of hundred MB of data in
> the file. Had I left this one for some more time, I'm sure that it
> would eventually also have reached the tipping point where the side
> suddenly becomes unusably slow.
> 
> The performance degradation isn't perfectly traceable to a single point
> in time; the slowlog does show that query being slow on occasion;
> however it seems that it is intermittent until it reaches a point of no
> return, when the queries get slow enough that a cascade of pending
> connections happens until we run out of free handles and the site just
> stops responding.
> 
> 
> Now, in my understanding, the size of the file, while unusual, really
> shouldn't have much bearing on the execution time of a full table scan,
> should it? I mean, even a full tablescan is simply going to scan the
> row index and only read the blocks that are actually in use? Hell, even
> IF every single row would somehow have ended up in a different block
> (which the allocation alghorithm should have prevented), that's still
> only slightly over a thousand blocks, or half a meg.
> 
> Thus, my question to the smart people on this list is threefold:
> 
> 1) Has anyone seen this behaviour before, and maybe know of a specific
> bug?
> 2) What might cause the file to grow this large?
> 3) How could the size of the file impact the speed of a full tablescan
> if only a few blocks in the file are actually in use?
> 
> Am I missing something obvious, here?
> 
> I have now updated those tables to InnoDB, time will tell if the issue
> reasserts itself. In the mean time, though I would very much like to
> understand what's going on.
> 
> 
> Thank you for your thoughts,
> /johan
> 
> 
> --
> 
> What's tiny and yellow and very, very dangerous?
> A canary with the root password.
Thread
MyISAM table size vs actual data, and performanceJohan De Meersman15 Feb
  • RE: MyISAM table size vs actual data, and performanceRick James21 Feb
    • Re: MyISAM table size vs actual data, and performanceJohan De Meersman22 Feb