Thanks for taking a look at this. Comments inline.
Kevin Lewis wrote:
> I was trying you approach out today and found that the pruning cycle
> that allows blob pages to be reused is not called on a load-based event
> for this test.
> The load-based system I implemented will make a call to check whether to
> start the scavenger after every 64 records allocated. This was to avoid
> doing math constantly when allocating record cache memory. So after 64
> records have been added to the record cache, it will check how many
> bytes have been added by subtracting the current cache size from the
> cache size the last time the scavenger actually ran. If this is more
> than 2% of the cache, then it will call the scavenger.
Thanks for describing this...
> Debugging it, I found that each RecordVersion object for this updated
> blob record uses up only 162 bytes (in debug mode) of the record cache.
> With a default cache size of 250 Mb, it will only call the scavenger
> after every 5 Mb of record cache use. But these blob records only add
> 162 bytes per update to the cache, while adding 1.1 Mb to new blob pages.
Right, as you mentioned in the bug report, load-based scavenging will
not be triggered until after about 33000 updates in such a scenario.
> So the reason you test seemed to keep the file size down was that a
> regular scheduled scavenge just happened to run. Those scavenge cycles
> will run no matter how much has recently been added to cache.
I see, thanks for clarifying. (The first scavenge cycle seems to occur
almost immediately after server startup every time, though.)
Question (just trying to understand this better):
The same default value of the falcon_scavenge_schedule variable existed
also with the old scavenger. Are you saying that the old scavenger did
not prune these old records at the regular scavenge cycles _unless_ the
record cache was 75% full?
> This means that the success of your test is timing dependent.
> I think I need to re-open this bug and add some kind of blob page
> tracker that will signal the scavenger after a certain number of blob
> pages have been added (preferably, added by update).
I created a new bug for this, http://bugs.mysql.com/bug.php?id=42202
"Unacceptable tablespace growth when doing rapid BLOB updates". Feel
free to update the bug report if needed.
> As for your test script, you should fill the blob field with values that
> are independent of the filing system;
> INSERT INTO t1 (myblob) VALUES (repeat('a', 1*1024*1024));
> UPDATE t1 SET myblob = (repeat('b', 1*1024*1024));
> A file named 'MYSQL_TEST_DIR/include/UnicodeData.txt' does not work very
> well on Windows.
Good point, I will modify the test accordingly. However, I also use a
file to temporarily store the size of the tablespace, so I need to work
out a solution for that as well. The solution would probably be either
a) do some file path separator trickery if on Windows
b) disable the test on Windows
> You could try to set the scavenge schedule to every 2 or 5 seconds for
> this test, and it will probably work pretty well. But it would still be
> timing dependent.
Yes, but not too badly, I guess, given the sleeps I added between the
BLOB updates. I'll try setting a non-default schedule in the new version
of the test to reduce timing dependence.
> John H. Embretsen wrote:
>> based on revid:cpowers@stripped
>> 2964 John H. Embretsen 2009-01-16
>> Regression test for bug 41870 - Unbounded tablespace growth when
>> updating BLOB record
>> This test measures the size of the default (FALCON_USER)
>> tablespace file
>> both before and after updating a 1.1 MB BLOB multiple (20) times.
>> If no or very little space was released and re-used during this
>> time, the
>> test will fail. File size is measured using Perl code.
>> Test case may be sensitive to changes in the behavior of
>> the Falcon scavenger, and is a --big-test.
>> per-file messages:
>> Expected test result. Efforts have been made to make this as
>> informative as
>> possible while still being relatively robust.
>> mysqld options needed for 1.1 MB BLOB updates given the current
>> default mysql-test settings.
>> New test case. Involves Perl code and the use/passing of
>> environment variables. Should be able to detect regressions of bug 41870.