List:Falcon Storage Engine« Previous MessageNext Message »
From:Kevin Lewis Date:January 17 2009 5:39am
Subject:Re: bzr commit into mysql-6.0-falcon-team branch (john.embretsen:2964)
View as plain text  
John,

I was trying you approach out today and found that the pruning cycle 
that allows blob pages to be reused is not called on a load-based event 
for this test.

The load-based system I implemented will make a call to check whether to 
start the scavenger after every 64 records allocated.  This was to avoid 
doing math constantly when allocating record cache memory.  So after 64 
records have been added to the record cache, it will check how many 
bytes have been added by subtracting the current cache size from the 
cache size the last time the scavenger actually ran.  If this is more 
than 2% of the cache, then it will call the scavenger.

Debugging it, I found that each RecordVersion object for this updated 
blob record uses up only 162 bytes (in debug mode) of the record cache. 
    With a default cache size of 250 Mb, it will only call the scavenger 
after every 5 Mb of record cache use.  But these blob records only add 
162 bytes per update to the cache, while adding 1.1 Mb to new blob pages.

So the reason you test seemed to keep the file size down was that a 
regular scheduled scavenge just happened to run.  Those scavenge cycles 
will run no matter how much has recently been added to cache.

This means that the success of your test is timing dependent.

I think I need to re-open this bug and add some kind of blob page 
tracker that will signal the scavenger after a certain number of blob 
pages have been added (preferably, added by update).

As for your test script, you should fill the blob field with values that 
are independent of the filing system;

INSERT INTO t1 (myblob) VALUES (repeat('a', 1*1024*1024));
UPDATE t1 SET myblob = (repeat('b', 1*1024*1024));

A file named 'MYSQL_TEST_DIR/include/UnicodeData.txt' does not work very 
well on Windows.

You could try to set the scavenge schedule to every 2 or 5 seconds for 
this test, and it will probably work pretty well.  But it would still be 
timing dependent.

Kevin


John H. Embretsen wrote:
> #At file:///export/home/tmp/je159969/mysql/bzr-repos/build-mysql-6.0-falcon-team/
> based on revid:cpowers@stripped
> 
>  2964 John H. Embretsen	2009-01-16
>       Regression test for bug 41870 - Unbounded tablespace growth when updating BLOB
> record
>       
>       This test measures the size of the default (FALCON_USER) tablespace file
>       both before and after updating a 1.1 MB BLOB multiple (20) times.
>       If no or very little space was released and re-used during this time, the
>       test will fail. File size is measured using Perl code.
>       
>       Test case may be sensitive to changes in the behavior of the Falcon scavenger,
> and is a --big-test.
> added:
>   mysql-test/suite/falcon/r/falcon_blob_space-big.result
>   mysql-test/suite/falcon/t/falcon_blob_space-big-master.opt
>   mysql-test/suite/falcon/t/falcon_blob_space-big.test
> 
> per-file messages:
>   mysql-test/suite/falcon/r/falcon_blob_space-big.result
>     Expected test result. Efforts have been made to make this as informative as
>     possible while still being relatively robust.
>   mysql-test/suite/falcon/t/falcon_blob_space-big-master.opt
>     mysqld options needed for 1.1 MB BLOB updates given the current default
> mysql-test settings.
>   mysql-test/suite/falcon/t/falcon_blob_space-big.test
>     New test case. Involves Perl code and the use/passing of environment variables.
> Should be able to detect regressions of bug 41870.
> === added file 'mysql-test/suite/falcon/r/falcon_blob_space-big.result'
> --- a/mysql-test/suite/falcon/r/falcon_blob_space-big.result	1970-01-01 00:00:00
> +0000
> +++ b/mysql-test/suite/falcon/r/falcon_blob_space-big.result	2009-01-16 13:02:31
> +0000
> @@ -0,0 +1,54 @@
> +*** Bug #41870 ***
> +SET @@storage_engine = 'Falcon';
> +DROP TABLE IF EXISTS t1;
> +CREATE TABLE t1 (
> +pk INT AUTO_INCREMENT PRIMARY KEY, 
> +myblob LONGBLOB
> +);
> +
> +** Load BLOB data into table
> +** (actual full path not shown as it may vary)
> +
> +INSERT INTO t1 (myblob) VALUES
> (LOAD_FILE('MYSQL_TEST_DIR/include/UnicodeData.txt'));
> +SELECT pk, LENGTH(myblob) FROM t1;
> +pk	LENGTH(myblob)
> +1	1117369
> +** Recording initial size of falcon_user.fts tablespace file
> +
> +** Updating BLOB data 20 times using the following statement:
> +** UPDATE t1 SET myblob = LOAD_FILE('MYSQL_TEST_DIR/include/UnicodeData.txt') WHERE
> pk = 1;
> +
> +20 updates left
> +19 updates left
> +18 updates left
> +17 updates left
> +16 updates left
> +15 updates left
> +14 updates left
> +13 updates left
> +12 updates left
> +11 updates left
> +10 updates left
> +9 updates left
> +8 updates left
> +7 updates left
> +6 updates left
> +5 updates left
> +4 updates left
> +3 updates left
> +2 updates left
> +1 updates left
> +
> +** Checking file size of default tablespace file (falcon_user.fts)...
> +** Tablespace growth threshold: 20971520 bytes
> +** File size growth below test threshold. Test OK.
> +
> +** Final checks and cleanup...
> +
> +SELECT count(*) FROM t1;
> +count(*)
> +1
> +SELECT pk, LENGTH(myblob) FROM t1;
> +pk	LENGTH(myblob)
> +1	1117369
> +DROP TABLE t1;
> 
> === added file 'mysql-test/suite/falcon/t/falcon_blob_space-big-master.opt'
> --- a/mysql-test/suite/falcon/t/falcon_blob_space-big-master.opt	1970-01-01 00:00:00
> +0000
> +++ b/mysql-test/suite/falcon/t/falcon_blob_space-big-master.opt	2009-01-16 13:02:31
> +0000
> @@ -0,0 +1,3 @@
> +--max_allowed_packet=2MB
> +--secure_file_priv=
> +
> 
> === added file 'mysql-test/suite/falcon/t/falcon_blob_space-big.test'
> --- a/mysql-test/suite/falcon/t/falcon_blob_space-big.test	1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/falcon/t/falcon_blob_space-big.test	2009-01-16 13:02:31 +0000
> @@ -0,0 +1,243 @@
> +--source include/have_falcon.inc
> +--source include/big_test.inc
> +
> +#
> +# Bug #41870: Unbounded tablespace growth when updating BLOB record
> +#
> +# This test inserts a ~1.1 MB BLOB into a table and updates it
> +# i times (i = 20) with the same data. The test measures the size
> +# of the falcon_user.fts tablespace file before and after updating 
> +# BLOB data. If the file size growth is "too large" (i.e. close to
> +# "i" times the BLOB data), we probably have introduced a regression 
> +# after Bug#41870 was fixed.
> +#
> +# This test requires a few megabytes of storage. Also, with 
> +# disk-based storage Falcon currently needs some time to process 
> +# this kind of data in the database, so it's --big-test.
> +#
> +# This test depends to some degree on the timing of the Falcon
> +# scavenger, as old records/used space is pruned for the first
> +# time after about 1 second. Hence the sleeps, to allow the test
> +# to succeed even in main memory (--mem). If scavenger timing
> +# changes, one might have to modify this test (e.g. number of
> +# BLOB updates, sleeps, size of BLOB, etc.).
> +#
> +# NOTE: This test requires some special server options for the
> +#       loading of > 1 MB BLOBs to go ok, see -master.opt file.
> +#
> +#  --max_allowed_packet=2MB - default is 1 MB, needs to be at least 
> +#                             as large as the BLOB data file itself.
> +#  --secure_file_priv=      - LOAD_FILE does not work without 
> +#                             unsetting (or adjusting) this.
> +#
> +
> +--echo *** Bug #41870 ***
> +
> +# ----------------------------------------------------- #
> +# --- Initialisation                                --- #
> +# ----------------------------------------------------- #
> +let $engine = 'Falcon';
> +eval SET @@storage_engine = $engine;
> +
> +--disable_warnings
> +DROP TABLE IF EXISTS t1;
> +--enable_warnings
> +
> +CREATE TABLE t1 (
> +  pk INT AUTO_INCREMENT PRIMARY KEY, 
> +  myblob LONGBLOB
> +);
> +
> +
> +#
> +# Load BLOB data into table. UnicodeData.txt is used because it is
> +# sufficiently large (1 MB BLOBs are quite common in the field)
> +# and already part of the mysql-test resources.
> +#
> +--echo
> +--echo ** Load BLOB data into table
> +--echo ** (actual full path not shown as it may vary)
> +--echo
> +
> +# Echo the statement without using/expanding the MYSQL_TEST_DIR variable.
> +--echo INSERT INTO t1 (myblob) VALUES
> (LOAD_FILE('MYSQL_TEST_DIR/include/UnicodeData.txt'));
> +
> +# Since full path will vary, we disable query log for inserts and updates
> +--disable_query_log
> +eval INSERT INTO t1 (myblob)
> +       VALUES (LOAD_FILE('$MYSQL_TEST_DIR/include/UnicodeData.txt'));
> +--enable_query_log
> +
> +#
> +# Check that the data load went ok. 
> +# Otherwise we have NULL in the myblob column.
> +#
> +SELECT pk, LENGTH(myblob) FROM t1;
> +
> +#
> +# We need to know the server's datadir in order to check Falcon's
> +# tablespace files. Store the datadir path as an environment
> +# variable so that we can read it from Perl later in the test.
> +# (Using Perl to check and compare file sizes).
> +#
> +let MYSQLD_DATADIR= `SELECT @@datadir`;
> +
> +#
> +# Check and make a note of the size of the falcon tablespace 
> +# before updating data. This is to allow the test to run e.g. 
> +# several times in a row, or after other tests using the same 
> +# server instance, without failing.
> +#
> +# Storing the file size value in a file, as we would have a 
> +# hard time retreiving it later (in a different perl section
> +# or in the test script itself) otherwise.
> +#
> +perl;
> +  #
> +  # Get hold of path to current datadir
> +  #
> +  $datadir= $ENV{'MYSQLD_DATADIR'};
> +  if (length($datadir) < 1)
> +  {
> +    print "ERROR: Unable to get path to datadir (MYSQLD_DATADIR)\n";
> +  }
> +
> +  #
> +  # Measure the size of the default Falcon tablespace file:
> +  #
> +  $filesize_before = -s "$datadir/falcon_user.fts";
> +  if ($filesize_before < 1)
> +  {
> +    print "ERROR: Unable to read tablespace file falcon_user.fts\n";
> +    print "       Looked in datadir: $datadir\n";
> +  }
> +
> +  #
> +  # Using a file in MYSQL_TMP_DIR as temporary storage,
> +  # to be able to read the value later in the test.
> +  #
> +
> +  print "** Recording initial size of falcon_user.fts tablespace file\n";
> +  $tmp_file= "$ENV{'MYSQL_TMP_DIR'}/falcon_filesize.tmp";
> +  open(OUTFILE, ">$tmp_file")
> +    or die("Unable to write to file $tmp_file: $!\n");
> +  print OUTFILE "$filesize_before\n";
> +  close(OUTFILE);
> +
> +EOF
> +
> +
> +# ----------------------------------------------------- #
> +# --- Test                                          --- #
> +# ----------------------------------------------------- #
> +
> +#
> +# i = number of BLOB updates to do in this test before
> +#     measuring tablespace file size.
> +#
> +let $i=20;
> +
> +--echo
> +--echo ** Updating BLOB data $i times using the following statement:
> +--echo ** UPDATE t1 SET myblob = LOAD_FILE('MYSQL_TEST_DIR/include/UnicodeData.txt')
> WHERE pk = 1;
> +--echo
> +
> +--disable_query_log
> +while ($i)
> +{
> +  --echo $i updates left
> +  eval UPDATE t1 
> +         SET myblob = LOAD_FILE('$MYSQL_TEST_DIR/include/UnicodeData.txt') 
> +         WHERE pk = 1;
> +  # Sleep to allow the scavenger to catch up (e.g. if test is run with --mem).
> +  # This is not really needed on slower storage systems/disks if
> +  # $i is large enough. Adjustable by setting --sleep option of MTR.
> +  --sleep 0.1
> +  dec $i;
> +}
> +--enable_query_log
> +
> +## Check tablespace file size after updates
> +--echo
> +--echo ** Checking file size of default tablespace file (falcon_user.fts)...
> +
> +perl;
> +  #
> +  # Get hold of path to current datadir, previously stored as env var.
> +  #
> +  $datadir= $ENV{'MYSQLD_DATADIR'};
> +  if (length($datadir) < 1)
> +  {
> +    print "ERROR: Unable to get path to datadir\n";
> +  }
> +
> +  #
> +  # Measure current file size
> +  #
> +  $filesize_after = -s "$datadir/falcon_user.fts";
> +  
> +  #
> +  # Now, read the value (file size) recorded at the beginning of the test.
> +  # We assume that the file $tmp_file  contains a single line with a 
> +  # single number representing the original file size in bytes.
> +  #
> +
> +  $tmp_file= "$ENV{'MYSQL_TMP_DIR'}/falcon_filesize.tmp";
> +  open(FILE, "<", $tmp_file) or die("Unable to read $tmp_file!: $!\n");
> +  $filesize_before=<FILE>;
> +  close(FILE);
> +
> +  if ($filesize_before < 1)
> +  {
> +    print "ERROR: Unable to read original file size\n";
> +    print "       Expected to find a single number in the file $tmp_file,\n";
> +    print "       but did not\n";
> +  }
> +
> +  #
> +  # Calculate file size growth.
> +  #
> +  $growth= $filesize_after - $filesize_before;
> +  
> +  #
> +  # threshold = tablespace file size growth threshold. If the 
> +  #             tablespace file size exceeds this threshold, 
> +  #             we call it a test failure. 
> +  #
> +  #   Current threshold: $i * approx. BLOB size (rounded down)
> +  #
> +
> +  $threshold= 20 * 1 * 1024 * 1024;
> +  print "** Tablespace growth threshold: $threshold bytes\n";
> +  
> +  if ($growth > $threshold)
> +  {
> +    print "FAILURE: Tablespace filesize growth exceeded our threshold!\n";
> +    print "         Size of tablespace file was: $filesize_after\n";
> +    print "         Growth during BLOB updates was: $growth\n";
> +    print "\n";
> +    print "         Please investigate if this is a regression or if the\n";
> +    print "         test case needs adjustments (see bug 41870).\n";
> +  }
> +  else
> +  {
> +    print "** File size growth below test threshold. Test OK.\n";
> +  }
> +EOF
> +
> +--echo
> +--echo ** Final checks and cleanup...
> +--echo
> +
> +# ----------------------------------------------------- #
> +# --- Check                                         --- #
> +# ----------------------------------------------------- #
> +SELECT count(*) FROM t1;
> +SELECT pk, LENGTH(myblob) FROM t1;
> +
> +# ----------------------------------------------------- #
> +# --- Final cleanup                                 --- #
> +# ----------------------------------------------------- #
> +DROP TABLE t1;
> +# Unset temporarily used environment variable
> +let MYSQLD_DATADIR=;
> 
> 
Thread