From: Ann W. Harrison Date: October 30 2008 3:26pm Subject: Re: Unthawable List-Archive: http://lists.mysql.com/falcon/115 Message-Id: <4909D240.2050303@mysql.com> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset=ISO-8859-1 Content-Transfer-Encoding: 7BIT Philip, I had written: >>> Philip's cases are valuable tests, but not typical usage... With >>> reasonable settings, chills will happen only during loads and mass >>> inserts. The chances of modifying or deleting a record that's been >>> chilled are small, and the chances of having a multi-level modify / >>> chill / delete cycle are lower still. Records are chilled only when the amount of data in new versions created by a single transaction reach a threshold set by a condition variable. Nothing any other transaction does has any effect on chilling. Record versions are thawed when the transaction that created them needs to re-read them or in the (hopefully very rare) case when the transaction that created them has committed but is not yet write complete. > > Please find below an email from Kevin on another thread that describes a > potential chill/thaw/backlogging scenario that is not a load or a mass > insert -- Kevin's example is a long running report that makes multiple passes over the same data. It's the typical example used to show the value of repeatable read isolation mode. It has nothing to do with chilling or thawing. A long-running report is very unlikely to touch record versions created by a mass-load or update that is committed but not write complete. If it did, the record versions would be thawed on first reference and would stay in the cache. Such a report might cause backlogging, which is different from chilling. Record version are chilled by moving them to the serial log. Normally new record versions are moved to the serial log when the transaction commits. In the case of mass-loads and updates, records are moved when their total amount of data hits a threshold. This frees cache space and (normally) has little or no extra cost since those records would go to the serial log anyway. The two cases where there is cost are a rollback and a thaw. When a transaction rolls back, its changes are normally just dumped from the record cache; if the records have been chilled, the cost of copying them to the serial log was wasted. Thaws are expected to be rare because normally mass-load and mass-update transactions don't turn around and re-read the data they've just created. Backlogging is a different thing. When the cache fills and can't be emptied by scavenging unneeded old versions and mature records that haven't been referenced recently, then chains of record versions are written to a special tablespace called the backlog. Backlogging is an expensive operation - first, there's a write to the disk, and second, the record version chains must be read back into memory on reference and to be scavenged. Currently, backlogging happens only after a chill - that was an easy place to do it, but it is not the only time it should be done. Aside from currently happening at the same time, chilling and backlogging are completely different operations. Having got that rant out, let me go back to Kevin's complex multi- pass report. Currently, if such a report is running and another connection truncates the table, the reporting transaction will see no data after the truncation. That an interaction between a non-transactional DDL statement and a repeatable read transaction that makes the second transaction's reads not repeatable. What Kevin (and Jim and I) are proposing is adding yet another sync object to tables that when taken in shared mode guarantees the continued existence of the table. Truncating (or dropping, or altering to drop columns) the table requires exclusive access to that sync object. > attempting to generate a report by making multiple passes > through the entire data while writing intermediate values in a summary > table. A big multi-pass report does put a load on any database system. If the records referenced are mature (i.e. there are no uncommitted versions and no old versions), then they may be scavenged during the report to free up space in the cache. If they're required again, they'll be read from the database again. If any of the records being reported on have multiple versions, they'll stay in the cache until scavenging no longer releases enough space, then they may be backlogged. They'll never be chilled - reading never causes chilling. > If you decide to mark the start of your execution by inserting a > timestamp in some log table, your transaction has become an update one > and thus the serial log will grow and the memory will fill up with old > record versions. Yes, both those things could happen, but for different reasons. Any long-running transaction causes old versions of records to be kept alive. That's what's necessary to preserve a stable snapshot of the data for an old transaction. It doesn't matter if the old transaction is creating a complex report or waiting for interaction with a user who's gone on holiday. All MVCC engines keep old versions for old transactions. Oracle and InnoDB keep then in their logs and when they hit some threshold, drop them arbitrarily, breaking the repeatable read. PostgreSQL and Firebird store old versions in the database, causing the database to fill with old cruft and leading to long, boring discussions of vacuuming and garbage collection. Falcon keeps old versions in the record cache or in the backlog. But the basic fact is that an MVCC repeatable read requires that the database keep old versions of records somewhere. Locking systems handle the problem differently. In a repeatable read locking transaction, every record that's read is set to a state where it can't be modified until the reader has committed. In those systems, instead of filling the record cache, log, or database with old versions, you fill the DBA's office with annoyed users who can't get their work done until the month-end report completes - or until they figure out who it was who went on holiday with a transaction open. In Falcon - and all other systems that I know of that use logs as an intermediary between cache and database - an open write transaction keeps the serial log from being truncated. The serial log will not fill memory - the number of buffers allocated to it is set at start-up. I suppose it's possible to run some sort of scavenge on the log, keeping the active transaction entries and throwing out everything that's write complete, but we don't do that. And I'm not sure that keeping the size of the log down in the case of someone going on holiday with an update transaction open is worth fixing. Sincerely, Ann