List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:May 15 2007 6:09pm
Subject:bk commit into 5.1 tree (guilhem:1.2526)
View as plain text  
Below is the list of changes that have just been committed into a local
5.1 repository of guilhem. When guilhem does a push these changes will
be propagated to the main repository and, within 24 hours after the
push, to the public repository.
For information on how to access the public repository
see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html

ChangeSet@stripped, 2007-05-15 18:09:22+02:00, guilhem@stripped +56 -0
  WL#866 - Online backup driver for MyISAM.
  When this driver (file myisam_backup_log.cc) is to do a backup,
  it starts physical logging of changes for target tables (file
  mi_backup_log.c), dirtily copies tables, then the physical log.
  When this driver does a restore, it copies back tables and the log,
  applies the log (file mi_examine_log.c).
  The data file is always backed up; for the index file two methods are:
  available: either only its 64KB header is copied and then restore will
  repair indexes, or the whole index file is copied. For now this is
  driven by an environment variable (for testing!), see TODOs below.
  It is an online backup in this sense: it starts and runs without
  disturbing any running or new update except at the very end: when
  creating the validity point it locks all tables with a read lock
  (so, stalls new update statements and waits for running update
  statements to finish); if only short statements are used this should
  not be a problem; if long statements, it will be.
  I recommend to reviewers to read in order myisam_backup_driver.cc,
  mi_backup_log.c, mi_log.c then other files.
  Doxygen: new files are properly documented, old files which I
  significantly changed get a minimal header plus proper documentation
  of their new pieces (new members of existing data structures, for
  example).
  Testing: a proper test of the "online" adjective would require
  large tables and timings (verify that the backup was not blocked by this
  or that statement etc); instead of making my own cooking, it's better
  to confer with the backup team about how they want to do it. So, the
  only testing in this patch is similar to how the driver of the Archive
  engine is tested, in backup.test: offline.
  TODO: better testing (more concurrency, --myisam-use-mmap, DATA|INDEX
  DIRECTORY clauses in CREATE TABLE...; I tested those three ones
  by hand however).
  TODO: when stopping logging, implement a method which does not
  do LOCK TABLES READ on non-open tables but only on already open ones
  (see comments in Backup::lock_tables_TL_READ_NO_INSERT()); first see if
  other engines need this which depends on the following TODO item
  TODO: come to a consensus on how a driver should do its synchronization,
  with the backup team.
  TODO: backup kernel should block OPTIMIZE|REPAIR|TRUNCATE TABLE.
  TODO: if indexes are not copied, possibility to build them at backup
  time instead of at restore time (=> requires a large tmp directory)
  TODO: throttling of the backup job by the backup kernel (implemented
  temporarily in this driver via configurable sleep read from getenv())
  TODO: larger buffers (between kernel and driver) may give more
  sequential disk accesses and less syscalls (observed 25% speedup of
  the backup job by using 2MB buffers instead of 2KB).
  TODO: fix mi_log*() functions to support file descriptors >65535.
  TODO: benchmark if the size of the IO_CACHE of the backup log makes
  a difference
  TODO: convenient way to decide if the index pages should be backed
  up or not (variable, clause in BACKUP command...); for now getenv().
  TODO: better estimates in size() and init_size().
  TODO: enable my_atomic unit test under Windows in Pushbuild because
  I am using atomic operations in this patch.
  TODO: benchmark the overhead added by MyISAM's online backup code
  when no online backup is happening (to do that, undefine
  HAVE_MYISAM_BACKUP_LOGGING in myisamdef.h).
  TODO: benchmark the overhead when in backup mode vs when not in backup
  mode, for different amounts of throttling (i.e. of sleep).

  include/atomic/nolock.h@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +7 -1
    empty structs don't have the same size in C (0) and in C++ (1),
    causes mismatches in MyISAM code, so I'm using a char instead and 
    talking with Serg about the definite fix.

  include/keycache.h@stripped, 2007-05-15 18:09:18+02:00, guilhem@stripped +10 -2
    A block now has an optional callback/callback_argument pair which is
    called after the block goes to the file. This callback and its argument
    are set through key_cache_write() (i.e. when the block is potentially
    put into the cache), and are reset to NULL after flush.
    This is used by MyISAM's online backup logging.

  include/my_global.h@stripped, 2007-05-15 18:09:18+02:00, guilhem@stripped +4 -2
    When testing a C++ file against "gcc -Wold-style-cast", these casts
    surfaced. static_cast_C_or_CPP means "a static cast which works in
    C and in C++, without warnings".

  include/my_sys.h@stripped, 2007-05-15 18:09:18+02:00, guilhem@stripped +26 -11
    New members of IO_CACHE:
    * post_write (called by my_b_flush_io_cache()),
    * hard_write_error_in_the_past (like the existing 'error' except that
    once set it is not reset until cache is reinitialized; 'error' is
    reset at next _my_b_write() call for example). Is useful when we want
    to lazily (once in a while) monitor if an IO_CACHE got a write error
    (and so, file is not usable) instead of monitoring each IO_CACHE write.
    These are used by MyISAM's online backup.

  include/myisam.h@stripped, 2007-05-15 18:09:18+02:00, guilhem@stripped +31 -3
    declarations' update.
    As most of the logic of myisamlog moved out of myisamlog.c into
    mi_examine_log() into mi_examine_log.c, mi_examine_log() cannot use
    globals of myisamlog.c anymore; a new structure MI_EXAMINE_LOG_PARAM
    is introduced instead (like MI_CHECK).

  libmysql/libmysql.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +1 -1
    small fix (nothing to do with backup, but popped up while testing)

  mysql-test/r/backup.result@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +80
-12
    result update. Note the increase in backup size, which is due to:
    - one more table to back up
    - the MyISAM native driver copies the index file which is always at
    least 1 KB, while the generic locking driver (which was used for MyISAM
    so far) copies only rows, so gives a smaller backup.
    The message in CHECK TABLE is expected (we copy a table which is
    open at this moment - that's a goal of an online backup :)

  mysql-test/r/backup_no_data.result@stripped, 2007-05-15 18:09:19+02:00,
guilhem@stripped +6 -6
    When an empty MyISAM table was backed up with the generic locking
    driver, the backup's data was 0 bytes (as 0 rows); but now with
    the native driver, we back up the index file which is 1024 bytes
    if the table is empty.

  mysql-test/t/backup.test@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +34 -1
    testing of online backup for MyISAM (though not in real online
    conditions, that will happen later).
    A second MyISAM table, tasking2, with and index and delay_key_write.

  mysql-test/t/key_cache.test@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +3
-3
    As the size of the BLOCK_LINK structure has grown (to accomodate
    the new flush_callback and flush_callback_arg members), the number
    of available blocks for a fixed key cache size has decreased
    (a similar fix will be needed for 64-bit systems too)

  mysys/mf_iocache.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +56 -19
    Calling the new post_write hook. Note that it does not work with
    SEQ_READ_APPEND IO_CACHE's, because they don't maintain the
    write position in a variable (as they are O_APPEND) and I don't
    want to do a tell() (and I don't need a post_write for such caches).
    Setting IO_CACHE::hard_write_error_in_the_past at the same time we
    set IO_CACHE::error to -1 in functions which do writes.
    IO_CACHE::hard_write_error_in_the_past is however not reset for
    each write, that's its difference from IO_CACHE::error.

  mysys/mf_keycache.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +92 -40
    A block now has an optional callback/callback_argument pair which is
    called after the block goes to the file. This callback and its argument
    are set through key_cache_write() (i.e. when the block is potentially
    put into the cache), and are reset to NULL after flush.
    This is used by MyISAM's online backup logging.

  mysys/my_thr_init.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +10 -4
    a dedicated mutex for the MyISAM log (better not use THR_LOCK_myisam
    which is already used at every mi_open()/mi_close(), and also it made
    coding easier).

  sql/backup/archive.h@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +13 -0
    Question for the backup team

  sql/backup/data_backup.cc@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +11 -1
    I will not push that. However, detecting end() errors helped me;
    it should be done.

  sql/backup/meta_backup.cc@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +47 -3
    No need to lock tables to get their CREATE information;
    locking tables here prevents online backup to start while a table
    is being updated, which is an issue. Here is a quick hack which I
    needed for my tests; I will not push that.
    Note about a Valgrind error.

  sql/log_event.cc@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +1 -2
    As start_bulk_insert() has been called, end_bulk_insert() must always
    be called, no matter if some error happened for a row (Mats approved).

  sql/mysqld.cc@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +7 -7
    emphasizing that the MyISAM log set from the command line is the
    logical one.

  sql/sql_cache.cc@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +3 -2
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name

  sql/sql_class.h@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +2 -1
    a new type of system thread: backup utility thread (used by the
    MyISAM online backup driver).

  sql/sql_load.cc@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +1 -2
    cast not needed

  sql/sql_repl.cc@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +4 -1
    new prototype

  sql/sql_repl.h@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +1 -1
    new prototype

  sql/sql_table.cc@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +3 -0
    As start_bulk_insert() has been called, end_bulk_insert() must be
    called.

  storage/myisam/CMakeLists.txt@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +2
-1
    include new files in build

  storage/myisam/Makefile.am@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +2 -2
    include new files in build

  storage/myisam/ha_myisam.cc@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +6
-3
    MyISAM now has an online backup driver, see myisam_backup_driver.cc.
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name

  storage/myisam/ha_myisam.h@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +5 -0
    MyISAM now has an online backup driver

  storage/myisam/mi_backup_log.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped
+239 -0
    Functions to start and stop backup logging for a set of tables.
    See comments of the functions.

  storage/myisam/mi_backup_log.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +0
-0

  storage/myisam/mi_check.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +3 -1
    mi_state_info_write() needs the MI_INFO now, as it is used by backup
    logging. Note: REPAIR/OPTIMIZE should be blocked by the MySQL
    layer as they are DDLs (some engines convert OPTIMIZE to ALTER TABLE
    for example).

  storage/myisam/mi_close.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +27 -4
    If we failed to flush the key cache, we need to mark the table as
    corrupted (fix for an unlikely bug).
    mi_state_info_write() needs the MI_INFO now, as it is used by backup
    logging.
    If the table has logged a MI_LOG_OPEN to the backup log (because it
    has logged some writes there), it needs to log a MI_LOG_CLOSE too.
    Regarding the added assertion, see revision comment in mi_locking.c.

  storage/myisam/mi_create.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +4 -1
    Ensure that no table is truncated while an online backup is
    running on it. We *could* support TRUNCATE during an online backup of
    MyISAM but not without the server's help. But anyway we know that
    TRUNCATE is going to be an issue (it deletes InnoDB's consistent
    reads, is a DDL...).

  storage/myisam/mi_delete.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +7 -4
    update to new prototype.
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name

  storage/myisam/mi_delete_all.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped
+9 -1
    Log to the backup log the MI_DELETE_ALL operation

  storage/myisam/mi_dynrec.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +12
-4
    Log to the backup log the operation of writing bytes to the data file.
    Always log _after_ the write.

  storage/myisam/mi_examine_log.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped
+847 -0
    This is the examine_log() function taken out of myisamlog.c, as we
    now need it in the server too, to restore from an online backup.
    It is renamed to mi_examine_log().
    examine_log() used global variables of myisamlog.c, that is not possible
    now so mi_examine_log() takes in parameter a new structure
    MI_EXAMINE_LOG_PARAM whose content corresponds to the old global
    variables.
    The function is extended to understand new log commands found in
    backup logs: MI_LOG_WRITE_BYTES_MYI, MI_LOG_WRITE_BYTES_MYD,
    MI_LOG_WRITE_BYTES_MYI_FROM_KEY_CACHE, as well as MI_LOG_DELETE_ALL
    which wasn't handled properly (a bug).
    To see the real code changes I made (because, most of lines are just
    moved from myisamlog.c), you should diff mi_examine_log() with
    examine_log() from the old myisamlog.c (I recommend it).

  storage/myisam/mi_examine_log.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped
+0 -0

  storage/myisam/mi_extra.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +8 -2
    When we set up a write cache (HA_EXTRA_WRITE_CACHE) we give the cache
    a post_write call, which will make sure that writes to the data file
    will go to the backup log if needed.
    It is more efficient that logging all my_b_write()s as it generates
    less log records, and anyway we needed the post_write callback for
    when logging is enabled while the table is in already the middle of
    using a write cache, when non-logged my_b_write()s have passed already).
    mi_state_info_write() needs the MI_INFO now, as it is used by backup
    logging

  storage/myisam/mi_locking.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +64
-43
    mi_state_info_write() needs the MI_INFO now, as it is used by backup
    logging.
    Comments explaining why some writes to the index file needn't go into
    the backup log.
    New function mi_remap_file_and_write_state_for_unlock(), used when
    unlocking a table and in mi_backup_stop_logging_for_tables().
    I added DBUG_ASSERT(!(info->opt_flag & WRITE_CACHE_USED)); originally
    for online backup purposes, then I saw it fired in some non-backup test
    (alter_table.test); turned out to be a forgotten end_bulk_insert() in
    sql_table.cc.

  storage/myisam/mi_log.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +404 -48
    mi_log() can now open a logical or physical (==backup) log.
    Logging functions now use IO_CACHE.
    _myisam_log() now expects caller to have the log's mutex (this change
    is to be able to call _myisam_log() from another logging function as
    a single atomic unit, without releasing the mutex).
    Logging functions now need to check if the log is opened inside the
    log's mutex, because the backup log closes at some point (while
    the old debug, logical log, was either open or closed for the lifetime
    of the MySQL server process).
    _myisam_log_command() may log MI_LOG_OPEN on top of its command
    (needed for backup logging).
    _myisam_log_record() is used only by logical logging, renamed it.
    Three functions are added for the three different ways that 
    an entry can be written to the backup log.

  storage/myisam/mi_open.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +41
-13
    When opening a table, detect if it must do backup logging.
    mi_state_info_write() needs the MI_INFO now, as it is used by backup
    logging.
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name,
    indeed backup logging needs this unresolved name, and backup logging
    sometimes cannot have MI_INFO available, but always has MYISAM_SHARE
    (see myisam_log_from_key_cache_for_backup()). It is also more
    efficient (why duplicate the same string in all MI_INFOs for a same
    table?).

  storage/myisam/mi_page.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +25 -7
    When writing a page to the key cache, if the table is in backup mode,
    we pass a callback to the key cache, which will be invoked when the
    key cache later flushes this page to the file. This callback may
    be NULL if we don't want to store index page changes into the backup
    log.
    It's more efficient to log at flush time than when we write to the
    key cache; writes to the same key page accumulate...

  storage/myisam/mi_panic.c@stripped, 2007-05-15 18:09:19+02:00, guilhem@stripped +11 -7
    update for new prototype.
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name

  storage/myisam/mi_rrnd.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +1 -3
    cosmetic change

  storage/myisam/mi_static.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +51
-5
    New log in MyISAM: on top of the existing logical log (which is for
    debugging), a new physical log is added (for online backup).

  storage/myisam/mi_test2.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +1 -1
    update for new prototype

  storage/myisam/mi_test3.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +2 -2
    update for new prototype

  storage/myisam/mi_update.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +6 -4
    update to new prototype.
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name

  storage/myisam/mi_write.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +5 -4
    update to new prototype.
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name

  storage/myisam/myisam_backup_driver.cc@stripped, 2007-05-15 18:09:20+02:00,
guilhem@stripped +1681 -0
    MyISAM online backup driver. See this file's first lines (namespace
    myisam_backup) for a description of how it works.
    This is the file which really drives the backup/restore inside MyISAM
    (see it as the master which calls functions from other files to
    perform some subtasks).

  storage/myisam/myisam_backup_driver.cc@stripped, 2007-05-15 18:09:20+02:00,
guilhem@stripped +0 -0

  storage/myisam/myisam_ftdump.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped
+2 -1
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name

  storage/myisam/myisamdef.h@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +97
-18
    Adding member "backup_logging" to MYISAM_SHARE, tells if the table
    is currently doing backup logging or not.
    It is read and set with atomic operations (this provides the needed
    synchronization between the "writer" thread (if the table is being
    written now) and the "backup" thread).
    Adding member "MI_LOG_OPEN_stored_in_backup_log" to MI_INFO, to
    remember if we already stored MI_LOG_OPEN in this backup log for this
    MI_INFO (it needs to be in MI_INFO and not MYISAM_SHARE, because
    log applying uses MI_INFO's dfile as a key to identify a table).
    A new mutex THR_LOCK_myisam_log to protect MyISAM logs (instead of
    THR_LOCK_myisam, for concurrency and coding reasons (explained
    elsewhere).
    New commands can be stored in the log (actually, only in the backup
    log): MI_LOG_WRITE_BYTES_MYD, MI_LOG_WRITE_BYTES_MYI,
    MI_LOG_WRITE_BYTES_MYI_FROM_KEY_CACHE.
    Inline funcionts to log a command/pwrite to the backup log with an
    additional MI_LOG_OPEN if first time.
    Updates to myisam_log_* macros to adapt to the fact that logs are now
    an IO_CACHE and not a plain file.
    mi_state_info_write() needs the MI_INFO now, as it is used by backup
    logging.
    Note: all extern vars touched in this file have their documentation
    in the file where they are defined.

  storage/myisam/myisamlog.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +72
-620
    The main function of this file, examine_log(), is now needed in
    the server to be able to do a restore from an online backup,
    so moves to mi_examine_log.c.
    I changed a meaningless min() to max().
    New log commands (MI_LOG_etc) have long names so I widened the
    space for displaying their name to 34 characters.

  storage/myisam/myisampack.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +15
-5
    mi_state_info_write() needs the MI_INFO now, as it is used by backup
    logging

  storage/myisammrg/ha_myisammrg.cc@stripped, 2007-05-15 18:09:20+02:00,
guilhem@stripped +2 -2
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name

  storage/myisammrg/myrg_info.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +2
-1
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name

  storage/myisammrg/myrg_rrnd.c@stripped, 2007-05-15 18:09:20+02:00, guilhem@stripped +1
-1
    MI_INFO::filename is now MYISAM_SHARE::unresolv_file_name

# This is a BitKeeper patch.  What follows are the unified diffs for the
# set of deltas contained in the patch.  The rest of the patch, the part
# that BitKeeper cares about, is below these diffs.
# User:	guilhem
# Host:	gbichot4.local
# Root:	/home/mysql_src/mysql-5.1-new-bak

--- 1.217/include/my_sys.h	2007-01-24 18:56:57 +01:00
+++ 1.218/include/my_sys.h	2007-05-15 18:09:18 +02:00
@@ -13,6 +13,11 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
 
+/**
+   @file
+   @brief mysys library API
+*/
+
 #ifndef _my_sys_h
 #define _my_sys_h
 C_MODE_START
@@ -336,7 +341,7 @@
 } DYNAMIC_STRING;
 
 struct st_io_cache;
-typedef int (*IO_CACHE_CALLBACK)(struct st_io_cache*);
+typedef int (*IO_CACHE_CALLBACK)(struct st_io_cache *, const byte *, uint, my_off_t);
 
 #ifdef THREAD
 typedef struct st_io_cache_share
@@ -435,21 +440,29 @@
   */
   enum cache_type type;
   /*
-    Callbacks when the actual read I/O happens. These were added and
-    are currently used for binary logging of LOAD DATA INFILE - when a
-    block is read from the file, we create a block create/append event, and
-    when IO_CACHE is closed, we create an end event. These functions could,
-    of course be used for other things
-  */
-  IO_CACHE_CALLBACK pre_read;
-  IO_CACHE_CALLBACK post_read;
-  IO_CACHE_CALLBACK pre_close;
+    Callbacks were added and are currently used for binary logging of LOAD
+    DATA INFILE - when a block is read from the file, we create a block
+    create/append event, and when IO_CACHE is closed, we create an end event;
+    also used to write the MyISAM WRITE_CACHE blocks to the MyISAM backup
+    log. These functions could, of course be used for other things. Note: some
+    callbacks share the same argument ("arg").
+  */
+  IO_CACHE_CALLBACK pre_read;  /**< called before reading from disk */
+  IO_CACHE_CALLBACK post_read; /**< called after reading from disk */
+  IO_CACHE_CALLBACK pre_close; /**< called before ending the cache */
+  /**
+     @brief called after writing to disk; does not work with SEQ_READ_APPEND.
+     The reason is that SEQ_READ_APPEND IO_CACHE does not maintain the
+     write position in a variable (as it is O_APPEND); to make it work we
+     should pass the result of tell() to post_write.
+  */
+  IO_CACHE_CALLBACK post_write;
   /*
     Counts the number of times, when we were forced to use disk. We use it to
     increase the binlog_cache_disk_use status variable.
   */
   ulong disk_writes;
-  void* arg;				/* for use by pre/post_read */
+  void *arg;			     /**< used by pre/post_read,post_write */
   char *file_name;			/* if used with 'open_cached_file' */
   char *dir,*prefix;
   File file; /* file descriptor */
@@ -461,6 +474,8 @@
     partial.
   */
   int	seek_not_done,error;
+  /** @brief cumulative 'error' since last [re]init_io_cache() */
+  int hard_write_error_in_the_past;
   /* buffer_length is memory size allocated for buffer or write_buffer */
   uint	buffer_length;
   /* read_length is the same as buffer_length except when we use async io */

--- 1.80/include/myisam.h	2006-12-23 20:19:45 +01:00
+++ 1.81/include/myisam.h	2007-05-15 18:09:18 +02:00
@@ -13,7 +13,11 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
 
-/* This file should be included when using myisam_funktions */
+/**
+   @file
+
+   @brief This file should be included when using myisam_funktions
+*/
 
 #ifndef _myisam_h
 #define _myisam_h
@@ -32,6 +36,7 @@
 #endif
 #include "my_handler.h"
 #include <mysql/plugin.h>
+#include <hash.h>
 
 /*
   There is a hard limit for the maximum number of keys as there are only
@@ -259,7 +264,7 @@
 /* invalidator function reference for Query Cache */
 typedef void (* invalidator_by_filename)(const char * filename);
 
-extern my_string myisam_log_filename;		/* Name of logfile */
+extern my_string myisam_logical_log_filename;		/* Name of logfile */
 extern ulong myisam_block_size;
 extern ulong myisam_concurrent_insert;
 extern my_bool myisam_flush,myisam_delay_key_write,myisam_single_user;
@@ -305,11 +310,34 @@
 extern int mi_reset(struct st_myisam_info *file);
 extern ha_rows mi_records_in_range(struct st_myisam_info *info,int inx,
                                    key_range *min_key, key_range *max_key);
-extern int mi_log(int activate_log);
+enum enum_mi_log_action { MI_LOG_ACTION_CLOSE= 0, MI_LOG_ACTION_OPEN };
+enum enum_mi_log_type { MI_LOG_PHYSICAL= 0, MI_LOG_LOGICAL };
+extern int mi_log(enum enum_mi_log_action action,
+                  enum enum_mi_log_type type, const char *log_filename);
 extern int mi_is_changed(struct st_myisam_info *info);
 extern int mi_delete_all_rows(struct st_myisam_info *info);
 extern ulong _mi_calc_blob_length(uint length , const byte *pos);
 extern uint mi_get_pointer_length(ulonglong file_length, uint def);
+extern int mi_backup_start_logging_for_tables(HASH *hash_of_tables,
+                                              const char *backup_log_name);
+extern int mi_backup_stop_logging_for_tables(my_bool cancel);
+/**
+   IN and OUT structure for instructing how to apply a MyISAM log and later
+   getting statistics about this log.
+*/
+typedef struct mi_examine_log_param
+{
+  uint verbose, update, max_files, re_open_count, recover, prefix_remove,
+    opt_processes;
+  ulong number_of_commands;
+  my_off_t start_offset,record_pos;
+  const char *log_filename, *filepath, *write_filename, *record_pos_file;
+  ulong com_count[11][3]; /**< count of commands found in log, their errors */
+  my_bool (*table_selection_hook)(const char *); /**< to filter tables */
+} MI_EXAMINE_LOG_PARAM;
+extern const char *mi_log_command_name[];
+extern void mi_examine_log_param_init(MI_EXAMINE_LOG_PARAM *param);
+extern int mi_examine_log(MI_EXAMINE_LOG_PARAM *param);
 
 /* this is used to pass to mysql_myisamchk_table -- by Sasha Pachev */
 

--- 1.271/libmysql/libmysql.c	2007-01-24 18:56:57 +01:00
+++ 1.272/libmysql/libmysql.c	2007-05-15 18:09:19 +02:00
@@ -1361,7 +1361,7 @@
 {
   DBUG_ENTER("mysql_stat");
   if (simple_command(mysql,COM_STATISTICS,0,0,0))
-    return mysql->net.last_error;
+    DBUG_RETURN(mysql->net.last_error);
   DBUG_RETURN((*mysql->methods->read_statistics)(mysql));
 }
 

--- 1.40/storage/myisam/Makefile.am	2006-12-31 01:06:39 +01:00
+++ 1.41/storage/myisam/Makefile.am	2007-05-15 18:09:19 +02:00
@@ -89,7 +89,7 @@
 			mi_rrnd.c mi_scan.c mi_cache.c \
 			mi_statrec.c mi_packrec.c mi_dynrec.c \
 			mi_update.c mi_write.c mi_unique.c \
-			mi_delete.c \
+			mi_delete.c mi_backup_log.c \
 			mi_rprev.c mi_rfirst.c mi_rlast.c mi_rsame.c \
 			mi_rsamepos.c mi_panic.c mi_close.c mi_create.c\
 			mi_range.c mi_dbug.c mi_checksum.c mi_log.c \
@@ -98,7 +98,7 @@
 			mi_keycache.c mi_preload.c \
 			ft_parser.c ft_stopwords.c ft_static.c \
 			ft_update.c ft_boolean_search.c ft_nlq_search.c sort.c \
-			ha_myisam.cc \
+			mi_examine_log.c ha_myisam.cc myisam_backup_driver.cc \
 			rt_index.c rt_key.c rt_mbr.c rt_split.c sp_key.c
 CLEANFILES =		test?.MY? FT?.MY? isam.log mi_test_all rt_test.MY? sp_test.MY?
 

--- 1.158/storage/myisam/mi_check.c	2007-01-24 22:39:50 +01:00
+++ 1.159/storage/myisam/mi_check.c	2007-05-15 18:09:19 +02:00
@@ -1391,6 +1391,8 @@
   MI_SORT_PARAM sort_param;
   DBUG_ENTER("mi_repair");
 
+  /* because we don't handle OPTIMIZE/REPAIR in online backup yet */
+  DBUG_ASSERT(!info->s->backup_logging);
   bzero((char *)&sort_info, sizeof(sort_info));
   bzero((char *)&sort_param, sizeof(sort_param));
   start_records=info->state->records;
@@ -4321,7 +4323,7 @@
     */
     if (info->lock_type == F_WRLCK)
       share->state.state= *info->state;
-    if (mi_state_info_write(share->kfile,&share->state,1+2))
+    if (mi_state_info_write(info, share->kfile, &share->state, 1+2))
       goto err;
     share->changed=0;
   }

--- 1.24/storage/myisam/mi_close.c	2006-12-31 01:06:39 +01:00
+++ 1.25/storage/myisam/mi_close.c	2007-05-15 18:09:19 +02:00
@@ -52,6 +52,8 @@
   }
   if (info->opt_flag & (READ_CACHE_USED | WRITE_CACHE_USED))
   {
+    /* Logically there should not be a WRITE_CACHE at this stage */
+    DBUG_ASSERT(!(info->opt_flag & WRITE_CACHE_USED));
     if (end_io_cache(&info->rec_cache))
       error=my_errno;
     info->opt_flag&= ~(READ_CACHE_USED | WRITE_CACHE_USED);
@@ -67,7 +69,11 @@
 	flush_key_blocks(share->key_cache, share->kfile,
 			 share->temporary ? FLUSH_IGNORE_CHANGED :
 			 FLUSH_RELEASE))
+    {
       error=my_errno;
+      mi_print_error(share, HA_ERR_CRASHED);
+      mi_mark_crashed(info);		/* Mark that table must be checked */
+    }
     if (share->kfile >= 0)
     {
       /*
@@ -77,7 +83,7 @@
         may be using the file at this point
       */
       if (share->mode != O_RDONLY && mi_is_crashed(info))
-	mi_state_info_write(share->kfile, &share->state, 1);
+	mi_state_info_write(info, share->kfile, &share->state, 1);
       if (my_close(share->kfile,MYF(0)))
         error = my_errno;
     }
@@ -93,6 +99,7 @@
 #ifdef THREAD
     thr_lock_delete(&share->lock);
     VOID(pthread_mutex_destroy(&share->intern_lock));
+    my_atomic_rwlock_destroy(&share->current_writes_rwlock);
     {
       int i,keys;
       keys = share->state.header.keys;
@@ -110,10 +117,26 @@
     my_free((gptr)info->ftparser_param, MYF(0));
     info->ftparser_param= 0;
   }
-  if (info->dfile >= 0 && my_close(info->dfile,MYF(0)))
-    error = my_errno;
+  if (info->dfile >= 0)
+  {
+    /*
+      If we stored the MI_LOG_CLOSE log record after closing dfile, this would
+      be possible in the log (two concurrent threads, one which closes, other
+      which opens, _same_ descriptor for dfile):
+      MI_LOG_OPEN  logged by thread2
+      MI_LOG_CLOSE logged by thread1
+      Record logged by thread2
+      And thus as MI_LOG_OPEN and MI_LOG_CLOSE have the same dfile, applying
+      the record will fail.
+    */
+    if (info->MI_LOG_OPEN_stored_in_backup_log)
+      _myisam_log_command(&myisam_backup_log, MI_LOG_CLOSE, info,
+                          NULL, 0, error, NULL);
+    if (my_close(info->dfile,MYF(0)))
+      error = my_errno;
+  }
 
-  myisam_log_command(MI_LOG_CLOSE,info,NULL,0,error);
+  myisam_log_command_logical(MI_LOG_CLOSE, info, NULL, 0, error);
   my_free((gptr) info,MYF(0));
 
   if (error)

--- 1.66/storage/myisam/mi_create.c	2007-01-19 13:07:07 +01:00
+++ 1.67/storage/myisam/mi_create.c	2007-05-15 18:09:19 +02:00
@@ -568,6 +568,9 @@
     share.state.create_time= (long) time((time_t*) 0);
 
   pthread_mutex_lock(&THR_LOCK_myisam);
+  /* because we don't handle CREATE or TRUNCATE in online backup yet */
+  DBUG_ASSERT((options & HA_OPTION_TMP_TABLE) || !backup_hash_of_tables ||
+              !hash_search(backup_hash_of_tables, name, strlen(name)));
 
   if (ci->index_file_name)
   {
@@ -687,7 +690,7 @@
   }
 
   DBUG_PRINT("info", ("write state info and base info"));
-  if (mi_state_info_write(file, &share.state, 2) ||
+  if (mi_state_info_write(NULL /* no MI_INFO */, file, &share.state, 2) ||
       mi_base_info_write(file, &share.base))
     goto err;
 #ifndef DBUG_OFF

--- 1.44/storage/myisam/mi_delete.c	2006-12-31 01:06:39 +01:00
+++ 1.45/storage/myisam/mi_delete.c	2007-05-15 18:09:19 +02:00
@@ -100,13 +100,15 @@
   info->state->records--;
 
   mi_sizestore(lastpos,info->lastpos);
-  myisam_log_command(MI_LOG_DELETE,info,(byte*) lastpos,sizeof(lastpos),0);
+  myisam_log_command_logical(MI_LOG_DELETE, info,
+                             (byte*) lastpos, sizeof(lastpos), 0);
   VOID(_mi_writeinfo(info,WRITEINFO_UPDATE_KEYFILE));
   allow_break();			/* Allow SIGHUP & SIGINT */
   if (info->invalidator != 0)
   {
-    DBUG_PRINT("info", ("invalidator... '%s' (delete)", info->filename));
-    (*info->invalidator)(info->filename);
+    DBUG_PRINT("info", ("invalidator... '%s' (delete)",
+                        info->s->unresolv_file_name));
+    (*info->invalidator)(info->s->unresolv_file_name);
     info->invalidator=0;
   }
   DBUG_RETURN(0);
@@ -114,7 +116,8 @@
 err:
   save_errno=my_errno;
   mi_sizestore(lastpos,info->lastpos);
-  myisam_log_command(MI_LOG_DELETE,info,(byte*) lastpos, sizeof(lastpos),0);
+  myisam_log_command_logical(MI_LOG_DELETE, info,
+                             (byte*) lastpos, sizeof(lastpos), 0);
   if (save_errno != HA_ERR_RECORD_CHANGED)
   {
     mi_print_error(info->s, HA_ERR_CRASHED);

--- 1.21/storage/myisam/mi_delete_all.c	2006-12-31 01:06:39 +01:00
+++ 1.22/storage/myisam/mi_delete_all.c	2007-05-15 18:09:19 +02:00
@@ -47,7 +47,7 @@
   for (i=0 ; i < share->base.keys ; i++)
     state->key_root[i]= HA_OFFSET_ERROR;
 
-  myisam_log_command(MI_LOG_DELETE_ALL,info,(byte*) 0,0,0);
+  myisam_log_command_logical(MI_LOG_DELETE_ALL, info, (byte*) 0, 0, 0);
   /*
     If we are using delayed keys or if the user has done changes to the tables
     since it was locked then there may be key blocks in the key cache
@@ -56,6 +56,14 @@
   if (my_chsize(info->dfile, 0, 0, MYF(MY_WME)) ||
       my_chsize(share->kfile, share->base.keystart, 0, MYF(MY_WME))  )
     goto err;
+ /*
+    For MyISAM physical logging: we do the logging with the command below;
+    _mi_writeinfo() will also do some logging.
+ */
+  if (unlikely(mi_get_backup_logging_state(info->s)))
+    _myisam_log_command(&myisam_backup_log, MI_LOG_DELETE_ALL, info,
+                        (byte*) 0, 0, 0,
+                        &info->MI_LOG_OPEN_stored_in_backup_log);
   VOID(_mi_writeinfo(info,WRITEINFO_UPDATE_KEYFILE));
 #ifdef HAVE_MMAP
   /* Resize mmaped area */

--- 1.58/storage/myisam/mi_dynrec.c	2007-02-13 16:33:30 +01:00
+++ 1.59/storage/myisam/mi_dynrec.c	2007-05-15 18:09:19 +02:00
@@ -193,6 +193,7 @@
 uint mi_mmap_pwrite(MI_INFO *info, byte *Buffer,
                      uint Count, my_off_t offset, myf MyFlags)
 {
+  uint ret;
   DBUG_PRINT("info", ("mi_write with mmap %d\n", info->dfile));
   if (info->s->concurrent_insert)
     rw_rdlock(&info->s->mmap_lock);
@@ -209,16 +210,19 @@
     memcpy(info->s->file_map + offset, Buffer, Count); 
     if (info->s->concurrent_insert)
       rw_unlock(&info->s->mmap_lock);
-    return 0;
+    ret= 0;
   }
   else
   {
     info->s->nonmmaped_inserts++;
     if (info->s->concurrent_insert)
       rw_unlock(&info->s->mmap_lock);
-    return my_pwrite(info->dfile, Buffer, Count, offset, MyFlags);
+    ret= my_pwrite(info->dfile, Buffer, Count, offset, MyFlags);
   }
-
+  if (unlikely(mi_get_backup_logging_state(info->s)))
+    myisam_log_pwrite_for_backup(MI_LOG_WRITE_BYTES_MYD,
+                                 info, Buffer, Count, offset);
+  return ret;
 }
 
 
@@ -227,7 +231,11 @@
 uint mi_nommap_pwrite(MI_INFO *info, byte *Buffer,
                       uint Count, my_off_t offset, myf MyFlags)
 {
-  return my_pwrite(info->dfile, Buffer, Count, offset, MyFlags);
+  uint ret= my_pwrite(info->dfile, Buffer, Count, offset, MyFlags);
+  if (unlikely(mi_get_backup_logging_state(info->s)))
+    myisam_log_pwrite_for_backup(MI_LOG_WRITE_BYTES_MYD,
+                                 info, Buffer, Count, offset);
+  return ret;
 }
 
 

--- 1.53/storage/myisam/mi_extra.c	2007-02-13 16:33:30 +01:00
+++ 1.54/storage/myisam/mi_extra.c	2007-05-15 18:09:19 +02:00
@@ -144,6 +144,11 @@
 			 HA_STATE_WRITE_AT_END |
 			 HA_STATE_EXTEND_BLOCK);
       }
+#ifdef HAVE_MYISAM_BACKUP_LOGGING
+    info->rec_cache.post_write=
+      (IO_CACHE_CALLBACK)myisam_log_flushed_write_cache_for_backup;
+    info->rec_cache.arg= info;
+#endif
     break;
   case HA_EXTRA_PREPARE_FOR_UPDATE:
     if (info->s->data_file_type != DYNAMIC_RECORD)
@@ -247,7 +252,7 @@
 	}
       }
       share->state.state= *info->state;
-      error=mi_state_info_write(share->kfile,&share->state,1 | 2);
+      error=mi_state_info_write(info, share->kfile, &share->state, 1 | 2);
     }
     break;
   case HA_EXTRA_FORCE_REOPEN:
@@ -383,7 +388,8 @@
   {
     char tmp[1];
     tmp[0]=function;
-    myisam_log_command(MI_LOG_EXTRA,info,(byte*) tmp,1,error);
+    myisam_log_command_logical(MI_LOG_EXTRA, info,
+                               (byte*) tmp, 1, error);
   }
   DBUG_RETURN(error);
 } /* mi_extra */

--- 1.52/storage/myisam/mi_locking.c	2007-01-03 10:28:53 +01:00
+++ 1.53/storage/myisam/mi_locking.c	2007-05-15 18:09:19 +02:00
@@ -13,11 +13,13 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
 
-/*
-  locking of isam-tables.
-  reads info from a isam-table. Must be first request before doing any furter
-  calls to any isamfunktion.  Is used to allow many process use the same
-  isamdatabase.
+/**
+   @file
+   @brief locking of isam tables.
+
+   Reads info from a isam table. Must be first request before doing any
+   further calls to any isam function.  Is used to allow many processes to use
+   the same isam database.
 */
 
 #include "ftdefs.h"
@@ -70,6 +72,8 @@
       }
       if (info->opt_flag & (READ_CACHE_USED | WRITE_CACHE_USED))
       {
+        /* Logically there should not be a WRITE_CACHE at this stage */
+        DBUG_ASSERT(!(info->opt_flag & WRITE_CACHE_USED));
 	if (end_io_cache(&info->rec_cache))
 	{
 	  error=my_errno;
@@ -81,41 +85,12 @@
       {
 	DBUG_PRINT("info",("changed: %u  w_locks: %u",
 			   (uint) share->changed, share->w_locks));
-	if (share->changed && !share->w_locks)
-	{
-#ifdef HAVE_MMAP
-    if ((info->s->mmaped_length != info->s->state.state.data_file_length)
&&
-        (info->s->nonmmaped_inserts > MAX_NONMAPPED_INSERTS))
-    {
-      if (info->s->concurrent_insert)
-        rw_wrlock(&info->s->mmap_lock);
-      mi_remap_file(info, info->s->state.state.data_file_length);
-      info->s->nonmmaped_inserts= 0;
-      if (info->s->concurrent_insert)
-        rw_unlock(&info->s->mmap_lock);
-    }
-#endif
-	  share->state.process= share->last_process=share->this_process;
-	  share->state.unique=   info->last_unique=  info->this_unique;
-	  share->state.update_count= info->last_loop= ++info->this_loop;
-          if (mi_state_info_write(share->kfile, &share->state, 1))
-	    error=my_errno;
-	  share->changed=0;
-	  if (myisam_flush)
-	  {
-	    if (my_sync(share->kfile, MYF(0)))
-	      error= my_errno;
-	    if (my_sync(info->dfile, MYF(0)))
-	      error= my_errno;
-	  }
-	  else
-	    share->not_flushed=1;
-	  if (error)
-          {
-            mi_print_error(info->s, HA_ERR_CRASHED);
-	    mi_mark_crashed(info);
-          }
-	}
+	if (share->changed && !share->w_locks &&
+            mi_remap_file_and_write_state_for_unlock(info))
+        {
+          mi_print_error(share, HA_ERR_CRASHED);
+          mi_mark_crashed(info);
+        }
 	if (info->lock_type != F_EXTRA_LCK)
 	{
 	  if (share->r_locks)
@@ -254,8 +229,8 @@
   pthread_mutex_unlock(&share->intern_lock);
 #if defined(FULL_LOG) || defined(_lint)
   lock_type|=(int) (flag << 8);		/* Set bit to set if real lock */
-  myisam_log_command(MI_LOG_LOCK,info,(byte*) &lock_type,sizeof(lock_type),
-		     error);
+  myisam_log_command(&myisam_logical_log_file, MI_LOG_LOCK, info,
+                     (byte*) &lock_type, sizeof(lock_type), error);
 #endif
   DBUG_RETURN(error);
 } /* mi_lock_database */
@@ -451,7 +426,7 @@
       share->state.process= share->last_process=   share->this_process;
       share->state.unique=  info->last_unique=	   info->this_unique;
       share->state.update_count= info->last_loop= ++info->this_loop;
-      if ((error=mi_state_info_write(share->kfile, &share->state, 1)))
+      if ((error= mi_state_info_write(info, share->kfile, &share->state, 1)))
 	olderror=my_errno;
 #ifdef __WIN__
       if (myisam_flush)
@@ -537,6 +512,12 @@
     {
       mi_int2store(buff,share->state.open_count);
       buff[2]=1;				/* Mark that it's changed */
+      /*
+        This is a write to the kfile but we don't record it in the backup log;
+        anyway the backup locks tables and so gets them in a consistent state
+        (not at half of a statement, which is what _mi_mark_file_changed()
+        serves to detect).
+      */
       DBUG_RETURN(my_pwrite(share->kfile,buff,sizeof(buff),
                             sizeof(share->state.header),
                             MYF(MY_NABP)));
@@ -566,6 +547,11 @@
     {
       share->state.open_count--;
       mi_int2store(buff,share->state.open_count);
+      /*
+        This is a write to the kfile but we don't record it in the backup log;
+        anyway the backup is copying tables while they are open by others so
+        the "open_count" in the resulting backup is meaningless.
+      */
       write_error=my_pwrite(share->kfile,buff,sizeof(buff),
 			    sizeof(share->state.header),
 			    MYF(MY_NABP));
@@ -574,4 +560,39 @@
       lock_error=mi_lock_database(info,old_lock);
   }
   return test(lock_error || write_error);
+}
+
+
+int mi_remap_file_and_write_state_for_unlock(MI_INFO *info)
+{
+  MYISAM_SHARE *share= info->s;
+  int error= 0;
+#ifdef HAVE_MMAP
+  if ((info->s->mmaped_length != info->s->state.state.data_file_length)
&&
+      (info->s->nonmmaped_inserts > MAX_NONMAPPED_INSERTS))
+  {
+    if (info->s->concurrent_insert)
+      rw_wrlock(&info->s->mmap_lock);
+    mi_remap_file(info, info->s->state.state.data_file_length);
+    info->s->nonmmaped_inserts= 0;
+    if (info->s->concurrent_insert)
+      rw_unlock(&info->s->mmap_lock);
+  }
+#endif
+  share->state.process= share->last_process=share->this_process;
+  share->state.unique=   info->last_unique=  info->this_unique;
+  share->state.update_count= info->last_loop= ++info->this_loop;
+  if (mi_state_info_write(info, share->kfile, &share->state, 1))
+    error=my_errno;
+  share->changed=0;
+  if (myisam_flush)
+  {
+    if (my_sync(share->kfile, MYF(0)))
+      error= my_errno;
+    if (my_sync(info->dfile, MYF(0)))
+      error= my_errno;
+  }
+  else
+    share->not_flushed=1;
+  return error;
 }

--- 1.15/storage/myisam/mi_log.c	2007-02-23 12:13:50 +01:00
+++ 1.16/storage/myisam/mi_log.c	2007-05-15 18:09:19 +02:00
@@ -13,9 +13,32 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
 
-/*
-  Logging of MyISAM commands and records on logfile for debugging
-  The log can be examined with help of the myisamlog command.
+/**
+   @file
+   @brief Logging of MyISAM commands and records for online backup, or debugging
+
+   The log can be examined with the myisamlog utility.
+
+   Writes to the logical log happen when the logical operation happens.
+
+   Writes to the physical log happen when the physical operation happens,
+   i.e. when the file is written, which can be at three moments:
+   -# when the row write directly writes to the file (mi_[no]mmap_pwrite)
+   -# if the row write went to a WRITE_CACHE, when this cache gets written to
+      the file
+   -# if the row write went to the key cache, when this key cache block gets
+      written ("flushed") to the file
+   Additionally, an entry for opening and an entry for closing the table, are
+   written to the backup log.
+   In the case of the first "direct row write" or "WRITE_CACHE" log write for
+   a certain MI_INFO, an entry for opening (MI_LOG_OPEN) is written. That is
+   because the physical entries refer to the table by the file descriptor of
+   the data file; the MI_LOG_OPEN entry links this number to a table
+   name. The entry for closing is written by mi_close() if an entry for
+   opening had been written before.
+   In the case of "key cache" log writes, MI_INFO may be gone (freed) when the
+   log write happens, i.e. MI_LOG_OPEN cannot be used, so the entry for the
+   write has to contain the file's name.
 */
 
 #include "myisamdef.h"
@@ -36,45 +59,124 @@
 #define GETPID() myisam_pid
 #endif
 
-	/* Activate logging if flag is 1 and reset logging if flag is 0 */
-
+/** the log_type global variable is probably legacy, it's always 0 now */
 static int log_type=0;
 ulong myisam_pid=0;
 
-int mi_log(int activate_log)
+/**
+   @brief Sets up a log object (logical or physical log), or destroys it
+
+   @details Backup log (a.k.a. physical log) contains each call to OS write
+   functions on the MyISAM files; logical log contains each call to
+   higher-level operations like mi_write()/mi_update().
+   Backup log is required to be able to make a consistent table from a dirty
+   table (indeed, a dirty table contains internally inconsistent records, so
+   applying a mi_write() to it is impossible). Backup log is idempotent.
+   Logical log is used to debug MyISAM.
+   Both logs are IO_CACHE to be fast.
+
+   @param  action           what to do (open it, close it, etc)
+   @param  type             physical or logical
+   @param  log_filename     only for physical log (logical log has a static
+   name)
+
+   @note logs are not created with MY_WAIT_IF_FULL: a log can itself be the
+   cause of filling the disk, so better corrupt it (and make a backup
+   fail for example) than prevent other normal operations.
+
+   @todo A realistic benchmark to see if the size of the IO_CACHE makes any
+   difference during an online backup.
+*/
+int mi_log(enum enum_mi_log_action action, enum enum_mi_log_type type,
+           const char *log_filename)
 {
   int error=0;
   char buff[FN_REFLEN];
+  int access_flags;
+  File file;
+  IO_CACHE *log;
+  uint cache_size;
   DBUG_ENTER("mi_log");
 
-  log_type=activate_log;
-  if (activate_log)
+  pthread_mutex_lock(&THR_LOCK_myisam_log);
+  if (type == MI_LOG_LOGICAL)
+  {
+    DBUG_ASSERT(!log_filename);
+    log_filename= myisam_logical_log_filename;
+    log         = &myisam_logical_log;
+    /* O_APPEND as file may exist and we want to keep it */
+    access_flags= O_WRONLY | O_BINARY | O_APPEND;
+    /* small cache size to not lose too much in case of crash */
+    cache_size= IO_SIZE;
+  }
+  else
+  {
+    DBUG_ASSERT(type == MI_LOG_PHYSICAL);
+    log         = &myisam_backup_log;
+    /* We want to fail if file exists */
+    access_flags= O_WRONLY | O_BINARY | O_TRUNC | O_EXCL;
+    /*
+      We want a large IO_CACHE to have large contiguous disk writes.
+      In many systems this size is affordable. In small embedded ones it is
+      not, but do they use online backup?
+    */
+    cache_size= IO_SIZE*256;
+  }
+  if (action == MI_LOG_ACTION_OPEN)
   {
     if (!myisam_pid)
       myisam_pid=(ulong) getpid();
-    if (myisam_log_file < 0)
+    if (!my_b_inited(log))
     {
-      if ((myisam_log_file = my_create(fn_format(buff,myisam_log_filename,
-						"",".log",4),
-				      0,(O_RDWR | O_BINARY | O_APPEND),MYF(0)))
-	  < 0)
-	DBUG_RETURN(my_errno);
+      DBUG_ASSERT(log_filename);
+      fn_format(buff, log_filename, "", "", MY_UNPACK_FILENAME);
+      /* For logical log we'll seek at end */
+      if ((file= my_create(buff,
+                           0, access_flags,
+                           MYF(MY_WME | ME_WAITTANG))) < 0)
+        error= my_errno;
+      else if (init_io_cache(log, file,
+                             cache_size, WRITE_CACHE,
+                             my_tell(file,MYF(MY_WME)), 0,
+                             MYF(MY_WME | MY_NABP)))
+      {
+        error= my_errno;
+        my_close(file, MYF(MY_WME));
+      }
     }
   }
-  else if (myisam_log_file >= 0)
+  else /* close */
   {
-    error=my_close(myisam_log_file,MYF(0)) ? my_errno : 0 ;
-    myisam_log_file= -1;
+    DBUG_ASSERT(action == MI_LOG_ACTION_CLOSE);
+    if (my_b_inited(log))
+    {
+      if (end_io_cache(log) ||
+          my_close(log->file,MYF(MY_WME)))
+        error= my_errno;
+      log->file= -1;
+    }
   }
+  pthread_mutex_unlock(&THR_LOCK_myisam_log);
   DBUG_RETURN(error);
 }
 
 
-	/* Logging of records and commands on logfile */
-	/* All logs starts with command(1) dfile(2) process(4) result(2) */
+/**
+   @brief Logs a MyISAM command to log
 
-void _myisam_log(enum myisam_log_commands command, MI_INFO *info,
-		 const byte *buffert, uint length)
+   @param  log              pointer to the log's IO_CACHE
+   @param  command          MyISAM command (MI_LOG_OPEN, etc)
+   @param  info             MI_INFO
+   @param  buffert          usually argument to the command (e.g. name of file
+                            to open for MI_LOG_OPEN), not NULL
+   @param  length           length of buffert
+
+   @note Contrary to other myisam_log functions in this file, this one
+   requires the caller to hold THR_LOCK_myisam_log.
+*/
+void _myisam_log(IO_CACHE *log,
+                 enum myisam_log_commands command, MI_INFO *info,
+                 const byte *buffert, uint length)
 {
   char buff[11];
   int error,old_errno;
@@ -85,20 +187,56 @@
   mi_int2store(buff+1,info->dfile);
   mi_int4store(buff+3,pid);
   mi_int2store(buff+9,length);
-
-  pthread_mutex_lock(&THR_LOCK_myisam);
-  error=my_lock(myisam_log_file,F_WRLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));
-  VOID(my_write(myisam_log_file,buff,sizeof(buff),MYF(0)));
-  VOID(my_write(myisam_log_file,buffert,length,MYF(0)));
-  if (!error)
-    error=my_lock(myisam_log_file,F_UNLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));
-  pthread_mutex_unlock(&THR_LOCK_myisam);
+  safe_mutex_assert_owner(&THR_LOCK_myisam_log);
+  /*
+    We need to check that 'log' is not closed, this can happen for a backup
+    log. Indeed we do not have full control on the table from the thread doing
+    mi_backup_stop_logging_for_tables(); it could be a "dirty" backup stop (in
+    the middle of writes) or even a non-dirty one (table can be in
+    mi_lock_database(F_UNLCK) and thus want to flush its header)).
+  */
+  if (likely(my_b_inited(log) != 0))
+  {
+    error=my_lock(log->file,F_WRLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));
+    /*
+      Any failure to write the log does not prevent the table write (table
+      should still be usable even though log breaks).
+      but sets up log->hard_write_error_in_the_past, which is regularly tested
+      by the module copying the log (myisam_backup.cc).
+    */
+    VOID(my_b_write(log, buff, sizeof(buff)));
+    VOID(my_b_write(log, buffert, length));
+    if (!error)
+      error=my_lock(log->file,F_UNLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));
+  }
   my_errno=old_errno;
 }
 
 
-void _myisam_log_command(enum myisam_log_commands command, MI_INFO *info,
-			 const byte *buffert, uint length, int result)
+/**
+   @brief Logs a MyISAM command and its return code to log
+
+   @param  log              pointer to the log's IO_CACHE
+   @param  command          MyISAM command (MI_LOG_OPEN, etc)
+   @param  info             MI_INFO
+   @param  buffert          usually argument to the command (e.g. name of file
+                            to open for MI_LOG_OPEN), may be NULL
+   @param  length           length of buffert
+   @param  result           return code of the command
+   @param  MI_LOG_OPEN_stored_in_log If non-NULL, pointer to a variable
+                                     telling if MI_LOG_OPEN has already been
+                                     stored for this MI_INFO in this log; if
+                                     the variable tells that no, writes a
+                                     MI_LOG_OPEN and sets the variable to
+                                     TRUE.
+
+   @todo info->dfile can exceed 65535: use 2 bytes; if >=65535 then store
+   65535 and use 4 bytes.
+*/
+void _myisam_log_command(IO_CACHE *log,
+                         enum myisam_log_commands command, MI_INFO *info,
+                         const byte *buffert, uint length, int result,
+                         my_bool *MI_LOG_OPEN_stored_in_log)
 {
   char buff[9];
   int error,old_errno;
@@ -109,20 +247,53 @@
   mi_int2store(buff+1,info->dfile);
   mi_int4store(buff+3,pid);
   mi_int2store(buff+7,result);
-  pthread_mutex_lock(&THR_LOCK_myisam);
-  error=my_lock(myisam_log_file,F_WRLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));
-  VOID(my_write(myisam_log_file,buff,sizeof(buff),MYF(0)));
-  if (buffert)
-    VOID(my_write(myisam_log_file,buffert,length,MYF(0)));
-  if (!error)
-    error=my_lock(myisam_log_file,F_UNLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));
-  pthread_mutex_unlock(&THR_LOCK_myisam);
+  mi_int2store(buff+9,length);
+  /*
+    Reasons to not use THR_LOCK_myisam here:
+    - better concurrency (not stealing THR_LOCK_myisam which is used for opens
+    and closes)
+    - mi_close() flushes indexes while holding THR_LOCK_myisam, and that flush
+    can cause log writes, so we would lock the mutex twice.
+  */
+  pthread_mutex_lock(&THR_LOCK_myisam_log);
+  if (likely(my_b_inited(log) != 0))
+  {
+    if (MI_LOG_OPEN_stored_in_log && !*MI_LOG_OPEN_stored_in_log)
+    {
+      _myisam_log(&myisam_backup_log, MI_LOG_OPEN, info,
+                  info->s->unresolv_file_name,
+                  strlen(info->s->unresolv_file_name));
+      *MI_LOG_OPEN_stored_in_log= TRUE;
+    }
+    error=my_lock(log->file,F_WRLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));
+    VOID(my_b_write(log, buff, sizeof(buff)));
+    if (buffert)
+      VOID(my_b_write(log, buffert, length));
+    if (!error)
+      error=my_lock(log->file,F_UNLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));
+  }
+  pthread_mutex_unlock(&THR_LOCK_myisam_log);
   my_errno=old_errno;
 }
 
 
-void _myisam_log_record(enum myisam_log_commands command, MI_INFO *info,
-			const byte *record, my_off_t filepos, int result)
+/**
+   @brief Logs a MyISAM command and its return code to the logical log. The
+   command involves a record (MI_LOG_WRITE etc).
+
+   @param  command          MyISAM command (MI_LOG_OPEN, etc)
+   @param  info             MI_INFO
+   @param  record           record to write/update/etc, not NULL
+   @param  filepos          offset in data file where record starts     
+   @param  result           return code of the command
+
+   @note Used only by logical logging. Cannot be used for backup logging (a
+   backup is a dirty copy, a logical record write will not apply to the
+   copy).
+*/
+void _myisam_log_record_logical(enum myisam_log_commands command,
+                                MI_INFO *info, const byte *record,
+                                my_off_t filepos, int result)
 {
   char buff[21],*pos;
   int error,old_errno;
@@ -140,10 +311,12 @@
   mi_int2store(buff+7,result);
   mi_sizestore(buff+9,filepos);
   mi_int4store(buff+17,length);
-  pthread_mutex_lock(&THR_LOCK_myisam);
-  error=my_lock(myisam_log_file,F_WRLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));
-  VOID(my_write(myisam_log_file,buff,sizeof(buff),MYF(0)));
-  VOID(my_write(myisam_log_file,(byte*) record,info->s->base.reclength,MYF(0)));
+  pthread_mutex_lock(&THR_LOCK_myisam_log);
+  error= my_lock(myisam_logical_log.file, F_WRLCK, 0L, F_TO_EOF,
+                 MYF(MY_SEEK_NOT_DONE));
+  VOID(my_b_write(&myisam_logical_log, buff, sizeof(buff)));
+  VOID(my_b_write(&myisam_logical_log, (byte*) record,
+                  info->s->base.reclength));
   if (info->s->base.blobs)
   {
     MI_BLOB *blob,*end;
@@ -153,11 +326,194 @@
 	 blob++)
     {
       memcpy_fixed(&pos,record+blob->offset+blob->pack_length,sizeof(char*));
-      VOID(my_write(myisam_log_file,pos,blob->length,MYF(0)));
+      VOID(my_b_write(&myisam_logical_log, pos, blob->length));
     }
   }
   if (!error)
-    error=my_lock(myisam_log_file,F_UNLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));
-  pthread_mutex_unlock(&THR_LOCK_myisam);
+    error= my_lock(myisam_logical_log.file, F_UNLCK, 0L, F_TO_EOF,
+                   MYF(MY_SEEK_NOT_DONE));
+  pthread_mutex_unlock(&THR_LOCK_myisam_log);
   my_errno=old_errno;
+}
+
+
+/* THE FOLLOWING FUNCTIONS SERVE ONLY FOR BACKUP LOGGING */
+
+/**
+   @brief Logs a pwrite() (done to the data or index file) to the backup log.
+
+   Also logs MI_LOG_OPEN if first time. Thus, a MI_INFO will write MI_LOG_OPEN
+   to the log only if it is doing a write to the table: a table which does
+   only reads during the backup logs nothing.
+
+   @param  command          MyISAM command (MI_LOG_WRITE_BYTES_TO_MYD, etc)
+   @param  info             MI_INFO
+   @param  buffert          argument to the pwrite
+   @param  length           length of buffer
+   @param  filepos          offset in file where buffer was written
+
+   @note length may be small (for example, if updating only a numeric field of
+   a record, it could be only a few bytes), so we try to minimize the header's
+   size of the log entry.
+*/
+void myisam_log_pwrite_for_backup(enum myisam_log_commands command,
+                                  MI_INFO *info, const byte *buffert,
+                                  uint length, my_off_t filepos)
+{
+  char buff[21], *ptr;
+  int old_errno;
+  DBUG_ENTER("myisam_log_pwrite_for_backup");
+  old_errno= my_errno;
+  buff[0]= (char) command;
+  mi_int2store(buff+1, info->dfile);
+  /* pid and result are not needed */
+  /*
+    filepos is coded in variable-length: 4 bytes, or if the first 4 bytes are
+    4G, then the next 8 bytes.
+  */
+  if (filepos >= UINT_MAX32)
+  {
+    mi_int4store(buff+3, UINT_MAX32);
+    mi_int8store(buff+7, filepos);
+    ptr= buff+15;
+  }
+  else
+  {
+    mi_int4store(buff+3, filepos);
+    ptr= buff+7;
+  }
+  /* length is 2 bytes, or if the first 2 bytes are 65535, the next 4 bytes */
+  if (length >= UINT_MAX16)
+  {
+    mi_int2store(ptr, UINT_MAX16);
+    mi_int4store(ptr, length);
+    ptr+= 6;
+  }
+  else
+  {
+    mi_int2store(ptr, length);
+    ptr+= 2;
+  }
+  pthread_mutex_lock(&THR_LOCK_myisam_log);
+  if (likely(my_b_inited(&myisam_backup_log) != 0))
+  {
+    if (!info->MI_LOG_OPEN_stored_in_backup_log)
+    {
+      _myisam_log(&myisam_backup_log, MI_LOG_OPEN, info,
+                  info->s->unresolv_file_name,
+                  strlen(info->s->unresolv_file_name));
+      /*
+        It is important to set the variable under mutex; one instant after
+        unlocking the mutex, the log may be closed and so it would be wrong
+        to say that the MI_LOG_OPEN is in the log (it would possibly influence
+        a next backup job).
+      */
+      info->MI_LOG_OPEN_stored_in_backup_log= TRUE;
+    }
+    /*
+      This backup logging does not work with --external-locking (the backup log
+      is private to the mysqld).
+    */
+#ifdef BACKUP_DOESNT_WORK_WITH_EXTERNAL_LOCKING
+    error= my_lock(myisam_backup_log.file, F_WRLCK, 0L, F_TO_EOF,
+                   MYF(MY_SEEK_NOT_DONE));
+#endif
+    VOID(my_b_write(&myisam_backup_log, buff, ptr-buff));
+    VOID(my_b_write(&myisam_backup_log, (byte*) buffert, length));
+#ifdef BACKUP_DOESNT_WORK_WITH_EXTERNAL_LOCKING
+    if (!error)
+      error= my_lock(myisam_backup_log.file, F_UNLCK, 0L, F_TO_EOF,
+                     MYF(MY_SEEK_NOT_DONE));
+#endif
+  }
+  pthread_mutex_unlock(&THR_LOCK_myisam_log);
+  my_errno= old_errno;
+  DBUG_VOID_RETURN;
+}
+
+
+/**
+   @brief Logs when the WRITE_CACHE is flushed to the data file, to the backup
+   log.
+
+   @param  cache_for_table  pointer to the table's WRITE_CACHE IO_CACHE
+   @param  buffert          argument to the pwrite
+   @param  length           length of buffer
+   @param  filepos          offset in file where buffer was written
+
+   @return Operation status, always 0
+     @retval 0      ok (function returns int and not void just to match the
+                    definition of IO_CACHE_CALLBACK)
+*/
+int myisam_log_flushed_write_cache_for_backup(IO_CACHE *cache_for_table,
+                                              const byte *buffert,
+                                              uint length, my_off_t filepos)
+{
+  MI_INFO *info= (MI_INFO*)(cache_for_table->arg);
+  DBUG_ENTER("myisam_log_flushed_write_cache_for_backup");
+  if (unlikely(mi_get_backup_logging_state(info->s)))
+    myisam_log_pwrite_for_backup(MI_LOG_WRITE_BYTES_MYD, info,
+                                 buffert, length, filepos);
+  DBUG_RETURN(0);
+}
+
+
+/**
+   @brief Logs when the key cache flushes a page to the file (so far, always
+   the index file), to the backup log. Argument cannot be a MI_INFO* (the
+   MI_INFO which put the page in the key cache may have been freed long ago
+   when the page is finally flushed), it is MYISAM_SHARE* which is sure to be
+   valid.
+
+   @param  arg              MYISAM_SHARE* where the block belongs
+   @param  filedes          descriptor to the file
+   @param  buffert          argument to the pwrite
+   @param  length           length of buffer
+   @param  filepos          offset in file where buffer was written
+
+   @todo Make records shorter (no pid, no result, variable-length coding);
+   however the gain does not represent that much (12 bytes compared to a key
+   page of 1024 bytes: 1%); the length of the file name is already bigger.
+*/
+void myisam_log_from_key_cache_for_backup(void *arg,
+                                          int filedes __attribute__((unused)),
+                                          const byte *buffert,
+                                          uint length, my_off_t filepos)
+{
+  MYISAM_SHARE *s= (MYISAM_SHARE *)arg;
+  DBUG_ENTER("myisam_log_from_key_cache_for_backup");
+  if (unlikely(index_pages_in_backup_log && mi_get_backup_logging_state(s)))
+  {
+    char *name= s->unresolv_file_name;
+    uint name_len;
+    char buff[23];
+    int old_errno;
+    old_errno= my_errno;
+    name_len= strlen(name);
+    bzero(buff, sizeof(buff));
+    buff[0]= (char)MI_LOG_WRITE_BYTES_MYI_FROM_KEY_CACHE;
+    mi_int2store(buff + 1, -1); // because no info->dfile
+    mi_sizestore(buff + 9, filepos);
+    mi_int4store(buff + 17, length);
+    mi_int2store(buff + 21, name_len);
+    pthread_mutex_lock(&THR_LOCK_myisam_log);
+    if (likely(my_b_inited(&myisam_backup_log) != 0))
+    {
+#ifdef BACKUP_DOESNT_WORK_WITH_EXTERNAL_LOCKING
+      error= my_lock(myisam_backup_log.file, F_WRLCK, 0L, F_TO_EOF,
+                     MYF(MY_SEEK_NOT_DONE));
+#endif
+      VOID(my_b_write(&myisam_backup_log, buff, sizeof(buff)));
+      VOID(my_b_write(&myisam_backup_log, name, name_len));
+      VOID(my_b_write(&myisam_backup_log, buffert, length));
+#ifdef BACKUP_DOESNT_WORK_WITH_EXTERNAL_LOCKING
+      if (!error)
+        error= my_lock(myisam_backup_log.file, F_UNLCK, 0L, F_TO_EOF,
+                       MYF(MY_SEEK_NOT_DONE));
+#endif
+    }
+    pthread_mutex_unlock(&THR_LOCK_myisam_log);
+    my_errno= old_errno;
+  }
+  DBUG_VOID_RETURN;
 }

--- 1.117/storage/myisam/mi_open.c	2007-02-23 12:23:42 +01:00
+++ 1.118/storage/myisam/mi_open.c	2007-05-15 18:09:19 +02:00
@@ -94,6 +94,11 @@
   head_length=sizeof(share_buff.state.header);
   bzero((byte*) &info,sizeof(info));
 
+  /*
+    'name' is an unresolved name (no .sym or Unix symbolic link
+    resolution). Backup logging needs it. We resolve 'name' in 'org_name' and
+    'name_buff'.
+  */
   my_realpath(name_buff, fn_format(org_name,name,"",MI_NAME_IEXT,
                                    MY_UNPACK_FILENAME),MYF(0));
   pthread_mutex_lock(&THR_LOCK_myisam);
@@ -292,6 +297,7 @@
 			 &share->rec,
 			 (share->base.fields+1)*sizeof(MI_COLUMNDEF),
 			 &share->blobs,sizeof(MI_BLOB)*share->base.blobs,
+                         &share->unresolv_file_name,strlen(name)+1,
 			 &share->unique_file_name,strlen(name_buff)+1,
 			 &share->index_file_name,strlen(index_name)+1,
 			 &share->data_file_name,strlen(data_name)+1,
@@ -313,6 +319,7 @@
     memcpy((char*) share->state.key_del,
 	   (char*) key_del, (sizeof(my_off_t) *
 			     share->state.header.max_block_size_index));
+    strmov(share->unresolv_file_name,name);
     strmov(share->unique_file_name, name_buff);
     share->unique_name_length= strlen(name_buff);
     strmov(share->index_file_name,  index_name);
@@ -526,6 +533,7 @@
 #ifdef THREAD
     thr_lock_init(&share->lock);
     VOID(pthread_mutex_init(&share->intern_lock,MY_MUTEX_INIT_FAST));
+    my_atomic_rwlock_init(&share->current_writes_rwlock);
     for (i=0; i<keys; i++)
       VOID(my_rwlock_init(&share->key_root_lock[i], NULL));
     VOID(my_rwlock_init(&share->mmap_lock, NULL));
@@ -575,7 +583,6 @@
 				   share->base.max_key_length),
 		       &info.lastkey,share->base.max_key_length*3+1,
 		       &info.first_mbr_key, share->base.max_key_length,
-		       &info.filename,strlen(name)+1,
 		       &info.rtree_recursion_state,have_rtree ? 1024 : 0,
 		       NullS))
     goto err;
@@ -584,7 +591,6 @@
   if (!have_rtree)
     info.rtree_recursion_state= NULL;
 
-  strmov(info.filename,name);
   memcpy(info.blobs,share->blobs,sizeof(MI_BLOB)*share->base.blobs);
   info.lastkey2=info.lastkey+share->base.max_key_length;
 
@@ -646,11 +652,18 @@
   m_info->open_list.data=(void*) m_info;
   myisam_open_list=list_add(myisam_open_list,&m_info->open_list);
 
+  if (backup_hash_of_tables &&
+      hash_search(backup_hash_of_tables, share->unique_file_name,
+                  share->unique_name_length))
+    m_info->s->backup_logging= TRUE;
   pthread_mutex_unlock(&THR_LOCK_myisam);
-  if (myisam_log_file >= 0)
+  if (my_b_inited(&myisam_logical_log))
   {
     intern_filename(name_buff,share->index_file_name);
-    _myisam_log(MI_LOG_OPEN,m_info,name_buff,(uint) strlen(name_buff));
+    pthread_mutex_lock(&THR_LOCK_myisam_log);
+    _myisam_log(&myisam_logical_log, MI_LOG_OPEN, m_info, name_buff,
+                (uint) strlen(name_buff));
+    pthread_mutex_unlock(&THR_LOCK_myisam_log);
   }
   DBUG_RETURN(m_info);
 
@@ -846,15 +859,25 @@
 }
 
 
-/*
-   Function to save and store the header in the index file (.MYI)
+/**
+   @brief Function to save and store the header in the index file (.MYI)
+
+   @param  info             the table
+   @param  file             file descriptor of the index file
+   @param  state            state of the table
+   @param  pWrite           if my_pwrite() or my_write() should be used
+
+   @return Operation status
+     @retval 0      ok
+     @retval !=0    error
 */
 
-uint mi_state_info_write(File file, MI_STATE_INFO *state, uint pWrite)
+uint mi_state_info_write(MI_INFO *info, File file,
+                         MI_STATE_INFO *state, uint pWrite)
 {
   uchar  buff[MI_STATE_INFO_SIZE + MI_STATE_EXTRA_SIZE];
   uchar *ptr=buff;
-  uint	i, keys= (uint) state->header.keys,
+  uint	i, ret, keys= (uint) state->header.keys,
 	key_blocks=state->header.max_block_size_index;
   DBUG_ENTER("mi_state_info_write");
 
@@ -906,11 +929,16 @@
     }
   }
 
-  if (pWrite & 1)
-    DBUG_RETURN(my_pwrite(file,(char*) buff, (uint) (ptr-buff), 0L,
-			  MYF(MY_NABP | MY_THREADSAFE)));
-  DBUG_RETURN(my_write(file,  (char*) buff, (uint) (ptr-buff),
-		       MYF(MY_NABP)));
+  ret= (pWrite & 1) ? my_pwrite(file,(char*) buff, (uint) (ptr-buff), 0L,
+                                MYF(MY_NABP | MY_THREADSAFE)) :
+    my_write(file,  (char*) buff, (uint) (ptr-buff),
+             MYF(MY_NABP));
+  if (unlikely(info /* mi_create() passes info==0 */ &&
+               mi_get_backup_logging_state(info->s)))
+    myisam_log_pwrite_for_backup(MI_LOG_WRITE_BYTES_MYI, 
+                                 info, (char*) buff,
+                                 (uint) (ptr-buff), 0L);
+  DBUG_RETURN(ret);
 }
 
 

--- 1.27/storage/myisam/mi_page.c	2006-12-31 01:06:40 +01:00
+++ 1.28/storage/myisam/mi_page.c	2007-05-15 18:09:19 +02:00
@@ -93,11 +93,23 @@
     length=keyinfo->block_length;
   }
 #endif
-  DBUG_RETURN((key_cache_write(info->s->key_cache,
-                         info->s->kfile,page, level, (byte*) buff,length,
-			 (uint) keyinfo->block_length,
-			 (int) ((info->lock_type != F_UNLCK) ||
-				info->s->delay_key_write))));
+  /*
+    It's more efficient to have the key cache write to the backup log when it
+    flushes than write now (a mass insert may touch the same index page again
+    and again; if we write now we write one log record per row write; if we
+    write only at flush time we write one log record per touched page).
+  */
+  DBUG_RETURN(key_cache_write(info->s->key_cache,
+                              info->s->kfile, page, level, (byte*)buff, length,
+                              (uint) keyinfo->block_length,
+                              (int) ((info->lock_type != F_UNLCK) ||
+                                     info->s->delay_key_write),
+#ifdef HAVE_MYISAM_BACKUP_LOGGING
+                              myisam_log_from_key_cache_for_backup
+#else
+                              NULL
+#endif
+                              , info->s));
 } /* mi_write_keypage */
 
 
@@ -116,10 +128,16 @@
   mi_sizestore(buff,old_link);
   info->s->state.changed|= STATE_NOT_SORTED_PAGES;
   DBUG_RETURN(key_cache_write(info->s->key_cache,
-                              info->s->kfile, pos , level, buff,
+                              info->s->kfile, pos, level, buff,
 			      sizeof(buff),
 			      (uint) keyinfo->block_length,
-			      (int) (info->lock_type != F_UNLCK)));
+			      (int) (info->lock_type != F_UNLCK),
+#ifdef HAVE_MYISAM_BACKUP_LOGGING
+                              myisam_log_from_key_cache_for_backup
+#else
+                              NULL
+#endif
+                              , info->s));
 } /* _mi_dispose */
 
 

--- 1.13/storage/myisam/mi_panic.c	2006-12-31 01:06:40 +01:00
+++ 1.14/storage/myisam/mi_panic.c	2007-05-15 18:09:19 +02:00
@@ -54,6 +54,7 @@
 	  error=my_errno;
       if (info->opt_flag & READ_CACHE_USED)
       {
+        /* QQ Why do we flush a READ_CACHE? it's a no-op */
 	if (flush_io_cache(&info->rec_cache))
 	  error=my_errno;
 	reinit_io_cache(&info->rec_cache,READ_CACHE,0,
@@ -78,15 +79,17 @@
       {					/* Open closed files */
 	char name_buff[FN_REFLEN];
 	if (info->s->kfile < 0)
-	  if ((info->s->kfile= my_open(fn_format(name_buff,info->filename,"",
-					      N_NAME_IEXT,4),info->mode,
-				    MYF(MY_WME))) < 0)
+	  if ((info->s->kfile= my_open(fn_format(name_buff,
+                                                 info->s->unresolv_file_name,
+                                                 "", N_NAME_IEXT, 4),
+                                       info->mode, MYF(MY_WME))) < 0)
 	    error = my_errno;
 	if (info->dfile < 0)
 	{
-	  if ((info->dfile= my_open(fn_format(name_buff,info->filename,"",
-					      N_NAME_DEXT,4),info->mode,
-				    MYF(MY_WME))) < 0)
+	  if ((info->dfile= my_open(fn_format(name_buff,
+                                              info->s->unresolv_file_name,
+                                              "", N_NAME_DEXT, 4),
+                                    info->mode, MYF(MY_WME))) < 0)
 	    error = my_errno;
 	  info->rec_cache.file=info->dfile;
 	}
@@ -103,7 +106,8 @@
   }
   if (flag == HA_PANIC_CLOSE)
   {
-    VOID(mi_log(0));				/* Close log if neaded */
+    /* Close log if needed */
+    VOID(mi_log(MI_LOG_ACTION_CLOSE, MI_LOG_LOGICAL, NULL));
     ft_free_stopwords();
   }
   pthread_mutex_unlock(&THR_LOCK_myisam);

--- 1.9/storage/myisam/mi_rrnd.c	2006-12-31 01:06:40 +01:00
+++ 1.10/storage/myisam/mi_rrnd.c	2007-05-15 18:09:20 +02:00
@@ -31,10 +31,8 @@
 
 int mi_rrnd(MI_INFO *info, byte *buf, register my_off_t filepos)
 {
-  my_bool skip_deleted_blocks;
+  my_bool skip_deleted_blocks= 0;
   DBUG_ENTER("mi_rrnd");
-
-  skip_deleted_blocks=0;
 
   if (filepos == HA_OFFSET_ERROR)
   {

--- 1.21/storage/myisam/mi_static.c	2006-12-31 01:06:40 +01:00
+++ 1.22/storage/myisam/mi_static.c	2007-05-15 18:09:20 +02:00
@@ -13,9 +13,11 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
 
-/*
-  Static variables for MyISAM library. All definied here for easy making of
-  a shared library
+/**
+   @file
+   @brief Static variables for MyISAM library
+
+   All defined here for easy making of a shared library
 */
 
 #ifndef _global_h
@@ -27,8 +29,27 @@
 { (uchar) 254, (uchar) 254,'\007', '\001', };
 uchar	NEAR myisam_pack_file_magic[]=
 { (uchar) 254, (uchar) 254,'\010', '\002', };
-my_string myisam_log_filename=(char*) "myisam.log";
-File	myisam_log_file= -1;
+my_string myisam_logical_log_filename= (char*)"myisam.log";
+/**
+   @brief log used by online backup (physical log).
+   Most of its entries are physical for example "write these bytes at this
+   offset". For example, a mi_write() with lots of BLOBs in many places will
+   cause lots of entries in this log. It also contains some logical ones like
+   MI_LOG_DELETE_ALL.
+   As the backup log is used for online backup, which copies a table file
+   dirtily, the backup log needs to be physical (because a big data record can
+   easily be corrupted in the copy (the copy may contain a mix of old pieces
+   and new pieces for the same record, if it was an update), which prevents a
+   logical log from working).
+*/
+IO_CACHE myisam_backup_log;
+/**
+   @brief log used for debugging (logical log).
+   Its records are logical, for example "write this record".
+   For example, a mi_write() with lots of BLOBs will cause one entry in this
+   log.
+*/
+IO_CACHE myisam_logical_log;
 uint	myisam_quick_table_bits=9;
 ulong	myisam_block_size= MI_KEY_BLOCK_LENGTH;		/* Best by test */
 my_bool myisam_flush=0, myisam_delay_key_write=0, myisam_single_user=0;
@@ -40,6 +61,15 @@
 my_off_t myisam_max_temp_length= MAX_FILE_SIZE;
 ulong    myisam_bulk_insert_tree_size=8192*1024;
 ulong    myisam_data_pointer_size=4;
+/** @brief human-readable names of commands storable in MyISAM logs */
+const char *mi_log_command_name[]=
+{"open","write","update","delete","close","extra","lock",
+ /* This one does not appear in enum myisam_log_commands so messes things */
+#ifdef DISABLED_UNTIL_MONTY_DECIDES
+ "re-open",
+#endif
+ "delete-all", "write-bytes-to-MYD", "write-bytes-to-MYI",
+ "write-bytes-to-MYI-from-key-cache", NullS};
 
 /*
   read_vec[] is used for converting between P_READ_KEY.. and SEARCH_
@@ -59,3 +89,19 @@
   SEARCH_BIGGER, SEARCH_BIGGER, SEARCH_SMALLER, SEARCH_BIGGER, SEARCH_SMALLER,
   SEARCH_BIGGER, SEARCH_SMALLER, SEARCH_SMALLER
 };
+
+/** @brief hash of all tables which we are going to back up online */
+HASH *backup_hash_of_tables;
+/**
+   @brief if page changes to the index file should be logged to the backup log
+   
+   If changes to index pages are not stored into the backup log, the table's
+   copy will need to have its index repaired.
+   
+   @note Changes to the header of the index file of a table in backup mode
+   are always logged because the header is not redundant with the data file.
+   
+   @todo Make it settable (command-line option, SQL clause in BACKUP
+   command...); currently (for the sake of testing) it uses a getenv().
+*/
+my_bool index_pages_in_backup_log;

--- 1.35/storage/myisam/mi_test2.c	2006-12-31 01:06:40 +01:00
+++ 1.36/storage/myisam/mi_test2.c	2007-05-15 18:09:20 +02:00
@@ -214,7 +214,7 @@
 		&create_info,create_flag))
     goto err;
   if (use_log)
-    mi_log(1);
+    mi_log(MI_LOG_ACTION_OPEN, MI_LOG_LOGICAL, NULL);
   if (!(file=mi_open(filename,2,HA_OPEN_ABORT_IF_LOCKED)))
     goto err;
   if (!silent)

--- 1.21/storage/myisam/mi_test3.c	2006-12-31 01:06:40 +01:00
+++ 1.22/storage/myisam/mi_test3.c	2007-05-15 18:09:20 +02:00
@@ -169,7 +169,7 @@
   MI_INFO *file,*file1,*file2=0,*lock;
 
   if (use_log)
-    mi_log(1);
+    mi_log(MI_LOG_ACTION_OPEN, MI_LOG_LOGICAL, NULL);
   if (!(file1=mi_open(filename,O_RDWR,HA_OPEN_WAIT_IF_LOCKED)) ||
       !(file2=mi_open(filename,O_RDWR,HA_OPEN_WAIT_IF_LOCKED)))
   {
@@ -214,7 +214,7 @@
   mi_close(file1);
   mi_close(file2);
   if (use_log)
-    mi_log(0);
+    mi_log(MI_LOG_ACTION_CLOSE, MI_LOG_LOGICAL, NULL);
   if (error)
   {
     printf("%2d: Aborted\n",id); fflush(stdout);

--- 1.27/storage/myisam/mi_update.c	2007-01-03 10:28:53 +01:00
+++ 1.28/storage/myisam/mi_update.c	2007-05-15 18:09:20 +02:00
@@ -170,7 +170,7 @@
 
   info->update= (HA_STATE_CHANGED | HA_STATE_ROW_CHANGED | HA_STATE_AKTIV |
 		 key_changed);
-  myisam_log_record(MI_LOG_UPDATE,info,newrec,info->lastpos,0);
+  myisam_log_record_logical(MI_LOG_UPDATE, info, newrec, info->lastpos, 0);
   /*
     Every myisam function that updates myisam table must end with
     call to _mi_writeinfo(). If operation (second param of
@@ -185,8 +185,9 @@
   allow_break();				/* Allow SIGHUP & SIGINT */
   if (info->invalidator != 0)
   {
-    DBUG_PRINT("info", ("invalidator... '%s' (update)", info->filename));
-    (*info->invalidator)(info->filename);
+    DBUG_PRINT("info", ("invalidator... '%s' (update)",
+                        info->s->unresolv_file_name));
+    (*info->invalidator)(info->s->unresolv_file_name);
     info->invalidator=0;
   }
   DBUG_RETURN(0);
@@ -231,7 +232,8 @@
 		 key_changed);
 
  err_end:
-  myisam_log_record(MI_LOG_UPDATE,info,newrec,info->lastpos,my_errno);
+  myisam_log_record_logical(MI_LOG_UPDATE, info, newrec,
+                            info->lastpos, my_errno);
   VOID(_mi_writeinfo(info,WRITEINFO_UPDATE_KEYFILE));
   allow_break();				/* Allow SIGHUP & SIGINT */
   if (save_errno == HA_ERR_KEY_NOT_FOUND)

--- 1.67/storage/myisam/mi_write.c	2007-01-03 10:28:53 +01:00
+++ 1.68/storage/myisam/mi_write.c	2007-05-15 18:09:20 +02:00
@@ -154,12 +154,13 @@
 		 HA_STATE_ROW_CHANGED);
   info->state->records++;
   info->lastpos=filepos;
-  myisam_log_record(MI_LOG_WRITE,info,record,filepos,0);
+  myisam_log_record_logical(MI_LOG_WRITE, info, record, filepos, 0);
   VOID(_mi_writeinfo(info, WRITEINFO_UPDATE_KEYFILE));
   if (info->invalidator != 0)
   {
-    DBUG_PRINT("info", ("invalidator... '%s' (update)", info->filename));
-    (*info->invalidator)(info->filename);
+    DBUG_PRINT("info", ("invalidator... '%s' (update)",
+                        info->s->unresolv_file_name));
+    (*info->invalidator)(info->s->unresolv_file_name);
     info->invalidator=0;
   }
 
@@ -231,7 +232,7 @@
   my_errno=save_errno;
 err2:
   save_errno=my_errno;
-  myisam_log_record(MI_LOG_WRITE,info,record,filepos,my_errno);
+  myisam_log_record_logical(MI_LOG_WRITE, info, record, filepos, my_errno);
   VOID(_mi_writeinfo(info,WRITEINFO_UPDATE_KEYFILE));
   allow_break();			/* Allow SIGHUP & SIGINT */
   DBUG_RETURN(my_errno=save_errno);

--- 1.97/storage/myisam/myisamdef.h	2007-01-03 10:28:53 +01:00
+++ 1.98/storage/myisam/myisamdef.h	2007-05-15 18:09:20 +02:00
@@ -13,7 +13,10 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
 
-/* This file is included by all internal myisam files */
+/**
+   @file
+   @brief This file is included by all internal myisam files
+*/
 
 #include "myisam.h"			/* Structs & some defines */
 #include "myisampack.h"			/* packing of keys */
@@ -21,6 +24,7 @@
 #ifdef THREAD
 #include <my_pthread.h>
 #include <thr_lock.h>
+#include <my_atomic.h>
 #else
 #include <my_no_pthread.h>
 #endif
@@ -154,6 +158,7 @@
 
 #define MAX_NONMAPPED_INSERTS 1000      
 
+/** @brief Information shared by all open instances of the same table */
 typedef struct st_mi_isam_share {	/* Shared between opens */
   MI_STATE_INFO state;
   MI_BASE_INFO base;
@@ -219,6 +224,15 @@
   uint     nonmmaped_inserts;           /* counter of writing in non-mmaped
                                            area */
   rw_lock_t mmap_lock;
+  /**
+     @brief If this table is doing backup logging (1) or not (0)
+     Read and set under MI_INFO::backup_logging_rwlock
+  */
+  volatile int32 backup_logging;
+  /** @brief for protecting MyISAM's MI_INFO::backup_logging */
+  my_atomic_rwlock_t backup_logging_rwlock;
+  /** @brief file name before resolving any symlink or expanding directory */
+  char *unresolv_file_name;
 } MYISAM_SHARE;
 
 
@@ -231,6 +245,7 @@
   uint error;
 } MI_BIT_BUFF;
 
+/** @brief Information local to the table's instance */
 struct st_myisam_info {
   MYISAM_SHARE *s;			/* Shared between open:s */
   MI_STATUS_INFO *state,save_state;
@@ -241,7 +256,6 @@
   DYNAMIC_ARRAY *ft1_to_ft2;            /* used only in ft1->ft2 conversion */
   MEM_ROOT      ft_memroot;             /* used by the parser               */
   MYSQL_FTPARSER_PARAM *ftparser_param; /* share info between init/deinit   */
-  char *filename;			/* parameter to open filename       */
   uchar *buff,				/* Temp area for key                */
 	*lastkey,*lastkey2;		/* Last used search key             */
   uchar *first_mbr_key;			/* Searhed spatial key              */
@@ -295,6 +309,13 @@
 #ifdef __WIN__
   my_bool owned_by_merge;                       /* This MyISAM table is part of a merge
union */
 #endif
+  /**
+     @brief if MI_INFO has already stored MI_LOG_OPEN in backup log.
+     Set to TRUE only by writer thread under THR_LOCK_myisam_log atomically
+     with logging the MI_LOG_OPEN; set to FALSE only by backup thread after
+     closing the log.
+  */
+  my_bool MI_LOG_OPEN_stored_in_backup_log;
 #ifdef THREAD
   THR_LOCK_DATA lock;
 #endif
@@ -464,7 +485,7 @@
 #define mi_unique_store(A,B)    mi_int4store((A),(B))
 
 #ifdef THREAD
-extern pthread_mutex_t THR_LOCK_myisam;
+extern pthread_mutex_t THR_LOCK_myisam, THR_LOCK_myisam_log;
 #endif
 #if !defined(THREAD) || defined(DONT_USE_RW_LOCKS)
 #define rw_wrlock(A) {}
@@ -478,8 +499,10 @@
 extern uchar NEAR myisam_file_magic[],NEAR myisam_pack_file_magic[];
 extern uint NEAR myisam_read_vec[],NEAR myisam_readnext_vec[];
 extern uint myisam_quick_table_bits;
-extern File myisam_log_file;
+extern IO_CACHE myisam_logical_log, myisam_backup_log;
 extern ulong myisam_pid;
+extern HASH *backup_hash_of_tables;
+extern my_bool index_pages_in_backup_log;
 
 	/* This is used by _mi_calc_xxx_key_length och _mi_store_key */
 
@@ -679,13 +702,25 @@
 #define SORT_BUFFER_INIT	(2048L*1024L-MALLOC_OVERHEAD)
 #define MIN_SORT_BUFFER		(4096-MALLOC_OVERHEAD)
 
+/** @brief Commands storable in MyISAM logs (L/B= in logical/backup log) */
 enum myisam_log_commands {
- 
MI_LOG_OPEN,MI_LOG_WRITE,MI_LOG_UPDATE,MI_LOG_DELETE,MI_LOG_CLOSE,MI_LOG_EXTRA,MI_LOG_LOCK,MI_LOG_DELETE_ALL
+  MI_LOG_OPEN, /**< when mi_open() LB */
+  MI_LOG_WRITE, /**< when mi_write() L */
+  MI_LOG_UPDATE, /**< when mi_update() L */
+  MI_LOG_DELETE, /**< when mi_delete() L */
+  MI_LOG_CLOSE, /**< when mi_close() LB */
+  MI_LOG_EXTRA, /**< when mi_extra() L */
+  MI_LOG_LOCK, /**< when mi_lock_database() L */
+  MI_LOG_DELETE_ALL, /**< when mi_delete_all LB */
+  MI_LOG_WRITE_BYTES_MYD, /**< when MyISAM writes to the data file B */
+  MI_LOG_WRITE_BYTES_MYI, /**< when MyISAM writes to the index file B */
+  MI_LOG_WRITE_BYTES_MYI_FROM_KEY_CACHE /**< when key cache flushes a page B */
 };
-
-#define myisam_log(a,b,c,d) if (myisam_log_file >= 0) _myisam_log(a,b,c,d)
-#define myisam_log_command(a,b,c,d,e) if (myisam_log_file >= 0)
_myisam_log_command(a,b,c,d,e)
-#define myisam_log_record(a,b,c,d,e) if (myisam_log_file >= 0)
_myisam_log_record(a,b,c,d,e)
+extern const char *mi_log_command_name[];
+/** @brief logs a command if this log is open */
+#define myisam_log_command_logical(a,b,c,d,e) if (my_b_inited(&myisam_logical_log))
_myisam_log_command(&myisam_logical_log,a,b,c,d,e,NULL)
+/** @brief logs a command involving a record if this log is open */
+#define myisam_log_record_logical(a,b,c,d,e) if (my_b_inited(&myisam_logical_log))
_myisam_log_record_logical(a,b,c,d,e)
 
 #define fast_mi_writeinfo(INFO) if (!(INFO)->s->tot_locks) (void)
_mi_writeinfo((INFO),0)
 #define fast_mi_readinfo(INFO) ((INFO)->lock_type == F_UNLCK) &&
_mi_readinfo((INFO),F_RDLCK,1)
@@ -700,14 +735,56 @@
                                     MI_BLOCK_INFO *info, byte **rec_buff_p,
                                     File file, my_off_t filepos);
 extern void _my_store_blob_length(byte *pos,uint pack_length,uint length);
-extern void _myisam_log(enum myisam_log_commands command,MI_INFO *info,
-		       const byte *buffert,uint length);
-extern void _myisam_log_command(enum myisam_log_commands command,
-			       MI_INFO *info, const byte *buffert,
-			       uint length, int result);
-extern void _myisam_log_record(enum myisam_log_commands command,MI_INFO *info,
-			      const byte *record,my_off_t filepos,
-			      int result);
+extern void _myisam_log(IO_CACHE *log, enum myisam_log_commands command,
+                        MI_INFO *info, const byte *buffert, uint length);
+extern void _myisam_log_command(IO_CACHE *log,
+                                enum myisam_log_commands command,
+                                MI_INFO *info, const byte *buffert,
+                                uint length, int result,
+                                my_bool *MI_LOG_OPEN_stored_in_log);
+extern void _myisam_log_record_logical(enum myisam_log_commands command,
+                                       MI_INFO *info, const byte *record,
+                                       my_off_t filepos, int result);
+extern void myisam_log_pwrite_for_backup(enum myisam_log_commands command,
+                                         MI_INFO *info, const byte *buffert,
+                                         uint length, my_off_t filepos);
+extern int
+myisam_log_flushed_write_cache_for_backup(IO_CACHE *cache_to_table,
+                                          const byte *buffert,
+                                          uint length, my_off_t offset);
+extern void
+myisam_log_from_key_cache_for_backup(void *arg,
+                                     int filedes, const byte *buffert,
+                                     uint length, my_off_t offset);
+
+/**
+   @details
+   Online backup logging for MyISAM is always compiled in.
+   But if one wants to benchmark the overhead it adds, when no backup is
+   running, one can undefine the symbol below.
+*/
+#define HAVE_MYISAM_BACKUP_LOGGING 1
+#ifdef HAVE_MYISAM_BACKUP_LOGGING
+static inline int32 mi_get_backup_logging_state(MYISAM_SHARE *s)
+{
+  int32 ret;
+  my_atomic_rwlock_wrlock(&s->backup_logging_rwlock);
+  ret= my_atomic_load32(&s->backup_logging);
+  my_atomic_rwlock_wrunlock(&s->backup_logging_rwlock);
+  return ret;
+}
+static inline void
+mi_set_backup_logging_state(MYISAM_SHARE *s, int32 new_state)
+{
+  my_atomic_rwlock_wrlock(&s->backup_logging_rwlock);
+  my_atomic_store32(&s->backup_logging, new_state);
+  my_atomic_rwlock_wrunlock(&s->backup_logging_rwlock);
+}
+#else
+#define mi_get_backup_logging_state(A) 0
+#define mi_set_backup_logging_state(A,B)
+#endif
+
 extern void mi_report_error(int errcode, const char *file_name);
 extern my_bool _mi_memmap_file(MI_INFO *info);
 extern void _mi_unmap_file(MI_INFO *info);
@@ -723,8 +800,10 @@
 extern uint mi_nommap_pwrite(MI_INFO *info, byte *Buffer,
                              uint Count, my_off_t offset, myf MyFlags);
 
-uint mi_state_info_write(File file, MI_STATE_INFO *state, uint pWrite);
+uint mi_state_info_write(MI_INFO *info, File file,
+                         MI_STATE_INFO *state, uint pWrite);
 uchar *mi_state_info_read(uchar *ptr, MI_STATE_INFO *state);
+int mi_remap_file_and_write_state_for_unlock(MI_INFO *info);
 uint mi_state_info_read_dsk(File file, MI_STATE_INFO *state, my_bool pRead);
 uint mi_base_info_write(File file, MI_BASE_INFO *base);
 uchar *my_n_base_info_read(uchar *ptr, MI_BASE_INFO *base);

--- 1.36/storage/myisam/myisamlog.c	2006-12-31 01:06:40 +01:00
+++ 1.37/storage/myisam/myisamlog.c	2007-05-15 18:09:20 +02:00
@@ -13,7 +13,15 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
 
-/* write whats in isam.log */
+/**
+   @file
+   @brief Utility to display and apply a MyISAM logical or physical log to tables
+
+   Prints what is in a MyISAM (logical or physical/backup) log, optionally
+   applies the changes to tables (all tables or only a set specified on the
+   command line). Works standalone (tables must not be modified by the
+   server during this).
+*/
 
 #ifndef USE_MY_FUNC
 #define USE_MY_FUNC
@@ -26,58 +34,15 @@
 #include <sys/resource.h>
 #endif
 
-#define FILENAME(A) (A ? A->show_name : "Unknown")
-
-struct file_info {
-  long process;
-  int  filenr,id;
-  uint rnd;
-  my_string name,show_name,record;
-  MI_INFO *isam;
-  bool closed,used;
-  ulong accessed;
-};
-
-struct test_if_open_param {
-  my_string name;
-  int max_id;
-};
-
-struct st_access_param
-{
-  ulong min_accessed;
-  struct file_info *found;
-};
-
 #define NO_FILEPOS (ulong) ~0L
 
-extern int main(int argc,char * *argv);
 static void get_options(int *argc,char ***argv);
-static int examine_log(my_string file_name,char **table_names);
-static int read_string(IO_CACHE *file,gptr *to,uint length);
-static int file_info_compare(void *cmp_arg, void *a,void *b);
-static int test_if_open(struct file_info *key,element_count count,
-			struct test_if_open_param *param);
-static void fix_blob_pointers(MI_INFO *isam,byte *record);
-static int test_when_accessed(struct file_info *key,element_count count,
-			      struct st_access_param *access_param);
-static void file_info_free(struct file_info *info);
-static int close_some_file(TREE *tree);
-static int reopen_closed_file(TREE *tree,struct file_info *file_info);
-static int find_record_with_key(struct file_info *file_info,byte *record);
-static void printf_log(const char *str,...);
-static bool cmp_filename(struct file_info *file_info,my_string name);
-
-static uint verbose=0,update=0,test_info=0,max_files=0,re_open_count=0,
-  recover=0,prefix_remove=0,opt_processes=0;
-static my_string log_filename=0,filepath=0,write_filename=0,record_pos_file=0;
-static ulong com_count[10][3],number_of_commands=(ulong) ~0L,
-	     isamlog_process;
-static my_off_t isamlog_filepos,start_offset=0,record_pos= HA_OFFSET_ERROR;
-static const char *command_name[]=
-{"open","write","update","delete","close","extra","lock","re-open",
- "delete-all", NullS};
+static my_bool matches_list_of_tables(const char *isam_file_name);
+
+static MI_EXAMINE_LOG_PARAM mi_exl;
+static char **table_names;
 
+static uint test_info=0;
 
 int main(int argc, char **argv)
 {
@@ -85,40 +50,59 @@
   ulong total_count,total_error,total_recover;
   MY_INIT(argv[0]);
 
-  log_filename=myisam_log_filename;
+  mi_examine_log_param_init(&mi_exl);
+  mi_exl.log_filename= myisam_logical_log_filename; /* the default */
   get_options(&argc,&argv);
+  if (argv[0]) /* some table names passed on command line */
+  {
+    table_names= argv;
+    mi_exl.table_selection_hook= matches_list_of_tables;
+  }
+
   /* Number of MyISAM files we can have open at one time */
-  max_files= (my_set_max_open_files(min(max_files,8))-6)/2;
-  if (update)
+  mi_exl.max_files= (my_set_max_open_files(max(mi_exl.max_files,8))-6)/2;
+
+  /*
+    Program must to work in all conditions: support symbolic links.
+    It should not be a security risk.
+  */
+#ifdef USE_SYMDIR
+  my_use_symdir= 1;
+#endif
+
+  if (mi_exl.update)
     printf("Trying to %s MyISAM files according to log '%s'\n",
-	   (recover ? "recover" : "update"),log_filename);
-  error= examine_log(log_filename,argv);
-  if (update && ! error)
+	   (mi_exl.recover ? "recover" : "update"),mi_exl.log_filename);
+
+  error= mi_examine_log(&mi_exl);
+
+  if (mi_exl.update && ! error)
     puts("Tables updated successfully");
   total_count=total_error=total_recover=0;
-  for (i=first=0 ; command_name[i] ; i++)
+  for (i=first=0 ; mi_log_command_name[i] ; i++)
   {
-    if (com_count[i][0])
+    if (mi_exl.com_count[i][0])
     {
       if (!first++)
       {
-	if (verbose || update)
+	if (mi_exl.verbose || mi_exl.update)
 	  puts("");
-	puts("Commands   Used count    Errors   Recover errors");
+	puts("Commands                         Used count    Errors"
+             " Recover errors");
       }
-      printf("%-12s%9ld%10ld%17ld\n",command_name[i],com_count[i][0],
-	     com_count[i][1],com_count[i][2]);
-      total_count+=com_count[i][0];
-      total_error+=com_count[i][1];
-      total_recover+=com_count[i][2];
+      printf("%-34s%9ld%10ld%15ld\n",mi_log_command_name[i],mi_exl.com_count[i][0],
+	     mi_exl.com_count[i][1],mi_exl.com_count[i][2]);
+      total_count+=mi_exl.com_count[i][0];
+      total_error+=mi_exl.com_count[i][1];
+      total_recover+=mi_exl.com_count[i][2];
     }
   }
   if (total_count)
     printf("%-12s%9ld%10ld%17ld\n","Total",total_count,total_error,
 	   total_recover);
-  if (re_open_count)
+  if (mi_exl.re_open_count)
     printf("Had to do %d re-open because of too few possibly open files\n",
-	   re_open_count);
+	   mi_exl.re_open_count);
   VOID(mi_panic(HA_PANIC_CLOSE));
   my_free_open_file_info();
   my_end(test_info ? MY_CHECK_ERROR | MY_GIVE_INFO : MY_CHECK_ERROR);
@@ -154,11 +138,11 @@
 	  else
 	    pos= *(++*argv);
 	}
-	number_of_commands=(ulong) atol(pos);
+	mi_exl.number_of_commands= (ulong) atol(pos);
 	pos=" ";
 	break;
       case 'u':
-	update=1;
+	mi_exl.update=1;
 	break;
       case 'f':
 	if (! *++pos)
@@ -168,7 +152,7 @@
 	  else
 	    pos= *(++*argv);
 	}
-	max_files=(uint) atoi(pos);
+	mi_exl.max_files=(uint) atoi(pos);
 	pos=" ";
 	break;
       case 'i':
@@ -182,7 +166,7 @@
 	  else
 	    pos= *(++*argv);
 	}
-	start_offset=(my_off_t) strtoll(pos,NULL,10);
+	mi_exl.start_offset=(my_off_t) strtoll(pos,NULL,10);
 	pos=" ";
 	break;
       case 'p':
@@ -193,14 +177,14 @@
 	  else
 	    pos= *(++*argv);
 	}
-	prefix_remove=atoi(pos);
+	mi_exl.prefix_remove=atoi(pos);
 	break;
       case 'r':
-	update=1;
-	recover++;
+	mi_exl.update=1;
+	mi_exl.recover++;
 	break;
       case 'P':
-	opt_processes=1;
+	mi_exl.opt_processes=1;
 	break;
       case 'R':
 	if (! *++pos)
@@ -210,14 +194,14 @@
 	  else
 	    pos= *(++*argv);
 	}
-	record_pos_file=(char*) pos;
+	mi_exl.record_pos_file=(char*) pos;
 	if (!--*argc)
 	  goto err;
-	record_pos=(my_off_t) strtoll(*(++*argv),NULL,10);
+	mi_exl.record_pos=(my_off_t) strtoll(*(++*argv),NULL,10);
 	pos=" ";
 	break;
       case 'v':
-	verbose++;
+	mi_exl.verbose++;
 	break;
       case 'w':
 	if (! *++pos)
@@ -227,7 +211,7 @@
 	  else
 	    pos= *(++*argv);
 	}
-	write_filename=(char*) pos;
+	mi_exl.write_filename=(char*) pos;
 	pos=" ";
 	break;
       case 'F':
@@ -238,7 +222,7 @@
 	  else
 	    pos= *(++*argv);
 	}
-	filepath= (char*) pos;
+	mi_exl.filepath= (char*) pos;
 	pos=" ";
 	break;
       case 'V':
@@ -253,7 +237,7 @@
 	if (version)
 	  break;
 	puts("Write info about whats in a MyISAM log file.");
-	printf("If no file name is given %s is used\n",log_filename);
+	printf("If no file name is given %s is used\n",mi_exl.log_filename);
 	puts("");
 	printf(usage,my_progname);
 	puts("");
@@ -284,7 +268,7 @@
   }
   if (*argc >= 1)
   {
-    log_filename=(char*) pos;
+    mi_exl.log_filename=(char*) pos;
     (*argc)--;
     (*argv)++;
   }
@@ -296,549 +280,17 @@
 }
 
 
-static int examine_log(my_string file_name, char **table_names)
-{
-  uint command,result,files_open;
-  ulong access_time,length;
-  my_off_t filepos;
-  int lock_command,mi_result;
-  char isam_file_name[FN_REFLEN],llbuff[21],llbuff2[21];
-  uchar head[20];
-  gptr	buff;
-  struct test_if_open_param open_param;
-  IO_CACHE cache;
-  File file;
-  FILE *write_file;
-  enum ha_extra_function extra_command;
-  TREE tree;
-  struct file_info file_info,*curr_file_info;
-  DBUG_ENTER("examine_log");
-
-  if ((file=my_open(file_name,O_RDONLY,MYF(MY_WME))) < 0)
-    DBUG_RETURN(1);
-  write_file=0;
-  if (write_filename)
-  {
-    if (!(write_file=my_fopen(write_filename,O_WRONLY,MYF(MY_WME))))
-    {
-      my_close(file,MYF(0));
-      DBUG_RETURN(1);
-    }
-  }
-
-  init_io_cache(&cache,file,0,READ_CACHE,start_offset,0,MYF(0));
-  bzero((gptr) com_count,sizeof(com_count));
-  init_tree(&tree,0,0,sizeof(file_info),(qsort_cmp2) file_info_compare,1,
-	    (tree_element_free) file_info_free, NULL);
-  VOID(init_key_cache(dflt_key_cache,KEY_CACHE_BLOCK_SIZE,KEY_CACHE_SIZE,
-                      0, 0));
-
-  files_open=0; access_time=0;
-  while (access_time++ != number_of_commands &&
-	 !my_b_read(&cache,(byte*) head,9))
-  {
-    isamlog_filepos=my_b_tell(&cache)-9L;
-    file_info.filenr= mi_uint2korr(head+1);
-    isamlog_process=file_info.process=(long) mi_uint4korr(head+3);
-    if (!opt_processes)
-      file_info.process=0;
-    result= mi_uint2korr(head+7);
-    if ((curr_file_info=(struct file_info*) tree_search(&tree, &file_info,
-							tree.custom_arg)))
-    {
-      curr_file_info->accessed=access_time;
-      if (update && curr_file_info->used && curr_file_info->closed)
-      {
-	if (reopen_closed_file(&tree,curr_file_info))
-	{
-	  command=sizeof(com_count)/sizeof(com_count[0][0])/3;
-	  result=0;
-	  goto com_err;
-	}
-      }
-    }
-    command=(uint) head[0];
-    if (command < sizeof(com_count)/sizeof(com_count[0][0])/3 &&
-	(!table_names[0] || (curr_file_info && curr_file_info->used)))
-    {
-      com_count[command][0]++;
-      if (result)
-	com_count[command][1]++;
-    }
-    switch ((enum myisam_log_commands) command) {
-    case MI_LOG_OPEN:
-      if (!table_names[0])
-      {
-	com_count[command][0]--;		/* Must be counted explicite */
-	if (result)
-	  com_count[command][1]--;
-      }
-
-      if (curr_file_info)
-	printf("\nWarning: %s is opened with same process and filenumber\nMaybe you should use
the -P option ?\n",
-	       curr_file_info->show_name);
-      if (my_b_read(&cache,(byte*) head,2))
-	goto err;
-      file_info.name=0;
-      file_info.show_name=0;
-      file_info.record=0;
-      if (read_string(&cache,(gptr*) &file_info.name,
-		      (uint) mi_uint2korr(head)))
-	goto err;
-      {
-	uint i;
-	char *pos,*to;
-
-	/* Fix if old DOS files to new format */
-	for (pos=file_info.name; (pos=strchr(pos,'\\')) ; pos++)
-	  *pos= '/';
-
-	pos=file_info.name;
-	for (i=0 ; i < prefix_remove ; i++)
-	{
-	  char *next;
-	  if (!(next=strchr(pos,'/')))
-	    break;
-	  pos=next+1;
-	}
-	to=isam_file_name;
-	if (filepath)
-	  to=convert_dirname(isam_file_name,filepath,NullS);
-	strmov(to,pos);
-	fn_ext(isam_file_name)[0]=0;	/* Remove extension */
-      }
-      open_param.name=file_info.name;
-      open_param.max_id=0;
-      VOID(tree_walk(&tree,(tree_walk_action) test_if_open,(void*) &open_param,
-		     left_root_right));
-      file_info.id=open_param.max_id+1;
-      /*
-       * In the line below +10 is added to accomodate '<' and '>' chars
-       * plus '\0' at the end, so that there is place for 7 digits.
-       * It is  improbable that same table can have that many entries in 
-       * the table cache.
-       * The additional space is needed for the sprintf commands two lines
-       * below.
-       */ 
-      file_info.show_name=my_memdup(isam_file_name,
-				    (uint) strlen(isam_file_name)+10,
-				    MYF(MY_WME));
-      if (file_info.id > 1)
-	sprintf(strend(file_info.show_name),"<%d>",file_info.id);
-      file_info.closed=1;
-      file_info.accessed=access_time;
-      file_info.used=1;
-      if (table_names[0])
-      {
-	char **name;
-	file_info.used=0;
-	for (name=table_names ; *name ; name++)
-	{
-	  if (!strcmp(*name,isam_file_name))
-	    file_info.used=1;			/* Update/log only this */
-	}
-      }
-      if (update && file_info.used)
-      {
-	if (files_open >= max_files)
-	{
-	  if (close_some_file(&tree))
-	    goto com_err;
-	  files_open--;
-	}
-	if (!(file_info.isam= mi_open(isam_file_name,O_RDWR,
-				      HA_OPEN_WAIT_IF_LOCKED)))
-	  goto com_err;
-	if (!(file_info.record=my_malloc(file_info.isam->s->base.reclength,
-					 MYF(MY_WME))))
-	  goto end;
-	files_open++;
-	file_info.closed=0;
-      }
-      VOID(tree_insert(&tree, (gptr) &file_info, 0, tree.custom_arg));
-      if (file_info.used)
-      {
-	if (verbose && !record_pos_file)
-	  printf_log("%s: open -> %d",file_info.show_name, file_info.filenr);
-	com_count[command][0]++;
-	if (result)
-	  com_count[command][1]++;
-      }
-      break;
-    case MI_LOG_CLOSE:
-      if (verbose && !record_pos_file &&
-	  (!table_names[0] || (curr_file_info && curr_file_info->used)))
-	printf_log("%s: %s -> %d",FILENAME(curr_file_info),
-	       command_name[command],result);
-      if (curr_file_info)
-      {
-	if (!curr_file_info->closed)
-	  files_open--;
-        VOID(tree_delete(&tree, (gptr) curr_file_info, 0, tree.custom_arg));
-      }
-      break;
-    case MI_LOG_EXTRA:
-      if (my_b_read(&cache,(byte*) head,1))
-	goto err;
-      extra_command=(enum ha_extra_function) head[0];
-      if (verbose && !record_pos_file &&
-	  (!table_names[0] || (curr_file_info && curr_file_info->used)))
-	printf_log("%s: %s(%d) -> %d",FILENAME(curr_file_info),
-		   command_name[command], (int) extra_command,result);
-      if (update && curr_file_info && !curr_file_info->closed)
-      {
-	if (mi_extra(curr_file_info->isam, extra_command, 0) != (int) result)
-	{
-	  fflush(stdout);
-	  VOID(fprintf(stderr,
-		       "Warning: error %d, expected %d on command %s at %s\n",
-		       my_errno,result,command_name[command],
-		       llstr(isamlog_filepos,llbuff)));
-	  fflush(stderr);
-	}
-      }
-      break;
-    case MI_LOG_DELETE:
-      if (my_b_read(&cache,(byte*) head,8))
-	goto err;
-      filepos=mi_sizekorr(head);
-      if (verbose && (!record_pos_file ||
-		      ((record_pos == filepos || record_pos == NO_FILEPOS) &&
-		       !cmp_filename(curr_file_info,record_pos_file))) &&
-	  (!table_names[0] || (curr_file_info && curr_file_info->used)))
-	printf_log("%s: %s at %ld -> %d",FILENAME(curr_file_info),
-		   command_name[command],(long) filepos,result);
-      if (update && curr_file_info && !curr_file_info->closed)
-      {
-	if (mi_rrnd(curr_file_info->isam,curr_file_info->record,filepos))
-	{
-	  if (!recover)
-	    goto com_err;
-	  if (verbose)
-	    printf_log("error: Didn't find row to delete with mi_rrnd");
-	  com_count[command][2]++;		/* Mark error */
-	}
-	mi_result=mi_delete(curr_file_info->isam,curr_file_info->record);
-	if ((mi_result == 0 && result) ||
-	    (mi_result && (uint) my_errno != result))
-	{
-	  if (!recover)
-	    goto com_err;
-	  if (mi_result)
-	    com_count[command][2]++;		/* Mark error */
-	  if (verbose)
-	    printf_log("error: Got result %d from mi_delete instead of %d",
-		       mi_result, result);
-	}
-      }
-      break;
-    case MI_LOG_WRITE:
-    case MI_LOG_UPDATE:
-      if (my_b_read(&cache,(byte*) head,12))
-	goto err;
-      filepos=mi_sizekorr(head);
-      length=mi_uint4korr(head+8);
-      buff=0;
-      if (read_string(&cache,&buff,(uint) length))
-	goto err;
-      if ((!record_pos_file ||
-	  ((record_pos == filepos || record_pos == NO_FILEPOS) &&
-	   !cmp_filename(curr_file_info,record_pos_file))) &&
-	  (!table_names[0] || (curr_file_info && curr_file_info->used)))
-      {
-	if (write_file &&
-	    (my_fwrite(write_file,buff,length,MYF(MY_WAIT_IF_FULL | MY_NABP))))
-	  goto end;
-	if (verbose)
-	  printf_log("%s: %s at %ld, length=%ld -> %d",
-		     FILENAME(curr_file_info),
-		     command_name[command], filepos,length,result);
-      }
-      if (update && curr_file_info && !curr_file_info->closed)
-      {
-	if (curr_file_info->isam->s->base.blobs)
-	  fix_blob_pointers(curr_file_info->isam,buff);
-	if ((enum myisam_log_commands) command == MI_LOG_UPDATE)
-	{
-	  if (mi_rrnd(curr_file_info->isam,curr_file_info->record,filepos))
-	  {
-	    if (!recover)
-	    {
-	      result=0;
-	      goto com_err;
-	    }
-	    if (verbose)
-	      printf_log("error: Didn't find row to update with mi_rrnd");
-	    if (recover == 1 || result ||
-		find_record_with_key(curr_file_info,buff))
-	    {
-	      com_count[command][2]++;		/* Mark error */
-	      break;
-	    }
-	  }
-	  mi_result=mi_update(curr_file_info->isam,curr_file_info->record,
-			      buff);
-	  if ((mi_result == 0 && result) ||
-	      (mi_result && (uint) my_errno != result))
-	  {
-	    if (!recover)
-	      goto com_err;
-	    if (verbose)
-	      printf_log("error: Got result %d from mi_update instead of %d",
-			 mi_result, result);
-	    if (mi_result)
-	      com_count[command][2]++;		/* Mark error */
-	  }
-	}
-	else
-	{
-	  mi_result=mi_write(curr_file_info->isam,buff);
-	  if ((mi_result == 0 && result) ||
-	      (mi_result && (uint) my_errno != result))
-	  {
-	    if (!recover)
-	      goto com_err;
-	    if (verbose)
-	      printf_log("error: Got result %d from mi_write instead of %d",
-			 mi_result, result);
-	    if (mi_result)
-	      com_count[command][2]++;		/* Mark error */
-	  }
-	  if (!recover && filepos != curr_file_info->isam->lastpos)
-	  {
-	    printf("error: Wrote at position: %s, should have been %s",
-		   llstr(curr_file_info->isam->lastpos,llbuff),
-		   llstr(filepos,llbuff2));
-	    goto end;
-	  }
-	}
-      }
-      my_free(buff,MYF(0));
-      break;
-    case MI_LOG_LOCK:
-      if (my_b_read(&cache,(byte*) head,sizeof(lock_command)))
-	goto err;
-      memcpy_fixed(&lock_command,head,sizeof(lock_command));
-      if (verbose && !record_pos_file &&
-	  (!table_names[0] || (curr_file_info && curr_file_info->used)))
-	printf_log("%s: %s(%d) -> %d\n",FILENAME(curr_file_info),
-		   command_name[command],lock_command,result);
-      if (update && curr_file_info && !curr_file_info->closed)
-      {
-	if (mi_lock_database(curr_file_info->isam,lock_command) !=
-	    (int) result)
-	  goto com_err;
-      }
-      break;
-    case MI_LOG_DELETE_ALL:
-      if (verbose && !record_pos_file &&
-	  (!table_names[0] || (curr_file_info && curr_file_info->used)))
-	printf_log("%s: %s -> %d\n",FILENAME(curr_file_info),
-		   command_name[command],result);
-      break;
-    default:
-      fflush(stdout);
-      VOID(fprintf(stderr,
-		   "Error: found unknown command %d in logfile, aborted\n",
-		   command));
-      fflush(stderr);
-      goto end;
-    }
-  }
-  end_key_cache(dflt_key_cache,1);
-  delete_tree(&tree);
-  VOID(end_io_cache(&cache));
-  VOID(my_close(file,MYF(0)));
-  if (write_file && my_fclose(write_file,MYF(MY_WME)))
-    DBUG_RETURN(1);
-  DBUG_RETURN(0);
-
- err:
-  fflush(stdout);
-  VOID(fprintf(stderr,"Got error %d when reading from logfile\n",my_errno));
-  fflush(stderr);
-  goto end;
- com_err:
-  fflush(stdout);
-  VOID(fprintf(stderr,"Got error %d, expected %d on command %s at %s\n",
-	       my_errno,result,command_name[command],
-	       llstr(isamlog_filepos,llbuff)));
-  fflush(stderr);
- end:
-  end_key_cache(dflt_key_cache, 1);
-  delete_tree(&tree);
-  VOID(end_io_cache(&cache));
-  VOID(my_close(file,MYF(0)));
-  if (write_file)
-    VOID(my_fclose(write_file,MYF(MY_WME)));
-  DBUG_RETURN(1);
-}
-
-
-static int read_string(IO_CACHE *file, register gptr *to, register uint length)
-{
-  DBUG_ENTER("read_string");
-
-  if (*to)
-    my_free((gptr) *to,MYF(0));
-  if (!(*to= (gptr) my_malloc(length+1,MYF(MY_WME))) ||
-      my_b_read(file,(byte*) *to,length))
-  {
-    if (*to)
-      my_free(*to,MYF(0));
-    *to= 0;
-    DBUG_RETURN(1);
-  }
-  *((char*) *to+length)= '\0';
-  DBUG_RETURN (0);
-}				/* read_string */
-
-
-static int file_info_compare(void* cmp_arg __attribute__((unused)),
-			     void *a, void *b)
-{
-  long lint;
-
-  if ((lint=((struct file_info*) a)->process -
-       ((struct file_info*) b)->process))
-    return lint < 0L ? -1 : 1;
-  return ((struct file_info*) a)->filenr - ((struct file_info*) b)->filenr;
-}
-
-	/* ARGSUSED */
-
-static int test_if_open (struct file_info *key,
-			 element_count count __attribute__((unused)),
-			 struct test_if_open_param *param)
-{
-  if (!strcmp(key->name,param->name) && key->id > param->max_id)
-    param->max_id=key->id;
-  return 0;
-}
-
-
-static void fix_blob_pointers(MI_INFO *info, byte *record)
-{
-  byte *pos;
-  MI_BLOB *blob,*end;
-
-  pos=record+info->s->base.reclength;
-  for (end=info->blobs+info->s->base.blobs, blob= info->blobs;
-       blob != end ;
-       blob++)
-  {
-    memcpy_fixed(record+blob->offset+blob->pack_length,&pos,sizeof(char*));
-    pos+=_mi_calc_blob_length(blob->pack_length,record+blob->offset);
-  }
-}
-
-	/* close the file with hasn't been accessed for the longest time */
-	/* ARGSUSED */
-
-static int test_when_accessed (struct file_info *key,
-			       element_count count __attribute__((unused)),
-			       struct st_access_param *access_param)
+static my_bool matches_list_of_tables(const char *isam_file_name)
 {
-  if (key->accessed < access_param->min_accessed && ! key->closed)
+  if (table_names && table_names[0])
   {
-    access_param->min_accessed=key->accessed;
-    access_param->found=key;
-  }
-  return 0;
-}
-
-
-static void file_info_free(struct file_info *fileinfo)
-{
-  DBUG_ENTER("file_info_free");
-  if (update)
-  {
-    if (!fileinfo->closed)
-      VOID(mi_close(fileinfo->isam));
-    if (fileinfo->record)
-      my_free(fileinfo->record,MYF(0));
-  }
-  my_free(fileinfo->name,MYF(0));
-  my_free(fileinfo->show_name,MYF(0));
-  DBUG_VOID_RETURN;
-}
-
-
-
-static int close_some_file(TREE *tree)
-{
-  struct st_access_param access_param;
-
-  access_param.min_accessed=LONG_MAX;
-  access_param.found=0;
-
-  VOID(tree_walk(tree,(tree_walk_action) test_when_accessed,
-		 (void*) &access_param,left_root_right));
-  if (!access_param.found)
-    return 1;			/* No open file that is possibly to close */
-  if (mi_close(access_param.found->isam))
-    return 1;
-  access_param.found->closed=1;
-  return 0;
-}
-
-
-static int reopen_closed_file(TREE *tree, struct file_info *fileinfo)
-{
-  char name[FN_REFLEN];
-  if (close_some_file(tree))
-    return 1;				/* No file to close */
-  strmov(name,fileinfo->show_name);
-  if (fileinfo->id > 1)
-    *strrchr(name,'<')='\0';		/* Remove "<id>" */
-
-  if (!(fileinfo->isam= mi_open(name,O_RDWR,HA_OPEN_WAIT_IF_LOCKED)))
-    return 1;
-  fileinfo->closed=0;
-  re_open_count++;
-  return 0;
-}
-
-	/* Try to find record with uniq key */
-
-static int find_record_with_key(struct file_info *file_info, byte *record)
-{
-  uint key;
-  MI_INFO *info=file_info->isam;
-  uchar tmp_key[MI_MAX_KEY_BUFF];
-
-  for (key=0 ; key < info->s->base.keys ; key++)
-  {
-    if (mi_is_key_active(info->s->state.key_map, key) &&
-	info->s->keyinfo[key].flag & HA_NOSAME)
+    char **name;
+    for (name= table_names ; *name ; name++)
     {
-      VOID(_mi_make_key(info,key,tmp_key,record,0L));
-      return mi_rkey(info,file_info->record,(int) key,(char*) tmp_key,0,
-		     HA_READ_KEY_EXACT);
+      if (!strcmp(*name, isam_file_name))
+        return 1;
     }
+    return 0;
   }
   return 1;
-}
-
-
-static void printf_log(const char *format,...)
-{
-  char llbuff[21];
-  va_list args;
-  va_start(args,format);
-  if (verbose > 2)
-    printf("%9s:",llstr(isamlog_filepos,llbuff));
-  if (verbose > 1)
-    printf("%5ld ",isamlog_process);	/* Write process number */
-  (void) vprintf((char*) format,args);
-  putchar('\n');
-  va_end(args);
-}
-
-
-static bool cmp_filename(struct file_info *file_info, my_string name)
-{
-  if (!file_info)
-    return 1;
-  return strcmp(file_info->name,name) ? 1 : 0;
 }

--- 1.64/storage/myisam/myisampack.c	2007-01-24 18:57:02 +01:00
+++ 1.65/storage/myisam/myisampack.c	2007-05-15 18:09:20 +02:00
@@ -506,9 +506,12 @@
   /* Create temporary or join file */
 
   if (backup)
-    VOID(fn_format(org_name,isam_file->filename,"",MI_NAME_DEXT,2));
+    VOID(fn_format(org_name,isam_file->s->unresolv_file_name,
+                   "", MI_NAME_DEXT, MY_REPLACE_EXT));
   else
-    VOID(fn_format(org_name,isam_file->filename,"",MI_NAME_DEXT,2+4+16));
+    VOID(fn_format(org_name,isam_file->s->unresolv_file_name,
+                   "", MI_NAME_DEXT,
+                   MY_REPLACE_EXT|MY_UNPACK_FILENAME|MY_RESOLVE_SYMLINKS));
   if (!test_only && result_table)
   {
     /* Make a new indexfile based on first file in list */
@@ -693,7 +696,9 @@
     {
       if (backup)
       {
-	if (my_rename(org_name,make_old_name(temp_name,isam_file->filename),
+	if (my_rename(org_name,
+                      make_old_name(temp_name,
+                                    isam_file->s->unresolv_file_name),
 		      MYF(MY_WME)))
 	  error=1;
 	else
@@ -2977,7 +2982,12 @@
   VOID(my_chsize(share->kfile, share->base.keystart, 0, MYF(0)));
   if (share->base.keys)
     isamchk_neaded=1;
-  DBUG_RETURN(mi_state_info_write(share->kfile,&share->state,1+2));
+  /*
+    The function below will not write to the MyISAM backup log file, because
+    it is only called by myisampack where isam_file->backup_logging is
+    FALSE (and thus the write will not even be tried).
+  */
+  DBUG_RETURN(mi_state_info_write(isam_file, share->kfile, &share->state,
1+2));
 }
 
 
@@ -3010,7 +3020,7 @@
   if (isam_file->s->base.keys)
     isamchk_neaded=1;
   state.changed=STATE_CHANGED | STATE_NOT_ANALYZED; /* Force check of table */
-  DBUG_RETURN (mi_state_info_write(file,&state,1+2));
+  DBUG_RETURN (mi_state_info_write(isam_file, file, &state, 1+2));
 }
 
 

--- 1.13/storage/myisammrg/myrg_info.c	2006-12-31 01:06:41 +01:00
+++ 1.14/storage/myisammrg/myrg_info.c	2007-05-15 18:09:20 +02:00
@@ -50,7 +50,8 @@
       info->records+=file->table->s->state.state.records;
       info->del+=file->table->s->state.state.del;
       DBUG_PRINT("info2",("table: %s, offset: %lu",
-                  file->table->filename,(ulong) file->file_offset));
+                          file->table->s->unresolv_file_name,
+                          (ulong) file->file_offset));
     }
     x->records= info->records;
     x->deleted= info->del;

--- 1.17/storage/myisammrg/myrg_rrnd.c	2006-12-31 01:06:41 +01:00
+++ 1.18/storage/myisammrg/myrg_rrnd.c	2007-05-15 18:09:20 +02:00
@@ -111,6 +111,6 @@
       start=mid;
   }
   DBUG_PRINT("info",("offset: %lu, table: %s",
-		     (ulong) pos, start->table->filename));
+		     (ulong) pos, start->table->s->unresolv_file_name));
   DBUG_RETURN(start);
 }

--- 1.67/mysys/mf_iocache.c	2007-02-13 23:21:46 +01:00
+++ 1.68/mysys/mf_iocache.c	2007-05-15 18:09:19 +02:00
@@ -129,6 +129,9 @@
 }
 
 
+/* FUNCTIONS TO SET UP OR RESET A CACHE */
+
+
 /*
   Initialize an IO_CACHE object
 
@@ -166,7 +169,7 @@
   info->file= file;
   info->type= TYPE_NOT_SET;	    /* Don't set it until mutex are created */
   info->pos_in_file= seek_offset;
-  info->pre_close = info->pre_read = info->post_read = 0;
+  info->pre_close= info->pre_read= info->post_read= info->post_write= NULL;
   info->arg = 0;
   info->alloced_buffer = 0;
   info->buffer=0;
@@ -275,7 +278,7 @@
 
   /* End_of_file may be changed by user later */
   info->end_of_file= end_of_file;
-  info->error=0;
+  info->error= info->hard_write_error_in_the_past= 0;
   info->type= type;
   init_functions(info);
 #ifdef HAVE_AIOWAIT
@@ -402,7 +405,7 @@
     }
   }
   info->type=type;
-  info->error=0;
+  info->error= info->hard_write_error_in_the_past= 0;
   init_functions(info);
 
 #ifdef HAVE_AIOWAIT
@@ -419,6 +422,8 @@
 } /* reinit_io_cache */
 
 
+/* FUNCTIONS TO DO READS TO THE CACHE */
+
 
 /*
   Read buffered.
@@ -1466,14 +1471,19 @@
   byte buff;
   IO_CACHE_CALLBACK pre_read,post_read;
   if ((pre_read = info->pre_read))
-    (*pre_read)(info);
+    (*pre_read)(info, NULL, 0, 0);
   if ((*(info)->read_function)(info,&buff,1))
     return my_b_EOF;
   if ((post_read = info->post_read))
-    (*post_read)(info);
+    (*post_read)(info, NULL, 0, 0);
   return (int) (uchar) buff;
 }
 
+/* FUNCTIONS TO DO WRITES TO THE CACHE */
+
+#define set_hard_write_error(A)                         \
+  ((A)->error= (A)->hard_write_error_in_the_past= -1)
+
 /* 
    Write a byte buffer to IO_CACHE and flush to disk
    if IO_CACHE is full.
@@ -1491,7 +1501,7 @@
   if (info->pos_in_file+info->buffer_length > info->end_of_file)
   {
     my_errno=errno=EFBIG;
-    return info->error = -1;
+    return set_hard_write_error(info);
   }
 
   rest_length=(uint) (info->write_end - info->write_pos);
@@ -1515,13 +1525,16 @@
       */
       if (my_seek(info->file,info->pos_in_file,MY_SEEK_SET,MYF(0)))
       {
-        info->error= -1;
+        set_hard_write_error(info);
         return (1);
       }
       info->seek_not_done=0;
     }
     if (my_write(info->file,Buffer,(uint) length,info->myflags | MY_NABP))
-      return info->error= -1;
+      return set_hard_write_error(info);
+
+    if (info->post_write)
+      (*(info->post_write))(info, Buffer, length, info->pos_in_file);
 
 #ifdef THREAD
     /*
@@ -1566,7 +1579,7 @@
   */
   DBUG_ASSERT(!info->share);
 #endif
-
+  DBUG_ASSERT(!info->post_write); /* unsupported */
   lock_append_buffer(info);
   rest_length=(uint) (info->write_end - info->write_pos);
   if (Count <= rest_length)
@@ -1586,7 +1599,7 @@
     if (my_write(info->file,Buffer,(uint) length,info->myflags | MY_NABP))
     {
       unlock_append_buffer(info);
-      return info->error= -1;
+      return set_hard_write_error(info);
     }
     Count-=length;
     Buffer+=length;
@@ -1639,12 +1652,21 @@
   {
     /* Of no overlap, write everything without buffering */
     if (pos + Count <= info->pos_in_file)
-      return my_pwrite(info->file, Buffer, Count, pos,
-		       info->myflags | MY_NABP);
+    {
+      int ret= my_pwrite(info->file, Buffer, Count, pos,
+                         info->myflags | MY_NABP);
+      if (unlikely(ret))
+        set_hard_write_error(info);
+      if (info->post_write)
+        (*(info->post_write))(info, Buffer, Count, pos);
+      return ret;
+    }
     /* Write the part of the block that is before buffer */
     length= (uint) (info->pos_in_file - pos);
     if (my_pwrite(info->file, Buffer, length, pos, info->myflags | MY_NABP))
-      info->error=error=-1;
+      error= set_hard_write_error(info);
+    if (info->post_write)
+      (*(info->post_write))(info, Buffer, length, pos);
     Buffer+=length;
     pos+=  length;
     Count-= length;
@@ -1705,7 +1727,7 @@
     if (info->file == -1)
     {
       if (real_open_cached_file(info))
-	DBUG_RETURN((info->error= -1));
+	DBUG_RETURN(set_hard_write_error(info));
     }
     LOCK_APPEND_BUFFER;
 
@@ -1733,29 +1755,44 @@
 	    MY_FILEPOS_ERROR)
 	{
 	  UNLOCK_APPEND_BUFFER;
-	  DBUG_RETURN((info->error= -1));
+	  DBUG_RETURN(set_hard_write_error(info));
 	}
 	if (!append_cache)
 	  info->seek_not_done=0;
       }
-      if (!append_cache)
-	info->pos_in_file+=length;
+
       info->write_end= (info->write_buffer+info->buffer_length-
 			((pos_in_file+length) & (IO_SIZE-1)));
 
       if (my_write(info->file,info->write_buffer,length,
 		   info->myflags | MY_NABP))
-	info->error= -1;
+	set_hard_write_error(info);
       else
 	info->error= 0;
+
       if (!append_cache)
       {
+        /*
+          This post_write is really POST-write; callers depend on this! So
+          always call it after writing to the file, not before.
+        */
+        if (info->post_write)
+          (*(info->post_write))(info, info->write_buffer,
+                                length, info->pos_in_file);
+        /*
+          The addition below will make the info->pos_in_file be the end of
+          written block; whereas the value we needed in post_write is the
+          value before the addition. That's why we called post_write before
+          this.
+        */
+	info->pos_in_file+=length;
         set_if_bigger(info->end_of_file,(pos_in_file+length));
       }
       else
       {
 	info->end_of_file+=(info->write_pos-info->append_read_pos);
 	DBUG_ASSERT(info->end_of_file == my_tell(info->file,MYF(0)));
+        DBUG_ASSERT(!info->post_write); /* unsupported */
       }
 
       info->append_read_pos=info->write_pos=info->write_buffer;
@@ -1809,7 +1846,7 @@
 
   if ((pre_close=info->pre_close))
   {
-    (*pre_close)(info);
+    (*pre_close)(info, NULL, 0, 0);
     info->pre_close= 0;
   }
   if (info->alloced_buffer)

--- 1.65/mysys/mf_keycache.c	2007-02-23 12:23:37 +01:00
+++ 1.66/mysys/mf_keycache.c	2007-05-15 18:09:19 +02:00
@@ -13,29 +13,30 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
 
-/*
-  These functions handle keyblock cacheing for ISAM and MyISAM tables.
-
-  One cache can handle many files.
-  It must contain buffers of the same blocksize.
-  init_key_cache() should be used to init cache handler.
-
-  The free list (free_block_list) is a stack like structure.
-  When a block is freed by free_block(), it is pushed onto the stack.
-  When a new block is required it is first tried to pop one from the stack.
-  If the stack is empty, it is tried to get a never-used block from the pool.
-  If this is empty too, then a block is taken from the LRU ring, flushing it
-  to disk, if neccessary. This is handled in find_key_block().
-  With the new free list, the blocks can have three temperatures:
-  hot, warm and cold (which is free). This is remembered in the block header
-  by the enum BLOCK_TEMPERATURE temperature variable. Remembering the
-  temperature is neccessary to correctly count the number of warm blocks,
-  which is required to decide when blocks are allowed to become hot. Whenever
-  a block is inserted to another (sub-)chain, we take the old and new
-  temperature into account to decide if we got one more or less warm block.
-  blocks_unused is the sum of never used blocks in the pool and of currently
-  free blocks. blocks_used is the number of blocks fetched from the pool and
-  as such gives the maximum number of in-use blocks at any time.
+/**
+   @file
+   @brief These functions handle keyblock cacheing for ISAM and MyISAM tables.
+
+   One cache can handle many files.
+   It must contain buffers of the same blocksize.
+   init_key_cache() should be used to init cache handler.
+   
+   The free list (free_block_list) is a stack like structure.
+   When a block is freed by free_block(), it is pushed onto the stack.
+   When a new block is required it is first tried to pop one from the stack.
+   If the stack is empty, it is tried to get a never-used block from the pool.
+   If this is empty too, then a block is taken from the LRU ring, flushing it
+   to disk, if neccessary. This is handled in find_key_block().
+   With the new free list, the blocks can have three temperatures:
+   hot, warm and cold (which is free). This is remembered in the block header
+   by the enum BLOCK_TEMPERATURE temperature variable. Remembering the
+   temperature is neccessary to correctly count the number of warm blocks,
+   which is required to decide when blocks are allowed to become hot. Whenever
+   a block is inserted to another (sub-)chain, we take the old and new
+   temperature into account to decide if we got one more or less warm block.
+   blocks_unused is the sum of never used blocks in the pool and of currently
+   free blocks. blocks_used is the number of blocks fetched from the pool and
+   as such gives the maximum number of in-use blocks at any time.
 */
 
 #include "mysys_priv.h"
@@ -144,6 +145,9 @@
   uint hits_left;         /* number of hits left until promotion             */
   ulonglong last_hit_time; /* timestamp of the last hit                      */
   KEYCACHE_CONDVAR *condvar; /* condition variable for 'no readers' event    */
+  /** @brief called when block goes to file*/
+  KEYCACHE_POST_WRITE_CALLBACK post_write;
+  void                         *post_write_arg;    /**< post_write's argument*/
 };
 
 KEY_CACHE dflt_key_cache_var;
@@ -255,6 +259,10 @@
 #define keycache_pthread_cond_signal pthread_cond_signal
 #endif /* defined(KEYCACHE_DEBUG) */
 
+static int key_cache_pwrite(int Filedes, const byte *Buffer, uint Count,
+                            my_off_t offset, myf MyFlags,
+                            KEYCACHE_POST_WRITE_CALLBACK callback,
+                            void *callback_arg);
 static inline uint next_power(uint value)
 {
   return (uint) my_round_up_to_next_power((uint32) value) << 1;
@@ -1497,6 +1505,7 @@
         }
         keycache->blocks_unused--;
         block->status= 0;
+        block->post_write= NULL;
         block->length= 0;
         block->offset= keycache->key_cache_block_size;
         block->requests= 1;
@@ -1574,11 +1583,18 @@
 	      The call is thread safe because only the current
 	      thread might change the block->hash_link value
             */
-	    error= my_pwrite(block->hash_link->file,
-			     block->buffer+block->offset,
-			     block->length - block->offset,
-			     block->hash_link->diskpos+ block->offset,
-			     MYF(MY_NABP | MY_WAIT_IF_FULL));
+	    error= key_cache_pwrite(block->hash_link->file,
+                                    block->buffer + block->offset,
+                                    block->length - block->offset,
+                                    block->hash_link->diskpos + block->offset,
+                                    MYF(MY_NABP | MY_WAIT_IF_FULL),
+                                    block->post_write,
+                                    block->post_write_arg);
+            /*
+              Callback is reset after flush. Next calls to key_cache_write()
+              will specify the callback that want for their data.
+            */
+            block->post_write= NULL;
             keycache_pthread_mutex_lock(&keycache->cache_lock);
 	    keycache->global_cache_write++;
           }
@@ -1605,6 +1621,7 @@
           link_to_file_list(keycache, block, file,
                             (my_bool)(block->hash_link ? 1 : 0));
           block->status= error? BLOCK_ERROR : 0;
+          block->post_write= NULL;
           block->length= 0;
           block->offset= keycache->key_cache_block_size;
           block->hash_link= hash_link;
@@ -2033,6 +2050,10 @@
       length              length of the buffer
       dont_write          if is 0 then all dirty pages involved in writing
                           should have been flushed from key cache
+      callback            a function which should be called after the block
+                          has been written to its file (i.e. only at flush
+                          time); NULL if no function
+      callback_arg        argument which will be passed to 'callback'
 
   RETURN VALUE
     0 if a success, 1 - otherwise.
@@ -2050,7 +2071,9 @@
                     File file, my_off_t filepos, int level,
                     byte *buff, uint length,
                     uint block_length  __attribute__((unused)),
-                    int dont_write)
+                    int dont_write,
+                    KEYCACHE_POST_WRITE_CALLBACK callback,
+                    void *callback_arg)
 {
   reg1 BLOCK_LINK *block;
   int error=0;
@@ -2064,7 +2087,9 @@
   {
     /* Force writing from buff into disk */
     keycache->global_cache_write++;
-    if (my_pwrite(file, buff, length, filepos, MYF(MY_NABP | MY_WAIT_IF_FULL)))
+    if (key_cache_pwrite(file, buff, length, filepos,
+                         MYF(MY_NABP | MY_WAIT_IF_FULL),
+                         callback, callback_arg))
       DBUG_RETURN(1);
   }
 
@@ -2107,8 +2132,9 @@
         {
           keycache->global_cache_w_requests++;
           keycache->global_cache_write++;
-          if (my_pwrite(file, (byte*) buff, length, filepos,
-		        MYF(MY_NABP | MY_WAIT_IF_FULL)))
+          if (key_cache_pwrite(file, (byte*) buff, length, filepos,
+                               MYF(MY_NABP | MY_WAIT_IF_FULL),
+                               callback, callback_arg))
             error=1;
 	}
         goto next_block;
@@ -2143,6 +2169,8 @@
       }
 
       block->status|=BLOCK_READ;
+      block->post_write= callback;
+      block->post_write_arg= callback_arg;
 
       /* Unregister the request */
       block->hash_link->requests--;
@@ -2174,8 +2202,9 @@
   {
     keycache->global_cache_w_requests++;
     keycache->global_cache_write++;
-    if (my_pwrite(file, (byte*) buff, length, filepos,
-		  MYF(MY_NABP | MY_WAIT_IF_FULL)))
+    if (key_cache_pwrite(file, (byte*) buff, length, filepos,
+                         MYF(MY_NABP | MY_WAIT_IF_FULL),
+                         callback, callback_arg))
       error=1;
   }
 
@@ -2279,11 +2308,14 @@
     KEYCACHE_DBUG_PRINT("flush_cached_blocks",
                         ("block %u to be flushed", BLOCK_NUMBER(block)));
     keycache_pthread_mutex_unlock(&keycache->cache_lock);
-    error= my_pwrite(file,
-		     block->buffer+block->offset,
-		     block->length - block->offset,
-                     block->hash_link->diskpos+ block->offset,
-                     MYF(MY_NABP | MY_WAIT_IF_FULL));
+    error= key_cache_pwrite(file,
+                            block->buffer + block->offset,
+                            block->length - block->offset,
+                            block->hash_link->diskpos + block->offset,
+                            MYF(MY_NABP | MY_WAIT_IF_FULL),
+                            block->post_write,
+                            block->post_write_arg);
+    block->post_write= NULL;
     keycache_pthread_mutex_lock(&keycache->cache_lock);
     keycache->global_cache_write++;
     if (error)
@@ -2292,7 +2324,7 @@
       if (!last_errno)
         last_errno= errno ? errno : -1;
     }
-    #ifdef THREAD
+#ifdef THREAD
     /*
       Let to proceed for possible waiting requests to write to the block page.
       It might happen only during an operation to resize the key cache.
@@ -2627,6 +2659,26 @@
   key_cache->global_cache_w_requests= 0; /* Key_write_requests */
   key_cache->global_cache_write= 0;      /* Key_writes */
   DBUG_RETURN(0);
+}
+
+
+/**
+   @brief Does a my_pwrite() to the file and then calls callback
+   
+   Arguments are those of my_pwrite() plus the callback and its argument.
+
+   @note The callback is really POST-write; callers depend on this! So always
+   call it after writing to the file, not before.
+*/ 
+static int key_cache_pwrite(int Filedes, const byte *Buffer, uint Count,
+                            my_off_t offset, myf MyFlags,
+                            KEYCACHE_POST_WRITE_CALLBACK callback,
+                            void *callback_arg)
+{
+  int ret= my_pwrite(Filedes, Buffer, Count, offset, MyFlags);
+  if (callback)
+    (*callback)(callback_arg, Filedes, Buffer, Count, offset);
+  return ret;
 }
 
 

--- 1.50/mysys/my_thr_init.c	2007-02-23 12:26:43 +01:00
+++ 1.51/mysys/my_thr_init.c	2007-05-15 18:09:19 +02:00
@@ -13,9 +13,9 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
 
-/*
-  Functions to handle initializating and allocationg of all mysys & debug
-  thread variables.
+/**
+   @file
+   @brief Functions for initialization and allocation of all mysys & debug thread
variables.
 */
 
 #include "mysys_priv.h"
@@ -28,8 +28,12 @@
 pthread_key(struct st_my_thread_var, THR_KEY_mysys);
 #endif /* USE_TLS */
 pthread_mutex_t THR_LOCK_malloc,THR_LOCK_open,
-	        THR_LOCK_lock,THR_LOCK_isam,THR_LOCK_myisam,THR_LOCK_heap,
+                THR_LOCK_lock, THR_LOCK_isam, THR_LOCK_heap,
                 THR_LOCK_net, THR_LOCK_charset, THR_LOCK_threads;
+/** @brief for insert/delete in the list of MyISAM open tables */
+pthread_mutex_t THR_LOCK_myisam;
+/** @brief for writing to the MyISAM logs */
+pthread_mutex_t THR_LOCK_myisam_log;
 pthread_cond_t  THR_COND_threads;
 uint            THR_thread_count= 0;
 uint 		my_thread_end_wait_time= 5;
@@ -135,6 +139,7 @@
   pthread_mutex_init(&THR_LOCK_lock,MY_MUTEX_INIT_FAST);
   pthread_mutex_init(&THR_LOCK_isam,MY_MUTEX_INIT_SLOW);
   pthread_mutex_init(&THR_LOCK_myisam,MY_MUTEX_INIT_SLOW);
+  pthread_mutex_init(&THR_LOCK_myisam_log,MY_MUTEX_INIT_SLOW);
   pthread_mutex_init(&THR_LOCK_heap,MY_MUTEX_INIT_FAST);
   pthread_mutex_init(&THR_LOCK_net,MY_MUTEX_INIT_FAST);
   pthread_mutex_init(&THR_LOCK_charset,MY_MUTEX_INIT_FAST);
@@ -193,6 +198,7 @@
   pthread_mutex_destroy(&THR_LOCK_lock);
   pthread_mutex_destroy(&THR_LOCK_isam);
   pthread_mutex_destroy(&THR_LOCK_myisam);
+  pthread_mutex_destroy(&THR_LOCK_myisam_log);
   pthread_mutex_destroy(&THR_LOCK_heap);
   pthread_mutex_destroy(&THR_LOCK_net);
   pthread_mutex_destroy(&THR_LOCK_charset);

--- 1.211/storage/myisam/ha_myisam.cc	2007-04-04 09:51:42 +02:00
+++ 1.212/storage/myisam/ha_myisam.cc	2007-05-15 18:09:19 +02:00
@@ -992,7 +992,7 @@
   param.thd= thd;
   param.tmpdir= &mysql_tmpdir_list;
   param.out_flag= 0;
-  strmov(fixed_name,file->filename);
+  strmov(fixed_name,file->s->unresolv_file_name);
 
   // Don't lock tables if we have used LOCK TABLE
   if (!thd->locked_tables && 
@@ -1687,11 +1687,11 @@
      if table is symlinked (Ie;  Real name is not same as generated name)
    */
     data_file_name= index_file_name= 0;
-    fn_format(name_buff, file->filename, "", MI_NAME_DEXT,
+    fn_format(name_buff, file->s->unresolv_file_name, "", MI_NAME_DEXT,
               MY_APPEND_EXT | MY_UNPACK_FILENAME);
     if (strcmp(name_buff, misam_info.data_file_name))
       data_file_name=misam_info.data_file_name;
-    fn_format(name_buff, file->filename, "", MI_NAME_IEXT,
+    fn_format(name_buff, file->s->unresolv_file_name, "", MI_NAME_IEXT,
               MY_APPEND_EXT | MY_UNPACK_FILENAME);
     if (strcmp(name_buff, misam_info.index_file_name))
       index_file_name=misam_info.index_file_name;
@@ -1969,6 +1969,9 @@
   myisam_hton->create= myisam_create_handler;
   myisam_hton->panic= myisam_panic;
   myisam_hton->flags= HTON_CAN_RECREATE | HTON_SUPPORT_LOG_TABLES;
+#ifndef EMBEDDED_LIBRARY
+  myisam_hton->get_backup_engine= myisam_backup_engine;
+#endif
   return 0;
 }
 

--- 1.81/storage/myisam/ha_myisam.h	2006-12-31 01:06:39 +01:00
+++ 1.82/storage/myisam/ha_myisam.h	2007-05-15 18:09:19 +02:00
@@ -137,3 +137,8 @@
   int net_read_dump(NET* net);
 #endif
 };
+
+#ifndef EMBEDDED_LIBRARY
+// MyISAM goes into embedded, where there is no backup defined
+Backup_result_t myisam_backup_engine(handlerton *self, Backup_engine* &be);
+#endif

--- 1.117/storage/myisammrg/ha_myisammrg.cc	2007-02-27 18:31:47 +01:00
+++ 1.118/storage/myisammrg/ha_myisammrg.cc	2007-05-15 18:09:20 +02:00
@@ -464,7 +464,7 @@
 
       if (!(ptr = (TABLE_LIST *) thd->calloc(sizeof(TABLE_LIST))))
 	goto err;
-      split_file_name(open_table->table->filename, &db, &name);
+      split_file_name(open_table->table->s->unresolv_file_name, &db,
&name);
       if (!(ptr->table_name= thd->strmake(name.str, name.length)))
 	goto err;
       if (db.length && !(ptr->db= thd->strmake(db.str, db.length)))
@@ -573,7 +573,7 @@
     LEX_STRING db, name;
     LINT_INIT(db.str);
 
-    split_file_name(open_table->table->filename, &db, &name);
+    split_file_name(open_table->table->s->unresolv_file_name, &db,
&name);
     if (open_table != first)
       packet->append(',');
     /* Report database for mapped table if it isn't in current database */

--- 1.273/sql/log_event.cc	2007-03-01 15:16:13 +01:00
+++ 1.274/sql/log_event.cc	2007-05-15 18:09:19 +02:00
@@ -6632,8 +6632,7 @@
 
 int Write_rows_log_event::do_after_row_operations(TABLE *table, int error)
 {
-  if (error == 0)
-    error= table->file->ha_end_bulk_insert();
+  error|= table->file->ha_end_bulk_insert();
   return error;
 }
 

--- 1.612/sql/mysqld.cc	2007-04-04 09:51:41 +02:00
+++ 1.613/sql/mysqld.cc	2007-05-15 18:09:19 +02:00
@@ -621,7 +621,7 @@
 /* Static variables */
 
 static bool kill_in_progress, segfaulted;
-static my_bool opt_do_pstack, opt_bootstrap, opt_myisam_log;
+static my_bool opt_do_pstack, opt_bootstrap, opt_myisam_logical_log;
 static int cleanup_done;
 static ulong opt_specialflag, opt_myisam_block_size;
 static char *opt_update_logname, *opt_binlog_index_name;
@@ -3438,8 +3438,8 @@
   pthread_attr_setstacksize(&connection_attrib, NW_THD_STACKSIZE);
 #endif
 
-  if (opt_myisam_log)
-    (void) mi_log(1);
+  if (opt_myisam_logical_log)
+    (void) mi_log(MI_LOG_ACTION_OPEN, MI_LOG_LOGICAL, NULL);
 
   /* call ha_init_key_cache() on all key caches to init them */
   process_key_caches(&ha_init_key_cache);
@@ -5342,8 +5342,8 @@
    (gptr*) &log_error_file_ptr, (gptr*) &log_error_file_ptr, 0, GET_STR,
    OPT_ARG, 0, 0, 0, 0, 0, 0},
   {"log-isam", OPT_ISAM_LOG, "Log all MyISAM changes to file.",
-   (gptr*) &myisam_log_filename, (gptr*) &myisam_log_filename, 0, GET_STR,
-   OPT_ARG, 0, 0, 0, 0, 0, 0},
+   (gptr*) &myisam_logical_log_filename, (gptr*) &myisam_logical_log_filename,
+   0, GET_STR, OPT_ARG, 0, 0, 0, 0, 0, 0},
   {"log-long-format", '0',
    "Log some extra information to update log. Please note that this option is deprecated;
see --log-short-format option.", 
    0, 0, 0, GET_NO_ARG, NO_ARG, 0, 0, 0, 0, 0, 0},
@@ -7065,7 +7065,7 @@
   opt_logname= opt_update_logname= opt_binlog_index_name= opt_slow_logname= 0;
   opt_tc_log_file= (char *)"tc.log";      // no hostname in tc_log file name !
   opt_secure_auth= 0;
-  opt_bootstrap= opt_myisam_log= 0;
+  opt_bootstrap= opt_myisam_logical_log= 0;
   mqh_used= 0;
   segfaulted= kill_in_progress= 0;
   cleanup_done= 0;
@@ -7348,7 +7348,7 @@
     thd_startup_options|=OPTION_BIG_TABLES;
     break;
   case (int) OPT_ISAM_LOG:
-    opt_myisam_log=1;
+    opt_myisam_logical_log=1;
     break;
   case (int) OPT_UPDATE_LOG:
     opt_update_log=1;

--- 1.103/sql/sql_cache.cc	2006-12-23 20:19:53 +01:00
+++ 1.104/sql/sql_cache.cc	2007-05-15 18:09:19 +02:00
@@ -2394,8 +2394,9 @@
         {
           char key[MAX_DBKEY_LENGTH];
           uint32 db_length;
-          uint key_length= filename_2_table_key(key, table->table->filename,
-                                                &db_length);
+          uint key_length=
+            filename_2_table_key(key, table->table->s->unresolv_file_name,
+                                 &db_length);
           (++block_table)->n= ++n;
           /*
             There are not callback function for for MyISAM, and engine data

--- 1.342/sql/sql_class.h	2007-03-01 13:04:45 +01:00
+++ 1.343/sql/sql_class.h	2007-05-15 18:09:19 +02:00
@@ -809,7 +809,8 @@
   SYSTEM_THREAD_SLAVE_SQL= 4,
   SYSTEM_THREAD_NDBCLUSTER_BINLOG= 8,
   SYSTEM_THREAD_EVENT_SCHEDULER= 16,
-  SYSTEM_THREAD_EVENT_WORKER= 32
+  SYSTEM_THREAD_EVENT_WORKER= 32,
+  SYSTEM_THREAD_BACKUP= 64
 };
 
 

--- 1.116/sql/sql_load.cc	2007-03-01 13:04:45 +01:00
+++ 1.117/sql/sql_load.cc	2007-05-15 18:09:19 +02:00
@@ -928,8 +928,7 @@
 	cache.read_function = _my_b_net_read;
 
       if (mysql_bin_log.is_open())
-	cache.pre_read = cache.pre_close =
-	  (IO_CACHE_CALLBACK) log_loaded_block;
+	cache.pre_read= cache.pre_close = log_loaded_block;
 #endif
     }
   }

--- 1.395/sql/sql_table.cc	2007-04-04 09:51:42 +02:00
+++ 1.396/sql/sql_table.cc	2007-05-15 18:09:19 +02:00
@@ -6797,7 +6797,10 @@
 					     (SQL_SELECT *) 0, HA_POS_ERROR, 1,
 					     &examined_rows)) ==
 	HA_POS_ERROR)
+    {
+      to->file->ha_end_bulk_insert();
       goto err;
+    }
   };
 
   /* Tell handler that we have values for all columns in the to table */

--- 1.2/mysql-test/r/backup_no_data.result	2007-04-18 16:22:49 +02:00
+++ 1.3/mysql-test/r/backup_no_data.result	2007-05-15 18:09:19 +02:00
@@ -73,10 +73,10 @@
 Backup Summary
 Backed up   1  table in database empty_db.
  
- header     =      235 bytes
- data       =        0 bytes
+ header     =      242 bytes
+ data       =     1029 bytes
               --------------
- total             235 bytes
+ total            1271 bytes
 DROP DATABASE empty_db;
 SHOW DATABASES;
 Database
@@ -96,10 +96,10 @@
 Restore Summary
 Restored   1  table in database empty_db.
  
- header     =      235 bytes
- data       =        0 bytes
+ header     =      242 bytes
+ data       =     1029 bytes
               --------------
- total             235 bytes
+ total            1271 bytes
 USE empty_db;
 SHOW TABLES;
 Tables_in_empty_db
--- New file ---
+++ storage/myisam/mi_backup_log.c	07/05/15 18:09:20
/* Copyright (C) 2007 MySQL AB

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; version 2 of the License.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
*/

/**
   @file
   @brief Starting and stopping backup logging for MyISAM tables
*/

#include "myisamdef.h"


/**
  @brief Starts MyISAM physical logging for a set of tables

  @details
  A condition of correctness of online backup is that:
  after the copy process has started (i.e. after the function below has
  terminated), any update done to a table-to-back-up must be present in the
  log. This guides the algorithm below.

  All writes (my_write, my_pwrite, memcpy to mmap'ed area, my_chsize) to the
  data or index file are done this way:
  @code
  {
    write_to_data_or_index_file;
    if ((atomic read of MYISAM_SHARE::backup_logging)==1)
      write log record to backup log;
  }
  @endcode
  
  The present function sets MYISAM_SHARE::backup_logging to 1 using an atomic
  write. Atomic write happens before or after atomic read above. If
  before, change will be in the log. If after, it is also after the
  write_to_data_or_index_file and thus change will be in the copy. So
  correctness is always guaranteed. Note the importance of checking
  MYISAM_SHARE::logging always _after_ write_to_data_or_index_file for the
  reasoning to hold.

  @param  hash_of_tables   The hash of tables for which to turn logging on
  @param  backup_log_name  Name of the backup log file to use

  @return Operation status
    @retval 0      ok
    @retval !=0    error; then caller should call
                   mi_backup_stop_logging_for_tables(TRUE)
*/
int mi_backup_start_logging_for_tables(HASH *hash_of_tables,
                                       const char *backup_log_name)
{
  LIST *list_item;
  int error= 0;
  DBUG_ENTER("mi_backup_start_logging_for_tables");
  DBUG_ASSERT(hash_inited(hash_of_tables));

#ifndef HAVE_MYISAM_BACKUP_LOGGING
  DBUG_ASSERT(0);
  DBUG_RETURN(1);
#endif

  pthread_mutex_lock(&THR_LOCK_myisam);
  if (backup_hash_of_tables) /* a backup already running */
  {
    pthread_mutex_unlock(&THR_LOCK_myisam);
    DBUG_ASSERT(0); /* because it should not happen */
    DBUG_RETURN(1);
  }
  backup_hash_of_tables= hash_of_tables;

  if (mi_log(MI_LOG_ACTION_OPEN, MI_LOG_PHYSICAL, backup_log_name))
  {
    error= 1;
    goto end;
  }
  /* Go through all open MyISAM tables (MI_INFOs) */
  for (list_item= myisam_open_list; list_item; list_item= list_item->next)
  {
    MI_INFO *info= (MI_INFO*)list_item->data;
    MYISAM_SHARE *share= info->s;
    DBUG_PRINT("info",("table '%s' 0x%lx tested against hash",
                       share->unique_file_name, (ulong)info));
    if (!hash_search(backup_hash_of_tables, share->unique_file_name,
                     share->unique_name_length) ||
        share->temporary)
      continue;
    mi_set_backup_logging_state(info->s, 1);
  }

end:
  switch(error)
  {
  case 2: /* for errors which happened after log was opened; impossible now */
    mi_log(MI_LOG_ACTION_CLOSE, MI_LOG_PHYSICAL, NULL);
    /* fall through */
  case 1:
    backup_hash_of_tables= NULL;
  }
  pthread_mutex_unlock(&THR_LOCK_myisam);
  DBUG_RETURN(error);
}


/**
  @brief Stops MyISAM physical logging

  @param  cancel           if this is a stop of a cancelled backup (tables are
                           being updated now, and log will not be used) or not
                           (tables are locked now (caller guarantees), and log
                           needs to be consistent, it's to create a backup
                           validity point).

  @return Operation status
    @retval 0      ok
    @retval !=0    error

  @note Even if 'cancel' is FALSE, tables may be being written now (in
  practice caller has read-locked tables, but those tables may be just going
  out of a write (after thr_unlock(), before or inside
  mi_lock_database(F_UNLCK) which may be flushing the index header or index
  pages).
*/
int mi_backup_stop_logging_for_tables(my_bool cancel)
{
  int error= 0;
  LIST *list_item;
  DBUG_ENTER("mi_backup_stop_logging_for_tables");
  pthread_mutex_lock(&THR_LOCK_myisam);
  if (!backup_hash_of_tables) /* no backup running */
  {
    pthread_mutex_unlock(&THR_LOCK_myisam);
    DBUG_RETURN(0); /* it's ok if it happens */
  }
  backup_hash_of_tables= NULL; /* its freeing is done by the caller */

  for (list_item= myisam_open_list; list_item; list_item= list_item->next)
  {
    MI_INFO *info= (MI_INFO*)list_item->data;
    MYISAM_SHARE *share= info->s;
    if (!info->s->backup_logging)
      continue;
    if (!cancel)
    {
      pthread_mutex_lock(&share->intern_lock);
      /*
        It is possible that some statement just finished, has not called
        mi_lock_database(F_UNLCK) yet, and so some key blocks would still be
        in memory even if !delay_key_write. So we have to flush below even
        in this case, to put them into the log.
        Possibility to avoid this flush: implement a new
        JUST_CALL_CALLBACK flush mode which would be like FLUSH_KEEP except
        that:
        - it does not my_pwrite() the block to the index file, only calls
        the callback (to put the block into the log)
        - and so it leaves the block in the "changed_blocks" list, does not
        increment the counter of flushed pages; however resets the callback.
        The advantage would be that log write is more sequential than index
        file writes (even though they are sorted during flush).

        It is also possible (same scenario) that some WRITE_CACHE is not
        flushed yet. This should not happen but it does (can just be a
        forgotten mi_extra(HA_EXTRA_NO_CACHE)); so mi_close() and
        mi_lock_database(F_UNLCK) flush the cache; so we have to do it here
        too, to put the data into the log.

        It is also possible (same scenario) that the index's header has not
        been written yet and nobody is going to do it for us; indeed this can
        happen (two concurrent threads): thread1 has just done
        mi_lock_database(F_WRLCK), is blocked by the thr_lock of our caller,
        thread2 has finished its write statement and is going to execute
        mi_lock_database(F_UNLCK); no index header flush will be done by the
        mi_lock_database(F_UNLCK) of thread2 as w_locks is >0 (due to
        thread1). And no index header flush will be done by thread1 as it is
        blocked. So, we need to flush the index header here, to put it into
        the log.

        Of course, for the flushing above to reach the log, it has to be done
        before setting share->backup_logging to FALSE and before closing the
        log.
      */
      if ((index_pages_in_backup_log &&
          flush_key_blocks(share->key_cache, share->kfile, FLUSH_KEEP)) ||
          ((info->opt_flag & WRITE_CACHE_USED) &&
           flush_io_cache(&info->rec_cache)) ||
          (share->changed && mi_remap_file_and_write_state_for_unlock(info)))
      {
        error= 1; /* we continue, because log has to be closed anyway */
        mi_print_error(share, HA_ERR_CRASHED);
        mi_mark_crashed(info);		/* Mark that table must be checked */
      }
      pthread_mutex_unlock(&share->intern_lock);
    } /* ... if (!cancel) */
  } /* ... for (list_item=...) */    

  /*
    Don't delete log, we need it to send it to the backup stream,
    but close it now, so that its IO_CACHE goes to disk (so that all log is
    visible to the my_read() calls).
  */
  if (mi_log(MI_LOG_ACTION_CLOSE, MI_LOG_PHYSICAL, NULL))
    error= 1;

  for (list_item= myisam_open_list; list_item; list_item= list_item->next)
  {
    MI_INFO *info= (MI_INFO*)list_item->data;
    MYISAM_SHARE *share= info->s;
    mi_set_backup_logging_state(info->s, 0);
    /*
      We reset MI_LOG_OPEN_stored_in_backup_log. How is this safe with a
      concurrent logging operation (like myisam_log_pwrite_for_backup()) which
      may want to set it to TRUE at the same time?
      The concurrent logging operation runs either before or after log closing
      (serialization ensured by THR_LOCK_myisam_log). If before, we (resetter)
      win. If after, the concurrent logging operation finds the log closed and
      so will not change MI_LOG_OPEN_stored_in_backup_log (so we win again).
      Note the importance of closing the log before, for the reasoning to
      hold.
    */
    info->MI_LOG_OPEN_stored_in_backup_log= FALSE;
  }

  pthread_mutex_unlock(&THR_LOCK_myisam);
  /*
    From this moment on, from the point of view of MyISAM, a new backup can
    start (new log will use a different tmp name).
  */
  DBUG_RETURN(error);
}

--- New file ---
+++ storage/myisam/mi_examine_log.c	07/05/15 18:09:20
/* Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */

/**
   @file
   @brief Function to display and apply a MyISAM logical or physical log to tables

   Prints what is in a MyISAM (logical or physical/backup) log, optionally
   applies the changes to tables (all tables or only a set specified on the
   command line). Is used both by the standalone program myisamlog and by the
   restore code of the MyISAM online backup driver.
*/

#ifndef USE_MY_FUNC
#define USE_MY_FUNC
#endif

#include "myisamdef.h"
#include <my_tree.h>
#include <stdarg.h>
#ifdef HAVE_GETRUSAGE
#include <sys/resource.h>
#endif

#define FILENAME(A) (A ? A->show_name : "Unknown")

/** In some cases we do not want to flush the index header in mi_close() */
static my_bool update_index_on_close= TRUE;

struct file_info {
  long process;
  int  filenr,id;
  uint rnd;
  my_string name,show_name,record;
  MI_INFO *isam;
  /**
     If 'isam' is currently closed. A not 'used' file is always 'closed' (why
     open it?). A 'used' file may temporarily be closed because of the max
     open file descriptors limit (but if we later meet a command which wants
     to use this file, we will re-open it).
  */
  bool closed;
  /** If this table matches the inclusion rules (or has to be ignored) */
  bool used;
  ulong accessed;
};

struct test_if_open_param {
  my_string name;
  int max_id;
};

struct st_access_param
{
  ulong min_accessed;
  struct file_info *found;
};

#define NO_FILEPOS (ulong) ~0L

void mi_examine_log_param_init(MI_EXAMINE_LOG_PARAM *param);
int mi_examine_log(MI_EXAMINE_LOG_PARAM *param);
static int read_string(IO_CACHE *file,gptr *to,uint length);
static int file_info_compare(void *cmp_arg, void *a,void *b);
static int test_if_open(struct file_info *key,element_count count,
			struct test_if_open_param *param);
static void fix_blob_pointers(MI_INFO *isam,byte *record);
static int test_when_accessed(struct file_info *key,element_count count,
			      struct st_access_param *access_param);
static void file_info_free(struct file_info *info);
static int close_some_file(TREE *tree);
static int reopen_closed_file(TREE *tree,struct file_info *file_info);
static int find_record_with_key(struct file_info *file_info,byte *record);
static int mi_close_wrap(MI_INFO *info);
static void printf_log(uint verbose, ulong isamlog_process,
                       my_off_t isamlog_filepos, const char *format,...);
static bool cmp_filename(struct file_info *file_info, const char *name);


void mi_examine_log_param_init(MI_EXAMINE_LOG_PARAM *mi_exl)
{
  bzero((gptr) mi_exl,sizeof(*mi_exl));
  mi_exl->number_of_commands= (ulong) ~0L;
  mi_exl->record_pos= HA_OFFSET_ERROR;
}


/**
  @brief Applies a MyISAM logical or physical log to tables.

  @details Applies either to all tables referenced by the log, or only to a
  subset specified in mi_exl->table_selection_hook.

  @param  mi_exl           Parameters of the applying

  @return Operation status
    @retval 0      ok
    @retval !=0    error
*/
int mi_examine_log(MI_EXAMINE_LOG_PARAM *mi_exl)
{
  ulong isamlog_process;
  my_off_t isamlog_filepos;
  uint command,result,files_open;
  ulong access_time,length;
  my_off_t filepos;
  int lock_command,mi_result;
  char isam_file_name[FN_REFLEN];
  char llbuff[21],llbuff2[21];
  uchar head[20];
  gptr	buff;
  struct test_if_open_param open_param;
  IO_CACHE cache;
  File log_file;
  char current_open_index_name[FN_REFLEN]; /**< to not open/close repeatedly */
  File current_open_index_fd= -1; /**< to not open/close repeatedly */
  FILE *write_file;
  enum ha_extra_function extra_command;
  TREE tree;
  struct file_info file_info,*curr_file_info;
  my_bool entry_has_short_header[]= {0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0};

  DBUG_ENTER("mi_examine_log");

  if ((log_file=my_open(mi_exl->log_filename,O_RDONLY,MYF(MY_WME))) < 0)
    DBUG_RETURN(1);
  write_file=0;
  if (mi_exl->write_filename)
  {
    if (!(write_file=my_fopen(mi_exl->write_filename,O_WRONLY,MYF(MY_WME))))
    {
      my_close(log_file,MYF(0));
      DBUG_RETURN(1);
    }
  }

  init_io_cache(&cache,log_file,0,READ_CACHE,mi_exl->start_offset,0,MYF(0));
  bzero((gptr) mi_exl->com_count,sizeof(mi_exl->com_count));
  init_tree(&tree,0,0,sizeof(file_info),(qsort_cmp2) file_info_compare,1,
	    (tree_element_free) file_info_free, NULL);
  VOID(init_key_cache(dflt_key_cache,KEY_CACHE_BLOCK_SIZE,KEY_CACHE_SIZE,
                      0, 0));

  files_open=0; access_time=0;
  current_open_index_name[0]= 0;
  while (access_time++ != mi_exl->number_of_commands &&
	 !my_b_read(&cache,(byte*) head,9))
  {
    isamlog_filepos=my_b_tell(&cache)-9L;
    command=(uint) head[0];
    file_info.filenr= mi_uint2korr(head+1);
    isamlog_process=file_info.process=(long) mi_uint4korr(head+3);
    if (!mi_exl->opt_processes)
      file_info.process=0;
    result= entry_has_short_header[command] ? 0: mi_uint2korr(head+7);
    if ((curr_file_info=(struct file_info*) tree_search(&tree, &file_info,
							tree.custom_arg)))
    {
      curr_file_info->accessed=access_time;
      if (mi_exl->update && curr_file_info->used &&
curr_file_info->closed)
      {
	if (reopen_closed_file(&tree,curr_file_info))
	{
	  command=sizeof(mi_exl->com_count)/sizeof(mi_exl->com_count[0][0])/3;
	  result=0;
	  goto com_err;
	}
        mi_exl->re_open_count++;
      }
    }
    DBUG_PRINT("info",("command: %u curr_file_info: 0x%lx used: %u",
                       command, (ulong)curr_file_info,
                       curr_file_info ? curr_file_info->used : 0));
    /*
      We update our statistic (how many commands issued, per command type),
      if this is a valid command about a file we want to include.
      There are two commands for which decision must be postponed, because
      curr_file_info is meaningless for them.
    */
    if ((command <
         sizeof(mi_exl->com_count)/sizeof(mi_exl->com_count[0][0])/3) &&
        (!mi_exl->table_selection_hook ||
         (curr_file_info && curr_file_info->used)) &&
        (((enum myisam_log_commands) command) != MI_LOG_OPEN) &&
        (((enum myisam_log_commands) command) !=
         MI_LOG_WRITE_BYTES_MYI_FROM_KEY_CACHE))
    {
      mi_exl->com_count[command][0]++;
      if (result)
        mi_exl->com_count[command][1]++;
    }
    switch ((enum myisam_log_commands) command) {
    case MI_LOG_OPEN:
      if (curr_file_info)
	printf("\nWarning: %s is opened with same process and filenumber\nMaybe you should use
the -P option ?\n",
	       curr_file_info->show_name);
      if (my_b_read(&cache,(byte*) head,2))
	goto err;
      file_info.name=0;
      file_info.show_name=0;
      file_info.record=0;
      if (read_string(&cache,(gptr*) &file_info.name,
		      (uint) mi_uint2korr(head)))
	goto err;
      {
	uint i;
	char *pos,*to;

	/* Fix if old DOS files to new format */
	for (pos=file_info.name; (pos=strchr(pos,'\\')) ; pos++)
	  *pos= '/';

	pos=file_info.name;
	for (i=0 ; i < mi_exl->prefix_remove ; i++)
	{
	  char *next;
	  if (!(next=strchr(pos,'/')))
	    break;
	  pos=next+1;
	}
	to=isam_file_name;
	if (mi_exl->filepath)
	  to=convert_dirname(isam_file_name,mi_exl->filepath,NullS);
	strmov(to,pos);
	fn_ext(isam_file_name)[0]=0;	/* Remove extension */
      }
      open_param.name=file_info.name;
      open_param.max_id=0;
      VOID(tree_walk(&tree,(tree_walk_action) test_if_open,(void*) &open_param,
		     left_root_right));
      file_info.id=open_param.max_id+1;
      /*
       * In the line below +10 is added to accomodate '<' and '>' chars
       * plus '\0' at the end, so that there is place for 7 digits.
       * It is  improbable that same table can have that many entries in 
       * the table cache.
       * The additional space is needed for the sprintf commands two lines
       * below.
       */ 
      file_info.show_name=my_memdup(isam_file_name,
				    (uint) strlen(isam_file_name)+10,
				    MYF(MY_WME));
      if (file_info.id > 1)
	sprintf(strend(file_info.show_name),"<%d>",file_info.id);
      file_info.closed=1;
      file_info.accessed=access_time;
      file_info.used= !mi_exl->table_selection_hook ||
        ((*(mi_exl->table_selection_hook))(isam_file_name));
      if (mi_exl->update && file_info.used)
      {
	if (files_open >= mi_exl->max_files)
	{
	  if (close_some_file(&tree))
	    goto com_err;
	  files_open--;
	}
	if (!(file_info.isam= mi_open(isam_file_name, O_RDWR,
                                      HA_OPEN_FOR_REPAIR | HA_OPEN_WAIT_IF_LOCKED)))
	  goto com_err;
	if (!(file_info.record=my_malloc(file_info.isam->s->base.reclength,
					 MYF(MY_WME))))
	  goto end;
	files_open++;
	file_info.closed=0;
      }
      VOID(tree_insert(&tree, (gptr) &file_info, 0, tree.custom_arg));
      if (file_info.used)
      {
	if (mi_exl->verbose && !mi_exl->record_pos_file)
	  printf_log(mi_exl->verbose,
                     isamlog_process, isamlog_filepos,
                     "%s: open -> %d",file_info.show_name, file_info.filenr);
	mi_exl->com_count[command][0]++;
        /* given how we log MI_LOG_OPEN, "result" is always 0 here */
	if (result)
	  mi_exl->com_count[command][1]++;
      }
      break;
    case MI_LOG_CLOSE:
      if (mi_exl->verbose && !mi_exl->record_pos_file &&
	  (!mi_exl->table_selection_hook ||
           (curr_file_info && curr_file_info->used)))
	printf_log(mi_exl->verbose,
                   isamlog_process, isamlog_filepos,
                   "%s: %s -> %d",FILENAME(curr_file_info),
                   mi_log_command_name[command],result);
      if (curr_file_info)
      {
	if (!curr_file_info->closed)
	  files_open--;
	VOID(tree_delete(&tree, (gptr) curr_file_info, 0, tree.custom_arg));
      }
      break;
    case MI_LOG_EXTRA:
      if (my_b_read(&cache,(byte*) head,1))
	goto err;
      extra_command=(enum ha_extra_function) head[0];
      if (mi_exl->verbose && !mi_exl->record_pos_file &&
	  (!mi_exl->table_selection_hook ||
           (curr_file_info && curr_file_info->used)))
	printf_log(mi_exl->verbose,
                   isamlog_process, isamlog_filepos,
                   "%s: %s(%d) -> %d",FILENAME(curr_file_info),
		   mi_log_command_name[command], (int) extra_command,result);
      if (mi_exl->update && curr_file_info &&
!curr_file_info->closed)
      {
	if (mi_extra(curr_file_info->isam, extra_command, 0) != (int) result)
	{
	  fflush(stdout);
	  VOID(fprintf(stderr,
		       "Warning: error %d, expected %d on command %s at %s\n",
		       my_errno,result,mi_log_command_name[command],
		       llstr(isamlog_filepos,llbuff)));
	  fflush(stderr);
	}
      }
      break;
    case MI_LOG_DELETE:
      if (my_b_read(&cache,(byte*) head,8))
	goto err;
      filepos=mi_sizekorr(head);
      if (mi_exl->verbose && (!mi_exl->record_pos_file ||
		      ((mi_exl->record_pos == filepos || mi_exl->record_pos == NO_FILEPOS)
&&
		       !cmp_filename(curr_file_info,mi_exl->record_pos_file))) &&
	  (!mi_exl->table_selection_hook ||
           (curr_file_info && curr_file_info->used)))
	printf_log(mi_exl->verbose,
                   isamlog_process, isamlog_filepos,
                   "%s: %s at %ld -> %d",FILENAME(curr_file_info),
		   mi_log_command_name[command],(long) filepos,result);
      if (mi_exl->update && curr_file_info &&
!curr_file_info->closed)
      {
	if (mi_rrnd(curr_file_info->isam,curr_file_info->record,filepos))
	{
	  if (!mi_exl->recover)
	    goto com_err;
	  if (mi_exl->verbose)
	    printf_log(mi_exl->verbose,
                       isamlog_process, isamlog_filepos,
                       "error: Didn't find row to delete with mi_rrnd");
	  mi_exl->com_count[command][2]++;		/* Mark error */
	}
	mi_result=mi_delete(curr_file_info->isam,curr_file_info->record);
	if ((mi_result == 0 && result) ||
	    (mi_result && (uint) my_errno != result))
	{
	  if (!mi_exl->recover)
	    goto com_err;
	  if (mi_result)
	    mi_exl->com_count[command][2]++;		/* Mark error */
	  if (mi_exl->verbose)
	    printf_log(mi_exl->verbose,
                       isamlog_process, isamlog_filepos,
                       "error: Got result %d from mi_delete instead of %d",
		       mi_result, result);
	}
      }
      break;
    case MI_LOG_WRITE:
    case MI_LOG_UPDATE:
      if (my_b_read(&cache,(byte*) head,12))
	goto err;
      filepos=mi_sizekorr(head);
      length=mi_uint4korr(head+8);
      buff=0;
      if (read_string(&cache,&buff,(uint) length))
	goto err;
      if ((!mi_exl->record_pos_file ||
	  ((mi_exl->record_pos == filepos || mi_exl->record_pos == NO_FILEPOS) &&
	   !cmp_filename(curr_file_info,mi_exl->record_pos_file))) &&
	  (!mi_exl->table_selection_hook ||
           (curr_file_info && curr_file_info->used)))
      {
	if (write_file &&
	    (my_fwrite(write_file,buff,length,MYF(MY_WAIT_IF_FULL | MY_NABP))))
	  goto end;
	if (mi_exl->verbose)
	  printf_log(mi_exl->verbose,
                     isamlog_process, isamlog_filepos,
                     "%s: %s at %ld, length=%ld -> %d",
		     FILENAME(curr_file_info),
		     mi_log_command_name[command], filepos,length,result);
      }
      if (mi_exl->update && curr_file_info &&
!curr_file_info->closed)
      {
	if (curr_file_info->isam->s->base.blobs)
	  fix_blob_pointers(curr_file_info->isam,buff);
	if ((enum myisam_log_commands) command == MI_LOG_UPDATE)
	{
	  if (mi_rrnd(curr_file_info->isam,curr_file_info->record,filepos))
	  {
	    if (!mi_exl->recover)
	    {
	      result=0;
	      goto com_err;
	    }
	    if (mi_exl->verbose)
	      printf_log(mi_exl->verbose,
                         isamlog_process, isamlog_filepos,
                         "error: Didn't find row to update with mi_rrnd");
	    if (mi_exl->recover == 1 || result ||
		find_record_with_key(curr_file_info,buff))
	    {
	      mi_exl->com_count[command][2]++;		/* Mark error */
	      break;
	    }
	  }
	  mi_result=mi_update(curr_file_info->isam,curr_file_info->record,
			      buff);
	  if ((mi_result == 0 && result) ||
	      (mi_result && (uint) my_errno != result))
	  {
	    if (!mi_exl->recover)
	      goto com_err;
	    if (mi_exl->verbose)
	      printf_log(mi_exl->verbose,
                         isamlog_process, isamlog_filepos,
                         "error: Got result %d from mi_update instead of %d",
			 mi_result, result);
	    if (mi_result)
	      mi_exl->com_count[command][2]++;		/* Mark error */
	  }
	}
	else
	{
	  mi_result=mi_write(curr_file_info->isam,buff);
	  if ((mi_result == 0 && result) ||
	      (mi_result && (uint) my_errno != result))
	  {
	    if (!mi_exl->recover)
	      goto com_err;
	    if (mi_exl->verbose)
	      printf_log(mi_exl->verbose,
                         isamlog_process, isamlog_filepos,
                         "error: Got result %d from mi_write instead of %d",
			 mi_result, result);
	    if (mi_result)
	      mi_exl->com_count[command][2]++;		/* Mark error */
	  }
	  if (!mi_exl->recover && filepos != curr_file_info->isam->lastpos)
	  {
	    printf("error: Wrote at position: %s, should have been %s",
		   llstr(curr_file_info->isam->lastpos,llbuff),
		   llstr(filepos,llbuff2));
	    goto end;
	  }
	}
      }
      my_free(buff,MYF(0));
      break;
    case MI_LOG_WRITE_BYTES_MYI:
    case MI_LOG_WRITE_BYTES_MYD:
      {
        /* unpack variable-length filepos and length */
        char *ptr;
        filepos= mi_uint4korr(head+3);
        if (filepos == UINT_MAX32)
        {
          /*
            'filepos' is stored in bytes 7-15, 'head' contains 9 bytes, so we
            need to read 6 more bytes. Plus 2 for 'length'.
          */
          if (my_b_read(&cache, (byte*) head + 9, 6 + 2))
            goto err;
          filepos= mi_uint8korr(head + 7);
          ptr= head + 15;
        }
        else
          ptr= head + 7;
        length= mi_uint2korr(ptr);
        if (length == UINT_MAX16)
        {
          /* we need to read 4 more bytes to know 'length' */
          if (my_b_read(&cache, (byte*) ptr + 2, 4))
            goto err;
          length= mi_uint4korr(ptr + 2);
        }
      }
      buff=0;
      if (read_string(&cache,&buff,(uint) length))
        goto err;
      if ((!mi_exl->record_pos_file ||
           ((mi_exl->record_pos == filepos || mi_exl->record_pos == NO_FILEPOS)
&&
            !cmp_filename(curr_file_info,mi_exl->record_pos_file))) &&
          (!mi_exl->table_selection_hook ||
           (curr_file_info && curr_file_info->used)))
      {
        if (write_file &&
            (my_fwrite(write_file,buff,length,MYF(MY_WAIT_IF_FULL | MY_NABP))))
          goto end;
        if (mi_exl->verbose)
          printf_log(mi_exl->verbose,
                     isamlog_process, isamlog_filepos,
                     "%s: %s at %s, length=%lu -> %d",
                     FILENAME(curr_file_info),
                     mi_log_command_name[command], llstr(filepos,llbuff),length,result);
      }
      if (mi_exl->update && curr_file_info &&
!curr_file_info->closed)
      {
        update_index_on_close= FALSE;
        if (my_pwrite((command == MI_LOG_WRITE_BYTES_MYI) ?
                      curr_file_info->isam->s->kfile :
                      curr_file_info->isam->dfile,
                      buff,length,filepos,MYF(MY_NABP)))
          goto com_err;
      }
      my_free(buff,MYF(0));
      break;
    case MI_LOG_WRITE_BYTES_MYI_FROM_KEY_CACHE:
      /* We are given a file name and data to write to it */
      if (my_b_read(&cache,(byte*) head,14))
	goto err;
      filepos=mi_sizekorr(head);
      length=mi_uint4korr(head+8);
      uint name_len= mi_uint2korr(head+12);
      gptr name_buff= NULL;
      buff= NULL;
      if (read_string(&cache,&name_buff,(uint) name_len) ||
          read_string(&cache,&buff,(uint) length))
	goto err;
      if ((!mi_exl->record_pos_file ||
	  ((mi_exl->record_pos == filepos || mi_exl->record_pos == NO_FILEPOS) &&
	   !strcmp(name_buff, mi_exl->record_pos_file))) &&
          !mi_exl->table_selection_hook)
      {
	if (write_file &&
	    (my_fwrite(write_file,buff,length,MYF(MY_WAIT_IF_FULL | MY_NABP))))
	  goto end;
	if (mi_exl->verbose)
	  printf_log(mi_exl->verbose,
                     isamlog_process, isamlog_filepos,
                     "%s: %s at %s, length=%lu -> %d",
		     name_buff,
		     mi_log_command_name[command], llstr(filepos,llbuff),length,result);
      }
      if (mi_exl->update)
      {
        /* here we use isam_file_name as a temporary buffer */
        fn_format(isam_file_name, name_buff, "",
                  MI_NAME_IEXT, MY_UNPACK_FILENAME);
        /*
          If this is the same file as the last write-from-key-cache, we re-use
          its open descriptor (saves open/close/open/close if several key
          blocks were flushed successively).
        */
        if (strcmp(current_open_index_name, isam_file_name))
        {
          if (((current_open_index_fd >= 0) &&
               my_close(current_open_index_fd, MYF(MY_WME))) ||
              ((current_open_index_fd=
                my_open(isam_file_name, O_WRONLY, MYF(MY_WME))) < 0))
            goto com_err;
          bmove(current_open_index_name, isam_file_name,
                sizeof(current_open_index_name));
        } /* otherwise, it's the same index file, already opened */
        update_index_on_close= FALSE;
        if (my_pwrite(current_open_index_fd,buff,length,filepos,MYF(MY_NABP)))
          goto com_err;
      }
      if (!mi_exl->table_selection_hook ||
          (*(mi_exl->table_selection_hook))(name_buff))
      {
        mi_exl->com_count[command][0]++;
        if (result)
          mi_exl->com_count[command][1]++;
      }
      my_free(name_buff,MYF(0));
      my_free(buff,MYF(0));
      break;
    case MI_LOG_LOCK:
      if (my_b_read(&cache,(byte*) head,sizeof(lock_command)))
	goto err;
      memcpy_fixed(&lock_command,head,sizeof(lock_command));
      if (mi_exl->verbose && !mi_exl->record_pos_file &&
	  (!mi_exl->table_selection_hook ||
           (curr_file_info && curr_file_info->used)))
	printf_log(mi_exl->verbose,
                   isamlog_process, isamlog_filepos,
                   "%s: %s(%d) -> %d",FILENAME(curr_file_info),
		   mi_log_command_name[command],lock_command,result);
      if (mi_exl->update && curr_file_info &&
!curr_file_info->closed)
      {
	if (mi_lock_database(curr_file_info->isam,lock_command) !=
	    (int) result)
	  goto com_err;
      }
      break;
    case MI_LOG_DELETE_ALL:
      if (mi_exl->verbose && !mi_exl->record_pos_file &&
	  (!mi_exl->table_selection_hook ||
           (curr_file_info && curr_file_info->used)))
	printf_log(mi_exl->verbose,
                   isamlog_process, isamlog_filepos,
                   "%s: %s -> %d",FILENAME(curr_file_info),
		   mi_log_command_name[command],result);
      if (mi_exl->update && curr_file_info &&
!curr_file_info->closed)
      {
	if (mi_delete_all_rows(curr_file_info->isam) != (int) result)
	  goto com_err;
      }
      break;
    default:
      fflush(stdout);
      VOID(fprintf(stderr,
		   "Error: found unknown command %d in logfile, aborted\n",
		   command));
      fflush(stderr);
      goto end;
    }
  }
  end_key_cache(dflt_key_cache,1);
  delete_tree(&tree);
  VOID(end_io_cache(&cache));
  VOID(my_close(log_file,MYF(0)));
  if (write_file && my_fclose(write_file,MYF(MY_WME)))
    DBUG_RETURN(1);
  if (current_open_index_fd >= 0)
    my_close(current_open_index_fd, MYF(MY_WME));
  DBUG_RETURN(0);

 err:
  fflush(stdout);
  VOID(fprintf(stderr,"Got error %d when reading from logfile\n",my_errno));
  fflush(stderr);
  goto end;
 com_err:
  fflush(stdout);
  VOID(fprintf(stderr,"Got error %d, expected %d on command %s at %s\n",
	       my_errno,result,mi_log_command_name[command],
	       llstr(isamlog_filepos,llbuff)));
  fflush(stderr);
 end:
  end_key_cache(dflt_key_cache, 1);
  delete_tree(&tree);
  VOID(end_io_cache(&cache));
  VOID(my_close(log_file,MYF(0)));
  if (write_file)
    VOID(my_fclose(write_file,MYF(MY_WME)));
  if (current_open_index_fd >= 0)
    my_close(current_open_index_fd, MYF(MY_WME));
  DBUG_RETURN(1);
}


static int read_string(IO_CACHE *file, register gptr *to, register uint length)
{
  DBUG_ENTER("read_string");

  if (*to)
    my_free((gptr) *to,MYF(0));
  if (!(*to= (gptr) my_malloc(length+1,MYF(MY_WME))) ||
      my_b_read(file,(byte*) *to,length))
  {
    if (*to)
      my_free(*to,MYF(0));
    *to= 0;
    DBUG_RETURN(1);
  }
  *((char*) *to+length)= '\0';
  DBUG_RETURN (0);
}				/* read_string */


static int file_info_compare(void* cmp_arg __attribute__((unused)),
			     void *a, void *b)
{
  long lint;

  if ((lint=((struct file_info*) a)->process -
       ((struct file_info*) b)->process))
    return lint < 0L ? -1 : 1;
  return ((struct file_info*) a)->filenr - ((struct file_info*) b)->filenr;
}

	/* ARGSUSED */

static int test_if_open (struct file_info *key,
			 element_count count __attribute__((unused)),
			 struct test_if_open_param *param)
{
  if (!strcmp(key->name,param->name) && key->id > param->max_id)
    param->max_id=key->id;
  return 0;
}


static void fix_blob_pointers(MI_INFO *info, byte *record)
{
  byte *pos;
  MI_BLOB *blob,*end;

  pos=record+info->s->base.reclength;
  for (end=info->blobs+info->s->base.blobs, blob= info->blobs;
       blob != end ;
       blob++)
  {
    memcpy_fixed(record+blob->offset+blob->pack_length,&pos,sizeof(char*));
    pos+=_mi_calc_blob_length(blob->pack_length,record+blob->offset);
  }
}

	/* close the file with hasn't been accessed for the longest time */
	/* ARGSUSED */

static int test_when_accessed (struct file_info *key,
			       element_count count __attribute__((unused)),
			       struct st_access_param *access_param)
{
  if (key->accessed < access_param->min_accessed && ! key->closed)
  {
    access_param->min_accessed=key->accessed;
    access_param->found=key;
  }
  return 0;
}


static void file_info_free(struct file_info *fileinfo)
{
  DBUG_ENTER("file_info_free");
  /* The 2 conditions below can be true only if 'update' */
  if (!fileinfo->closed)
    VOID(mi_close_wrap(fileinfo->isam));
  if (fileinfo->record)
    my_free(fileinfo->record,MYF(0));
  my_free(fileinfo->name,MYF(0));
  my_free(fileinfo->show_name,MYF(0));
  DBUG_VOID_RETURN;
}



static int close_some_file(TREE *tree)
{
  struct st_access_param access_param;

  access_param.min_accessed=LONG_MAX;
  access_param.found=0;

  VOID(tree_walk(tree,(tree_walk_action) test_when_accessed,
		 (void*) &access_param,left_root_right));
  if (!access_param.found)
    return 1;			/* No open file that is possibly to close */
  if (mi_close_wrap(access_param.found->isam))
    return 1;
  access_param.found->closed=1;
  return 0;
}


static int reopen_closed_file(TREE *tree, struct file_info *fileinfo)
{
  char name[FN_REFLEN];
  if (close_some_file(tree))
    return 1;				/* No file to close */
  strmov(name,fileinfo->show_name);
  if (fileinfo->id > 1)
    *strrchr(name,'<')='\0';		/* Remove "<id>" */

  if (!(fileinfo->isam= mi_open(name,O_RDWR,HA_OPEN_FOR_REPAIR |
HA_OPEN_WAIT_IF_LOCKED)))
    return 1;
  fileinfo->closed=0;
  return 0;
}

	/* Try to find record with uniq key */

static int find_record_with_key(struct file_info *file_info, byte *record)
{
  uint key;
  MI_INFO *info=file_info->isam;
  uchar tmp_key[MI_MAX_KEY_BUFF];

  for (key=0 ; key < info->s->base.keys ; key++)
  {
    if (mi_is_key_active(info->s->state.key_map, key) &&
	info->s->keyinfo[key].flag & HA_NOSAME)
    {
      VOID(_mi_make_key(info,key,tmp_key,record,0L));
      return mi_rkey(info,file_info->record,(int) key,(char*) tmp_key,0,
		     HA_READ_KEY_EXACT);
    }
  }
  return 1;
}


/**
   In practice this is only called if verbose>=1. When mi_examine_log() is
   used in the server it is with verbose==0 so this is not called.
*/
static void printf_log(uint verbose, ulong isamlog_process, my_off_t isamlog_filepos,
const char *format,...)
{
  char llbuff[21];
  va_list args;
  va_start(args,format);
  if (verbose > 2)
    printf("%9s:",llstr(isamlog_filepos,llbuff));
  if (verbose > 1)
    printf("%5ld ",isamlog_process);	/* Write process number */
  (void) vprintf((char*) format,args);
  putchar('\n');
  va_end(args);
}


static bool cmp_filename(struct file_info *file_info, const char *name)
{
  if (!file_info)
    return 1;
  return strcmp(file_info->name,name) ? 1 : 0;
}


/*
  mi_close() calls mi_state_info_write() if the table is corrupted.
  This can happen for example is the table is from an online backup which made
  a copy of its data file and only its index' header.
  But in that case, if we have executed some MI_LOG_WRITE_BYTES_MYI commands,
  the state in memory is older than the state on disk, so we do not want to
  call mi_state_info_write(), it would cancel what we have just done.
  The solution is to mark the table as "read only" before mi_close(). We have
  no problem with updating the MYISAM_SHARE structure as we are not
  multi-threaded (i.e. nobody uses the share while we are changing its mode in
  mi_close_wrap()). It is also not a problem if this "read only" influences
  next users of this same share, as a backup log contains all index header
  writes logged, and so all next users can skip calling mi_state_info_write().
*/
static int mi_close_wrap(MI_INFO *info)
{
  if (!update_index_on_close)
    info->s->mode= O_RDONLY;
  return mi_close(info);
}

--- New file ---
+++ storage/myisam/myisam_backup_driver.cc	07/05/15 18:09:20
/* Copyright (C) 2007 MySQL AB

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; version 2 of the License.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
*/

/**
   @file
   @brief Online backup driver for the MyISAM storage engine.

   @see myisam_backup
*/

#define MYSQL_SERVER 1 // need it to have mysql_tmpdir defined
#include "mysql_priv.h"
#include "ha_myisam.h"
#include "myisamdef.h" // to access dfile and kfile
#include "backup/backup_engine.h"
#include "backup/tables.h"         // for build_table_list()
#include <hash.h>

/**
   @brief Online backup driver for the MyISAM storage engine.

   Reference of the Online Backup API:
   http://forge.mysql.com/source/OnlineBackup.

   Here is how the MyISAM online backup works.
   It is online because we dirtily copy the data and index files
   A the tables maintain a physical idempotent log of changes done to them
   during the copy process, applying this log to the dirty copy yields a clean
   table corresponding to how the original table was when logging ended.

   A condition for this to work is that any update done to a table after the
   copy process started must be present in the log. So if an update is
   proceeding, we must wait for it to go to the log before starting the copy
   process.

   HOW THE BACKUP WORKS

   In backup::begin(), we instruct all needed tables to do backup
   logging; this does not have to wait for existing updates to complete,
   neither does it stall new updates.

   Then we dirtily copy them in Backup::get_data(). That copy is intensive on
   the hard drive, so can be optionally throttled (via a configurable sleep).

   When the copy process is done with tables, it signals the backup kernel
   that it is ready to lock tables (to create a validity point).
   To not waste its time until the Backup::prelock() request is sent by the
   backup kernel, the copy process starts copying the log.

   Now the Backup::prelock() request comes.
   To finish the backup, we need to synchronize (=read-lock) all tables of the
   backup (thus creating a consistent state accross them), stop logging for
   all of them, and unlock tables. This lock can wait for a long time if there
   is a long running update, so we do all the locking work before
   Backup::lock(): backup::prelock() sets up a separate thread which issues a
   LOCK TABLES READ on our tables; without waiting for completion of that,
   prelock() returns backup::OK which means "I have not completed my
   preparations for locking".

   In Backup::get_data(), the driver monitors the status of the locking
   thread, and when finally that thread has managed to get its locks, we stop
   logging and reply backup::READY.

   So note the difference: this time, we have to wait for all updates to
   finish, and stall new ones.

   Next Backup::get_data() calls, if there are, send the final tail of the
   log.

   Backup::lock() comes, it's an empty operation for us.

   Later we get an Backup::unlock() request. That kills the locking thread,
   which thus unlocks tables. And Backup::end() cleans up memory.

   HOW THE RESTORE WORKS

   In Restore::send_data() we receive data which we write to tables (those
   tables have just been created with their correct structure, but no data, by
   the backup kernel). We similarly restore the log.

   In Restore::end(), we apply the log to tables, making them clean.
   If of the index file we backed up only the header (an option), we here do
   an index rebuild.
   Voila, the table is ready to work.

   @todo if an index rebuild is needed, possibly do it at backup time.
*/
namespace myisam_backup {

using backup::byte;
using backup::result_t;
using backup::version_t;
using backup::Table_list;
using backup::Table_ref;
using backup::Buffer;


/**
   An instance of Table represents a table (just a db/name pair) stored in the
   engine and part of the backup job.
*/

struct Table
{
  Table(const Table_ref &tbl);

 protected:
  String m_db, m_name;
  String m_complete_name; ///< concatenation of m_db and m_name
};

Table::Table(const Table_ref &tbl)
{
  /*
    Note: this way below will break with non-ascii characters;
    what driver should be passed is what engine used to create the table, ie
    output of build_table_filename().
    Just replace "building" with "bûilding" in backup.test to see issues.
    Rafal knows about this issue, upper layer will soon pass escaped name.
  */
  m_complete_name.append("./");
  m_complete_name.append(tbl.db().name());
  m_complete_name.append("/");
  m_complete_name.append(tbl.name());
  m_db.append(tbl.db().name());
  m_name.append(tbl.name());
}


/**
   Backup engine classgre
*/

class Engine: public Backup_engine
{
  public:
    Engine()
    {}
    virtual const version_t version() const { return 0; };
    virtual result_t get_backup(const uint32, const Table_list &tables, Backup_driver*
&drv);
    virtual result_t get_restore(version_t ver, const uint32, const Table_list
&tables,Restore_driver* &drv);
    virtual void free()     { delete this; }
};

/*************************
 *
 *  BACKUP FUNCTIONALITY
 *
 *************************/

class Object_backup;


/**
   @brief Handles backup orders received from the backup kernel (implements the API).
*/
class Backup: public Backup_driver
{
public:
  Backup(const Table_list &tables);
  virtual ~Backup();
  /**  @brief estimate total size of backup. @todo improve it */
  virtual size_t    size() { return UNKNOWN_SIZE; };
  /**  @brief estimate size of backup before lock. @todo improve it */
  virtual size_t    init_size() { return UNKNOWN_SIZE; };
  virtual result_t  begin(const size_t);
  virtual result_t  end();
  virtual result_t  get_data(Buffer &buf);
  virtual result_t  prelock();
  virtual result_t  lock();
  virtual result_t  unlock();
  virtual result_t  cancel()
    {
      /* Nothing to do in cancel(); free() will suffice */
      return backup::OK;
    };
  virtual void free() { delete this; };
  void lock_tables_TL_READ_NO_INSERT();

private:
  enum { DUMPING_DATA_INDEX_FILES,
         DUMPING_LOG_FILE_BEFORE_TABLES_ARE_LOCKED,
         DUMPING_LOG_FILE_AFTER_TABLES_ARE_LOCKED,
         DONE, ERROR } state;
  Object_backup  **images; ///< one for the log and one per table
  uint stream; ///< which stream we are currently writing to
  char backup_log_name[FN_REFLEN];
  /**
     @brief All db||table names in a HASH structure.

     Passed to MyISAM functions for them to detect if a table is part of the
     backup (=> should do logging) or not.
  */
  HASH *hash_of_tables;
  /** Locking of tables goes through several states */
  enum { LOCK_NOT_STARTED, LOCK_IN_PROGRESS, LOCK_ACQUIRED, LOCK_ERROR } lock_state;
  /**
     @brief The locking thread (so that we can kill it).

     Creating a validity point is only possible by locking all tables (it is
     the only way to have tables consistent with each other, as we have no
     UNDO log).
     But locking via thr_lock() is blocking.
     So, to have a non-blocking prelock() call, this locking is done in a
     separate thread (named "the locking thread").
  */
  THD *lock_thd;
  pthread_cond_t COND_lock_state; ///< for communication with locking thread
  TABLE_LIST *tables_in_TABLE_LIST_form; ///< for open_and_lock_tables()
  void kill_locking_thread();
  static const ulong bytes_between_sleeps= 10*1024*1024;
  /** @brief after copying bytes_between_sleeps we sleep sleep_time */
  ulong sleep_time;
  ulong bytes_since_last_sleep; ///< how many bytes since we last slept
};

/**
   When we send a backup packet to the backup kernel, we prefix it with a code
   which tells which type of file this packet belongs to. Starts at 1 because
   garbage is often zeros and we want to spot it
*/
enum enum_file_code { DATA_FILE_CODE= 1,
                      WHOLE_INDEX_FILE_CODE, HEADER_INDEX_FILE_CODE,
                      LOG_FILE_CODE };

/** @brief An object to backup; in practice, a table or the log */
class Object_backup
{
public:
  virtual result_t get_data(Buffer &)= 0;
  virtual ~Object_backup() {};
  bool internal_error() { return state == ERROR; }
  virtual result_t end()= 0; ///< cleanups
protected:
  enum { OK, ERROR } state; ///< serves to detect an error during construction
};


/**
   @brief An object to back up is made of one or more such files.

   This class does not open the file, user has to open it.
   This class does not close the file by default, but can do so if requested.
*/
class File_backup
{
public:
  result_t get_data(Buffer &);
  File_backup() : backup_file_size(0) {}
  void init(int fd_arg, my_off_t file_size_arg, enum_file_code file_code_arg)
    { fd= fd_arg; file_size= file_size_arg; file_code= file_code_arg; }
  result_t close_file();
private:
  int fd; ///< file descriptor
  /**
     @brief After backing up that much of the file, we can stop.

     In case of ftruncate() happening to the file, we may even copy less than
     this size.
  */
  my_off_t file_size;
  enum_file_code file_code; ///< code stored at start of each backup block
  my_off_t backup_file_size; ///< how much of the file we already backed up
};


/**
   @brief Handles backing up a single table
*/
class Table_backup: public Table, public Object_backup
{
public:
  Table_backup(const backup::Table_ref &tbl);
  virtual ~Table_backup();
  virtual result_t get_data(Buffer &);
  virtual result_t end(); ///< cleanups
private:
  MI_INFO     *mi_info; ///< MyISAM table structure
  File_backup dfile_backup, kfile_backup;
  enum { DATA_FILE= 1, INDEX_FILE } in_file; ///< which file we are dumping now
};


/**
   @brief Handles backing up the log
*/
class Log_backup: public Object_backup
{
public:
  Log_backup(const char *log_name_arg);
  virtual result_t get_data(Buffer &);
  virtual ~Log_backup();
  virtual result_t end();
private:
  String log_name;
  File_backup log_file_backup;
  bool log_deleted; ///< if we have already deleted the log or not
};


result_t Engine::get_backup(const uint32, const Table_list &tables,
                            Backup_driver* &drv)
{
  Backup *ptr= new myisam_backup::Backup(tables);
  if (unlikely(!ptr))
    return backup::ERROR;
  drv= ptr;
  return backup::OK;
}


Backup::Backup(const Table_list &tables):
  Backup_driver(tables), state(ERROR), images(NULL), stream(1),
  hash_of_tables(NULL), lock_state(LOCK_NOT_STARTED), lock_thd(NULL),
  bytes_since_last_sleep(0)
{
  /*
    Driver is not ready at this point, so state is ERROR.
    This constructor cannot fail. If it could, begin() would have to detect
    it.
  */
}


/**
   When the locking thread is not needed anymore (that is, it is ok to unlock
   the tables), this method kills it.
*/
void Backup::kill_locking_thread()
{
  /*
    If everything worked well, when unlock() calls us we kill the thread and
    so when free() calls us the locking thread is already dead here
    (LOCK_ERROR).
  */
retry:
  pthread_mutex_lock(&THR_LOCK_myisam);
  if (lock_state != LOCK_ERROR) // thread not already dead, kill it
  {
    /*
      If the locking thread had no time to create THD (very unlikely), wait
      for it.
    */
    if (unlikely(!lock_thd))
    {
      pthread_mutex_unlock(&THR_LOCK_myisam);
      sleep(1);
      goto retry;
    }
    /*
      Locking thread had time to create its THD, may be inside locking
      (waiting for others to release locks etc), wake it up and kill it.
      To wake it up we cannot hold THR_LOCK_myisam (awake() can't bear it),
      but we still have to make sure lock_thd is not being deleted at this
      moment. So we lock LOCK_delete and then can unlock THR_LOCK_myisam.
    */
    pthread_mutex_lock(&lock_thd->LOCK_delete);
    pthread_mutex_unlock(&THR_LOCK_myisam);
    /*
      Now we kill it (which will in particular work if the thread is waiting
      for some table locks).
    */
    lock_thd->awake(THD::KILL_CONNECTION);
    pthread_mutex_unlock(&lock_thd->LOCK_delete);
    pthread_mutex_lock(&THR_LOCK_myisam);
    /* And we wait for the thread to inform of its death */
    while (lock_state != LOCK_ERROR)
      pthread_cond_wait(&COND_lock_state, &THR_LOCK_myisam);
  }
  pthread_mutex_unlock(&THR_LOCK_myisam);
}


/**
   This destructor is only called by the class' free().
   It cleans up any leftover the driver could have. It is safe to call it at
   any point. In a normal (no error) situation, the hash freeing is the only
   operation done here, all the rest should already have been done by earlier
   stages.
*/
Backup::~Backup()
{
  /* If we had already started backup logging, we must dirtily stop it */
  mi_backup_stop_logging_for_tables(TRUE);
  if (images)
  {
    for (uint n= 0; n < (1 + m_tables.count()); ++n)
      delete images[n];
    delete images;
  }
  if (hash_of_tables)
  {
    hash_free(hash_of_tables);
    delete hash_of_tables;
    hash_of_tables= NULL;
  }
  kill_locking_thread();
}


/** @brief Usual parameter to hash_init() */
static ::byte
*backup_get_table_from_hash_key(const ::byte *lsc, uint *length,
                                my_bool not_used __attribute__ ((unused)))
{
  const LEX_STRING *ls= reinterpret_cast<const LEX_STRING *>(lsc);
  *length= ls->length;
  return static_cast< ::byte *>(ls->str);
}


/** @brief Usual parameter to hash_init() */
static void backup_free_hash_key(void *lsv)
{
  my_free(reinterpret_cast<gptr>(lsv), MYF(0));
}


#define SET_STATE_TO_ERROR_AND_DBUG_RETURN {                                 \
    state= ERROR;                                                       \
    DBUG_PRINT("error",("driver got an error at %s:%d",__FILE__,__LINE__)); \
    DBUG_RETURN(backup::ERROR); }

/* use this one only in constructors */
#define SET_STATE_TO_ERROR_AND_DBUG_VOID_RETURN {                                 \
    state= ERROR;                                                       \
    DBUG_PRINT("error",("driver got an error at %s:%d",__FILE__,__LINE__)); \
    DBUG_VOID_RETURN; }


/**
   @brief Sets MyISAM in a state ready for us to copy.

   I.e. builds hash_tables, starts MyISAM logging.
*/
result_t Backup::begin(const size_t)
{
  DBUG_ENTER("myisam/backup/Backup::begin");
  DBUG_PRINT("info",("%d tables", m_tables.count()));

  /*
    per the API, all significant allocations (large mem, opening files) must
    not be in the constructor but in begin() or later.
  */
  images= new Object_backup*[1+m_tables.count()];
  if (unlikely(!images))
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  bzero(images, (1+m_tables.count())*sizeof(*images));

  DBUG_ASSERT(!hash_of_tables); // no double call to begin()
  hash_of_tables= new HASH;
  if (!hash_of_tables ||
      hash_init(hash_of_tables, &my_charset_bin,
                m_tables.count(), 0, 0,
                backup_get_table_from_hash_key,
                backup_free_hash_key, 0))
  {
    delete hash_of_tables;
    hash_of_tables= NULL;
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  }
  /* Build the hash of tables for the MyISAM layer (mi_backup_log.c etc) */
  for (uint n=0 ; n < m_tables.count() ; n++ )
  {
    char unique_file_name[FN_REFLEN], *str;
    uint str_len;
    LEX_STRING *hash_key;
    my_realpath(unique_file_name,
                fn_format(unique_file_name,m_tables[n].name().ptr(),
                          m_tables[n].db().name().ptr(),
                          MI_NAME_IEXT,
                          MY_UNPACK_FILENAME), MYF(MY_WME));
    str_len= strlen(unique_file_name);
    my_multi_malloc(MYF(MY_WME),
                    &hash_key, sizeof(*hash_key),
                    &str, str_len, NullS);
    if (!hash_key)
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
    memcpy(str, unique_file_name, str_len);
    hash_key->length= str_len;
    hash_key->str= str;
    if (my_hash_insert(hash_of_tables,
                       reinterpret_cast< ::byte *>(hash_key)))
    {
      my_free(reinterpret_cast<gptr>(hash_key), MYF(0));
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
    }
    DBUG_PRINT("info",("table '%.*s' inserted in hash",
                       hash_key->length, hash_key->str));
  }

  {
    THD *thd= current_thd;
    /*
      If tmpdir is in RAM (/dev/shm etc), we may exhaust it if our log is big
    */
    my_snprintf(backup_log_name, sizeof(backup_log_name),
                "%s/%s%lx_%lx_%x-backuplog", mysql_tmpdir,
                tmp_file_prefix, current_pid, thd->thread_id,
                thd->tmp_table++); // it's not a tmp table but what...
    unpack_filename(backup_log_name, backup_log_name);
#if 0
    /*
      Of course driver shouldn't pollute thd's state like this, it's just for
      debugging now.
    */
    thd->proc_info="MyISAM dump going to start";
#endif
  }

  {
    /*
      Don't worry, this getenv is just a temporary way for me to test both
      logging and no-logging of index pages.
    */
    char *env_arg= getenv("BACKUP_NO_INDEX");
    /* by default we log index pages */
    index_pages_in_backup_log= !(env_arg && atoi(env_arg));
    /*
      Again this getenv is just a temporary way for me to test different
      values of sleeps without adding a real variable in set_var.cc
      (not worth it, as sleeps, if they are implemented in the final version,
      will not be in this driver anyway, but rather in the backup kernel).
    */
    env_arg= getenv("BACKUP_SLEEP");
    /*
      By default we don't sleep at all; however, 500 ms every 10MB gives a
      low penalty on clients, so it can be a good choice.
    */
    sleep_time= env_arg ? atoi(env_arg) : 0;
  }

  if (mi_backup_start_logging_for_tables(hash_of_tables, backup_log_name))
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;

  state= DUMPING_DATA_INDEX_FILES;
  DBUG_RETURN(backup::OK);
}


/**
   If some error happened, end() is not called but free() is. So we do all
   cleanup in the destructor, and nothing here.
*/
result_t Backup::end()
{
  DBUG_ENTER("myisam/backup/Backup::end");
  DBUG_RETURN(backup::OK);
}


/**
   @brief Actually sends backup data to the backup kernel.
*/
result_t Backup::get_data(Buffer &buf)
{
  result_t ret;
  DBUG_ENTER("myisam/backup/Backup::get_data");
  DBUG_PRINT("enter",("stream %d",stream));

  /* we are currently on stream 'stream' */
  buf.table_no= stream;

  /*
    Rafal and I agreed that one single ERROR from the driver will cause the
    upper layer to not call the driver anymore except for free().
  */
  DBUG_ASSERT(state != ERROR);
  DBUG_ASSERT(buf.data != NULL); // to check that caller gave room

  if (state == DONE)
  {
    /*
      We never come here, because after returning from the call where we sent
      the last piece of the last stream (when we set our internal state to
      DONE), all streams were closed, so the upper layer wouldn't call us
      again. At least it was so during testing. But if it calls us, we do all
      that the API expects us to do:
    */
    buf.size= buf.table_no= 0;
    buf.last= TRUE;
    DBUG_RETURN(backup::DONE);
  }

  buf.last= TRUE; // by default

  Object_backup  *image= images[stream];
  if (unlikely(!image))
  {
    /*
      Let's create it.
      Table 0 will be image 1 on stream 1. Table N will be image N+1 on stream
      N+1. Log will be image 0 on stream 0.
    */
    if (stream >= 1)
      image= new Table_backup(m_tables[stream-1]);
    else
      image= new Log_backup(backup_log_name);
    images[stream]= image;
    if (!image || image->internal_error())
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  }

  if ((ret= image->get_data(buf)) == backup::ERROR)
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;

  if (sleep_time)
  {
    bytes_since_last_sleep+= buf.size;
    /* sched_yield() is not as flexible (higher penalty) as sleep() */
    if (bytes_since_last_sleep > bytes_between_sleeps)
    {
      my_sleep(sleep_time*1000UL);
      bytes_since_last_sleep= 0;
    }
  }

  if (state == DUMPING_LOG_FILE_BEFORE_TABLES_ARE_LOCKED)
  {
    DBUG_ASSERT(stream == 0);
    /*
      We are sending the log; even if reached its EOF, some more may be
      appended to it before prelock() ends, so this is not the stream's end.
    */
    buf.last= FALSE;
    /*
      API docs say we should return READY, but Rafal says OK is better (one
      READY to signal end of initial phase; then OKs; one READY to signal end
      of prelock(); then OKs).
    */
    if (lock_state == LOCK_NOT_STARTED)
      DBUG_RETURN(backup::OK);
    /* Let's see if the locking thread has finished locking all tables */
    pthread_mutex_lock(&THR_LOCK_myisam);
    if (lock_state == LOCK_IN_PROGRESS) // not yet
    {
      pthread_mutex_unlock(&THR_LOCK_myisam);
      DBUG_RETURN(backup::OK);
    }
    if (lock_state !=  LOCK_ACQUIRED) // it failed, so do we
    {
      pthread_mutex_unlock(&THR_LOCK_myisam);
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
    }
    DBUG_PRINT("info",("locking thread acquired READ locks on tables"));
    pthread_mutex_unlock(&THR_LOCK_myisam);
    if (mi_backup_stop_logging_for_tables(FALSE))
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
    state= DUMPING_LOG_FILE_AFTER_TABLES_ARE_LOCKED;
    /* signal "end of prepare-for-lock, ready for lock()" */
    DBUG_RETURN(backup::READY);
  }
  else if (buf.last)
  {
    /*
      we are sending the last chunk of the image, next call will be about the
      next image:
    */
    if (image->end())
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
    delete image;
    images[stream++]= NULL;
    if (stream > m_tables.count())
    {
      stream= 0; // send the log on stream 0
      state= DUMPING_LOG_FILE_BEFORE_TABLES_ARE_LOCKED;
      ret= backup::READY; // end of initial phase
    }
    else if (stream == 1) // log done
    {
      state= DONE;
      /* destructor is safe, but save its useless work */
      delete images;
      images= NULL;
    }
  }

  DBUG_RETURN(ret);
}


/**
   @brief Creates a validity point by locking all tables.

   This is the only job of the locking thread: call this function which locks
   tables, then wait for being killed (which will unlock tables).

   @todo make a function which does all needed initializations, for those
   threads (like this thread, the slave SQL thread, the NDB binlog thread)
   which open/lock tables without going through the usual client thread calls
   (do_command()/dispatch_command()/mysql_parse()/mysql_execute_command()
   etc).
   @todo use a method which does not open closed tables. This will be needed
   when backing up lots of tables (more than the limit of open file
   descriptors). Monty suggests calling thr_multi_lock()+mi_open() on already
   open tables (so, like open_and_lock_tables() on them), plus, for
   closed tables, a thread which opens a closed table would acquire a read
   thr_lock lock on it and give this lock to the backup thread...
*/
void Backup::lock_tables_TL_READ_NO_INSERT()
{
  THD *thd;
  DBUG_ENTER("myisam/backup/Backup::lock_tables_TL_READ_NO_INSERT");

  thd= new THD;
  if (unlikely(!thd))
    goto end2;
  thd->thread_stack = (char*)&thd; // remember where our stack is
  if (unlikely(thd->store_globals())) // for a proper MEM_ROOT
    goto end2;
  thd->init_for_queries(); // opening tables needs a proper LEX
  thd->command= COM_DAEMON;
  thd->system_thread= SYSTEM_THREAD_BACKUP;
  thd->version= refresh_version;
  thd->set_time();
  thd->main_security_ctx.host_or_ip= "";
  thd->client_capabilities= 0;
  my_net_init(&thd->net, 0);
  thd->main_security_ctx.master_access= ~0;
  thd->main_security_ctx.priv_user= 0;
  thd->real_id= pthread_self();
  /*
    Making this thread visible to SHOW PROCESSLIST is useful for
    troubleshooting a backup job (why does it stall etc).
  */
  pthread_mutex_lock(&LOCK_thread_count);
  threads.append(thd);
  pthread_mutex_unlock(&LOCK_thread_count);
  mysql_init_query(thd, NULL, 0);
  /*
    We need TL_READ_NO_INSERT (and not TL_READ) because we want to prevent
    concurrent inserts.
  */
  tables_in_TABLE_LIST_form= build_table_list(m_tables, TL_READ_NO_INSERT);
  if (!tables_in_TABLE_LIST_form)
    goto end2;
  /*
    As locking tables can be a long operation, we need to support
    cancellability during that time. So we publish our THD to the thread which
    created us.
  */
  pthread_mutex_lock(&THR_LOCK_myisam);
  lock_thd= thd; // so that we can be killed
  pthread_mutex_unlock(&THR_LOCK_myisam);

  if (open_and_lock_tables(thd, tables_in_TABLE_LIST_form))
    goto end;

  DBUG_PRINT("info",("MyISAM backup locking thread got locks"));
  pthread_mutex_lock(&THR_LOCK_myisam);
  lock_state= LOCK_ACQUIRED;
  thd->enter_cond(&COND_lock_state, &THR_LOCK_myisam,
                  "MyISAM backup: holding table locks");
  while (!thd->killed)
    pthread_cond_wait(&COND_lock_state, &THR_LOCK_myisam);
  thd->exit_cond("MyISAM backup: terminating");

end:
  DBUG_PRINT("info",("MyISAM backup locking thread dying"));
  close_thread_tables(thd);
end2:
  pthread_mutex_lock(&THR_LOCK_myisam);
  lock_state= LOCK_ERROR;
  pthread_mutex_unlock(&THR_LOCK_myisam);
  pthread_cond_broadcast(&COND_lock_state);
  net_end(&thd->net);
  delete thd;
  DBUG_VOID_RETURN;
}


/** @brief Entry point for the locking thread */
pthread_handler_t separate_thread_for_locking(void *arg)
{
  my_thread_init();
  DBUG_ENTER("myisam/backup/lock_tables_in_separate_thread");
  pthread_detach_this_thread();
  (static_cast<Backup *>(arg))->lock_tables_TL_READ_NO_INSERT();
  my_thread_end();
  pthread_exit(0);
  DBUG_RETURN(0);
}


/**
   Launches a separate thread ("locking thread") which will lock
   tables. Locking in a separate thread is needed to have a non-blocking
   prelock() (given that thr_lock() is blocking).
*/
result_t Backup::prelock()
{
  DBUG_ENTER("myisam/backup/Backup::prelock");
  pthread_cond_init(&COND_lock_state, NULL);
  lock_state= LOCK_IN_PROGRESS;
  {
    pthread_t th;
    if (pthread_create(&th, &connection_attrib,
                       separate_thread_for_locking, this))
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  }
  DBUG_RETURN(backup::OK);
}


result_t Backup::lock()
{
  DBUG_ENTER("myisam/backup/Backup::lock");
  /* locking was done in prelock() already, nothing to do */
  DBUG_RETURN(backup::OK);
}


result_t Backup::unlock()
{
  DBUG_ENTER("myisam/backup/Backup::unlock");
  /* kill the locking thread which owns table locks, it will unlock them */
  kill_locking_thread();
  DBUG_RETURN(backup::OK);
}


/*
   Backing up the log

   @todo For now we read the log file from disk. We could instead try to
   "steal" it from its IO_CACHE; that might reduce the log portion which goes
   to disk, if the backup thread is fast enough to catch up on client threads
   filling the log. Maybe this cache should be made SEQ_READ_APPEND.
*/

Log_backup::Log_backup(const char *log_name_arg) : log_deleted(FALSE)
{
  DBUG_ENTER("myisam/backup/Log_backup::Log_backup");
  log_name.append(log_name_arg);
  int fd= my_open(log_name_arg, O_RDONLY, MYF(MY_WME));
  if (fd < 0)
    SET_STATE_TO_ERROR_AND_DBUG_VOID_RETURN;
  /*
    Log is alone on the shared stream for now, so a code is useless,
    except that it allows us to verify that what restore sends us is really a
    log.
  */
  log_file_backup.init(fd, ~(ULL(0)), LOG_FILE_CODE);
  state= OK;
  DBUG_VOID_RETURN;
}


/**
   @brief Closes and deletes the log.

   The only reason to have an end() and call it from the destructor, instead of
   putting the code into the destructor, is that when the caller does a "delete
   image", it cannot be told about errors, while if the caller does
   "image->end()" (and then "delete image") it can see an error.
*/
result_t Log_backup::end()
{
  DBUG_ENTER("myisam/backup/Log_backup::end");
  /* log is safe in the stream so we don't need it anymore */
  if (log_file_backup.close_file() ||
      (!log_deleted && my_delete(log_name.ptr(), MYF(MY_WME))))
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  log_deleted= TRUE;
  DBUG_RETURN(backup::OK);
}


Log_backup::~Log_backup()
{
  /* If all went well, we don't do anything here. */
  if (end())
  {
    /*
      Repetition of failing deletes could fill disk, warn.
      It's a destructor, no other way to report an error than print.
    */
    sql_print_error("MyISAM backup driver: could not close or delete log");
  }
}


/* the header of a MYI file always fits in this size */
#define MAX_INDEX_HEADER_SIZE (64*1024)


/**
   @brief Opens a MyISAM table for backing it up.
   
   @param  tbl              The table to open
*/
Table_backup::Table_backup(const backup::Table_ref &tbl) :
  Table(tbl), mi_info(NULL)
{
  DBUG_ENTER("myisam/backup/Table_backup::Table_backup");
  DBUG_PRINT("info",("Initializing backup image for table %s",
                     m_complete_name.ptr()));
  /*
    Here we use low-level mi_* functions as all we want is a pair of file
    descriptors.
    O_RDONLY is not ok, as it forces all instances of the table to be
    read-only (sets HA_OPTION_READ_ONLY_DATA of share->options).
    HA_OPEN_FOR_REPAIR so that we can back it up even if it is corrupted (we
    don't do that now).
  */
  mi_info= mi_open(m_complete_name.ptr(), O_RDWR, HA_OPEN_FOR_REPAIR);
  if (!mi_info) // table does not exist? not normal
    SET_STATE_TO_ERROR_AND_DBUG_VOID_RETURN;
  if (mi_is_crashed(mi_info))
  {
    /*
      For some users, a backup has to be fully correct to be considered
      usable, so a corrupted table should stop the backup. However sometimes,
      user knows a table is corrupted but still wants to back it up (it could
      be that the machine's hardware is failing, currently only the index file
      is bad, then it makes sense to back it up before the table is completely
      destroyed). Or it could be 10000 tables to back up with a single one
      corrupted, then it makes sense to do the backup and later repair the
      single one.
      For now we abort the backup. Optionally we should not.
    */
    SET_STATE_TO_ERROR_AND_DBUG_VOID_RETURN;
  }
  DBUG_ASSERT(mi_info->dfile >= 0);
  DBUG_ASSERT(mi_info->s->kfile >= 0);
  /*
    We use the shared file descriptor on the index file, so use pread() and
    not read().
  */
  my_off_t file_size= my_seek(mi_info->dfile, 0, SEEK_END, MYF(MY_WME));
  if (file_size == MY_FILEPOS_ERROR)
    SET_STATE_TO_ERROR_AND_DBUG_VOID_RETURN;
  dfile_backup.init(mi_info->dfile, file_size, DATA_FILE_CODE);
  if (index_pages_in_backup_log)
  {
    file_size= my_seek(mi_info->s->kfile, 0, SEEK_END, MYF(MY_WME));
    if (file_size == MY_FILEPOS_ERROR)
      SET_STATE_TO_ERROR_AND_DBUG_VOID_RETURN;
    kfile_backup.init(mi_info->s->kfile, file_size, WHOLE_INDEX_FILE_CODE);
  }
  else
    kfile_backup.init(mi_info->s->kfile,
                      MAX_INDEX_HEADER_SIZE /* upper limit */ ,
                      HEADER_INDEX_FILE_CODE);
  in_file= DATA_FILE; // dump the data file first (no specific reason)
  state= OK;
  DBUG_VOID_RETURN;
  /*
    Note: we are copying an index file of a table, which may have instances in
    the MySQL table cache, so after restore it will show up as
    "warning: 1 client is using or hasn't closed the table properly".
    Maybe do a quick index update on the table at the end of restore to
    remove this warning. But how to know if the problem pre-dates backup ?
  */
}


/**
   @brief Closes the MyISAM table.

   The only reason to have an end() and call it from the destructor, instead of
   putting the code into the destructor, is that when the caller does a "delete
   image", it cannot be told about errors, while if the caller does
   "image->end()" (and then "delete image") it can see an error.
*/
result_t Table_backup::end()
{
  DBUG_ENTER("myisam/backup/Table_backup::end");
  if (mi_info)
  {
    if (mi_close(mi_info))
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
    mi_info= NULL;
  }
  DBUG_RETURN(backup::OK);
}


Table_backup::~Table_backup()
{
  /* If all went well, we don't do anything here. */
  if (end())
  {
    /*
      Close failure is important, warn.
      It's a destructor, no other way to report an error than print.
    */
    sql_print_error("MyISAM backup driver: could not close table '%s'",
                    m_complete_name.ptr());
  }
}


result_t Table_backup::get_data(Buffer &buf)
{
  result_t ret;
  DBUG_ENTER("myisam/backup/Table_backup::get_data");
  switch (in_file)
    {
    case DATA_FILE:
      ret= dfile_backup.get_data(buf);
      if (buf.last) // move to dumping the index file...
      {
        in_file= INDEX_FILE;
        buf.last= FALSE; // ... so this is not the last buffer on this stream
      }
      break;
    case INDEX_FILE:
      ret= kfile_backup.get_data(buf);
      break;
    };
  if (ret == backup::ERROR)
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  DBUG_RETURN(ret);
}


result_t Log_backup::get_data(Buffer &buf)
{
  result_t ret;
  DBUG_ENTER("myisam/backup/Log_backup::get_data");
  if (((ret= log_file_backup.get_data(buf)) == backup::ERROR) ||
      (myisam_backup_log.hard_write_error_in_the_past == -1))
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  /*
    See, we detect a log write error encountered by the MyISAM myisam_log*
    and mi_log* functions, every time we read a packet from the log file.
  */
  DBUG_RETURN(ret);
}


result_t File_backup::close_file()
{
  int ret;
  if (fd < 0)
    return backup::OK;
  ret= my_close(fd, MYF(MY_WME));
  fd= -1;
  return ret ? backup::ERROR : backup::OK;
}


result_t File_backup::get_data(Buffer &buf)
{
  int       res;
  uint      howmuch= buf.size;
  result_t  ret= backup::OK;

  DBUG_ENTER("myisam/backup/File_backup::get_data");

  buf.size= 1;
  *buf.data= static_cast<uchar>(file_code);
  howmuch--;
  DBUG_ASSERT(howmuch > 0);

  if (backup_file_size >= file_size)
    res= 0; // we don't have to read/send the rest of file
  else
  {
    res= my_pread(fd, reinterpret_cast< ::byte *>(buf.data) + 1,
                  howmuch, backup_file_size, MYF(MY_WME));
    // DBUG_DUMP("sending",buf_ptr-1, 16);
  }
  if (res < 0)
  {
    ret= backup::ERROR;
    goto end;
  }
  backup_file_size+= res;
  if (res == 0) // end of file
  {
    buf.size= 0; // don't even send a packet
    goto end;
  }
  buf.size+= res;
  buf.last= FALSE;
end:
  DBUG_PRINT("info",("ret %d buf.last %d buf.size %d",
                     ret, buf.last, buf.size));
  DBUG_RETURN(ret);
}


/**************************************
 *
 *   RESTORE FUNCTIONALITY
 *
 **************************************/

class Object_restore;

/**
   @brief Handles restore orders received from the backup kernel (implements the API).
*/
class Restore: public Restore_driver
{
public:
  Restore(const Table_list &tables);
  virtual ~Restore();
  virtual result_t  begin(const size_t);
  virtual result_t  end();
  virtual result_t  send_data(Buffer &buf);
  virtual result_t  cancel()
    {
      /* Nothing to do in cancel(); free() will suffice */
      return backup::OK;
    };
  virtual void      free() { delete this; };

private:
  enum { PUMPING, DONE, ERROR } state;
  uint            images_left; ///< how many images left to restore in this job
  Object_restore  **images; ///< one for the log and one per table
  char restore_log_name[FN_REFLEN];
};


/** @brief An object to restore; in practice, a table or the log */
class Object_restore
{
public:
  virtual result_t send_data(const Buffer &buf)= 0;
  virtual ~Object_restore() {};
  virtual result_t close()= 0;
  virtual result_t post_restore() { return backup::OK; };
  bool internal_error() { return state == ERROR; }
  virtual result_t end()= 0;
protected:
  enum { OK, ERROR } state;
};


/**
   @brief An object to restore is made of one or more such files.

   This class does not open the file, user has to open it.
   This class does not close the file by default, but can do so if requested.
*/
class File_restore
{
public:
  void init(int fd_arg) { fd= fd_arg; }
  result_t send_data(const Buffer &);
  result_t close_file();
private:
  int fd; ///< file descriptor
};


/**
   @brief Handles restoring a single table
*/
class Table_restore: public Object_restore, public Table
{
public:
  Table_restore(const Table_ref &tbl);
  virtual result_t send_data(const Buffer &buf);
  virtual ~Table_restore();
  virtual result_t close();
  virtual result_t post_restore(); ///< what to do after restoring its files
  virtual result_t end(); ///< cleanups
 private:
  MI_INFO      *mi_info; ///< MyISAM table structure
  File_restore dfile_restore, kfile_restore;
  bool         rebuild_index; ///< if we have to rebuild index or not
  THD          *thd; ///< rebuilding index requires a THD
};


/**
   @brief Handles restoring the log
*/
class Log_restore: public Object_restore
{
public:
  Log_restore(const char *log_name_arg);
  virtual result_t send_data(const Buffer &buf);
  virtual ~Log_restore();
  virtual result_t close();
  virtual result_t end();
private:
  String log_name;
  File_restore log_file_restore;
  bool log_deleted; ///< if we have already deleted the log or not
};


result_t Engine::get_restore(version_t ver, const uint32,
                             const Table_list &tables, Restore_driver* &drv)
{
  Restore *ptr= new myisam_backup::Restore(tables);
  if (unlikely(!ptr))
    return backup::ERROR;
  drv= ptr;
  return backup::OK;
}


Restore::Restore(const Table_list &tables):
  Restore_driver(tables), state(ERROR), images_left(0), images(NULL)
{
  /* This constructor cannot fail, otherwise begin() would have to detect it */
}


/**
   This destructor is only called by the class' free().
   It cleans up any leftover the driver could have. It is safe to call it at
   any point. In a normal (no error) situation, it does nothing, all should
   already have been done by earlier stages.
*/
Restore::~Restore()
{
  if (images)
  {
    for (uint n= 0; n <= m_tables.count(); ++n)
      delete images[n];
    delete images;
  }
}


/**
   @brief Sets MyISAM in a state ready for us to restore.

   I.e. creates a temporary file to host the log's restored copy.
*/
result_t Restore::begin(const size_t)
{
  THD *thd= current_thd;
  DBUG_ENTER("myisam/restore/Restore::begin");
  my_snprintf(restore_log_name, sizeof(restore_log_name),
	      "%s/%s%lx_%lx_%x-restorelog", mysql_tmpdir,
	      tmp_file_prefix, current_pid, thd->thread_id,
              thd->tmp_table++);
  unpack_filename(restore_log_name, restore_log_name);

  images_left= 1 + m_tables.count();
  images= new Object_restore*[images_left];
  if (unlikely(!images))
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  bzero(images, (images_left)*sizeof(*images));

  state= PUMPING;
  DBUG_RETURN(backup::OK);
}


/**
   If no error happened, we have to apply the log and possibly repair
   indexes; this has to be done here and not in the destructor (as it has to
   be done only in case of success, while a destructor runs in all cases).
   Because we have no "end of stream" notifications yet, when we come here all
   our tables/logs are opened. and log is not applied (both things which could
   be done in send_data() if we knew end-of-stream). Repairing indexes, on the
   other hand, really has to be done here.

   @todo selective restore (this is just passing a proper function which
   checks if the table is in a hash of tables).
*/
result_t Restore::end()
{
  DBUG_ENTER("myisam/restore/Restore::end");
  /*
    Rafal said currently end() is called in case of error but said he'll fix
    that (only free() will be called)
  */
  DBUG_ASSERT(state != ERROR);
  if (images)
  {
    for (uint n=0; n <= m_tables.count(); ++n)
      if (images[n] && images[n]->close())
        SET_STATE_TO_ERROR_AND_DBUG_RETURN;

    /*
      Tables are closed. Apply backup log if it exists (it does not exist if
      it was empty at backup time).
    */
    if (images[0])
    {
      MI_EXAMINE_LOG_PARAM mi_exl;
      mi_examine_log_param_init(&mi_exl);
      mi_exl.log_filename= restore_log_name;
      mi_exl.update= 1;
      /*
        For max_files, the assumption is that at backup time the server had
        enough file descriptors and so should have that many now.
      */
      mi_exl.max_files= open_files_limit;
      if (mi_examine_log(&mi_exl))
        SET_STATE_TO_ERROR_AND_DBUG_RETURN;
    }
 
    for (uint n=0; n <= m_tables.count(); ++n)
      if (images[n] && images[n]->post_restore())
        SET_STATE_TO_ERROR_AND_DBUG_RETURN;

    /*
      By doing here the work of the destructor we can test the return code of
      end(). We don't do it for tables as they will do nothing in end()
      (except freeing their memory) so that can be left to the destructor.
      Note: if the log was empty during backup, then we haven't sent any
      buffer for it, so restore sends us nothing on the shared stream, so
      there is no image for the log. That's why we have to test
      if(images[0]).
    */
    if (images[0] && images[0]->end())
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  }

  DBUG_RETURN(backup::OK);
}


/**
   @brief Actually receives restore data from the backup kernel.
*/
result_t Restore::send_data(Buffer &buf)
{
  result_t ret;
  uint stream= buf.table_no;
  DBUG_ENTER("myisam/restore/Restore::send_data");
  DBUG_PRINT("enter",("Got packet with %d bytes from stream %d",
                      buf.size, buf.table_no));

  if (state == DONE)
  {
    /* we never come here */
    DBUG_PRINT("info",("Ignoring the packet (all objects already restored)"));
    DBUG_RETURN(backup::DONE);
  }

  Object_restore *image= images[stream];

  /*
    We create an image when we see a new stream.
    Still we have N open tables during the last table's restore.
    But when Rafal implements that the last buffer of a stream has
    buf.last==TRUE (soon), we can close tables earlier.
  */
  if (!image)
  {
    if (stream >= 1)
      image= new Table_restore(m_tables[stream-1]);
    else
      image= new Log_restore(restore_log_name);
    images[stream]= image;
    if (unlikely(!image || image->internal_error()))
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  }

  if ((ret= image->send_data(buf)) == backup::ERROR)
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;

  // for when we have "end of stream" notifications:
#if 0
  if (buf.last)
  {
    if (image->close())
      SET_STATE_TO_ERROR_AND_DBUG_RETURN;
    images_left--;
    if (images_left == 0)
    {
      state= DONE;
      /* DONE means done with all send_data() calls, but we have more work */
      DBUG_RETURN(backup::DONE);
    }
  }
#endif

  DBUG_RETURN(backup::OK);
}



/*
  Restoring the log
*/


Log_restore::Log_restore(const char *log_name_arg)
{
  DBUG_ENTER("myisam/restore/Log_restore::Log_restore");
  log_name.append(log_name_arg);
  int fd= my_create(log_name_arg, 0, O_WRONLY, MYF(MY_WME));
  if (fd < 0)
  {
    log_deleted= TRUE;
    SET_STATE_TO_ERROR_AND_DBUG_VOID_RETURN;
  }
  log_deleted= FALSE;
  log_file_restore.init(fd);
  state= OK;
  DBUG_VOID_RETURN;
}

result_t Log_restore::close()
{
  DBUG_ENTER("myisam/restore/Log_restore::close");
  result_t ret= log_file_restore.close_file();
  if (ret)
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  DBUG_RETURN(ret);
}


/**
   @brief Closes and deletes the log.

   The only reason to have an end() and call it from the destructor, instead of
   putting the code into the destructor, is that when the caller does a "delete
   image", it cannot be told about errors, while if the caller does
   "image->end()" (and then "delete image") it can see an error.
*/
result_t Log_restore::end()
{
  DBUG_ENTER("myisam/backup/Log_restore::end");
  /* log is applied so we don't need it anymore */
  if (log_file_restore.close_file() //||
//      (!log_deleted && my_delete(log_name.ptr(), MYF(MY_WME))))
      )
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  log_deleted= TRUE;
  DBUG_RETURN(backup::OK);
}


Log_restore::~Log_restore()
{
  /* If all went well, we don't do anything here. */
  if (end())
  {
    /*
      Repetition of failing deletes could fill disk, warn.
      It's a destructor, no other way to report an error than print.
    */
    sql_print_error("MyISAM restore driver: could not close or delete log");
  }
}


/** @brief Opens a MyISAM table for restoring it */
Table_restore::Table_restore(const Table_ref &buf):
  Table(buf), rebuild_index(FALSE)
{
  DBUG_ENTER("Table_restore::Table_restore");
  DBUG_PRINT("enter",("Initializing backup image for table %s",
                      m_complete_name.ptr()));
  /*
    Here we use low-level mi_* functions as all we want is a pair of file
    descriptors.
    Though we only want to write (O_WRONLY), the SQL layer uses only O_RDONLY
    and O_RDWR, so here we don't try to be original.
  */
  mi_info= mi_open(m_complete_name.ptr(), O_RDWR, 0);
  if (!mi_info)
  {
    /* table does not exist or is corrupted? not normal, it's just created */
    SET_STATE_TO_ERROR_AND_DBUG_VOID_RETURN;
  }
  /*
    It's ok to copy the kfile descriptor and write() to it as the upper layers
    guarantee that we are the only user of the brand new table.
  */
  DBUG_ASSERT(mi_info->dfile >= 0);
  DBUG_ASSERT(mi_info->s->kfile >= 0);
  dfile_restore.init(mi_info->dfile);
  kfile_restore.init(mi_info->s->kfile);
  /* seek them at start, because we use my_write() */
  if ((my_seek(mi_info->dfile, 0, SEEK_SET, MYF(MY_WME)) ==
       MY_FILEPOS_ERROR) ||
      (my_seek(mi_info->s->kfile, 0, SEEK_SET, MYF(MY_WME)) ==
       MY_FILEPOS_ERROR))
    SET_STATE_TO_ERROR_AND_DBUG_VOID_RETURN;
  thd= current_thd;
  state= OK;
  DBUG_VOID_RETURN;
}


/** @brief Closes the MyISAM table */
result_t Table_restore::close()
{
  bool ret;
  DBUG_ENTER("myisam/restore/Table_restore::close");
  DBUG_PRINT("info",("table: %s", m_complete_name.ptr()));
  ret= mi_info ? mi_close(mi_info) : 0;
  mi_info= NULL;
  if (ret)
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  DBUG_RETURN(backup::OK);
}


/**
   @brief Closes the MyISAM table.

   The only reason to have an end() and call it from the destructor, instead of
   putting the code into the destructor, is that when the caller does a "delete
   image", it cannot be told about errors, while if the caller does
   "image->end()" (and then "delete image") it can see an error.
*/
result_t Table_restore::end()
{
  return close();
}


Table_restore::~Table_restore()
{
  if (end())
  {
    /*
      Close failure is important, warn.
      It's a destructor, no other way to report an error than print.
    */
    sql_print_error("MyISAM restore driver: could not close table");
  }
}


/**
   @brief Repairs the table's index if needed.

   Has to be done after applying the log.
*/
result_t Table_restore::post_restore()
{
  HA_CHECK_OPT check_opt;
  int error;
  Vio* save_vio;

  DBUG_ENTER("myisam/restore/Table_restore::post_restore");

  if (!rebuild_index)
    DBUG_RETURN(backup::OK); // nothing to do

  /*
    myisamchk() as well as ha_myisam::repair() do a lot of operations before
    and after mi_repair(); to not duplicate code we reuse one of them.
    As we are in the server here, we use the one of the server.
    A "new ha_myisam + ha_open()" is not sufficient as TABLE and TABLE_SHARE
    are needed for ha_myisam::open(). So we use open_temporary_table() which
    sets up all fine without touching thread's structure (and so, without
    causing problems to locks, without interfering with close_thread_tables()
    which would be done by another driver in the same thread etc).
    Note that as the table has just been created, and in theory is protected
    from any usage, by the upper backup layer, opening it with
    open_temporary_table() is correct.
  */
  TABLE *table;
  {
    char path[FN_REFLEN];
    const char *db= m_db.ptr();
    const char *name= m_name.ptr();
    build_table_filename(path, sizeof(path), db, name, "", 0);
    table= open_temporary_table(thd, path, db, name, 0);
  }
  if ((error= (!table || !table->file)))
    goto err;

  check_opt.init();
  check_opt.flags|= T_VERY_SILENT | T_CALC_CHECKSUM | T_QUICK;
  /*
    We do not want repair() to spam us with messages (protocol->store() etc).
    Just send them to the error log, and report the failure in case of
    problems.
    Note that ha_myisam::restore() does not do that (merely uses the same
    check_opt.flags as us), as it is allowed to return an array of errors.
  */
  save_vio= thd->net.vio;
  thd->net.vio= NULL;
  error= table->file->ha_repair(thd,&check_opt) != 0;
  thd->net.vio= save_vio;

err:
  if (table)
  {
    intern_close_table(table);
    my_free(reinterpret_cast<gptr>(table), MYF(0));
  }
  if (error)
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  DBUG_RETURN(backup::OK);
}


result_t Table_restore::send_data(const Buffer &buf)
{
  enum enum_file_code file_code= static_cast<enum enum_file_code>(*buf.data);
  result_t ret;
  DBUG_ENTER("myisam/restore/Table_restore::send_data");

  switch (file_code)
  {
  case DATA_FILE_CODE:
    ret= dfile_restore.send_data(buf);
    break;
  case HEADER_INDEX_FILE_CODE:
    rebuild_index= TRUE; // because we are given only the index's header
    // fall through
  case WHOLE_INDEX_FILE_CODE:
    ret= kfile_restore.send_data(buf);
    break;
  default:
    DBUG_PRINT("info",("packet with code %d I didn't expect", file_code));
    ret= backup::ERROR;
  }
  if (ret == backup::ERROR)
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  DBUG_RETURN(ret);
}


result_t Log_restore::send_data(const Buffer &buf)
{
  enum enum_file_code file_code= static_cast<enum enum_file_code>(*buf.data);
  result_t ret;
  DBUG_ENTER("myisam/restore/log/send_data");

  ret= (file_code == LOG_FILE_CODE) ? log_file_restore.send_data(buf) :
    backup::ERROR;
  if (ret)
    SET_STATE_TO_ERROR_AND_DBUG_RETURN;
  DBUG_RETURN(ret);
}


result_t File_restore::send_data(const Buffer &buf)
{
  uint howmuch= buf.size;

  DBUG_ENTER("myisam/restore/file/send_data");
  //DBUG_DUMP("receiving",buf.data + 1, 16);

  howmuch--; // skip the first byte which contains the code

  uint res= my_write(fd, reinterpret_cast< ::byte *>(buf.data) +1 ,
                     howmuch, MYF(MY_WME));

  DBUG_RETURN((res != howmuch) ? backup::ERROR : backup::OK);
}


result_t File_restore::close_file()
{
  int ret;
  if (fd < 0)
    return backup::OK;
  ret= my_close(fd, MYF(MY_WME));
  fd= -1;
  return ret ? backup::ERROR : backup::OK;
}


} // myisam_backup namespace


Backup_result_t myisam_backup_engine(handlerton *self, Backup_engine* &be)
{
  be= new myisam_backup::Engine();

  if (unlikely(!be))
    return backup::ERROR;

  return backup::OK;
}


--- 1.160/include/my_global.h	2007-02-25 00:12:06 +01:00
+++ 1.161/include/my_global.h	2007-05-15 18:09:18 +02:00
@@ -57,9 +57,11 @@
 #ifdef __cplusplus
 #define C_MODE_START    extern "C" {
 #define C_MODE_END	}
+#define static_cast_C_or_CPP(T) static_cast<T>
 #else
 #define C_MODE_START
 #define C_MODE_END
+#define static_cast_C_or_CPP(T) (T)
 #endif
 
 #if defined(_WIN32) || defined(_WIN64) || defined(__WIN32__) || defined(WIN32)
@@ -932,7 +934,7 @@
 #define my_offsetof(TYPE, MEMBER) \
         ((size_t)((char *)&(((TYPE *)0x10)->MEMBER) - (char*)0x10))
 
-#define NullS		(char *) 0
+#define NullS		static_cast_C_or_CPP(char *)(0)
 /* Nowdays we do not support MessyDos */
 #ifndef NEAR
 #define NEAR				/* Who needs segments ? */
@@ -1079,7 +1081,7 @@
 #define INT8(v)		(int8) (v)
 #define INT16(v)	(int16) (v)
 #define INT32(v)	(int32) (v)
-#define MYF(v)		(myf) (v)
+#define MYF(v)		static_cast_C_or_CPP(myf)(v)
 
 #ifndef LL
 #ifdef HAVE_LONG_LONG

--- 1.19/mysql-test/t/key_cache.test	2006-11-15 10:23:23 +01:00
+++ 1.20/mysql-test/t/key_cache.test	2007-05-15 18:09:19 +02:00
@@ -71,7 +71,7 @@
 # Following results differs on 64 and 32 bit systems because of different
 # pointer sizes, which takes up different amount of space in key cache
 
---replace_result 1812 KEY_BLOCKS_UNUSED 1793 KEY_BLOCKS_UNUSED 1674 KEY_BLOCKS_UNUSED
1818 KEY_BLOCKS_UNUSED 1824 KEY_BLOCKS_UNUSED
+--replace_result 1799 KEY_BLOCKS_UNUSED 1793 KEY_BLOCKS_UNUSED 1674 KEY_BLOCKS_UNUSED
1818 KEY_BLOCKS_UNUSED 1824 KEY_BLOCKS_UNUSED
 show status like 'key_blocks_unused';
 
 insert into t1 values (1, 'qqqq'), (11, 'yyyy');
@@ -84,7 +84,7 @@
 update t2 set i=2 where i=1;
 
 show status like 'key_blocks_used';
---replace_result 1808 KEY_BLOCKS_UNUSED 1789 KEY_BLOCKS_UNUSED 1670 KEY_BLOCKS_UNUSED
1814 KEY_BLOCKS_UNUSED 1820 KEY_BLOCKS_UNUSED
+--replace_result 1795 KEY_BLOCKS_UNUSED 1789 KEY_BLOCKS_UNUSED 1670 KEY_BLOCKS_UNUSED
1814 KEY_BLOCKS_UNUSED 1820 KEY_BLOCKS_UNUSED
 show status like 'key_blocks_unused';
 
 cache index t1 key (`primary`) in keycache1;
@@ -146,7 +146,7 @@
 drop table t1,t2,t3;
 
 show status like 'key_blocks_used';
---replace_result 1812 KEY_BLOCKS_UNUSED 1793 KEY_BLOCKS_UNUSED 1674 KEY_BLOCKS_UNUSED
1818 KEY_BLOCKS_UNUSED 1824 KEY_BLOCKS_UNUSED
+--replace_result 1799 KEY_BLOCKS_UNUSED 1793 KEY_BLOCKS_UNUSED 1674 KEY_BLOCKS_UNUSED
1818 KEY_BLOCKS_UNUSED 1824 KEY_BLOCKS_UNUSED
 show status like 'key_blocks_unused';
 
 

--- 1.7/include/keycache.h	2006-12-23 20:04:04 +01:00
+++ 1.8/include/keycache.h	2007-05-15 18:09:18 +02:00
@@ -13,7 +13,10 @@
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
 
-/* Key cache variable structures */
+/**
+   @file
+   @brief Key cache API
+*/
 
 #ifndef _keycache_h
 #define _keycache_h
@@ -116,10 +119,15 @@
 extern int key_cache_insert(KEY_CACHE *keycache,
                             File file, my_off_t filepos, int level,
                             byte *buff, uint length);
+/** @brief per-block callback called when block is flushed */
+typedef void (*KEYCACHE_POST_WRITE_CALLBACK)(void *, int, const byte *,
+                                             uint, my_off_t);
 extern int key_cache_write(KEY_CACHE *keycache,
                            File file, my_off_t filepos, int level,
                            byte *buff, uint length,
-			   uint block_length,int force_write);
+			   uint block_length, int force_write,
+                           KEYCACHE_POST_WRITE_CALLBACK callback,
+                           void *callback_arg);
 extern int flush_key_blocks(KEY_CACHE *keycache,
                             int file, enum flush_type type);
 extern void end_key_cache(KEY_CACHE *keycache, my_bool cleanup);

--- 1.44/storage/myisam/myisam_ftdump.c	2006-12-31 01:06:40 +01:00
+++ 1.45/storage/myisam/myisam_ftdump.c	2007-05-15 18:09:20 +02:00
@@ -98,7 +98,8 @@
   if ((inx >= info->s->base.keys) ||
       !(info->s->keyinfo[inx].flag & HA_FULLTEXT))
   {
-    printf("Key %d in table %s is not a FULLTEXT key\n", inx, info->filename);
+    printf("Key %d in table %s is not a FULLTEXT key\n", inx,
+           info->s->unresolv_file_name);
     goto err;
   }
 

--- 1.9/sql/backup/archive.h	2007-04-13 15:02:58 +02:00
+++ 1.10/sql/backup/archive.h	2007-05-15 18:09:19 +02:00
@@ -258,6 +258,19 @@
   {
     DBUG_ASSERT(hton);
     DBUG_ASSERT(hton->get_backup_engine);
+    /*
+      Currently the Archive and MyISAM both leak with that:
+      they allocate a bit of memory in their get_backup_engine, and as
+      Native_backup_info/Native_restore_info don't give a chance to free it
+      (they don't tell the engine that their backup engine is not needed
+      anymore), we get a leak shown by Valgrind.
+      What's the solution?
+      Should engine never allocate in get_backup_engine() (using a static
+      backup engine?)?
+      Should engine remember that it has allocated and then free in its deinit
+      function (a bit like Brian's idea of a per-engine MEM_ROOT)?
+      Should there be a release_backup_engine() call in hton?
+    */
     hton->get_backup_engine(const_cast< ::handlerton* >(hton),m_be);
     DBUG_ASSERT(m_be);
   }

--- 1.15/sql/backup/data_backup.cc	2007-04-20 07:37:52 +02:00
+++ 1.16/sql/backup/data_backup.cc	2007-05-15 18:09:19 +02:00
@@ -439,6 +439,15 @@
   // prepare for VP
   DBUG_PRINT("backup/data",("-- PREPARE PHASE --"));
 
+#if 0
+  /*
+    when I want to test if this backup works in an "online" situation (while
+    updates happen), I add a sleep here so that I can fire a concurrent UPDATE
+    in another connection.
+  */
+  sleep(5);
+#endif
+
   sch.prepare();
 
   while (sch.prepare_count > 0)
@@ -1022,7 +1031,8 @@
   for(uint no=0; no < drv_count; ++no)
   {
     DBUG_PRINT("restore",("Shutting down driver %u",no));
-    drv[no]->end();
+    if (drv[no]->end() != backup::OK)
+      DBUG_ASSERT(0);
     drv[no]->free();
   }
 

--- 1.15/sql/backup/meta_backup.cc	2007-04-16 19:22:48 +02:00
+++ 1.16/sql/backup/meta_backup.cc	2007-05-15 18:09:19 +02:00
@@ -374,6 +374,50 @@
     Insert the database.table string at the location
     of the table name in the CREATE command.
   */
+  /*
+    I sometimes observe these Valgrind errors:
+==9295== Thread 4:
+==9295== Conditional jump or move depends on uninitialised value(s)
+==9295==    at 0x8191364: String::c_ptr() (sql_string.h:99)
+==9295==    by 0x85D20A9: backup::insert_db_in_create(st_table_list*, String*) (
+meta_backup.cc:387)
+==9295==    by 0x85D22CC: backup::get_table_metadata(THD*, st_table_list*, List<
+String>*) (meta_backup.cc:450)
+==9295==    by 0x85C6D42: backup::collect_tables_to_backup(THD*, backup::Backup_
+info&) (sql_backup.cc:294)
+==9295==    by 0x85C796D: backup::do_backup(THD*, backup::Location const&, List<
+LEX_STRING>*) (sql_backup.cc:77)
+==9295==    by 0x8245E85: mysql_execute_command(THD*) (sql_parse.cc:1887)
+==9295==    by 0x824E6D5: mysql_parse(THD*, char*, unsigned) (sql_parse.cc:5246)
+==9295==    by 0x824F16D: dispatch_command(enum_server_command, THD*, char*, uns
+igned) (sql_parse.cc:885)
+==9295==    by 0x82502C3: do_command(THD*) (sql_parse.cc:654)
+==9295==    by 0x823E1D7: handle_one_connection (sql_connect.cc:1089)
+==9295==    by 0x4062111: start_thread (in /lib/libpthread-2.5.so)
+==9295==    by 0x41A42ED: clone (in /lib/libc-2.5.so)
+==9295==
+==9295== Conditional jump or move depends on uninitialised value(s)
+==9295==    at 0x4024267: strlen (in /usr/lib/valgrind/x86-linux/vgpreload_memch
+eck.so)
+==9295==    by 0x41240F0: vfprintf (in /lib/libc-2.5.so)
+==9295==    by 0x8547AC3: _db_doprnt_ (dbug.c:1136)
+==9295==    by 0x85D20CC: backup::insert_db_in_create(st_table_list*, String*) (
+meta_backup.cc:387)
+==9295==    by 0x85D22CC: backup::get_table_metadata(THD*, st_table_list*, List<
+String>*) (meta_backup.cc:450)
+==9295==    by 0x85C6D42: backup::collect_tables_to_backup(THD*, backup::Backup_
+info&) (sql_backup.cc:294)
+==9295==    by 0x85C796D: backup::do_backup(THD*, backup::Location const&, List<
+LEX_STRING>*) (sql_backup.cc:77)
+==9295==    by 0x8245E85: mysql_execute_command(THD*) (sql_parse.cc:1887)
+==9295==    by 0x824E6D5: mysql_parse(THD*, char*, unsigned) (sql_parse.cc:5246)
+==9295==    by 0x824F16D: dispatch_command(enum_server_command, THD*, char*, uns
+igned) (sql_parse.cc:885)
+==9295==    by 0x82502C3: do_command(THD*) (sql_parse.cc:654)
+==9295==    by 0x823E1D7: handle_one_connection (sql_connect.cc:1089)
+==9295==    by 0x4062111: start_thread (in /lib/libpthread-2.5.so)
+    (the line 387 it refers to is the DBUG_PRINT below)
+  */
   DBUG_PRINT("backup", ("inserting database %s into %s",
     db_table_name.c_ptr(), create_str->c_ptr()));
   DBUG_RETURN(create_str->replace(index, table_name.length(), db_table_name));
@@ -417,11 +461,11 @@
   DBUG_ENTER("get_table_metadata");
   /*
     Check to see if tables exist and open them. Abort if
-    something is wrong or tables cannot be locked.
+    something is wrong.
   */
-  table->lock_type= TL_READ;
   DBUG_PRINT("metadata_backup", ("opening the tables"));
-  if (open_and_lock_tables(thd, table))
+  uint counter;
+  if (open_tables(thd, &table, &counter, 0))
   {
     DBUG_PRINT("metadata_backup", ( "error opening tables!" ));
     DBUG_RETURN(-1);

--- 1.8/storage/myisam/CMakeLists.txt	2006-12-31 01:37:58 +01:00
+++ 1.9/storage/myisam/CMakeLists.txt	2007-05-15 18:09:19 +02:00
@@ -30,7 +30,8 @@
 				mi_rfirst.c mi_rlast.c mi_rnext.c mi_rnext_same.c mi_rprev.c mi_rrnd.c
 				mi_rsame.c mi_rsamepos.c mi_scan.c mi_search.c mi_static.c mi_statrec.c
 				mi_unique.c mi_update.c mi_write.c rt_index.c rt_key.c rt_mbr.c
-				rt_split.c sort.c sp_key.c ft_eval.h myisamdef.h rt_index.h mi_rkey.c)
+				rt_split.c sort.c sp_key.c ft_eval.h myisamdef.h rt_index.h mi_rkey.c
+				mi_backup_log.c mi_examine_log.c myisam_backup_driver.cc)
 
 ADD_EXECUTABLE(myisam_ftdump myisam_ftdump.c)
 TARGET_LINK_LIBRARIES(myisam_ftdump myisam mysys dbug strings zlib wsock32)

--- 1.32/mysql-test/r/backup.result	2007-04-12 09:06:32 +02:00
+++ 1.33/mysql-test/r/backup.result	2007-05-15 18:09:19 +02:00
@@ -55,15 +55,44 @@
 LOCK TABLES `tasking` WRITE;
 INSERT INTO `tasking` VALUES
('333445555','405',23),('123763153','405',33.5),('921312388','601',44),('800122337','300',13),('820123637','300',9.5),('830132335','401',8.5),('333445555','300',11),('921312388','500',13),('800122337','300',44),('820123637','401',500.5),('830132335','400',12),('333445665','600',300.25),('123654321','607',444.75),('123456789','300',1000);
 UNLOCK TABLES;
+DROP TABLE IF EXISTS `tasking2`;
+Warnings:
+Note	1051	Unknown table 'tasking2'
+CREATE TABLE `tasking2` (
+`id` char(9),
+`project_number` char(9),
+`hours_worked` double(10,2),
+index(`hours_worked`)
+) delay_key_write=1 ENGINE=MyISAM DEFAULT CHARSET=latin1;
+LOCK TABLES `tasking2` WRITE;
+INSERT INTO `tasking2` VALUES
('333445555','405',23),('123763153','405',33.5),('921312388','601',44),('800122337','300',13),('820123637','300',9.5),('830132335','401',8.5),('333445555','300',11),('921312388','500',13),('800122337','300',44),('820123637','401',500.5),('830132335','400',12),('333445665','600',300.25),('123654321','607',444.75),('123456789','300',1000);
+UNLOCK TABLES;
+use db1;
+SELECT COUNT(*) FROM `building`;
+COUNT(*)
+5
+SELECT COUNT(*) FROM `directorate`;
+COUNT(*)
+3
+use db2;
+SELECT COUNT(*) FROM `tasking`;
+COUNT(*)
+0
+SELECT COUNT(*) FROM `staff`;
+COUNT(*)
+10
+SELECT COUNT(*) FROM `tasking2`;
+COUNT(*)
+14
 BACKUP DATABASE db1,db2 TO 'test.ba';
 Backup Summary
 Backed up   2 tables in database db1.
-Backed up   2 tables in database db2.
+Backed up   3 tables in database db2.
  
- header     =     1028 bytes
- data       =     2276 bytes
+ header     =     1298 bytes
+ data       =     5760 bytes
               --------------
- total            3304 bytes
+ total            7058 bytes
 DROP DATABASE db1;
 DROP DATABASE db2;
 SHOW BACKUP 'test.ba';
@@ -91,16 +120,22 @@
   `project_number` char(9) DEFAULT NULL,
   `hours_worked` double(10,2) DEFAULT NULL
 ) ENGINE=BLACKHOLE DEFAULT CHARSET=latin1
+db2	tasking2	CREATE TABLE `db2`.`tasking2` (
+  `id` char(9) DEFAULT NULL,
+  `project_number` char(9) DEFAULT NULL,
+  `hours_worked` double(10,2) DEFAULT NULL,
+  KEY `hours_worked` (`hours_worked`)
+) ENGINE=MyISAM DEFAULT CHARSET=latin1 DELAY_KEY_WRITE=1
 USE mysql;
 RESTORE DATABASE FROM 'test.ba';
 Restore Summary
 Restored   2 tables in database db1.
-Restored   2 tables in database db2.
+Restored   3 tables in database db2.
  
- header     =     1028 bytes
- data       =     2276 bytes
+ header     =     1298 bytes
+ data       =     5760 bytes
               --------------
- total            3304 bytes
+ total            7058 bytes
 USE db1;
 SHOW TABLES;
 Tables_in_db1
@@ -123,6 +158,14 @@
 Tables_in_db2
 staff
 tasking
+tasking2
+SHOW CREATE TABLE tasking;
+Table	Create Table
+tasking	CREATE TABLE `tasking` (
+  `id` char(9) DEFAULT NULL,
+  `project_number` char(9) DEFAULT NULL,
+  `hours_worked` double(10,2) DEFAULT NULL
+) ENGINE=BLACKHOLE DEFAULT CHARSET=latin1
 SELECT * FROM staff;
 id	first_name	mid_name	last_name	sex	salary	mgr_id
 333445555	John	Q	Smith	M	30000	333444444
@@ -135,12 +178,37 @@
 333445665	Edward	E	Engles	M	25000	333445555
 123654321	Beware	D	Borg	F	55000	333444444
 123456789	Wilma	N	Maxima	F	43000	333445555
-SHOW CREATE TABLE tasking;
+CHECK TABLE staff EXTENDED;
+Table	Op	Msg_type	Msg_text
+db2.staff	check	warning	1 client is using or hasn't closed the table properly
+db2.staff	check	status	OK
+SHOW CREATE TABLE tasking2;
 Table	Create Table
-tasking	CREATE TABLE `tasking` (
+tasking2	CREATE TABLE `tasking2` (
   `id` char(9) DEFAULT NULL,
   `project_number` char(9) DEFAULT NULL,
-  `hours_worked` double(10,2) DEFAULT NULL
-) ENGINE=BLACKHOLE DEFAULT CHARSET=latin1
+  `hours_worked` double(10,2) DEFAULT NULL,
+  KEY `hours_worked` (`hours_worked`)
+) ENGINE=MyISAM DEFAULT CHARSET=latin1 DELAY_KEY_WRITE=1
+SELECT * FROM tasking2;
+id	project_number	hours_worked
+333445555	405	23.00
+123763153	405	33.50
+921312388	601	44.00
+800122337	300	13.00
+820123637	300	9.50
+830132335	401	8.50
+333445555	300	11.00
+921312388	500	13.00
+800122337	300	44.00
+820123637	401	500.50
+830132335	400	12.00
+333445665	600	300.25
+123654321	607	444.75
+123456789	300	1000.00
+CHECK TABLE tasking2 EXTENDED;
+Table	Op	Msg_type	Msg_text
+db2.tasking2	check	warning	1 client is using or hasn't closed the table properly
+db2.tasking2	check	status	OK
 DROP DATABASE IF EXISTS db1;
 DROP DATABASE IF EXISTS db2;

--- 1.32/mysql-test/t/backup.test	2007-04-11 22:29:50 +02:00
+++ 1.33/mysql-test/t/backup.test	2007-05-15 18:09:19 +02:00
@@ -55,6 +55,11 @@
 -- Table structure for table `staff`
 --
 
+# TODO: test with INDEX DIRECTORY and DATA DIRECTORY clauses
+# (cannot do it here as it would disable the test under Windows)
+# Test with more concurrency (to see it's an online backup)
+# Test with --myisam-use-mmap
+
 DROP TABLE IF EXISTS `staff`;
 CREATE TABLE `staff` (
   `id` char(9),
@@ -93,6 +98,29 @@
 INSERT INTO `tasking` VALUES
('333445555','405',23),('123763153','405',33.5),('921312388','601',44),('800122337','300',13),('820123637','300',9.5),('830132335','401',8.5),('333445555','300',11),('921312388','500',13),('800122337','300',44),('820123637','401',500.5),('830132335','400',12),('333445665','600',300.25),('123654321','607',444.75),('123456789','300',1000);
 UNLOCK TABLES;
 
+DROP TABLE IF EXISTS `tasking2`;
+CREATE TABLE `tasking2` (
+  `id` char(9),
+  `project_number` char(9),
+  `hours_worked` double(10,2),
+  index(`hours_worked`)
+) delay_key_write=1 ENGINE=MyISAM DEFAULT CHARSET=latin1;
+
+--
+-- Dumping data for table `tasking2`
+--
+
+LOCK TABLES `tasking2` WRITE;
+INSERT INTO `tasking2` VALUES
('333445555','405',23),('123763153','405',33.5),('921312388','601',44),('800122337','300',13),('820123637','300',9.5),('830132335','401',8.5),('333445555','300',11),('921312388','500',13),('800122337','300',44),('820123637','401',500.5),('830132335','400',12),('333445665','600',300.25),('123654321','607',444.75),('123456789','300',1000);
+UNLOCK TABLES;
+
+use db1;
+SELECT COUNT(*) FROM `building`;
+SELECT COUNT(*) FROM `directorate`;
+use db2;
+SELECT COUNT(*) FROM `tasking`;
+SELECT COUNT(*) FROM `staff`;
+SELECT COUNT(*) FROM `tasking2`;
 
 BACKUP DATABASE db1,db2 TO 'test.ba';
 
@@ -119,8 +147,13 @@
 USE db2;
 SHOW TABLES;
 
-SELECT * FROM staff;
 SHOW CREATE TABLE tasking;
+
+SELECT * FROM staff;
+CHECK TABLE staff EXTENDED;
+SHOW CREATE TABLE tasking2;
+SELECT * FROM tasking2;
+CHECK TABLE tasking2 EXTENDED;
 
 DROP DATABASE IF EXISTS db1;
 DROP DATABASE IF EXISTS db2;

--- 1.163/sql/sql_repl.cc	2007-03-01 15:24:03 +01:00
+++ 1.164/sql/sql_repl.cc	2007-05-15 18:09:19 +02:00
@@ -1566,7 +1566,10 @@
 }
 
 
-int log_loaded_block(IO_CACHE* file)
+int log_loaded_block(IO_CACHE* file,
+                     const byte *buffert __attribute__((unused)),
+                     uint count __attribute__((unused)),
+                     my_off_t filepos __attribute__((unused)))
 {
   LOAD_FILE_INFO *lf_info;
   uint block_len ;

--- 1.44/sql/sql_repl.h	2006-12-31 01:06:36 +01:00
+++ 1.45/sql/sql_repl.h	2007-05-15 18:09:19 +02:00
@@ -66,7 +66,7 @@
   bool wrote_create_file, log_delayed;
 } LOAD_FILE_INFO;
 
-int log_loaded_block(IO_CACHE* file);
+int log_loaded_block(IO_CACHE *, const byte *, uint, my_off_t);
 
 #endif /* HAVE_REPLICATION */
 

--- 1.4/include/atomic/nolock.h	2006-12-23 20:33:28 +01:00
+++ 1.5/include/atomic/nolock.h	2007-05-15 18:09:19 +02:00
@@ -30,7 +30,13 @@
 
 #ifdef make_atomic_cas_body
 
-typedef struct { } my_atomic_rwlock_t;
+/*
+  We need a type which uses the least possible memory as it will not be used.
+  Empty struct has different size in C vs C++.
+  0-length array does not have this problem (size is 0) but is gcc-specific.
+  For now we use char.
+*/
+typedef char my_atomic_rwlock_t;
 #define my_atomic_rwlock_destroy(name)
 #define my_atomic_rwlock_init(name)
 #define my_atomic_rwlock_rdlock(name)
Thread
bk commit into 5.1 tree (guilhem:1.2526)Guilhem Bichot15 May